Least Rattling Feedback from Strong Time-scale Separation

In most interacting many-body systems associated with some"emergent phenomena,"we can identify sub-groups of degrees of freedom that relax on dramatically different time-scales. Time-scale separation of this kind is particularly helpful in nonequilibrium systems where only the fast variables are subjected to external driving; in such a case, it may be shown through elimination of fast variables that the slow coordinates effectively experience a thermal bath of spatially-varying temperature. In this work, we investigate how such a temperature landscape arises according to how the slow variables affect the character of the driven quasi-steady-state reached by the fast variables. Brownian motion in the presence of spatial temperature gradients is known to lead to the accumulation of probability density in low temperature regions. Here, we focus on the implications of attraction to low effective temperature for the long-term evolution of slow variables. After quantitatively deriving the temperature landscape for a general class of overdamped systems using a path integral technique, we then illustrate in a simple dynamical system how the attraction to low effective temperature has a fine-tuning effect on the slow variable, selecting configurations that bring about exceptionally low force fluctuation in the fast-variable steady-state. We furthermore demonstrate that a particularly strong effect of this kind can take place when the slow variable is tuned to bring about orderly, integrable motion in the fast dynamics that avoids thermalizing energy absorbed from the drive. We thus point to a potentially general feedback mechanism in multi-time-scale active systems, that leads to the exploration of slow variable space, as if in search of fine-tuning for a"least rattling"response in the fast coordinates.


I. INTRODUCTION
A broad range of many-body nonequilibrium systems have in common that different degrees of freedom within them undergo motion on two, wellseparated time-scales, and that the faster degrees of freedom are the only ones directly subject to external driving. Such separation can occur if a faster set of active particles act as a bath for a heavier, more slowly relaxing set of larger, extended degrees of freedom, such as in the example of a polymer immersed in a mixture of self-propelling particles [1]. Alternatively, in many systems one can usefully identify coarse-grained variables describing global features of the many-body dynamics, which may relax more slowly than the coordinates of individual particles. Such order parameters might then be thought of as a set of slowly-varying constraints on the driven fast dynamics, as for example in [2].
In all such cases, it is possible in principle for the particular configuration of a set of slow variables to have a significant influence on the specific nonequilibrium steady-state reached by the fast variables. Thus, in general, a feedback loop can arise in which the slow variables first establish the features of the fast steady-state, and then the statistics of this steady-state in turn determine the stochastic dynamics of the resulting local motion in slow variable space. The goal of this paper is to characterize * pchvykov@mit.edu the dynamical attractors of slow variable evolution in terms of the particular, special properties of the fast steady-states to which they give rise.
Nonequilibrium systems with time-scale separation have been extensively studied over the last several decades. The most common context where they have come up is in formalizing the concept of a "thermal bath" -explicitly modelling the fast bath degrees of freedom as a Hamiltonian system, and studying their effects on the slow variables. In this way, one can in some cases recover the effective friction tensor [3], and the corresponding noise term, related by fluctuation-dissipation theorem [4]. There is also extensive literature studying the conditions and effects of deviations from this basic result, which are generally termed "anomalous diffusion" -see e.g., [5]. Within this context, the "slow" degrees of freedom lack their own dynamics, and are considered only as probes of the fast bath. More recent studies have considered the minimal dissipation required from an external agent to slowly move such probes. A geometric interpretation of this bound was presented in [6], and extended to nonequilibrium baths in [7], as well as to reversible external protocols in [8]. Systems where slow variables have their own dynamics under a conservative coupling to the fast bath have received relatively little attention, excepting notable recent work for a simple harmonic oscillator probe in [9], and a more general exploration in [10,11], where some formal results relating dissipation and forces on the slow variables were derived.
Most of this previous work has relied on the projection operator technique to adiabatically eliminate fast variables and obtain the reduced Fokker-Planck equations for the slow variables, as in Ch6.4 of [12], or see [13] for a recent review. The straightforward implication of this approach is that at long times, probability density in slow variable space is expected to accumulate in locations where inward mean drift is strong, and where local diffusion is low. Here, we first derive this effect for a general class of Langevin systems using a response-field path integral framework that makes clear the relationship between the reduced Fokker-Planck parameters and the absorption and thermalization of drive energy in the fast steady-state. Some related path-integral system reduction techniques have been studied before (e.g., [14,15]), but in substantially different contexts. Having established a means of explicitly calculating the parameters of the multiplicative noise stochastic process governing the slow variables, we then proceed to analyse the implications for what we term "least rattling feedback," in which slow variables dynamically finely-tune themselves to bring about fast variable steady-states that attenuate force fluctuations so as to lower the slow variable effective temperature.
The tendency of slow variables in driven systems to move thermophoretically towards regions of lower effective temperature has been noticed in the past, most commonly in situations where the slow variables find a way to reduce the influx of energy from the drive (as in [16,17]. As we shall see here, however, a striking alternative can arises if the fast variables are capable of exhibiting regular, integrable dynamics; in such a case, least rattling stability can co-exist with strongly coupling to and absorbing work from the external drive.
In the section II of this article, we will present the derivation of our main analytical result, which establishes a relationship between force fluctuations in fast driven variables and the resulting effective temperature experienced by the slow variables in a driven system. In section III, we will carry out a numerical analysis of the kicked rotor on a carta time-scale separated, damped, driven dynamical system that is ideally suited for demonstrating the predictive power of the "least rattling" framework. Not only will this analysis draw clear connections to methods of equilibrium statistical physics and show how they generalize in such a nonequilibrium scenario, but it will also underline how "least rattling" helps to explain the non-trivial relationship between dissipation rate and local kinetic stability in driven systems.

II. ANALYTICAL SLOW DYNAMICS
In this section, we lay out a general formalism for extracting slow dynamics in stochastic systems with strong time-scale separation. We will model "slow" variables x a and "fast" variables y i that evolve according to a coupled system of Langevin equations. Our approach will be to integrate out the fast degrees of freedom and develop an effective theory for the dynamics of the slow variables that is controlled by a small number which quantifies the time-scale separation between fast and slow. As we carry out this integration, we will show that the effects on x a from the fast steady-state of the y i variables at leading order in are an average force and, more subtley, a random force and renormalized drag that are calculated from the two-point correlation function of the forces acting between x a and y i . These latter effects are identified as an emergent, positiondependent effective temperature experienced by the slow coordinates.

A. Setup
While the method we present here is not restricted to this context, it is easiest to illustrate on systems whose dynamics can be given by first order equations, as below. In particular, it works the same way for other types of fast dynamics -such as inertial, or discrete -as long as there is a fast relaxation to a steady-state.
Here the noise ξ is usual Gaussian white noise: ξ α (t) = 0 and ξ α (t)ξ β (s) = δ α,β δ(t − s). Taking the limit µ/η ≡ 1 amounts to explicitly separating x a as slow modes, and y i as fast ones (a, b, c index the slow configuration space, and i, j, k -the fast one). The natural physical interpretation of this system as overdamped dynamics in a thermal bath of temperature T , with two different damping coefficients µ and η, the noise amplitudes given by Einstein's relation, and with the forces F a and f i will be implied from now on for concreteness, but is not at all necessary. With a slight adjustment the system could as well represent underdamped dynamics, such as in the kicked rotor model system we characterize below.

B. Results
The detailed derivation of the effective slow dynamics is relegated to Appendix A. Here we mention only the key steps in the derivation. First, we rescale time t → µ t, making the slow dynamics obeyẋ a = F a + √ 2 T ξ a , while the relaxation time of fast variables becomes of O [1]. Second, we express probability of slow trajectories in terms of the Martin-Siggia-Rose path integral (also termed the response-field formalism) [18], and third, we do a cumulant expansion controlled by : where Z x is the normalization, andx(t) is the auxiliary "response" field. In the last line, we see that the O 2 term in the expansion, like temperature T , comes in ∝x 2 , and thus gives a correction to the noise on the slow dynamics -this is the effect that we will focus on throughout the rest of this paper. Doing this more carefully (as shown in the Appendix) the resulting slow dynamics, which is our main analytical result, are where the matrix square root is defined by B ≡ √ D ⇔ B.B T = D. Dots denote Itô products, which will be typical here (see sec.A 2). Note that only the connected components of the expectations appear in the expressions for γ(x) and D(x) (denoted by commas), and thus are insensitive to any deterministic motion of the fast variables. Further note that there is also an O [ ] correction of the damping coefficient, which, for a fully conservative (undriven) system, matches the noise correction to preserve Einstein's relation, as it must (see sec.A 4). For non-conservative forces, however, this will not be the case, and the ratio of the effective noise to damping amplitudes can be used to define an effective temperature tensor T ef f (x a ) ≡ γ −1 .D. γ −1 T , which will generally depend on the slow coordinates -i.e., the noise on slow variables becomes multiplicative.

C. Least Rattling
The significance of the above formal result is that to extract the effective slow dynamics we need not know everything about the fast modes, but only the mean and variance of the force fluctuations F a in the y i (fast) steady-state at fixed x a (slow d.o.f.). All other details of the fast dynamics become irrelevant by the same mechanism as for the central limit theorem. The slow dynamics thus follow the simple equation 3, which can often be solved analytically. Its qualitative behavior is guided by a competition between the mean drift along the average force F (x) and a median drift down the effective temperature gradients T ef f (x). While the former effect is larger by a factor 1/ , it is a vector quantity, and as such, may be suppressed by averaging in case of high-dimensional disordered fast dynamics. This is in contrast to T ef f , which comes in as a positivedefinite tensor, making it robust to averaging-out. Without rigorously exploring this trade-off for now, in this work we simply choose focus on the effect of T ef f (x), which guides the slow variables towards regions in their configuration space that yield more orderly, less chaotic, or less "rattling" fast dynamics (see sec.A 5). We suggest that this effect might result in the self-organization lately studied in many non-equilibrium systems [19,20].
We now expand on a few of the points mentioned above. First, how general is this method? Its scope is basically inherited from the regime of applicability of the Central Limit Theorem (CLT): our requirement of strong time-scale separation amounts to the condition that fast fluctuations decorrelate faster than dynamical time-scale of slow variables. This way their effect on the slow coordinates adds up as i.i.d. random variables, satisfying the conditions of CLT. Thus any fast fluctuations must ei-ther decorrelate quickly (e.g., due to thermal noise or chaos) -thus contributing to the Gaussian noise amplitude, or not decorrelate at all (as with integrable behavior) -contributing to the mean force F . This requirement could notably be broken if some fast fluctuations decay slower than exponentially -a scenario that leads to effective colored noise and anomalous diffusion, but retains much of the general intuition from eq.3.
This framework is particularly useful in cases where fast dynamics can be in several qualitatively different dynamical phases, controlled by the slow variables. E.g., if a fast variable undergoes a transition from chaotic to integrable behavior as a function of some slow coordinate, then we will typically expect its effect to transition from a noise contribution to an average force contribution respectively -as we will see in the toy system below. Making this precise and describing the relevant universality classes of these transitions based on their symmetry structure can be done within the broader framework of renormalization group flow. This could allow extracting the effective slow dynamics, much like it allows finding large scale physics for quantum or statistical fields [21].
Finally, we mentioned above that while the average force F causes the mean of the x a -ensemble (slow variables) to drift, the multiplicative Itô noise given by the effective temperature bath T ef f (x), affects only a drift of the median of that same ensemble. This latter effect is realized by virtue of the p(x a ) probability distribution growing increasingly heavy-tailed with time (e.g., log-normal distributions are typical), and so while the mean remains fixed, the median will drift towards the low-noise regions. This means that any finite ensemble of trajectories will also settle in the low-noise region, and the mean will never be realized experimentally. Some aspects of this ergodicity-breaking phenomenon were discussed in [22], and a similar problem considered in [23]. The key for us is that the least-rattling effect is inherently non-ergodic, and is observed only by monitoring the system over time.

III. TOY MODEL
To illustrate the above results, we consider a toy model that is designed to be a simplest possible example capturing all the qualitative features we might expect of more general scale-separated driven systems of interest. Specifically, we take a kicked rotor on a cart setup shown in fig.1a. The fast kicked rotor (Chirikov standard map) dynamics here are chosen as the simplest system that can realize both the chaotic and integrable behaviors under differ-ent parameter regimes. Essentially, the system is a rigid pendulum that experiences no external forces except for periodic kicks of a uniform force field (as though gravity gets turned on in brief bursts), and is given by the first two lines in eq.4. We modelled the system to be immersed in a thermal bath by adding a small damping and noise (see third line in eq.4), whose effects have been studied in [24,25]. The point relevant for the following analysis is that when the driving force amplitude (henceforth called "kicking strength") is large, the rotor dynamics are fully chaotic, but if the kicking strength drops below a critical value (K 5), periodic orbits appear in the configuration space, and are made globally attractive in the presence of damping, thus quickly making the dynamics integrable (we refer to this phenomenon below as "dynamical regularization"). Thus, by controlling the effective drive strength, it is possible to switch between chaotic and regular regimes of fast dynamics.
We then fasten the pivot of the fast kicked rotor on a slow cart that can slide back and forth in a highly viscous medium, perpendicular to the direction of the kick accelerations (i.e., along the symmetry axis of the rotor dynamics -see fig. 1a). The cart is pulled by the tension in the rod, which depends on the fast dynamics, while the global cart position x can feed back on the fast dynamics by having a kicking field that varies along the cart's track K(x). This way we have slow variables conservatively coupled to driven fast dynamics, and a feedback loop controlled through the arbitrary form of K(x) -providing a flexible testing ground. Overall, we argue that, while vastly simplified, this model captures essential physical features of many multi-particle nonequilibrium systems of potential interest.

A. Model Setup
The toy model explored here is presented in fig.  1a: the kicked rotor is attached to a massless cart moving on a highly-viscous track, which ensures that cart's velocity is much smaller than the rotor's. The exact equations of motion for the system can be derived from a force-balance, and in their dimensionless form become: where all lengths are measured in units of rotor arm length, time in units of kicking period, and the angle θ is 2π-periodic. Note that for practical reasons (see Appendix B), we also assumed that the cart is momentarily pinned down during each kick, so as to remove the term 1 2 K(x) sin 2θ δ(t−n) that should otherwise be included in F x due to the direct coupling of the kicks to the cart. For now, we can motivate this by saying that the interesting problem is where the driving force affects the slow dynamics only by means of the fast ones, and not directly, while this chosen implementation can simply be viewed as an additional component of the drive protocol. Additionally, to provide more modelling freedom, we can include an arbitrary potential U (x) acting directly on the cart to produce a conservative force. Timescale separation in this model implies that back reaction from cart dynamics on the rotor is small -i.e., here thatẍ K (by differentiating the last line, we see that indeedẍ ∼ O v 3 /c 1 for c 1). Thus the leading-order feedback from the slow variables onto fast dynamics comes from x-dependence of K, which we have full control over, making for a convenient toy-model. We also independently assume that b 1 so that fast dynamics are close to the ideal kicked rotor and retain its features.

B. Analytical Evaluation
For large K (above the dynamical regularization threshold, i.e. K 5), the steady-state of the fast dynamics is fully chaotic, and thus thermal -i.e., we assume thermalization of the entirety of drive energy among the fast fluctuations, as happens in [26] for example. This way, the steady-state distribution is Boltzmann, which is here uniform over θ and Gaussian over v, whose variance we can call T R (rotor temperature). The symmetry of this state over θ and v gives F x s.s. = 0, making the fluctuations dominant. The only remaining parameter we need to find is then T R , which is fully constrained by energy balance as follows. In general, to keep an ergodic system at an effective temperature that is higher than that of its bath requires dissipation [27]: (for 1D systems with mass=1). Moreover, we can find the power exerted by the kicking force to be P = K 2 /4 in the chaotic regime, which in the steadystate must balance the dissipated power. This allows us to extract the effective rotor temperature: This, however, only gives us information about the fast behavior, while the x-noise correction that we want will also depend on the nature of the rotor-cart coupling. This way, we need to evaluate the x-force correlations and δT is the force on the cart. The calculation is relatively straightforward and detailed in sec.B 2, where we also show that γ damping-coefficient correction is 0 by symmetry of the (θ, v) distribution. We find that, while the inertial term can be ignored at leading order, the correlations of the centripetal force give us δT x = 1 2c dt F c , F c = K 2 /16c. Note here that this multiplicative noise correction should be interpreted in the Itô sense, as the F c , F c correlations decay on a time-scale faster than kick-period (see Appendix,A 2).
For K 5, on the other hand, the rotor moves periodically in one of the regular attractors. This means that the cart experiences no additional stochasticity other than that from the thermal bath, giving a low T ef f = T 0 , but as some of these attractors spontaneously break the left-right symmetry of the problem, we get F x s.s. = 0. As the motion in most of these attractors is very simple -n ∈ Z full revolutions of the rotor per kick -we can estimate this force explicitly: n /2c, where v n ≡ 2πn, and θ(t) and v(t) were estimated by integrating the equations of motion 4 at leading order (see sec.B 3).
Compiling the resulting predictions for the cart motion, we get: with v n ≡ 2πn and n some random integer, typically smaller than O √ T R (since the rotor first explores its phase-space thermally before finding one of the regular attractors). Another quantity we can easily estimate for the two phases is the energy dissipation rate:Q Numerical simulations confirm these predictions in fig.1 c, d, and e respectively.

C. Numerical Tests
To verify the above analytical results, we can run numerical simulations of the full system dynamics in eq. 4. To begin, we check the cart dynamics for different values of (x-independent) K (and U (x) = 0). Fig. 1b shows typical cart trajectories for K in the regular and chaotic regimes. More systematically, plotting the apparent average drift F and fluctuations T ef f for multiple realizations at each K, we get plots in c and d of fig. 1 respectively. We thus see quantitative agreement between the prominent features of these plots and the results of eq. 6shown here as black lines. Finally fig. 1e shows the heat dissipation rate in the different possible steadystates, showing that while lowering T ef f corresponds to decreased dissipation within the chaotic phase, this rule is violated if we enter a regular dynamic attractor.
Note that as the original problem is stated exactly, and our method allows for full analytical treatment of the slow variables, there are no fitting parameters in any of the curves we are comparing against throughout the numerical study. We use c = 5 × 10 4 , b = 0.1 for all simulations, and to emphasize the effects from fluctuations of the fast variables, we take the actual thermal bath to be at a vanishingly low temperature T 0 ∼ 0, unless otherwise stated.
While fig. 1 shows agreement of one-and twopoint functions of cart position with our analytical prediction, we have yet to check that the fast dynamics can really be approximated by an effective thermal bath. One convincing way to do this is to introduce a non-trivial potential landscape U (x) acting on the cart's position x, and check the resulting steady-state distribution p(x) against Boltzmann statistics at the predicted temperature T ef f . Figure 2a shows the agreement between the histogram produced by this simulation and the curve for the expected Boltzmann distribution.
To see that the T ef f (x) landscape remains the appropriate description even for non-uniform K(x), we can calculate the steady-state distribution in a K(x) landscape, now letting U (x) = 0. The expected distribution for free diffusion in a temperature landscape can easily be found using, e.g., Fokker-Planck equation, and gives p(x) ∝ 1/T (x) (note that this arises precisely because our effective slow dynamics have Itô multiplicative noise -for Stratonovich it would be 1/ T (x)). This is well confirmed by simulations in fig. 2b, thus showing that at least in the steady-state, probability density does indeed collect in low-temperature regions.
The last natural test that we mention here is to see how T ef f (x) landscape can counteract the forces of U (x) as in a. and K(x) step-function (analytical prediction plotted in black is given by eq.8). d. U (x) = const and K(x) as in b., but shifted down to dip below the critical K c ∼ 5 value -dotted red line (black curve again shows 1/T ef f (x) outside of ordered region) -this shows that probability gets localized at the two transition points at long times.
right wells respectively), as shown in fig. 2c: In the limit of a discrete jump process between the two wells (wells with equal internal entropy separated by a high barrier), this exact solution becomes well approximated by that obtained from currentmatching with the expected jump rates: The key non-equilibrium feature in these solutions is the dependence of the probabilities in either well on the barrier height U (0) via ∆ -for higher barriers the temperature difference becomes more important. This example gives the first non-trivial application of thermodynamic intuition from T ef f (x) landscape to solution of our non-equilibrium system. Projecting this concept onto a broader context, we note that this setup is a particular realization in the class of problems of iterative annealing (e.g., used in chap- (brown). Black curve shows the analytical prediction for the median, while mean is expected to be constant at small times. Inset shows the regularization transition at K c ∼ 5, where effective temperature drops abruptly to 0, and median departs from the smooth decay. x-axis is labelled in units of K eroned protein folding [28], etc).

D. Least rattling
Having confirmed the steady-state and thermal properties of the slow behaviors, we next want to look at the predictive power of our formalism for transient behaviors and currents, again in the presence of inhomogeneous fast dynamics. The first example we consider is transient cart motion in linearly varying K(x) = κ x. The simulation results are shown in fig. 3. As mentioned above, free diffusion in a temperature gradient results in a median drift to low T , as observed here. Explicitly, the slow dynamics in this case cẋ = κ ter is plotted in black in fig.3 and well reproduces the simulation result in brown. Note that for any finite ensemble of trajectories, or for a bounded system, the mean will eventually go to low tempera-tures as well, but not as cleanly or predictably -so the constant mean value is not practically realizable at long times. The inset focuses on the crossover into regular dynamics, where we see that the symmetrybroken drift-force F x can either take the cart to the K = 0 absorbing state (as detailed in the further inset), or back out into the chaotic regime. In the latter case, the cart typically diffuses back down to the transition again. The resulting oscillations cause a (transient) accumulation of probability around the critical point, giving a peculiar realization of selforganized criticality. This critical region itself is also interesting as the correlations of the fast variables persist for long times, and can thus break the timescale separation assumption -but this will have little effect on the global system behavior. The overall takeaway here is the emergent "least rattling:" slow dynamics drift towards regions where fast ones are less stochastic.
To further illustrate the importance of the regularization transition on the slow dynamics, we consider the probability distribution p(x) in the presence of K(x) landscape (and no potential U = 0), as in fig.  2b., but shifted down such that it dips slightly below the regularization transition at its lowest point - fig. 2d. The resulting small zero-temperature region in x, corresponding to integrable fast dynamics, becomes absorbing, collecting most of the probability density over time (see fig. 2d). Note again that probability accumulates at the critical transition points, giving the two-pronged shape. We stress here the observed sharp localization transition of the steadystate distribution as soon as some regular regime of the fast dynamics becomes accessible -i.e., the slow variables find the regularized region even if it requires some fine-tuning. (This trade-off between least-rattling and entropic forces can be made quantitative.)

E. Anomalous diffusion
The last example we present shows that besides an effective temperature landscape, the regular dynamical phase accessible to this model can give rise to apparent anomalous diffusion. To begin, Fig.  4a shows an implementation of Buttiker-Landauer ratchet using our model: periodic U (x) and K(x) landscapes, with a relative phase-offset of π/2 create a steady-state current being pumped, in this case to the right. Intuitively, this happens because a higher effective temperature in the right half of the potential well makes it easier for the cart to overcome the right potential barrier than the left one. The interesting behavior appears when we shift the K(x) wave downward to straddle the transition point at K ∼ 5 ( fig. 4b). In this case the pumped current reverses direction and becomes an order of magnitude larger -even if we reduce the amplitude of the K(x) variation. To understand this, it helps to look at some typical realizations of barrier-crossing trajectories at the bottom of fig. 4. While in panel a. transitions are achieved by stochastic fluctuations that are exponentially suppressed by the Boltzmann factor, in panel b., these are achieved by a directed symmetry-broken drift force F > ∂ x U , and thus the crossing probability is just the probability of the fast dynamics finding the appropriate regular attractors. These ballistic-like trajectories of the cart in the regular regime can be usefully thought of as anomalous super-diffusion with exponent α = 2. Also, in so far as the barrier crossing becomes easier as we lower K(x) through the critical value, we can say that the diffusion becomes stronger, thus showing non-monotonicity with K -reminiscent of the findings in [29].

IV. DISCUSSION
The equilibrium partition function that is computed for the Boltzmann distribution is a powerful formal tool for making predictive calculations in thermally fluctuating systems. Its success stems from two key simplifying assumptions: first, that energy only enters or leaves the system of interest in the form of heat exchanged at a single temperature, and second, that the system and surrounding heat bath uniformly sample joint states of constant energy. This latter ergodic assumption essentially amounts to eliminating time from the picture, so that energy and probability become interchangeable.
The nonequilibrium scenario is generally less tractable than its equilibrium counterpart both because time has not been eliminated from our description of the system, and also because energy is permitted to enter and leave the system via different couplings to the external environment. Thus, the specific approach to modelling some nonequilibrium systems we have described here seeks to recover some of the desirable advantages of the equilibrium description by exploiting time-scale separation in two ways: first, by only allowing nonequilibrium drives to couple to a fast subset of variables, and second, by "partially removing" time from the picture by replacing the fast variables with a timeless thermal bath approximation. This "conveys" the entire timedependence of the problem into the resulting effective slow dynamics.
Adopting such an approach by no means recovers the simplicity of the equilibrium picture, however, it does give rise to a relatively tractable effective a. b.

FIG. 4:
Simulated steady-state cart-position distributions for the shown U (x) (blue) and K(x) (red) landscapes (x is periodic). These result in a pumped steady-state current (block arrows), with the typical barrier-crossing trajectories shown at the bottom. Unlike in all the above simulations, thermal bath temperature T 0 = 10 −4 > 0 was taken in these to smooth out the critical behavior. Straddling the critical point with K(x) in panel b produces a ten-fold larger (and reversed) current, even for smaller absolute variation in K(x) description of the dynamics. As we have seen, slow variables in such a scenario experience not only a mean force landscape from the steady-state of the fast variables, but also are expected to drift in the direction of decreasing fictitious temperature set by the fast variable force fluctuations. Crucially, the latter effect is non-ergodic, thus somehow capturing the breaking of ergodicity typical of driven dynamics in a simple and tractable picture. We have established that this effective picture is quantitatively predictive of the diffusive and stationary behavior of distributions for such slow variables in a simple rotor-on-cart toy model. The tendency of such systems to gravitate to values of slow variables that reduce the effective temperature of fast ones suggests a interesting relationship between dissipation and kinetic stability in driven systems. Although nonequilibrium steady-states are not in general required to be extrema of the average dissipation rate, it is true that the minimum required dissipation to maintain an effective temperature scales with T ef f . Accordingly, there may be a subset of systems where the drift to lower effective temperature is indeed accompanied by a drop in dissipation. However, for cases where dissipation instead goes to maintaining dynamically regular motions, steadystate behavior might be dominated by a highly dissipative, stable attractor of low T ef f . Moreover, if fast variables can undergo a dynami-cal ordering transition that is controlled by the slow coordinates, the corresponding drop in effective temperature can be dramatic. As such, this case opens up the intriguing possibility that dynamical ordering in fast variables might serve as a mechanism for the long-term kinetic stability for slow variables. Moreover, if dynamical ordering only can occur for rare, finely-tuned choices of slow variables, this stability could appear as a tendency toward self-organized fine-tuning in the slow-variable dynamics.
Accordingly, we suggest that an interesting future set of applications for the least rattling approach may lie in the active matter setting, where it is frequently the case that coarse-grained macroscopic features of active particle mixtures relax more slowly than the strongly driven microscopic components. The diversity of self-organized dynamicallyordered collective behaviors exhibited by such systems is well-known [20], and it may be useful to characterize these behaviors in terms of their possibly fine-tuned relationships to driven force fluctuations on the microscopic level. Future work must focus on generalizing our current approach to modelling the dynamics of such coarse-grained variables.

Appendix A: Derivation of Effective slow dynamics
Starting with the explicitly time-scale separated dynamics given by the system 1, the first step is to explicitly bring our small parameter ≡ µ/η 1 into the equations by rescaling time t → µ t, which gives:ẋ This will allow us to do a systematic expansion in below. For later convenience, let us explicitly introduce the time-scales of slow τ S ∼ O [1/ ] and fast τ F ∼ O [1] relaxation. Next we want to integrate out the fast degrees of freedom y i , which we can explicitly do by writing down the Martin-Siggia-Rose (MSR) path integral expression for the probability of a particular slow trajectory x a (t), as given in the first line of eq. 2. For clarity of notation, we have represented the full path-integration over the fast dynamics as the average: Note that so far, this is defined for a specific fixed slow trajectory x(t). With this notation set, we now observe that the only y-dependence that the average can act on in eq. 2 is that in F (x, y, t). Thus, all the other terms can be taken out of the average, while the remaining small F a exponent can be treated with a cumulant expansion: As this is the only part of the path integral in eq 2 that carries the coupling to fast dynamics, it will source all the interesting emergent effects (i.e., coupling renormalizations) for the slow modes, and we thus focus on this for most of this Appendix.

Averages over fast dynamics
Before getting into the physical implications of the different terms in the expansion, let's discuss how to go about calculating the averages O y|x(t) . Indeed, as defined in eq. A2, these averages are just short-hand for path integrals over the full fast dynamics in the presence of arbitrarily time-varying slow variables x a (t), and hence at this point, merely formal, but not very informative, quantities. On the other hand, intuitively we know that at the lowest order in , these averages should reduce to averages over the y-steady states under fixed x: p ss (y|x). To derive this result as well as the first correction in , we need to once again develop a systematic expansion. Besides O (see below), the only dependence on the trajectory x(t) in eq A2 comes in through the force f i (and similarly in the partition function), which by time-scale separation assumption, we know will vary only slightly on the fast time-scale τ F : Plugging this expansion into eq. A2 and Taylor expanding both numerator and denominator (normalization Z y ) in , we get: The averages here O y|x(t0) are at a fixed x, and thus are precisely the averages over y steady-states p ss (y|x(t 0 )). Note also that at this order, the possible x-dependence of O is accounted for at the slow time-scale and does not give any additional contributions here. At the next order in , however, the variations of O and f i on the relaxation time-scale τ F begin to interact, giving new contributions. Generally, Feynman diagrams are the only practical way to go to higher orders as the number of correction terms potentially becomes large. We do not employ diagrams here because they are not practical for the general context we are working with -but they do become very useful in specific examples. Applying the result in eq. A4 to our cumulant expansion A3, we get Remembering the form of eq. 2, we recognize the correction term here as a correction (or renormalization) of the damping coefficient of the original slow dynamics. Crucially, this correction comes in at the same order in as the O 2 term in eq. A3, and must thus be kept in our expansion. By the same token, the equivalent correction of the F a (t), F b (t ) y|f ix x(t) term only comes in at a higher order and is thus ignored at this stage. Nonetheless, it is interesting to note that including higher order corrections would also introduce an inertia-like term into our slow dynamics (mass renormalization, ∝ẍ), as discussed in [9]), as well as higher derivative at progressively higher orders.

Noise correction
The key thing to note from the above results is that at O 2 , the only thing we need to compute the slow dynamics are the one-and two-point functions of the various variables in the y-steady-states. Thus we need neither the full form of the steadystate distribution of the fast variables, nor the deviations from this steady-state under dynamic x(t). This result should be thought of as (and really is a form of) the central limit theorem. Now that we have a sense of what the different terms in the cumulant expansion A3 mean math-ematically, we can turn to their physical implications. We already mentioned the correction of the x-damping coefficient that we get by resolving the F a y|x(t) term in terms of y-steady-states. The only other contribution at this same order is the second term in the expansion A3. This will contribute an additional noise term to the resulting slow dynamics, as it will enter the path integral along with T , correcting thex 2 operator. However, this noise term would only be white if F a (t), F b (t ) y ∼ δ(t − t ), which in general need not be the case, hence making the noise correction colored. Intuitively, we see that because of time-scale separation, y fluctuations will decorrelate much faster (δt ∼ O [1]) than the slow time-scale we are sampling by observing x (τ S ∼ O [1/ ]). This makes the short-range correlations of the noise correction unimportant for the slow evolution, allowing us to approximate it by white noise.
More formally, this situation is precisely identical to having a UV cutoff in a field theory given by, e.g., finite lattice spacing. Similarly, taking the white-noise approximation here corresponds to sending such a cutoff to infinity, which is justified as long as all our observables are confined to energy-scales (or here time-scales) far lower than said cutoff. Explicitly, the approximation we are making (which formally comes from the assumption of RG universality in the fast dynamics): One reason why we must be careful in taking this white-noise limit is that the precise limiting procedure will determine whether the correct interpretation of the resulting multiplicative white noise is Itô or Stratonovich, resulting in observable consequences on the slow time-scale. This question is related to the choice ofx where to evaluate the y-steady-state in the RHS of eq. A6, as well as, independently, where to evaluate the explicit dependence F a (x). To avoid very messy notation, we assume away the latter point by restricting the form of F a (x, y, t) =F a (x) + F a (y, t), which then makes the above expression A6 depend on x only via the steady-state p ss (y|x) (F drops out altogether as it only contributes to the disconnected cumulant): Again, this restriction is not necessary and is taken here for convenience.
Thus, we see that if we discretely changex, two separate time-scales (both fast, ∼ O [1]) control the relaxation of δT (x): τ F , on which p ss (y|x) globally relaxes to its new form (i.e., relaxation time of onepoint functions), and τ F 2 , on which the two-point function F a (y, t), F b (y, t ) y|f ixx decays. If τ F τ F 2 , then we have the usual result that the white-noise limit of multiplicative colored noise should be interpreted as Stratonovich (see Ch.6.5 in [12]). On the other hand, for τ F τ F 2 we see that p ss (y|x) remains essentially fixed while the noise correlations decay, and so the noise amplitude must be evaluated according to the value of x at the beginning of the τ F 2 interval, i.e., in non-anticipating Itô convention. Note that while both limits are possible, the latter is typical, especially for many-body systems, since the relaxation of p ss (y|x) proceed via relaxations of twopoint functions throughout the system. Finally, note that the same Itô / Stratonovich ambiguity occurs in the expression for the damping correction A5 and is resolved in exactly the same way as here.

Compiling results
Finally, we are in a position to put everything together.
We use our final expressions for damping A5 (δγ ab (x) ≡ dt iỹ i ∂ b f i t , F a t y|f ix x(t) ), and noise correction A6 (δT (x) ≡ 2 dt F a (y, t), F b (y, t ) y|f ixx ) in the cumulant expansion A3, and plug that into the full expression for the probability distribution over slow paths 2 to get: The resulting path integral can then be used to extract the corresponding Langevin equation for the slow dynamics: where ξ a is the usual white noise: ξ a (t), ξ b (t ) = δ ab δ(t − t ), and the square root of the matrix D ab is defined by the condition √ D . √ D T = D. Finally the notation is used to denote the Itô or Stratonovich dot according to the conditions described in the last section: τ F and τ F 2 are the decay time-scales for the one-and two-point functions of the fast dynamics respectively. This is then the main analytical result of our work, shown in eq. 3 for the more common Itô case.

Equilibrium: sanity check
Now that we have the effective slow dynamics for general stochastic systems with time-scale separation, we want to check that in the equilibrium case, we recover the expected fluctuation-dissipation relation: D ab (x) = T γ ab (x). Equilibrium in our original system will corresponds to lack of any driving forces: thus all the forces must come from gradients of a single potential landscape U (x a , y i ): F a = −∂ a U and f i = −∂ i U . Focusing on the expression for γ ab above we note that in this case We then note that the response field for the force F b is given byF b =ỹ i ∂ i F b when x is fixed, as it is here. Finally in MSR we know that iF b , F a gives the linear response function for F , and so by fluctuation-dissipation theorem iF b t , F a t = ∂ t F b t , F a t /T for t < t (zero otherwise). Using this in the above expression and integrating by parts: (The factor of two dividing the integral comes from the fact that while the correlator is time-symmetric, the response function is causal.) We thus recover the desired result.

Fast dynamics and T ef f (x)
The last question we must address is why does the effective temperature T ef f (x) found above, in general correlate with how chaotic the fast variables are? In the case where our fast dynamics undergo a phase transition, we clearly see that under integrable dynamics (zero Lyapunov exponents), the connected correlator F a (y, t)F b (y, t ) − F a (y, t) F b (y, t ) = 0 vanishes (or is proportional to the small thermal bath temperature). By the same token, in the chaotic phase (Lyapunov exponents comparable to inverse characteristic time), the averages F a (y, t) are insensitive to the amplitude of the chaotic fluctuations (by symmetry), and thus we get a high T ef f . This is the case in the toy model we studied -as illustrated in fig.5.
The issue is more subtle, however, when we are not explicitly considering a phase transition in the fast behavior. For example, consider a system that can have chaotic behavior, as well as regular selfoscillations, but with slow random phase-drift. This way, the steady-state probability in both cases is distributed throughout the accessible configuration space, with the only distinguishing feature being the correlation decay time τ F 2 -scaling inversely with the Lyapunov exponents λ Lyap . It turns on that in this case also, T ef f ∝ 1/τ F 2 ∝ λ Lyap is higher for more chaotic systems -as long as τ F 2 > τ charcharacteristic return time of fast dynamics.
We can motivate this claim by first realizing that if the fast steady-state is confined to a finite configuration-space region, then it must have some cyclicity with a finite characteristic return-time τ char . This means that the correlator F a (y, t), F b (y, t ) , besides exponentially decaying, will also fluctuate (not necessarily periodically), with persistence time ≤ τ char (depending on the details of fast-slow coupling). Thus, we can Fourier transform the force correlator as: F a (y, t), F b (y, t ) ∼ e −t/τ F 2 ∞ ω0 dω f (ω) cos(t ω), where the infra-red cutoff ω 0 = 2π/τ char is given by the fact that fast dynamics have no timescales longer than τ char . Integrating this correlator to recover the effective temperature, we see that T ef f (x) ∝ ∞ ω0 dω f (ω) ω 2 /τ F 2 as long as τ F 2 > τ char , as stated above. Of course, all this assumes that the amplitude of the force fluctuations stays roughly the same as their correlation time changesbut the systems we are interested in are those that exhibit a qualitative change in their Lyapunov exponents, thus making this the dominant effect.

Appendix B: Kicker Rotor on a Cart
In this appendix we present the analytical calculations required to make the predictions for the Kicked Rotor on a cart toy model described in the main text. For convenience, we reproduce the dimensionless equations of motion here (this time including the direct effect of kicks on the cart in F x ): The units were chosen such that rotor arm length, mass, and kicking period are all =1. The part in black gives simple kicked rotor dynamics, red part weakly (for b 1) couples it to a thermal bath at temperature T , and blue part gives the coupling and dynamics of the cart. We assume throughout that the bath temperature here is very low T ∼ 0 to highlight the effect of the chaotic fluctuations of kicked rotor. Strong time-scale separation, which here is achieved by assuming c 1, implies that terms ∝ẍ will be small. To see when precisely we can be justified in dropping these, we estimate their magnitude for the two regimes: the forced regime ("during the kick") and the free rotation. For the forced regime, since δ(t) is a distribution, we can only talk about the integrals For the free rotation, we can simply differentiate the unforced part of the last line in eq. B1 with respect to time, which gives cẍ = v 3 cos θ to leading order. Thus, to ignore theẍ terms, we need K v/c K for the driven regime, and v 3 /c b v for free rotation. While the former condition is easy to satisfy for a large c, the latter one competes with our additional assumption that b 1 and can be difficult to satisfy numerically, especially as velocities v can sometimes get very large -thus we will keep theẍ sin θ term as an additional perturbative correction to the free dynamics.

Chaotic Kicked Rotor steady-state
To proceed in evaluating the different terms in the expression for the effective slow dynamics (eq.3 in main text), we need to find the steady-state distribution over (θ, v) for a fixed cart position x. As mentioned in the main text, for strong driving K 5, the kicked rotor dynamics are fully chaotic, and thus the steady-state thermalizes all input energy. This immediately implies that the probability distribution is of form p ss (θ, v |x) ∝ exp − v 2 2 T R (x) -uniform over θ and Gaussian over v, parametrised by a single number T R (x) -effective rotor temperature. The symmetries of this distribution guarantee that F x ss = 0.
To find this temperature, we can use the argument from eq. 5 of the main text, which tells us that this steady-state will have a dissipation rate: where for underdamped, forced Langevin dynamics, we have in general: Thus, while v t is completely uncorrelated with ξ t : v t ξ t = 0, v t+δt is correlated only via the thermal noise term: v t+δt ξ t = √ 2 T 0 b, and is independent of any interaction or driving forces F (x, v, t). This gives, for mass=1: (note that if v were a vector in d-dimensions, we would multiply this expression by d). With this, and neglecting the bath temperature T 0 , we can balance the work flow in and heat flow out per kick, to get: where v pre is the pre-kick velocity, which is uncorrelated with θ. Since this gives the variance of typical rotor velocities, we can use it to simplify the timescale separation condition derived above v 3 /c b v down to K/c b 2 -which is quite difficult to satisfy in addition to b 1. Thus, while this result is correct, it turns out that O [1/c] correction coming from the cart coupling termẍ cos θ is very important here. Note that eq. 3 in main text tells us to include x-dependence of the steady-state in correcting the effective damping coefficient γ, however as this term depends only onẍ and not x itself, the steady-state does not gain any x-dependence from it, but rather an x-uniform correction which must be included directly. In this case, as the dynamics are still chaotic and distribution thermal, while the work extracted from the drive δW is not affected (since any work done on the overdamped cart is immediately dissipated and so can be ignored), so the coupling to cart simply adds another channel for heat dissipation: where we assumed that v and θ are uncorrelated during most of the time 0 < t < 1, as justified below. This correction significantly lowers the typical velocities and well reproduced in the results of the simulations for large, but practical values of c (e.g., for b = 10 −2 , c = 10 4 , K = 10 ⇒ T R ≈ 375 < K 2 /4b = 2500, or for b = 10 −1 , c = 5 × 10 4 , K = 10 ⇒ T R ≈ 230 < K 2 /4b = 250).

Cart Damping and Noise correction
With the above understanding of the steady-state, we now proceed to compute the two-time correlations functions needed to get the corrections on the slow dynamics given by eq.3 of the main text. We begin by noting their general structure here: each kick introduces correlations between θ and v, after which, while v remain approximately constant until the next kick (for b 1), θ spins around and correlations decay. The typical decay time-scale in this system can be estimated by looking at the decay: sin θ(0) sin θ(t) ss = sin θ(0) sin (θ(0) + v t) ss where we assume θ and v to be uncorrelated over the time-window. This gives decay τ ∼ 1/ √ T R , which for typical values of parameters (e.g., for b = 10 −1 , c = 5 × 10 4 , K = 10) could be around 1/20. The key here is that in most cases the decay time is much shorter than 1 (the kicking period). Figure 5 shows numerical results for F x (t), F x (t ) correlation in the chaotic phase to give a sense of how these quantities typically look for the given system. Since the θ − v correlations are only generated by kicks, this implies that most of the time they are uncorrelated -as we have assumed a few times above. Moreover, this means that our noise and damping corrections should always be interpreted as Itô for this system, as discussed in the Appendix A 2 of the main text.
Using this result we can immediately see that the damping correction vanishes (see eq.B1 for definitions of f θ , f v ): as the correlator vanishes by the symmetries of the thermal steady-state.
With that, to find the effective temperature experienced by the cart, we need only compute the noise correction, as: From eq.B1, we get F x = v 2 sin θ + K(x) sin θ cos θ δ(t − n) −ẍ sin 2 θ = (centripetal F c ) + (direct kick coupling F k ) -(inertia F i ). Unlike in f v , where the term b v was comparable magnitude toẍ cos θ, here F c and F k are both > O [1], and thus the inertia F i is distinctly sub-leading and can be dropped. We now proceed to individually compute the F c , F c , F k , F k , F c , F k = F k , F c contributions. For the F c , F c term, we see that far from the kicks, where θ and v are uncorrelated, we get: dt v 2 sin (θ + t v) , v 2 sin θ (θ,v)|f ix x = 0 which can be evaluated analytically in Mathematica. The leading correction to this quantity then comes from the θ − v correlations generated by the kicks.
To capture these, we write all velocities and angles in terms of their values before the last kick -at a time when they were guaranteed to be uncorrelated. Thus v(t) = v pre v pre − K sin θ pre and θ(t) = θ pre + t v pre t < 0 θ pre + t (v pre − K sin θ pre ) t > 0 Using these expressions, we can thus evaluate dt dt F c t , F c t piecewise (where a second integral must be included since the time-translationinvariance is now broken). Dropping the subscript pre, we have: where all integrals can be evaluated analytically (in Mathematica) if we take the time integrals first, and then average over the (uncorrelated) θ and v. Similarly for the other terms: Adding up all four terms, we thus get T ef f = 1 2c dt F x t , F x t (θ,v)|f ix x = 0. Here we clearly see that the cancellation comes up due to the anticorrelations between the centripetal force and the kick coupling. In this particular case, the cancellation is somewhat accidental, and is a consequence of the simplicity of the system -the functional form of couplings is quite restricted. In general, we expect such cancellations to be unlikely in higherdimensional systems. As discussed in the main text, to make the system interesting and get a finite T ef f , we can simply eliminate the direct kick-cart coupling F k from the dynamics altogether, with the physical interpretation of "pinning" down the cart at the instant of the kick. This leaves only F c , thus giving T ef f = K 2 /16c, as desired.
Note also that the rotor temperature T R calculated above ends up dropping out and does not affect any of the time-integrated correlators, but only the particulars of their time-dependence as F x (t), F x (t ) (θ,v)|f ix x . Thus the only really key role it played for us was to show that these correlators decay faster than kicking period.

Ordered KR steady-state
On the other hand for weak driving K 5, the kicked rotor undergoes dynamic regularization, and in steady-state is found in one of the integrable attractors in its phase space. Thus, none of the above arguments apply here. Instead, the main regular regions correspond to the rotor completing n full revolutions per kick, with n = ..., −2, −1, 0, 1, 2.... As it does, there are no stochastic fluctuations, other than those from the thermal bath, and as the steady-state lacks the symmetries of the thermal state, F x ss = 0 except at n = 0. In fact, depending on the attractor that the rotor falls into, it will exert a persistent force on the cart, causing constant directed drift. We can easily estimate this drift force for the n'th attractor as (here we again assume that b ∼ O [1/c] 1, and let v n ≡ 2πn): where v(t) = 2πn − b v t − 1 c where v(t) dynamics are found by directly integrating eq.B1 to first order in small parameters. However, as it is impossible to predict which of the attractors will be chosen, we can't a-priori tell the direction or speed that the cart will be moving atthough the options are restricted to the above small discrete set of possibilities parametrized by n.