Violation of Local Detailed Balance Despite a Clear Time-Scale Separation

Integrating out fast degrees of freedom is known to yield, to a good approximation, memory-less, i.e. Markovian, dynamics. In the presence of such a time-scale separation local detailed balance is believed to emerge and to guarantee thermodynamic consistency arbitrarily far from equilibrium. Here we present a transparent example of a Markov model of a molecular motor where local detailed balance can be violated despite a clear time-scale separation and hence Markovian dynamics. Driving the system far from equilibrium can lead to a violation of local detailed balance against the driving force. We further show that local detailed balance can be restored, even in the presence of memory, if the coarse-graining is carried out as Milestoning. Our work establishes Milestoning not only as a kinetically but for the first time also as a thermodynamically consistent coarse-graining method. Our results are relevant as soon as individual transition paths are appreciable or can be resolved.

When the underlying degrees of freedom can assume continuous values any coarse-graining that lumps states as shown in Fig. 1a inherently leads to non-Markovian jump dynamics in continuous time [40,41] due to fast re-crossings in the transition region between A and B. Notably, these can nowadays be experimentally resolved [42][43][44][45][46][47] and are therefore important practically.Conversely, Milestoning [48,49] (see [50][51][52] for a broader perspective) turned out to be a coarse-graining scheme that allows for a kinetically consistent mapping of highdimensional dynamics onto a drastically simplified Markovjump process [41,53].The state space is dissected into hypersurfaces which may enclose sub-volumes that are called "cores" [41,53].Fig. 1b depicts two such cores A and B, whereby the color of the trajectory encodes the last visited core.Beyond a short transient, Markov-jump dynamics emerges from the coarse-graining whenever the trajectory upon leaving any core either (i) quickly returns to it or (ii) quickly transits to the next core [41].Hereby, condition (i) ensures a local equilibration prior to leaving a state that is required for the emergence of local detailed balance [2][3][4][5].Besides being kinetically consistent, Milestoning offers two main advantages over lumping.
First, in experiments probing low-dimensional observables one may be able to separate pairs of metastable states even if their projections onto the observable overlap [54].This is illustrated in Fig. 1c, where two seemingly overlapping metastable states in the projected space x are resolved by choosing the respective milestones outside the overlapping region.Whenever a milestone is left, the trajectory rapidly returns or quickly transits to the other milestone.Thus, the last visited milestone to a good approximation reflects the currently visited metastable region in a possibly higher-dimensional (here 2d) underlying space.Second, we recently discovered that Milestoning naturally ensures local detailed balance in the presence of a time-scale separation [32].Surprisingly, this extends even to systems without a clear time-scale separation, which we investigate further below.Notably, with so-called "dynamical coring" [54,55] one can, under certain conditions, convert a "lumped" process into a "milestoned" process by manually discarding short recrossing events as those shown in Fig. 1a.
In this Letter we show, by means of a simple yet biophysically relevant example, that time-scale separation surprisingly and against common belief does not ensure the existence of local detailed balance.The minimum time-scale separation required for Eq. ( 1) to hold may grow exponentially with the thermodynamic driving force.In other words, time-scale separation may not suffice arbitrarily far from equilibrium.Milestoning, in stark contrast to lumping (see Fig. 1), robustly ensures local detailed balance in the limit of a time-scale separation.This result indicates that unlike lumping, Milestoning generically yields a thermodynamically consistent coarse-graining.
F 1 -ATPase driven far from equilibrium.-Weconsider the molecular motor F 1 -ATPase driven by the hydrolysis of adenosine triphosphate (ATP).The dynamics evolves as a Markov processes on six rotational states [56] as shown in Fig. 2a: The binding of ATP occurs with a rate κ + proportional to the concentration of ATP and effetcs a 90 • rotation.The reverse unbinding occurs with the rate κ − .ATP hydrolisis to ADP is assumed to be infinitely fast.The release of ADP occurs with rate ω + and triggers a 30 • rotation, and the reverse step occurs with rate ω − .The free energy µ liberated by the hydrolysis of one ATP → ADP at a given concentration relates to the entropy change times the temperature T , and local detailed balance (1) imposes Heneceforth we measure energies, µ, in units of the thermal energy k B T .The steady state probability to find the ATPase in even and odd states is given by [57] respectively, where we defined κ ≡ κ + +κ − and ω ≡ ω + + ω − .The entropy production rate can be expressed with   the rate of ATP consumption, J = P odd κ + − P even κ − , via [57] This completes the description of the "full" system.
Lumping.-We now perform a coarse-graining to reduce the six states to three.Two sensible ways to lump the states are shown in Fig. 2b.Assuming Markovian dynamics the effective forward "+" and backward "−" rates on the lumped space read [15] and satisfy In terms of effective rates the coarse-grained entropy reads [15] with z = 1 + 6 or z = 1 + 2 and using Eqs.(4-6) yields Both ratios (7) and ( 8) are positive and bounded by 1 [15], i.e., σz ≤ σ (see also [58]).
A time-scale separation is manifested as a gap in the spectrum of the Markov generator, which separates fast from slow modes (see Part II in [59]).In our model κ ω and κ ω are the only kinds of time-scale separation, and in principle require two different types of lumping (for details see [60]).At high ATP concentration (κ ω) "lumping 1 + 2" (see dashed boxes in Fig. 2b) hides the fast degrees freedom ∼ κ.Conversely, at low ATP concentration one should rather lump 1 + 6 (see solid boxes in Fig. 2b).Note that whenever the entropy production rate is deduced from a master equation [2-4, 6-10, 12-23, 27, 28, 30, 31, 34-38] one explicitly (or implicitly) assumes the observed degrees of freedom to be formally infinitely slower than any possibly hidden ones.
How can we reconcile this?For convenience we focus on µ = 20 and the lumping "1+2", which in fact represents a semi-Markov process of second order [36] (see also [27,37,38]).That is, the waiting time density ψ ±|± (t) depends on both, the previous and next visited state with the normalization In particular, for the given parameters we find where λ 1 ≈ 1.0 ≈ ω and λ 2 ≈ 148.4 ≈ κ.The analytical expression for the waiting time density is immaterial for the present discussion but straightforward to determine.For times t 0.15 the jumps are essentially Markovian -the waiting time density is to a good approximation exponential and independent of the previous step, ψ ±|+ ≈ ψ ±|− , and the fast decaying mode is negligible, e −λ2t ≈ 0. Remarkably, using Eq. ( 9) one finds ln[ψ +|i (t)/ψ −|i (t)] ≈ µ = 20 for for all t 0.15 and i = ±.Hence, only short times t ≤ 0.15 encode a violation of Markovianity and broken local detailed balance.At strong driving most of the jumps occur in positive direction "+" and on average take equally long ≈ 1/λ 1 ≈ 1.In fact, only the backward jump "−" can be faster on average, however, if and only if the preceding jump occurred in the forward "+" direction, i.e. a forward transition is followed immediately by a backward transition.In this case one finds These rare events lead to an "overestimation" of the effective backward transition rate W 1+2 − ω − κ + /κ.Note that a locally equilibrated backward rate would need to satisfy ln(W 1+2 − ) ≈ ln(ω − κ + /κ), which is satisfied if in addition to the time-scale separation the individual rates satisfy κ ± ω ± (see, e.g., [15]).By evaluating exactly the waiting time distribution to include the short-time behavior one is able to restore the entropy production from the two-step affinity via [36] where the last equality follows from Eqs. ( 2) and ( 9) (here µ = 20).Thus, by taking into account the tiny non-Markovian features in Eq. ( 9) one can in principle recover the entropy production.This, however, poses a serious practical problem at strong driving µ 1. Namely, to deduce Eq. ( 10) from an experiment we formally require a trajectory with statistically sufficiently many incidents of finding two consecutive backward steps not interrupted by a forward step.It thus seems that one is required to reliably observe rare events with a probability ∝ (e −µ ) 2 , which may not be feasible.
In the following we illustrate how an alternative coarsegraining -Milestoning -effectively restores Markovian dynamics in a thermodynamically consistent manner while it concurrently effectively squares the sample size by relying only on the evaluation of single rare backward jumps that occur with probability ≈ e −µ .
Thermodynamic consistency of Milestoning.-Wedefine three milestones (or cores) at locations highlighted by dotted black lines in Fig. 1a.These represent the three odd rotational states.We measure the passages across the milestones (see thick yellow lines in Fig. 1c).If the angle were measured continuously, the passages through the milestones would correspond to instantaneous events [32].The coarse-grained process at any time reflects the last visited Milestone (see blue line).As in Ref. [32] we dissect waiting times into the dwell and transition time periods.The dwell time represents all loops returning to the original milestone, while the transition-path time reflects the time of commuting between milestones.The waiting time can be shown to be the sum of the statistically independent dwell and transition-path times (see second main result in [32]).The main advantage of this decomposition is that the statistics of transitionpath time encode information about potentially hidden multidimensional pathways [62] (see also [32,63,64]).
If the gaps between revisitations of the same milestone (see vertical arrow in Fig. 1d) and transition-path times are negligibly short compared to the waiting time in a state, the resulting "Milestoned process" becomes, to a good approximation, Markovian [41].Note that milestones may represent closed (see [41] and Fig. 1b) or open (see [48,49] and Fig. 1c) hypersurfaces.
Let φ ± denote the splitting probability that the next milestone will be visited in the forward "+" and backward "−" direction, respectively.One can confirm (cf.first main result in [32]) that holds.That is, Milestoning transition probabilities exactly encode the entropy production per hydrolyzed ATP.
Since transition-path times obey a reflection symmetry [65] and because the dwell time statistics do not depend on the exit direction [32] the waiting time densities in the + and − direction coincide, i.e. ψ ± (t) = ψ(t).In the presence of hidden dissipative mechanisms the symmetry may be lifted counterintuitively [66,67].Denoting the mean waiting time by t = ∞ 0 tψ(t)dt, the steady state current becomes J M = φ + / t −φ − / t = J.Defining the Milestoning rates as W M ± = φ + / t and inserting them into Eq.( 6) yields, using Eqs.( 4) and (11), σM = σ.Thus, Milestoning in contrast to lumping preserves the entropy production in the limit of a time-scale separation and beyond.
Upon inspecting the waiting time density we find that it is to a good approximation memory-less for µ 10 as well as for µ 20, while the non-exponential behavior is most pronounced in the regime 10 ≤ µ ≤ 20 (see Fig. 3b).Thus, in the limit of either of the two time-scale separations, µ 10 and µ 20, the Milestoned dynamics is to a good approximation Markovian.In contrast to lumping, Milestoning restores local detailed balance (1) in both directions, parallel and anti-parallel to the driving, even at large asymmetries, which is the second main result of this Letter.
Notably, the regime µ 10 clearly fulfills both criteria (i) and (ii) for the emergence of Markovian dynamics [41] if the probability to reside within a core satisfies P odd ≈ 1. Conversely, the opposite limit µ 20 does not obviously imply Markovian kinetics.To understand why it does so nevertheless, we point out that in this limit (a) P even = 1 − P odd ≈ 1.If we were to choose the even (gray) states as cores instead of the odd (yellow) ones (see Fig. 2a), we would obviously restore the criteria for the emergence of Markovian dynamics [41].It turns out further that (b) the waiting time density remains unaffected by the exchange of ω ± and κ ± , i.e. it does not depend on whether we choose the odd or even states as milestones.This explains why an exponential distribution emerges to a good approximation also in the limit µ 20.We also note that the kinetic hysteresis discovered in [32] almost vanishes as soon as Markovian dynamics emerge and the aforementioned criteria [41] are satisfied, which here follows from (a) by choosing the even states as milestones.
Conclusion.-We have shown that a clear time-scale separation, in contrast to the common belief, is only a necessary but not a sufficient condition for the validity of local detailed balance.By coarse-graining a detailed Markov model of a strongly driven molecular motor we demonstrated a clear time-scale separation between the observed and hidden degrees of freedom and hence Markovian dynamics of the observable, and concurrently the non-existence of a local equilibrium against the driving.Our work demonstrates, for the first time, that Milestoning restores thermodynamic consistency in the steady state in the presence of strong driving even if the dynamics displays memory.A coarse-graining based on lumping may yield effectively Markovian dynamics that nevertheless violates local detailed balance.It will be interesting to revisit recent works on the thermodynamics of systems with slow hidden degrees of freedom that employed lumping [27,[36][37][38] to inspect if and how these change under the thermodynamically consistent Milestoning which will lead to correlated transitions [68] and/or dwell times [32].Beyond the examples shown here as well as in Sec.II of [60] it will be interesting to investigate whether the two conditions for Markovianity together with Milestoning [41] generally guarantee the validity of local detailed balance.
Acknowledgments.The financial support from the German Research Foundation (DFG) through the Emmy Noether Program GO 2762/1-2 to A. G. is gratefully acknowledged.

SUPPLEMENTAL MATERIAL I. TIME-SCALE SEPARATION IN THE ATPASE MODEL A. General discussion
The time-scale separation emerges if any of the two clusterings "Lumping 1+2" or "Lumping 1+6" have meso-states that relax much faster than the remaining transitions.Since two-level systems relax to equilibrium with a rate that is roughly given by the sum of both rates connecting the two states, we obtain a time-scale separation as soon as either ω κ or ω κ, which we explain more thoroughly in the following.
Without loss of generality we focus on "Lumping 1+2" (see Fig. 2b in the main text) and rationalize why κ ω actually corresponds to a proper separation of time scales.The other time-scale separation associated with "Lumping 1+6" and ω κ follows by analogy, which is mathematically obtained by interchanging ω ± ↔ κ ± in the discussion below.
The dynamics inside a mesostate 1 + 2 (see Fig. 2b in the main text) occurs with "internal" rates κ ± within the lumped-states.The lumped states are exited either with rate ω − from state 1 (odd numbered) or with rate ω + from state 2 (even numbered).The eigenvalues of the Generator within one lumped state represent the zeros of the characteristic polynomial which leads to the eigenvalues where In the main text we set λ 2 = λ + and λ 1 = λ − .The two relaxation time scales are 1/λ + and 1/λ − , which correspond to the two-scale exponential decay due to the two states within one meso-state.In general, lumping n states will lead to n exponentially decaying modes.A separation of time scale demands that all dynamical modes except for one exponentially decaying mode are quickly relaxing (here λ + λ − or 1 λ − /λ + ).We relate this condition to the rates via where in the last step of the first line we self-consistently re-used the first inequality implying λ + λ + + λ − = ω + κ, and the very last step defines the ratio R. In other words, time-scale separation for "Lumping 1+2" demands R 1. Let us now test whether (and when) this condition is satisfied in our model.

C. Spectral criterion for time-scale separation
For the sake of completeness we discuss the time-scale separation also in a slightly alternative (but equivalent) interpretation using the spectrum of the Markov generator [59,69] (see also Ref. [70] for recent related work on local detailed balance).In this setting we consider the full system (in our case the six-state system from Fig. 2(a) in the main text).For simplicity we focus on the specific example with parameters given in Fig. 3 and µ = 20, which is also used in Eq. ( 9) in the main text.Inserting all the rates in the full six-state system leads to to six eigenvalues, where λ 0 = 0 corresponds to the steady state distribution that does not change with time.The other five eigenvalues are λ 1,2 ≈ 1.5 ± 0.88 i, λ 3,4 ≈ 147.9 ± 0.88 i, λ 5 = 149.4.Consistent with the previous discussion these values confirm the gap in the eigenvalue spectrum 0 = Re(λ 0 ) ≤ Re(λ 1 ) ≤ Re(λ 2 ) Re(λ 3 ) ≤ Re(λ 4 ) ≤ Re(λ 5 ), i.e., three eigenmodes (0 th -2 nd eigenvalue) relax much slower with a rate ≤ 1.5 than the other three (3rd-5th eigenvalue) with a rate ≥ 149.4.We thus confirm the time-scale separation by means of the definition given in Ref. [69].More generally, we explicitly illustrate in Fig. 4 the size of the gap between the second and third mode as function of the driving µ, which except for µ ∼ 15 is in fact always quite pronounced.
Note that a (coarse-grained) three-state Markov model can account only of three eigenvalues λ 0 , λ 1 , λ 2 (and hence time scales) and their corresponding eigenfunctions.If more time scales are required/desired one must increase the number of states.

D. Pecularity of the violation of local detailed balance
In Eq. ( 9) in the main text we evaluate the waiting time in the coarse-grained state given that the next move and preceding moves are heading in either of both directions (+ or −), respectively.Inspecting said waiting time density one immediately finds that local detailed balance is satisfied, to a very good approximation, if the second term is ignored.Thus, the second term alone encodes the violation of local detailed balance.
Since the second term decays much faster than the first term (λ 2 λ 1 ) most state-changes occur after an effective local equilibration.There is just one sequence of events that is able to avoid a clear local equilibration.Namely that involving the forward jump "+" followed by the backward jump "−", which is the only sequence of transitions that allow the prefactor of the second term to outweigh the first one.More precisely, using Eq. ( 9) in the main text the corresponding waiting time density reads ψ −|+ (t) = 2.089 • 10 −9 e λ1t + 6.738 • 10 −3 e λ2t with 6.738 • 10 −3 2.089 • 10 −9 .Thus, after a short time t 0.15, (i.e., much shorter than 15,% of the mean waiting time) we find a violation of local equilibration, whereas for t ≥ 0.15 a local equilibration is established, that is, ψ −|+ (t) ≈ 2.089 • 10 −9 e λ1t .Therefore, the violation of local detailed balance leading to σ1+2 /σ < 1 in Fig. 3a (for µ = 20) is caused almost entirely by the forward jumps (+) that are followed by a backward jump (−) which are shorter than 15 % of the total mean waiting time.
Since backward transitions are extremely rare at µ = 20, we find that most transitions > 99.9 % occur upon a local equilibration.
Note that there are several other variants where local equilibration is manifested.For example, using Eq. ( 3) in the main text one finds P even ≈ κ+ κ ≡ P eq even whereas P odd ≈ 1 − κ+ κ = κ− κ = P eq odd .The second line in Eq. ( 5) implies that the forward transitions are sampled from local equilibrium W 1+2 + ≈ P eq even ω + but W 1+2 − ≈ P eq odd ω − .Thus, local equilibration sets in before each forward transition but not before backward transitions.
It is worth mentioning that if in addition to the timescale separation all individual transitions satisfy the mathematical stronger condition κ + , κ − ω + , ω − one restores local detailed balance in both directions (e.g., see Ref. [15]).In the following we discuss prominent examples from diffusion models which can never satisfy this stronger condition.

II. KRAMERS' THEORY AND LOCAL DETAILED BALANCE
In systems with continuous coordinates (for example, in any biophysical system) Markov-jump processes in the continuous-time limit (i.e., short lag-time limit) are wellknown to not follow from a coarse-graining that is based on lumping [40,41] [see also paragraph after Eq. (2.47) in Ref. [69]].In fact, we argue that local detailed balance may not exist if the coarse-graining is based solely on lumping.To see this we re-investigate below Kramers' original work [71] (see also Ref. [72] for a review), which pioneered the microscopic understanding of local detailed balance and, more precisely, the constituents which are the rate velocities akin to transition rates.In stochastic thermodynamics Kramers' microscopic understanding is routinely used (directly or implicitly) to model stochastic pumps (cf.Arrhenius rates) [73][74][75], stochastic resonance [76], occasionally for kinetic proofreading [77,78], and many others.
In the following we rationalize why Kramers' theory [71,79] is incompatible with lumping in continuous time processes.To illustrate this we consider a particle at position x in a tilted double-well potential where the last term introduces the tilt.The potential is illustrated in Fig. 5a, where the vertical solid line separates the two lumped meso-states which include the minima A and B (the vertical dotted lines will be used for Milestoning below).The tilt leads to a free energy difference between right state B and left state A to be given by ≈ 2 k B T .
To use Kramers' theory we determine the location of two minima of the potential (16) at x A ≈ −1.01527 and x B ≈ 0.983993 alongside the local maximum at x max = 0.0312806.For an overdamped particle with diffusion constant set to unity D = 1 and k B T ≡ 1 Kramers' work predicts the rates to be approximately given by [71,79] and where U (x) ≡ ∂ 2 x U (x) is the second derivative that relates to the curvature of the potential.In the following we test Kramers' result first against lumping and then against Milestoning.

A. Lumping is inconsistent with Kramers' theory
Let us first test Lumping against Kramers' rates.To test this numerically we use a thermodynamically consistent finite element method from [80] which we also used in [81] to derive local time distribution for diffusion processes.To this end we discretize our potential in N steps.The discretization i = 1, . . ., N for N = 20 is shown in Fig. 5a (see green symbols).For convenience we confine the grid between equidistant points x 1 = −1.5 and x N = 1.5 with a grid spacing dx = x i+1 − x i = (x N − x 1 )/(N − 1) = 3/(N − 1).The finite-element method (FEM) is chosen to represent a Markov-jump process between the states i and j = i ± 1 for any 1 ≤ i, i ± 1 ≤ N to be given by [80] (see also [81]) where we set D = k B T = 1.
To evaluate the waiting time in B for a given grid with N states based on lumping we measure the waiting time of the process until the right half of the potential (right of vertical dotted line in Fig. 5a) is left for the first time after entering said region.Using the rates from Eq. ( 19) leads to the waiting time density in Fig. 5b for N = 20, 100, 500.The following two observations can be made.
First, the waiting time distribution becomes shifted towards shorter times as N increases, which we mathematically expect, since any overdamped diffusion process (i.e., N → ∞) formally "wriggles" infinitely often across any position it visits, which in turn leads to vanishingly short waiting times in the limit N → ∞.Note that even underdamped systems can wriggle several times across a barrier, which is typically accounted for by the success probability to "actually" cross the barrier also called the "transmission coefficient" (e.g., see Ref. [50]).It is worth mentioning that we believe that "Dynamical coring" [54,55] tackles this problem by effectively removing all the fast fast re-crossings, which means the waiting time density then approximately selects the rate of the long time tail of the lumped process.
Second, excluding the fast re-crossings and focusing on the long-time limit we find a non-normalized exponential decay that is faster than Kramers' result, which is due to the fact that any lumped state "senses" only one half of the barrier's curvature which enters Eqs.(17) and (18).The full curvature requires a full crossing of the barrier, which we discuss below in more detail whilst discuss Milestoning.Note that Fig. 5b shows the probability density on a semi-log-scale, which renders exponential decays to straight lines.See Fig. 5d for the linear-scale plot with N = 100, which also includes Milestoning that is discussed in the following subsection.
These aforementioned two aspects the (re-crossing and half of barrier sampling) are the key reasons why Kramers' theory is fundamentally incompatible with lumping.Note that we are not the first to show this incompatibility, since it follows from several previous findings, which can in particular be drawn from the paragraph following Eq.(2.47) in Ref. [69] and is also mentioned in Refs.[40,41] to name but a few.In contrast to our work pursued the main text, this conflict arises merely from projecting a continuous space dynamics into discrete one, which we also pointed out in our recent work [32].

B. Milestoning in Kramers' theory
Let us now study Kramers' rate theory in the context of Milestoning.To this end we locate the two milestones at x = −0.5 (A) and x = 0.5 (B) (see vertical dotted lines in Fig. 5a).The coarse-graining that is based on Milestoning simply maps the full dynamics into the last crossed milestone.Thus, for any x ≤ −0.5 the coarsegrained state is "A" and for any x ≥ 0.5 the state is mapped to "B", whereas for any −0.5 ≤ x ≤ 0.5 the process is mapped to either "A" or "B", depending on which milestone was passed last.Note that we start to record the coarse-grained state as soon as the first milestone is crossed.One waiting time interval in state B corresponds to the time between the first entrance into B by crossing x = 0.5 until the other milestone A at x = −0.5 is crossed for the first time.
To measure the waiting time we use the finite element method from Eq. ( 19).To deduce the waiting time density in state B we use the grid state of the right dotted line (x = 0.5) as the starting condition and the state left of the left dotted line (x = −0.5)as the target or absorbing condition.The resulting waiting-time density is shown in Fig. 5c.We find for N 20 that the waiting time density to a good approximation approaches the waiting time density predicted by Kramers approximation, i.e., Arrhenius law.We show the waiting time density for N = 100 in Fig. 5d on a linear scale (see Milestoning and Kramers).For completeness we show the mean values of the waiting time including the reverse transition A → B in Tab.I.One can see that the mean waiting time determined by Milestoning approximates the Kramers result reasonably well.

C. Why is Arrhenius law consistent with
Milestoning?
To see this we look at its derivation [71,79] based on a first passage problem.In Milestoning the pair of vertical dotted lines correspond to the starting and target points, respectively.In the case of an escape from A the dotted line at x = −0.5 would be the initial condition and x = 0.5 would be the absorbing target, whereas for an escape from B the lines exchange roles (target and starting point).
Eq. ( 17) relates to Eq. (5.2.174) in Ref. [79], which uses the starting point to be at the left minima and the target position (absorbing point) the well on the right half of the barrier (e.g., the right minima).The exact location of the target point does matter much (see Fig. 5.3c in Ref. [79]) as soon as the target is well to the right from the barrier (local maximum x ≈ 0).In other words, the location of the right dotted line will not substantially affect the Milestoning process as long as its location is right of the barrier and a few thermal energy units below its maximum value.This guarantees that the particle will revisit the Milestone several times before hitting the other Milestone, which is the key ingredient for the process to become approximately Markovian [41] and allows the particle to locally relax to the equilibrium Boltzmann distribution within the minimum B. Due to the equivalence between the first passage problem proposed by Kramers and the Milestoning process, it is obvious that Milestoning is consistent with Kramers theory.

D. Closing remarks
A few important aspects should be pointed out.First, despite the substantial quantitative difference in the waiting time density between the variants of coarsegraining (Lumping and Milestoning) there are a few strong similarities that we need to point out.If we evaluate both coarse-graining procedures on a single trajectory we will find that most of the time (that is more than 99 % of the entire time) we expect the microstate to be mapped to the same mesostate (A, B).The reason is that Milestoning and Lumping can only lead to a different coarse-grained process if the particle is located between −0.5 ≤ x ≤ 0.5, which rarely happens.
First, due to the high energy within said region the particle will only spend a very short time within that region.Second, whenever the particle is on the left half −0.5 ≤ x ≤ 0 it will have much more likely passed x = −0.5 for the last time and not x = 0.5, which further decreases the likeliness at any time t that both coarsegraining procedures (Lumping or Milestoning) map the microstate into a different Mesostate.Thus, the difference between Lumping and Milestoning can only be detected between fast transition-path events.
Therefore, if an experiment has a time resolution in the form of discrete time intervals that cannot resolve transition-path events, we do not expect the experiment to detect any recrossing events and thus to actually detect the violation of local detailed balance of a Lumped process.
Note that a careful reader might use Tab.I and evaluate the mean waiting time according to Lumping in A and B, respectively, and deduce from its inverse that Eq. ( 1) in the main text is approximately satisfied for any grid size N tacitly assuming the rates with Lumping are taken as the inverse of the mean w Lumping A→B = 1/ t A and w Lumping B→A = 1/ t B .We expect this "coincidence" to vanish for genuinely out-of-equilibrium systems We were recently able to show that Milestoning with a time-scale separation arbitrarily far from equilibrium is consistent with Kramers' theory and satisfies local detailed balance (see Sec. III.C "The peculiar limit of local detailed balance" in [32]).Consistent with Ref. [32], our results show, for the first time, that milestoning may be crucial to restore local detailed balance for for coarsegraining of Markov jump processes along individual trajectories.

III. LUMPING IS A SUBSET OF MILESTONING
There is one freedom that Milestoning allows which is not allowed in Lumping.Lumping asserts to any microstate a single coarse-grained mesostate, whereas Milestoning allows for an "neutral" region of microstates that are not uniquely attributed to a specific single coarsegrained mesostate [41].Note within such a "neutral" region Milestoning determines the coarse-grained state by the last visited non-neutral region.In this sense Lumping is the limit of Milestoning, where the "neutral region" is the empty set, which completes the proof that Lumping is a subset of Milestoning.
Note that the aforementioned neutral region in the main text represents the space between the two ellipses

FIG. 1 .
FIG. 1. Variants of coarse-graining: the color of the full trajectory evolving from the blue star represent the instantaneous coarse-grained states A (blue) and B (yellow), respectively.(a) State lumping: The full set of states Ω is decomposed into subsets A and B. (b) Milestoning based on core sets: two metastable states represent the cores A and B, and the coarse-grained state corresponds to the last visited core.(c) Top: contour lines depicting potential iso-surfaces in the xy plane; milestones A and B resolve the metastable regions.Bottom: measured equilibrium probability density function (PDF) with the PDFs of the the individual metastable states indicated in blue and orange, respectively.

FIG. 4 .
FIG. 4. Relaxation time of second and third eigenmode as function of the driving µ.

FIG. 5 .
FIG. 5. Testing Kramers' theory with a tilted double-well potential.(a) Potential (solid blue line) and its discrete finite element (Markov state) representation with N = 20 states.The vertical solid line separates the potential into lumped states A and B. Milestoning is represented by the two vertical dotted lines within states A and B, where the region between the dotted lines can belong to either A or B (depending on which dotted line was crossed last).(b) Waiting time density in state B predicted by Kramers ℘B→A = wBAe −w BA t using Eq.(18) versus waiting time density to stay on the right half of the potential (vertical solid line in (a)) for N = 20, 100, 500 grid-points.(c) First passage time density to pass the left dotted line at x = −0.5 for the first time if the particle starts right of the vertical dotted line at x = 0.5 in (a), which represents the waiting time density in B based on Milestoning if the milestones are chosen by the vertical dotted lines in panel (b).The thick line represents Kramers theory from (b).(d) Direct comparison of Kramers theory with Lumping and Milestoning with a finite element method and N = 100 states.In the limit N → ∞ Lumping leads to delta distribution of the probability density with the probability weight moving to t = 0.

TABLE I .
(18) waiting time in A and B. Kramers approximation follows from the inverse of the rates wA→B and wB→A from (17) and(18), respectively, which are used as a reference.The relative deviation from Kramers is = |w −1 − t |/w −1 , where w −1 = w −1 A→B , w −1 B→A and t = t A, t B. The mean waiting time evaluated for the two variants of coarse-graining (Lumping and Milestoning) and for serveral discretisations N = 20, 100, 500.