Statistics of Infima and Stopping Times of Entropy Production and Applications to Active Molecular Processes

We study the statistics of infima, stopping times and passage probabilities of entropy production in nonequilibrium steady states, and show that they are universal. We consider two examples of stopping times: first-passage times of entropy production and waiting times of stochastic processes, which are the times when a system reaches for the first time a given state. Our main results are: (i) the distribution of the global infimum of entropy production is exponential with mean equal to minus Boltzmann's constant; (ii) we find the exact expressions for the passage probabilities of entropy production to reach a given value; (iii) we derive a fluctuation theorem for stopping-time distributions of entropy production. These results have interesting implications for stochastic processes that can be discussed in simple colloidal systems and in active molecular processes. In particular, we show that the timing and statistics of discrete chemical transitions of molecular processes, such as, the steps of molecular motors, are governed by the statistics of entropy production. We also show that the extreme-value statistics of active molecular processes are governed by entropy production, for example, the infimum of entropy production of a motor can be related to the maximal excursion of a motor against the direction of an external force. Using this relation, we make predictions for the distribution of the maximum backtrack depth of RNA polymerases, which follows from our universal results for entropy-production infima.


I. INTRODUCTION AND STATEMENT OF THE MAIN RESULTS
The total entropy S tot (t) produced by a mesoscopic process in a finite time interval [0, t] is stochastic, and can for a single realization be negative due to fluctuations. The second law of thermodynamics implies that its average, taken over many realizations of the process, increases in time, S tot (t) ≥ 0. In the 19th century Maxwell already formulated the idea of a stochastic entropy [1], and in the last decades definitions of entropy production of nonequilibrium processes were established using the theory of stochastic processes [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19].
Little is known beyond the second law about the statistics of entropy-production fluctuations. The best insights, so far, in fluctuations of entropy production are provided by fluctuation theorems. They express a fundamental asymmetry of the fluctuations of entropy production: it is exponentially more likely to produce a positive amount of entropy than to reduce entropy by the same but negative amount. An example is the detailed fluctuation theorem, which can be written as p S (S tot ; t)/p S (−S tot ; t) = e Stot/kB , where k B is Boltzmann's constant. Here p S (S tot ; t) is the probability density describing the distribution of the entropy production S tot at a given time t. The detailed fluctuation theorem is universal and holds for a broad class of physical processes in steady state [3-5, 8-10, 14, 20-23]. Moreover, the detailed fluctuation theorem has been tested in several experiments [24][25][26][27][28][29][30][31], for reviews see [32][33][34].
In addition to fluctuation theorems, an important question is to understand the extreme-value statistics of entropy production. In particular, because entropy must on average increase, it is interesting to understand the statistics of records of negative entropy production during a given time interval [0, t]. To address this question, here we introduce the infimum of entropy production, for a single realization, S inf (t) ≡ inf 0≤τ ≤t S tot (τ ), which is the negative record of entropy production over a time interval [0, t].
In this paper we derive universal equalities and inequalities on the statistics of entropy production infima. We show that the mean of the infimum of the stochastic entropy production, in a given time interval [0, t], is bounded from below by minus the Boltzmann constant: This infimum law for entropy production is illustrated in Fig. 1a) and expresses a fundamental bound on how much entropy can be reduced in a finite time. The infimum law follows from a universal bound for the cumulative distribution of entropy-production infima: Here Pr (·) denotes the probability of an event, and the left-hand side is the cumulative distribution of entropy production with s ≥ 0. Remarkably, as we show in this paper, the infimum law, given by Eq. (1), is universal and holds in general for classical and stationary stochastic processes.
The global infimum of entropy production, S ∞ inf ≡ lim t→∞ S inf (t), is the lowest value that entropy production will ever reach in one realization of the process; note that the global infimum is always smaller or equal to the local infimum, S ∞ inf ≤ S inf (t). We show that the distribution of the global infimum of entropy production is exponential where s ≥ 0, and the mean value of the global infimum equals to minus the Boltzmann constant: The shape of the distribution of the global infimum implies that the infimum lies with 50 percent probability within −k B ln 2 ≤ S ∞ inf ≤ 0, and its standard deviation equals the Boltzmann constant. Whereas Eqs. (1) and (2) hold generally in steady states, the equalities given by Eqs. (3) and (4) are shown to be true for continuous stochastic processes.
Related to the global infimum are the passage probabilities P (2) + (P (2) − ) for entropy production to reach a threshold s + tot (−s − tot ) without having reached −s − tot (s + tot ) before. This corresponds to the stochastic process S tot (t) with two absorbing boundaries, a positive absorbing boundary at S tot (t) = s + tot and a negative absorbing boundary at S tot (t) = −s − tot . If the process S tot (t) is continuous and S tot (t) = 0, we find Interestingly, the relations (5) and (6) relate entropyproduction fluctuations between two asymmetric values s + tot = s − tot . The asymptotic value of the passage probability P (2) + for s + tot = +∞ is the probability that entropy never reaches the value −s − tot . It is equal to the probability that the global infimum is larger or equal than −s − tot . The relations for the passage probabilities given by Eqs. (5) and (6) thus imply Eqs. (3) and (4) for the global infimum. Notably, the infima and passage statistics of entropy production are independent of the strength of the non-equilibrium driving, i.e., the mean entropy-production rate.
We also discuss stopping times. A stopping time is the time at which a stochastic trajectory satisfies for the first time a certain criterion. We discuss here s tot -stopping times T + , for which entropy production at the stopping time equals to S tot (T + ) = s tot , with s tot > 0. An example is the first-passage time of entropy production, at which entropy production reaches s tot for the first time. This value of entropy S tot (T + ) is a new record of entropy production, and first-passage times of entropy production are thus times at which a given record is reached. The infima law implies that the average infimum of the entropy production (green solid line) is larger or equal than −kB (orange line). b) First-passage-time fluctuation theorem for entropy production with two absorbing boundaries. Examples of trajectories of stochastic entropy production as a function of time, which first reach a positive threshold stot (blue thick curves), and which first reach a negative threshold −stot (red thin curves). The probability distribution pT + (t; stot) to first reach the positive threshold at time t, and the probability distribution pT − (t; −stot) to first reach the negative threshold at time t, are related by Eq. (8). c) Waiting-time fluctuations: the statistics of the waiting times between two states I and II are the same for forward and backward trajectories that absorb or dissipate a certain amount of heat Q in isothermal conditions.
Analogously, we define a (−s tot )-stopping time T − associated to T + at which entropy production equals to S tot (T − ) = −s tot at the stopping time. For example, if T + is the first-passage time of entropy production to first reach s tot , then T − is the first-passage time of entropy production to first reach −s tot . Remarkably, we find that the mean stopping time T + equals to the mean stopping time T − : A similar equality holds for all the higher order moments of stopping times of entropy production. These results follow from the stopping-time fluctuation theorem p T+ (t; s tot ) p T− (t; −s tot ) = e stot/kB , which we derive in this paper for classical and continuous stochastic processes in steady state. Here p T+ (t; s tot ) is the probability density for the stopping time T + , and p T− (t; −s tot ) is the probability density for the stopping time T − . The stopping-time fluctuation theorem (8) is illustrated in Fig. 1b) for the example where T + and T − are first-passage times of entropy production with two absorbing boundaries. Other examples of stopping times are waiting times, defined as the time a stochastic trajectory takes while changing from an initial state I to a final state II, see Fig. 1c). We show in this paper that for a nonequilibrium and stationary isothermal process the ratio of waitingtime distributions corresponding to forward trajectories (I → II) and backward trajectories (II → I) obey for all trajectories between I and II that exchange the amount Q of heat with an equilibrated environment at temperature T env ; if Q > 0, then the system absorbs heat from the environment. Here p T I→II + (t; −Q) denotes the probability density for the waiting time T I→II + to reach the state II while absorbing the heat Q. Equation (9) is a generalization of the local detailed-balance condition for transition rates k I→II + /k II→I − = e −Q/kBTenv [35][36][37] to waiting-time distributions. Indeed, transition rates are given by k = ∞ 0 dt t −1 p T (t). Notably, Eq. (9) implies a symmetry relation on the normalized waiting-time distributions where P = ∞ 0 dt p T (t). Therefore, the mean waiting times T = ∞ 0 dt t p T (t)/P for the forward and backward transitions are the same, T I→II We derive all these results on infima, passage probabilities and stopping times of entropy production in a new unified formalism that uses the theory of martingales [38,39], and apply our results to the dynamics of colloidal particles in periodic potentials and molecular motors, which transduce chemical energy into mechanical work. The paper is structured as follows: In Sec. II, we briefly review the formalism of stochastic thermodynamics. In Sec. III, we discuss the connection between martingale processes and entropy production. In Sec. IV, Sec. V, and Sec. VI we derive, respectively, the infimum law (1) and the bound (2); the statistics of the global infimum of entropy production (3)-(4) and the equalities for the passage probabilities (5)-(6); fluctuation theorems for stopping times of entropy production, which include first-passage times of entropy production (8) and waiting times of stochastic processes (9). We apply our results in Sec.VII to a drifted colloidal particle moving in a periodic potential. In Sec. VIII, we apply our results to discrete molecular processes such as the stepping statistics of molecular motors or the dynamics of enzymatic reactions. The paper concludes with a discussion in Sec. IX.

II. STOCHASTIC THERMODYNAMICS AND ENTROPY PRODUCTION
We first briefly review the basic concepts of stochastic entropy production based on path probabilities in discrete time. We then present a measure-theoretic formalism of stochastic thermodynamics, which defines entropy production in discrete and continuous time. Using measure theory we avoid problems with the normalization of path probabilities.

A. Entropy production for processes in discrete time
We consider the dynamics of a mesoscopic system in a nonequilibrium steady state, and describe its dynamics with the coarse-grained state variables ω(t) = (q(t),q(t)) at time t. The variables q(t) represent n degrees of freedom that are even under time reversal, and the variables q(t) representñ degrees of freedom that are odd under time reversal [40]. Notably, the variables q(t) andq(t) represent the dynamics of collective modes in a system of interacting particles, for instance, q(t) describes the position of a colloidal particle in a fluid andq(t) its effective momentum.
In a given time window [0, t], the coordinates ω(t) trace a path in phase space ω t 0 = {ω(τ )} 0≤τ ≤t . We associate with each trajectory ω t 0 a probability density P(ω t 0 ; p init ), which captures the limited information provided by the coarse-grained variables ω, and the fact that the exact microstate is not known; the distribution p init is the probability density of the initial state ω(0) . The entropy production associated with a path ω t 0 of a stationary process is given by [7,8,11] where Θ t ω t 0 = {ω(τ )} t τ =0 is the time-reversed trajectory withω(τ ) = (q(t − τ ), −q(t − τ )); p ss the steady-state distribution in the forward dynamics; andp ss the steadystate distribution in the backward dynamics. Equation (11) is well-defined for discrete-time processes for which ω t 0 is a discrete sequence of states. The ensemble average of entropy production of a stationary process can be expressed as an integral: . (12) Entropy production is therefore the observable that quantifies time irreversibility of mesoscopic trajectories [41]. In fact, by measuring entropy production an observer can determine within a minimal time whether a movie of a stochastic process is run forwards or backwards [42]. Microscopic reversibility implies that a mesoscopic system in contact with an equilibrated environment satisfies local detailed balance [8,11,43,44]. Local detailed balance manifests itself in a condition on the path probabilities conditioned on the initial state, and reads where S env (t) is the entropy change in the environment. If local detailed balance holds, then our definition of entropy production (11) equals the total entropy change, i.e., the sum of the system-entropy change ∆S sys [14] and the environment-entropy change S env : with ∆S sys (t) = −k B ln p ss (ω(t)) p ss (ω(0)) .
Note that in Eq. (15) we have used thatp ss (ω(t)) = p ss (ω(t)) [11]. For a system in contact with one or several thermal baths, the environment-entropy change is related to the heat exchanged between system and environment [45].

B. Entropy production for processes in continuous time
In discrete time, the expressions Eqs. (11) and (12) for entropy production are well-defined [46]. In continuous time, the path-probability densities P are not normalizable. In order to avoid this problem, we use a formalism based on measure theory to define entropy production in continuous-time processes [38,39,[47][48][49][50]. Measure theory studies probabilities of events in terms of a probability space (Ω, F, P). The set Ω of all trajectories ω is called the sample space; the set F of all measurable subsets Φ of Ω is a σ-algebra; and the function P is a measure, which associates probabilities to subsets Φ. In the following we identify the symbol ω with the full trajectory of state variables over all times, ω = {q(τ ),q(τ )} τ ∈(−∞,∞) .
The concept of a probability measure P(Φ) generalizes the path probability densities P(ω). The value P (Φ) denotes the probability to observe a trajectory ω in the set Φ, in other words P (Φ) = Pr (ω ∈ Φ). An example of a measure is with p(x) is a probability density of elements x in R n .
Here, λ denotes the Lebesgue measure and the Lebesgue integral is over the set Φ. One can also define a probability density R(ω) of a measure P(Φ) = ω∈Φ dP with respect to a second probability measure Q(Φ) using the Radon-Nikodým theorem [48] where the integral is over the probability space (Ω, F, Q) [48,50]. The function R(ω) is called the Radon-Nikodým derivative, which we denote by R(ω) = dP dQ (ω). In Eq. (17), the function R(ω) generalizes the probability density p(x) to spaces for which the Lebesgue measure does not exist, e.g., the Wiener space of trajectories of a Brownian particle.
We now consider probability measures of steady-state processes. A stationary probability measure is timetranslation invariant and satisfies P = P • T t , where T t is the map that translates a trajectory ω by a time t as q(τ ) → q(τ + t) andq(τ ) →q(τ + t). A stochastic process X(ω; t) provides the value of an observable X at time t for a given trajectory ω. We denote the average or expectation value of the stochastic variable X(ω; t) by X(ω; t) P = ω∈Ω X(ω; t) dP. In the following, the stochastic process X(ω; t) is sometimes simply denoted by X(t) and its average by X(t) .
Entropy production S tot (ω; t) is an example of a stochastic process. An appropriate definition of entropy production, which generalizes Eq. (11) to include continuous-time processes, can be written using the Radon-Nikodým derivative of the measure P| F (t) with respect to the time-reversed measure (P • Θ)| F (t) [51]. Here P| F (t) denotes the restriction of the measure P over those events in the sub-σalgebra F(t) ⊂ F that is generated by trajectories ω t 0 in the time interval [0, t]. The time-reversed measure P • Θ is defined using the time-reversal map Θ, which time reverses trajectories ω as q(t) → q(−t) andq(t) → −q(−t). Note that Eq. (18) is well-defined for continuous-time processes that may contain jumps.

III. MARTINGALE THEORY FOR ENTROPY PRODUCTION
A fundamental, but still unexplored, property of entropy production is that in steady state its exponential e −Stot(t)/kB is a positive and uniformly integrable martingale process. A process is called martingale if its expected value at any time t equals to its value at a previous time τ , when the expected value is conditioned on observations up to the time τ (see Appendix A). The process e −Stot(t)/kB satisfies this property, and therefore obeys (see Appendix B) for τ < t, and where the average is conditioned on a particular trajectory ω(t ) from t = 0 up to time τ . From Eq. (19) it follows that martingale processes have a time-independent average. Interestingly, for e −Stot(t)/kB this implies the integral fluctuation theorem. Indeed, using Eq. (19) for τ = 0 and S tot (0) = 0, it follows that e −Stot(t)/kB = 1, for arbitrary initial conditions [2,8,14,52]. On average the total entropy S tot (t) always increases, and therefore it cannot be a martingale. However entropy production is a submartingale with the property Equation (20) follows from Eq. (19) and the fact that e −Stot(t)/kB is a convex function of S tot (t). From (20) it follows that the average entropy production is greater or equal than zero for any initial condition. Note that this statement is stronger than S tot (t) ≥ 0, where the brackets denote the steady-state ensemble.
A key property of martingales is Doob's maximal inequality (see Appendix A) [38,39]. For e −Stot(t)/kB this inequality provides a bound on the cumulative distribution of its supremum [53]: Equation (21) is a stronger condition than the well-known Markov inequality Eq. (A7), and holds for steady-state processes in discrete time and steady-state continuoustime processes with jumps.
Another key property of martingales is Doob's optional sampling theorem. For entropy production, this theorem generalizes Eq. (19) to averages conditioned on stochastic stopping times T < t (see Appendix A): The stopping time T = T (ω) is the time at which a trajectory ω satisfies for the first time a certain criterion, and therefore differs for each realization ω. This is a generalization of passage times. Equation (22) holds for steadystate processes in discrete time and for steady-state continuous-time processes with jumps. Equation (22) implies that the expected value of e −Stot(t)/kB , over all trajectories for which the value of entropy at the stochastic stopping time T is given by the value s tot , equals e −stot/kB .

IV. THE INFIMUM LAW
Using the martingale property of e −Stot(t)/kB we now derive the infimum law for entropy production, which holds for non-equilibrium processes in a steady state. From Eq. (21) and the integral fluctuation theorem, e −Stot(t)/kB = 1, we find the following bound for the cumulative distribution of the supremum of e −Stot(t)/kB for λ ≥ 0. Equation (23) implies a lower bound on the cumulative distribution of the infimum of S tot in a given time interval [0, t]: with s ≥ 0 and S inf (t) = inf τ ∈[0,t] {S tot (τ )}. The right hand side of Eq. (24) is the cumulative distribution of an exponential random variable S with distribution function p S (s) = e −s . From Eq. (24) it thus follows that the random variable −S inf (t)/k B dominates stochastically over S, and this implies an inequality on the mean values of the corresponding random variables as we show in Appendix C. From Eq. (24) we thus find the following universal bound for the mean infimum of entropy production at time t: The infimum law given by Eq. (25) holds for stationary stochastic processes in discrete time and for stationary stochastic processes in continuous time for which e −Stot(t)/kB is right continuous. For the special case of isothermal processes the total entropy change S tot (t) = ∆S sys (t) − Q(t)/T env = ∆S sys (t)+(W (t) − ∆E(t)) /T env , with Q(t) denoting the heat absorbed by the system from the reservoir, W (t) the work done on the system and ∆E(t) the internal energy change of the system. We thus have a bound on the infimum of the dissipated part of the work W diss , which reads Here we have defined the dissipated work W diss (t) = W (t) − ∆F (t) with ∆F (t) = E(t) − T env ∆S sys (t). For a process that is isothermal and for which all states have the same energy and entropy we have ∆F (t) = 0, and thus with W inf (t) the infimum of the work done on the system, and Q sup (t) the supremum of the heat absorbed by the system over a time t. Equation (27) implies that on average a homogeneous system in isothermal conditions can absorb no more than k B T env of energy from the thermal reservoir.

V. PASSAGE PROBABILITIES AND GLOBAL INFIMUM OF ENTROPY PRODUCTION
We now derive, using the theory of martingales, general expressions for the passage probabilities and the global infimum of entropy production in continuous steady-state processes without jumps.
A. Passage probabilities of entropy production with two asymmetric absorbing boundaries We consider the stochastic entropy production S tot (ω; t) of a stationary probability measure P in a time interval [0, T (2) (ω)], which starts at t = 0 and ends at a stopping time T (2) (ω). Here T (2) is the first-passage time at which S tot (ω; t) passes for the first time one of the two threshold values −s − tot < 0 or s + tot > 0 (see Fig. 1(b) for the particular case of s + tot = s − tot ). We define the passage probabilities P (2) + as the probability that the process first passes s + tot before passing s − tot , and analogously, P − as the probability that the process first passes s − tot before passing s + tot . These passage probabilities can be written as: with Φ + the set of trajectories ω that pass first the positive threshold s + tot , and Φ − the set of trajectories ω that pass first the negative threshold −s − tot : Note that if s + tot is different from s − tot then Φ + and Φ − are not each other's time reversal. Therefore the probabilities of these sets are in general not related by local detailed balance. We also define the conjugate proba-bilitiesP (2) + andP (2) − of the sets Φ + and Φ − under the time-reversed dynamics: For a steady-state process out-of-equilibrium, i.e., S tot (t) > 0, S tot (t) passes in a finite time one of the two boundaries with probability one. We thus have: In addition, we derive, using Doob's optional sampling theorem, the following two identities: Eq. (36) follows from the equalities: = ω∈Φ+ e −Stot(ω;+∞)/kB dP = ω∈Φ+ e −Stot(ω;T (2) (ω))/kB dP (40) = e −s + tot /kB P In Eq. (39) we transform an integral over the measure P • Θ to an integral over the measure P, using the definition of entropy production, given by Eq. (18) and e −Stot(ω;+∞)/kB = lim t→+∞ e −Stot(ω;t)/kB , (see Appendix B). In Eq. (40) we replace e −Stot(ω;+∞)/kB by its value at the stopping time, e −Stot(ω;T (2) (ω))/kB , using Doob's optional sampling theorem, given by Eq. (22). Finally, in Eq. (41) we use the fact that for continuous processes S tot (ω; T (2) (ω)) = s + tot , for all realizations of the process ω in the set Φ + .
From Eqs. (34)-(37) we find the following explicit expressions for the passage probabilities: For the case of symmetric boundaries s tot = s + tot = s − We can also discuss the limits where one the two thresholds move to infinity, whereas the other threshold remains finite. This corresponds to a process with one absorbing boundary. If the lower threshold s − tot k B , the process ends with probability one in the positive threshold, P (2) + = 1 and P (2) − = 0, in accordance with the second law of thermodynamics. If however the upper threshold becomes large, s + tot k B , entropy production can still reach the positive threshold, since on average entropy always increases, but with a probability that depends on s − tot . In this case the passage probabilities are given by From these limits we can also determine the passage probabilities P − of entropy production with one absorbing boundary. They denote, respectively, the probability to reach a positive boundary s tot or a negative boundary −s tot : The above arguments also hold for sets Φ +,I and Φ −,I of trajectories ω, which are conditioned on an initial coarse grained state I. They are defined as the subsets of, respectively, Φ + and Φ − with the additional constraint that the initial state falls in the coarse-grained state I, ω(0) ∈ I. With these definitions Eqs. (44)- (45) can be generalized to passage probabilities of entropy production conditioned on the initial state, see Appendix D. Note that this generalization holds for coarse-grained states that are invariant with respect to time-reversal, i.e., I = Θ(I).
In Fig. 2 we illustrate the expressions of the passage probabilities, given by Eqs. (44) and (45), by plotting ln(P − , which are given by: In situations for which P − = 1/2, the stopping process is unbiased, and the probability to reach the threshold s + tot equals the probability to reach the threshold −s − tot . Since entropy production is a stochastic process with positive drift, the passage probabilities can be equal only if the negative threshold lies closer to the origin than the positive threshold, s − tot < s + tot . Additionally, it follows from Eq. (52) that for P − the negative threshold obeys s − tot ≤ k B ln 2, as is illustrated in Fig. 2. This bound on s − tot can also be discussed for passage probabilities P (2) − = 1/2, for which the lower threshold must satisfy s − tot ≤ −k B ln P − . The discussion of stopping events of entropy production with two boundaries is an example of thermodynamics of symmetry breaking. Thermodynamics of symmetry breaking is usually discussed in finite times [54]. Here we find the analogous relation ±s ± tot ≥ k B ln P (2) ± , which is valid for stopping times.

B. Global infimum of entropy production
The global infimum of entropy production S ∞ inf is the lowest value of entropy production over all times t ≥ 0 during one realization of the process. The global infimum can be defined in terms of the local infimum S inf (t) as Therefore, the global infimum is always negative and smaller or equal than local infima S inf (t). The statistics of the global infimum follow from the expressions for the passage probabilities (44)- (45). This can be most easily understood in terms of the cumulative distribution of the global infimum The right-hand side of Eq. (54) is the survival probability for entropy production in a process with one absorbing boundary located at −s, with s ≥ 0. Therefore, the survival probability is the passage probability P The mean of the global infimum therefore is These properties of the global infimum hold for continuous processes in steady state. The infimum law, given by Eq. (25), becomes thus an equality at large times. Since S inf (t) ≥ S ∞ inf , the equalities on the global infimum, given by Eqs. (55) and (56), valid for continuous processes, imply the inequalities for the local infima, given by Eqs. (24) and (25), for continuous processes. Note however that Eqs. (24) and (25) are also valid for processes in discrete time and processes in continuous time with jumps.
Remarkably, the distribution of the global infimum of entropy production is universal. For any continuous steady-state process the distribution of the global infimum is an exponential with mean equal to −k B .

VI. STOPPING TIMES OF ENTROPY PRODUCTION
In this section we derive fluctuation theorems for stopping times of entropy production using the martingale property of e −Stot(t)/kB . The stopping-time fluctuation theorem entails fluctuation theorems for first-passagetimes of entropy production and for waiting times of a stochastic process. Entropy-production passage-probability ratio ln P

A. Stopping-time fluctuation theorem
We consider the statistics of s tot -stopping times T + = T (ω) for which entropy production at the stopping time takes the value s tot , i.e., S tot (T + ) = s tot (s tot > 0). An example of such an s tot -stopping time is the first-passage time T (1) + , which determines the time at which entropy production S tot (t) reaches for the first time the value s tot > 0. Another example is given by the first-passage time T (2) + , which is the time at which entropy production S tot (t) passes for the first time a threshold value s tot , given that it has not reached −s tot before. The latter process is therefore equivalent to a first-passage problem with two absorbing boundaries. More generally, s tot -stopping times T (n) + can be defined by multiple threshold crossings and the condition S tot (T (n) + ) = s tot , with n the order of threshold crossings.
We derive the following general fluctuation theorem for s tot -stopping times T + (see Appendix E): where P Φ T+≤t is the probability to observe a trajectory ω that satisfies the stopping-time criterion at a time T + ≤ t, and Φ T+≤t denotes the set of these trajectories. The set Θ T+ Φ T+≤t describes the time-reversed trajectories of Φ T+≤t . It is generated by applying the time-reversal map Θ T+ to all the elements of the original set. The map Θ T+ = T T+ • Θ time reverses trajectories ω with respect to the reference time T + (ω)/2, and thus X(Θ T+ (ω); τ ) = X(ω; T + (ω) − τ ) for all stochastic pro- cesses X that are even under time reversal. The fluctuation theorem for s tot -stopping times Eq. (57) is valid for continuous and stationary stochastic processes. In our derivation, given in Appendix E, we use the martingale property of e −Stot(t)/k . The probability density p T+ of the s tot -stopping time T + is given by: Entropy production is odd under time reversal S tot (Θ T+ (ω); T + (ω)) = −S tot (ω; T + (ω)) as shown in Appendix B 3. Therefore, we can associate to the s tot - For example, the (−s tot )-stopping times T − associated to the first-passage time T (1) when entropy production first reaches −s tot , see Fig.3

a).
Analogously, the (−s tot )-stopping times T − associated to the the first-passage time T + is the first-passage time T (2) − when entropy production first reaches −s tot without having reached s tot before, see Fig.3

b).
We can thus identify the distribution of T − with the measure of time-reversed trajectories: This equation can be applied to all pairs of stopping times T + and T − related by Eq. (59). From Eqs. (57), (58) and (60) follows the stopping-time fluctuation theorem for entropy production The stopping-time fluctuation theorem for entropy production, given by Eq.(61), generalizes the results derived in [42] for first-passage times. Equation (61) implies two interesting results for stopping times of entropy production that are outlined below.

B. Symmetry of the normalized stopping-time distributions
The stopping-time fluctuation relation Eq. (61) implies an equality between the normalized stopping-time distributions p T+ (t|s tot ) and p T− (t| − s tot ) which reads The normalized distributions are defined as: The symmetric relation Eq. (62) comes from the fact that the ratio of the stopping-time distributions in Eq. (61) is time independent. The stopping-time fluctuation theorem (61) thus implies that the mean stopping time, given that the process terminates at the positive boundary, equals to the mean stopping time, given that the process terminates at the negative boundary: This remarkable symmetry extends to all the moments of the stopping-time distributions. A similar result has been found for waitingtime distributions in chemical kinetics [55][56][57][58][59][60][61], for the cycle-time distributions in Markov chains [60,62], and for decision-time distributions in sequential hypothesis tests [42,63]. These results could therefore be interpreted as a consequence of the fundamental relation for the stopping-time fluctuations of entropy production given by Eq. (62).

C. Passage probabilities for symmetric boundaries
Equation (61) implies a relation for the stopping probabilities of entropy production: Stopping probabilities are the probabilities that the process satisfies the stopping criterion in a finite time. They are defined as Equation (66) follows directly from integrating Eq. (61) over time. The relations (50) and (51) for passage probabilities of first-passage times with one absorbing boundary, and the relations (50) and (51) for passage probabilities of first-passage times with two absorbing boundaries, are examples of stopping times that satisfy Eq. (66).

D. Fluctuation relation for waiting times
An interesting question, closely related to entropy stopping times, is the following: what is the waiting time T I→II a process takes to travel from a state I to a state II. We derive here exact relations characterizing (±s tot )waiting times T I→II ± . The (±s tot )-waiting time T I→II ± denotes the time a process takes to travel from a state I to a state II, and produce a total positive or negative entropy ±s tot (see Appendix E 4).
Following Kramers [64], we define states as points in phase space, i.e., I = {q I } and II = {q II }. In the Appendix E 4, we also consider the more general case for which states consist of sets of points, which may also contain odd-parity variables.
We first derive a generalized fluctuation theorem which applies to trajectories starting from a given initial state I (see Appendix E 3): with Γ I the set of trajectories ω for which ω(0) ∈ I. We use Eq. (69) to derive a fluctuation theorem for waiting times (see Appendix E 4): where pss(q I ) is the change in the environment entropy during the transition from state I to state II. Equation (70) relates the waiting-time distributions between two states with the environment-entropy change along trajectories connecting both states.
We normalize the distributions in Eq. (70), and find a relation for the normalized waiting-time distributions and for the associated passage probabilities: Interestingly, the relations Eqs. (70), (71) and (72) are similar to the waiting-time relations (61), (62) and (66) discussed above. However, in Eq. (70) and (72) the environmental entropy production occurs, instead of the total entropy production, because the trajectories are conditioned on passage through initial and final states. For isothermal processes, s env = −Q/T env , with Q the heat absorbed by the system and T env the temperature of the environment.

VII. APPLICATION TO SIMPLE COLLOIDAL SYSTEMS
Infima of entropy production fluctuations and stopping times can be calculated for specific stochastic processes. In this section we discuss these quantities for the dynamics of a colloidal particle with diffusion coefficient D that moves in a periodic potential V , with period , and under the influence of a constant external force F [65,66] (see Fig. 4 for a graphical illustration). This process has been realized in several experiments using colloidal particles trapped with toroidal optical potentials [67][68][69][70]. We discuss how our results can be tested in this type of experiments.
We describe the dynamics of this colloidal particle in terms of a one-dimensional overdamped Brownian motion with periodic boundary conditions. The state of the particle at time t is characterized by a phase variable φ(t) ∈ [0, ). In the illustration of a ring geometry in Fig. 4, φ is the azimuthal angle and = 2π. Equivalently, one can consider a stochastic process X(t) given by the net distance traveled by the Brownian particle up to time t: X(t) = φ(t)+ N (t), where N (t) is the winding number or the net number of clockwise turns (or minus the number of counterclockwise turns) of the particle up to time t [71]. The time evolution of X(t) is obeys the Langevin equation where γ is a friction coefficient, v = F/γ is the drift velocity, and ζ is a Gaussian white noise with zero mean ζ(t) = 0 and with autocorrelation ζ(t)ζ(t ) = 2Dδ(t− t ). If the Einstein relation holds, D = k B T env /γ, with T env the temperature of the thermal reservoir. Here V (x) is the periodic potential of period , .
The steady-state entropy production after a time t is [65,72]: For a drift-diffusion process, with V (x) = 0, the total entropy production reads Equation (75) implies that the first-passage and extremevalue statistics of entropy production in the driftdiffusion process follow from the statistics of the position X(t) of a drifted Brownian particle in the real line. The drift-diffusion process is an example for which the infimum and first-passage statistics of entropy production can be calculated analytically. We also consider a Smoluchowski-Feynman ratchet whose dynamics is given by Eq. (73) with the nonzero potential, as illustrated in Fig. 4. For the potential (76) with V 0 = k B T env the stochastic entropy production in steady state, given by Eq. (74), equals [72] S tot (t) In the following we present analytical results for the infima and passage statistics of entropy production in the drift-diffusion process, and we present simulation results for the Smoluchowski-Feynman ratchet with the potential (76). The simulation results are for parameters corresponding to experimental conditions in optical tweezers experiments [68][69][70], namely, for a polystyrene spherical Brownian particle of radius 1 µm immersed in water at room temperature. ratchet. A Brownian particle (gray sphere) immersed in a thermal bath of temperature Tenv moves in a periodic potential V (φ) (black shaded curve) with friction coefficient γ. The coordinate φ is the azimuthal angle of the particle. When applying an external force F = γv in the azimuthal direction, the particle reaches a nonequilibrium steady state. In this example,

A. Infimum statistics
We now study the infimum properties of entropy production for the Smoluchowski-Feynman ratchet in steady state. In Fig. 5 we show that the cumulative distribution of the local entropy-production infimum S inf (t) is bounded from below by 1 − e −s , which confirms the universality of Eq. (2). We compare analytical results for the drift-diffusion process (V (x) = 0, dashed lines) with numerical results for the Smoluchowski-Ferynman ratchet (with a potential V (x) given by Eq. (76), solid lines), for different values of the mean entropy production σ(t) = S tot (t) /k B . The analytical expression for the cumulative distribution of S inf is, for the drift-diffusion process, given by (see Appendix F): where s > 0, erfc(x) = (2/ √ π) +∞ x e −y 2 dy is the complementary error function, and σ(t) is the average entropy production in steady state at time t, which for the drift-diffusion process is σ(t) = (v 2 /D)t. Interestingly, the bound saturates for large values of the average entropy production σ(t), which illustrates the universal equality on the distribution of the global infimum of entropy production, given by Eq. (3). Remarkably, as shown in Fig. 5 the cumulative distribution for the infimum of entropy production of the Smoluchowski-Feynman ratchet is nearly identical for different shapes of the potential V (x). This equivalence between the infimum cumulative distributions holds even for small values of σ(t) where the shape of the potential V (x) affects the entropy-production fluctuations.
Secondly, in Fig. 6 we illustrate the infimum law, S inf (t) ≥ −k B , for the Smoluchowski-Feynman ratchet. We show the average local infimum S inf (t) as a function of the mean entropy production σ(t): we compare analytical results for the drift-diffusion process without potential (dashed lines) with numerical results for the Smoluchowski-Ferynman ratchet with a potential given by Eq. (76) (solid lines). The analytical expression for the drift-diffusion process is (see Appendix F): where erf(x) = 1 − erfc(x) is the error function. In the limit of large times, the global infimum S ∞ inf = −k B , in accordance to the universal equality (4). The results in Fig. 6 show that the mean local infimum of entropy production has the same functional dependency on σ(t) independent of the potential V (x). This points towards a universal behaviour of the statistics of local infima of entropy production.
B. Passage probabilities and first-passage times

Symmetric boundaries
We illustrate our universal results on passage probabilities and first-passage times for the Smoluchowski-Feynman ratchet.
For the drift-diffusion process (V (x) = 0), we recover the first-passage-time fluctuation theorem for entropy production, given by Eq. (8), from the analytical expressions of the first-passage-time distributions for the position of a Brownian particle (see Appendix G).
We also compute, using numerical simulations, the first-passage-time statistics for entropy production of the Smoluchowski-Feynman ratchet in steady state and with potential V (x) given by Eq. (76). First we study the first-passage times T (2) ± for entropy production with two absorbing boundaries at the threshold values s tot and −s tot (with s tot > 0). Figure 7 shows the empirical firstpassage-time distribution p T    we confirm the validity of the first-passage-time fluctuation theorem given by Eq. (8). Moreover, the functional dependency of the empirical passage probabilities P − can be obtained using the method in [73,74]. As a result, the integral firstpassage-time fluctuation theorem given by Eq. (66) is also fulfilled in this example (see bottom inset in Fig. 7). lindsey1977complete As a second case we consider two one-boundary firstpassage problems for entropy production. We study the first-passage times T  timates of these first-passage-time distributions and confirms the validity of the first-passage-time fluctuation theorem given by Eq. (8). In the top inset of Fig. 8 we show that the the passage probabilities P

Asymmetric boundaries
We now discuss the passage statistics for entropy production with asymmetric boundaries. In Appendix G we discuss the drift-diffusion process, whereas here we discuss the the Smoluchowski-Feynman rachet with a potential, given by Eq. (76). In Fig. 9(a) we show the distributions of the first-passage times p T  Fig. 7. b) Passage probabilities for entropy production P (2) + (blue filled squares), to first reach the positive threshold s + tot given that it has not reached −s − tot before, and passage probability for entropy production P  ) is the fraction of the trajectories starting from XI (XII) at t = 0 that reach XII (XI) at a later time t > 0 without returning to XI (XII) before. The solid line is a straight line of slope 1. The data in the inset is obtained from 10 5 simulations starting from the state I, and 10 5 simulations starting from the state II.
In Fig. 9(b) we show numerical results for the entropyproduction passage probabilities P

C. Fluctuation theorem in waiting times
We illustrate the waiting-time fluctuation theorem, given by Eq. (9) (or Eq. (70)), on the Smoluchowski-Feynman ratchet. We compute, using numerical simulations, the waiting times along forward trajectories I → II and backward trajectories II → I between two states characterised, respectively, by the coordinates X = X I and X = X II , as illustrated in Fig. 10a). In agreement with the fluctuation theorem for waiting times we find that the normalized distribution p T I→II + (t|s env ) is equal to the normalized distribution p T II→I − (t| − s env ) (see Fig. 10b)). Here the environment-entropy change is determined by the heat exchange between system and environment, i.e., s env = −Q/T env = F (X II − X I )/T env + (V (X I )−V (X II ))/T env . In the inset of Fig. 10b) we show simulation results for the ratio of passage probabilities, which is in agreement with our theoretical result Eq. (72), i.e., P I→II + /P II→I − = e −Q/kBTenv .

VIII. APPLICATIONS TO ACTIVE MOLECULAR PROCESSES
In contrast to the driven colloidal particles discussed in the last section, which is best viewed as a a continuous stochastic process, many biochemical processes are often described in terms of transitions between discrete states. Examples are the motion of a molecular motor on a substrate with discrete binding sites, or a chemical reaction that turns reactants into products. The statistics of waiting times of discrete processes can be obtained by a coarse-graining procedure of continuous processes. We apply our theory to a chemical driven hopping process with one degree of freedom and to the dynamics of RNA polymerases described by two degrees of freedom.

A. From continuous to discrete processes
We can apply the theory developed above to systems which progress in discrete steps using a coarse-graining procedure [75,76]. We consider a single continuous mesoscopic degree of freedom X, which can be used to describe a stochastic cyclic process. The variable X can be interpreted as a generalized reaction coordinate or the position coordinate of a molecular motor, and obeys the Langevin equation (73) with the same noise correlations. The effective potential V (X) now describes the free energy profile along a chemical reaction or position coordinate of a molecular motor.
We coarse grain the continuous variable X by considering events when the particle passes discrete points at positions X M . The transition times between these points are examples of waiting times, similar to those in Kramers' theory [64]. An example is shown in Fig. 11, for which the points X M are located at the minima of the FIG. 11: Coarse-graining procedure from a continuous Langevin process (top) to a discrete Markov process (bottom). The horizonal axis denotes either a chemical coordinate, which quantifies the progress of a chemical reaction, or a position coordinate, which quantifies the position of a molecular motor. In the Langevin description this coordinate is described by the position of a Brownian particle (grey circle) moving in a periodic potential V (X) of period (blue curve) and driven by an external bias F . In the discrete process this coordinate is described by the state of a Markov jump process with stepping rates k+ and k−, which are related to the waiting times and the passage probabilities through k+ = P potential V . We introduce the transition times T M →M +1 to reach the final state X M +1 , for the first time, starting from the initial state X M and allowing several passages through X M . Similarily, we define T M +1→M for the reverse process. The entropy change associated with a transition is s tot = F /T env . Entropy production is therefore related to position X M by: where N (t) = (X M (0) − X M (t)) / is the number of steps in the negative direction minus the number of steps in the positive direction up to time t. The transition times are thus first-passage times of entropy production: − . The corresponding change in entropy is s tot = F /T env , where we have used that the process is cyclic. The probabilities for forward and backward hopping are the passage probabilities for entropy production P − . The passage probabilities and the statistics of transition times obey the universal equalities that we have derived in the sections IV, V and VI of this paper. They can also be related to the usually discussed transition rates k ± ≡ P (2) ± 1/T (2) ± , which satisfy the condition of local detailed balance k + /k − = e F /kBTenv [35][36][37], as follows from Eqs. (62) and (66).

B. Chemically-driven hopping processes described by one degree of freedom
Enzymes or molecular motors that are driven out of equilibrium by a chemical bias or external force are examples of such stepping processes. The thermodynamic bias of a chemical transition is often of the form F = ∆G − F mech where ∆G denotes the chemical free energy change associated with the chemical transition and F mech is an externally applied force opposing motion driven by positive ∆G.
We are interested in the statistics of chemical transition times T M →M , which are the first-passage times of entropy production T (2) − on the mean dwell times could be tested experimentally for molecular motor that make reversible steps, for example, F 0 F 1 -ATP synthase [77] and RNA polymerase in the backtracking state [78,79].
We can also discuss the extreme-value statistics of the number of steps N . We denote by N max (t) > 0 the maximum number of steps against the bias. The infimum law Eq. (1) implies an upper bound for the average of N max (t) given by for ∆G > F mech . The right-hand side of Eq. (81) is the inverse of the Péclet number Pe = v /D. Moreover, Eq. (2) implies that the cumulative disitribution of N max (t) is bounded from above by an exponential with n ≥ 0. Equation (82) states that the probability that a molecular motor makes more than n backsteps is smaller or equal than e −n(∆G−F mech )/kBTenv . Therefore, our results on infima of entropy production constrain the maximum excursion of a molecular motor against the direction of an external force.
C. Dynamics of RNA polymerases: an example with two degrees of freedom We now apply the general results of our theory to a more complex example of a biomolecular process, which cannot be described in terms of a single degree of freedom, namely, the dynamics of RNA polymerases on a DNA template. RNA polymerases transcribe genetic information from DNA into RNA. During transcription, RNA polymerases adopt two different states: the elongating state and the backtracking state [78]. Elongation is an active process where RNA polymerases move stepwise and unidirectionally along the DNA while polymerizing a complementary RNA driven by the chemical free energy of the incorporated nucleotides, as illustrated in Fig. 12a). In the elongating state the motion of RNA polymerase and the polymerization reaction are fuelled by the hydrolysis of ribonucleotide triphosphate (NTP), which provides the free energy ∆G NTP per nucleotide [80]. Backtracking is a passive motion of the RNA polymerase on the DNA template that displaces the RNA 3' end from the active site of the polymerase, and leaves the enzyme transcriptionally inactive [81], as illustrated in Fig. 12b). Transcription is thus an active polymerization process that is interspersed by pauses of passive stepping motion.
The main properties of the dynamics of RNA polymerases are the following. In the elongating state, RNA polymerases can either continue elongation (green arrows in Fig. 12 a) and c)), or enter a backtracking state (blue arrows in Fig. 12 a) and c)). A RNA polymerase in the backtracking state diffuses on the DNA template until its active site is realigned with the 3' end of the RNA [78,82] This diffusive motion is often biased by either the presence of external opposing forces F mech or by an energy barrier ∆V RNA related to the secondary structure of the nascent RNA [83,84], see Fig. 13.
The dynamics of RNA polymerase can thus be described as a continuous-time Markov jump process on a two-dimensional network [84][85][86], see Fig. 12 c). The state of a polymerase is determined by two variables: X denotes the position of the polymerase (in nucleotides) along the DNA template, and Y denotes the number of NTP molecules hydrolysed during elongation. When X = Y the polymerase is in the elongating state, and when X < Y the polymerase is in the backtracked state. The variable N = Y − X denotes the depth of the backtrack. We consider for simplicity a stationary processs with steady-state distribution p ss (X, Y ) on a homogeneous DNA template. Therefore, the steadystate probability to find the polymerase in an elongating state is p ss (0) = p ss (X, X), and the steady-state probability to find the polymerase backtracked by N steps is p ss (N ) = p ss (X, X + N ).
Two relevant quantities in transcription are the pause density [78,85,87,88], i.e., the probability per nucleotide that an elongating polymerase enters a backtracked state, and the maximum depth of a backtrack [78,79], i.e., the maximum number of backtracked nucleotides. Our theory provides expressions for the statistics of these quantities in terms of infima and stopping-time statistics of entropy production.
If the probability flux from the elongating state into the backtracked state is smaller than the flux of the reverse process, which implies that entropy S tot reduces a) Elongation The polymerase diffuses along the DNA template until its active site (red shaded area) aligns with the 3' end of the transcribed RNA. c) Network representation of a Markov jump process describing the dynamics of RNA polymerases. The X-coordinate gives the position of the polymerase along the DNA template, and the Y -coordinate gives the number of nucleotides transcribed. Green nodes represent elongating states and brown nodes represent backtracked states. The inset shows all possible transitions from or to an elongating state. The green solid arrow denotes active translocation of the RNA polymerase, and the dashed green arrow denotes its time reversed transition. Because the heat dissipated during the forward translocation is much larger than kBTenv, the rate of the backward transition is very small. The blue arrow corresponds to the entry in a backtracked state, and the brown arrow is the exit from a backtracked state into the elongating state. In the backtracked state motion is biased towards the elongating state.
when the polymerase enters a backtrack, the pause density is equal to a passage probability P (2) − of entropy production. In addition we consider that an elongating polymerase only moves forward. These conditions are necessary for backtrack entry to correspond to a firstpassage process of entropy production with two absorb-Absorbing state (Elongation) FIG. 13: The dynamics of RNA polymerase during backtracking. The motion of the polymerase is represented as a diffusion process in a tilted periodic potential with an absorbing boundary on the right (which corresponds to the transition from the backtracked state into the elongating state). The period of the potential is equal to the length of one nucleotide (nt). The potential is tilted towards the absorbing boundary due to an energetic barrier ∆VRNA, and an external opposing mechanical force F mech pushes the polymerase away from the absorbing boundary. This process can be coarse grained into a discrete one-dimensional hopping process with waiting times T ing boundaries of different sign. The probability P (2) + then corresponds to the probability for the polymerase to attach a new nucleotide, as illustrated in the inset in Fig. 12 c). These probabilities obey Eqs. (5) and (6) [89], with −s − tot = ∆S be + F mech /T env and s + tot = (∆G NTP − F mech )/T env . Here, ∆S be = −k B ln (p ss (1)/p ss (0)) is the system entropy change when entering the backtrack. If ∆G NTP /k B T env 1, which holds in typical experiments [85], we have e −s + tot /kB 1, and thus we find simple expressions for the pause densities and the probability to continue elongations The backtrack entry of a polymerase, and therefore the pause density, is thus determined by a kinetic competition between an entropic term T env ∆S be < 0 and the work done by the force acting on the polymerase F mech > 0. We can also discuss the dynamics of backtracked RNA polymerases. During a backtrack the dynamics of RNA polymerase is captured by a one-dimensiononal biased diffusion process with an absorbing boundary corresponding to the transition to elongation [79], see Fig. 13. During backtracking the bias F = ∆V RNA / − F mech , where ∆V RNA is an energy barrier associated with the secondary structure of the nascent RNA [83,84,88]. The waiting times of the corresponding Markov jump process are equal to first-passage times of entropy production. For a polymerase with backtrack depth N , the waiting time to decrease N by one T (2) + (N ), and the waiting time to increase N by one T (2) − (N ), are first-passage times of entropy production with two absorbing boundaries s + tot = −k B ln (p ss (N − 1)/p ss (N )) + F /T env and −s − tot = −k B ln (p ss (N + 1)/p ss (N )) − F /T env . If the bias dominates, the two boundaries have opposite sign and we can use the theory of stopping times developed here. The hopping probabilities for forward and backward steps in the backtrack follow then from our general expressions for passage probabilities of entropy production Eqs. (5) and (6).
The maximum backtrack depth N max (t), at a time t after entering the backtrack state, is related to the infimum of entropy production S inf (t). The infimum law (1) provides therefore a thermodynamic bound on the average of the maximum extent of a backtrack. Using Eq. (81) with ∆V RNA 0.5k B T env and F mech 0.4k B T env [83,84,88], we estimate the upper bound for the maximum backtrack depth N max (t) 10 nucleotides; we have used the parameter values F mech = 5 pN, k B T env = 4 pN nm and = 0.34 nm the distance between two nucleotides. Similarily, from Eq. (82) we find that the cumulative distribution of the maximum backtrack depth is upper bounded by an exponential [90]. This is consistent with single molecule experiments on RNA polymerase backtracking, which have reported that the distribution of the backtrack extent is exponential (see Fig. S3 in [78]).

IX. DISCUSSION
The second law of thermodynamics is a statement about the average entropy production when many realizations of a mesoscopic system are considered. This fundamental law leaves open the question whether fluctuations of entropy production also obey general laws [42,71,[91][92][93][94][95]. In the present paper we have demonstrated that the infimum of entropy production and stopping times of entropy production exhibit statistical features that are universal and do not depend on the physical nature of a given system. The simplest example that illustrates these universal features is the case where entropy production follows a stochastic drift-diffusion process described by the Langevin equation where the constant drift velocity v S > 0 corresponds to the average entropy production rate. The Gaussian white noise η S (t) has zero mean and the autocorrelation η S (t)η S (0) = 2D S δ(t). The diffusion constant D S characterizes entropy production fluctuations and obeys These relations can be derived from Eq. (75). These equations exhibit all the universal properties of entropy production discussed in this paper, see Section VII.
The infimum of entropy production up to a given time must be either zero or must be negative, even though entropy production is always positive on average. Here, we have shown that when averaging over all realizations in a nonequilibrium steady state, the average infimum of entropy production is always greater than or equal to −k B . Furthermore, the global infimum of entropy production is exponentially distributed with mean −k B . This is an exact result for the distribution of global infima of a correlated stochastic process, which is interesting because only few similar results are known [96][97][98].
Our results are of special interest for experiments, because entropy production reaches its global infimum in a finite time. The exact results on entropy infima could be verified experimentally, for example using colloidal particles trapped with optical tweezers [68,69,99]. Other experimental systems which could be used to test our results are single molecule experiments [77,100,101] or mesoscopic electronic systems such as quantum dots [31].
We have furthermore shown that the stopping-time statistics and the passage statistics of entropy production exhibit universality. We have found a remarkable symmetry for the stopping times of entropy production. In a network of states, this symmetry relation implies that, for each link between two states, the statistics of waiting times is the same for forward and backward jumps along this link. Measuring statistics of waiting times thus reveals whether two forward and backward transitions along a link in a network of states are each other's time reverse. If the corresponding waiting-time distributions are not equal, then forward and backward transitions are not related by time reversal.
Our work is based on the finding that in steady state e −Stot(t)/kB is a martingale process. A martingale is a process for which the mean is unpredictable, or equivalently, represents a fair game with no net gain or loss. The theory of martingales is well developed because of its importance to fields such as quantitative finance [102,103] and decision theory [104]. In stochastic thermodynamics martingales have not yet attracted much attention [53]. Our work reveals the relevance of martingales to nonequilibrium thermodynamics. We also show that entropy production itself is a submartingale, and relate this fact to the second law of thermodynamics. Remarkably, the universal statistics of infima and stopping times all follow from the martingale property of e −Stot(t)/kB .
Our results on entropy production fluctuations provide expressions for the hopping probabilities, the statistics of waiting times, and the extreme-value statistics of active molecular processes. We have illustrated our results on active molecular processes described by one degree of freedom, and on the stochastic dynamics of DNA transcription by RNA polymerases. Our theory provides expressions for the probability of molecular motors to step forwards or backwards in terms of the entropy produced during the forward and backward steps, and relates waiting-time statistics of forward and back-ward transitions. Moreover, the infimum law provides a thermodynamic bound on the maximal excursion of a motor against the effective force that drives its motion. For the dynamics of RNA polymerases this implies that the statistics of the maximum backtrack depth is bounded by a limiting exponential distribution. This provides predictions about the mean and the second moment of backtrack depths that could be tested in future single-molecule experiments.
Cells and proteins often execute complex functions at random times. Stopping times provide a powerful tool to characterize timing of stochastic biological processes. It will be interesting to explore whether our approach to the statistics of stopping times of nonequilibrium processes is also relevant to more complex systems, such as, flagellar motor switching [105], sensory systems [106][107][108][109], selfreplicating nucleic acids [110,111] or cell-fate decisions [112,113].

Definition of a martingale process
A sequence of random variables X 1 , X 2 , X 3 , . . . is martingale when E (|X n |) < ∞, for all n ∈ N, and the conditional expectation E (X n |X 1 , X 2 , . . . , X m ) = X m , for all m ≤ n. A martingale is thus a process for which the mean is unpredictable, or equivalently, represents a fair game with no net gain or loss. Note that here we use the standard notation E P (X n ) = X n P for the expectation of a random variable X n with respect to the measure P.
For our purposes, we also need a general definition of a martingale process with respect to a filtered probability space. We consider a stochastic process X(ω; t), with t ∈ [0, ∞[, and a filtered probability space (Ω, F, {F(t)} t≥0 , P). For processes in continuous time, we consider that X is right continuous. A process X(ω; t) is a martingale with respect to the filtered probability space (Ω, F, {F(t)} t≥0 , P) when: X is adapted to the filtration {F(t)} t≥0 ; X is integrable, i.e., E (|X(ω, t)|) < ∞; and the conditional expectation E (X(ω; t)|F(s)) = X(ω; s) for all s < t. The latter condition is equivalent to ω∈Φ X(ω; s)dP = ω∈Φ X(ω; t)dP , for any s < t and for any Φ ∈ F(s). A sub-martingale satisfies the inequality ω∈Φ X(ω; s)dP ≤ ω∈Φ X(ω; t)dP , for any s < t and for any Φ ∈ F(s). The Wiener process, also called a Brownian motion, is an example of a martingale process.

Uniform integrability
Uniformly integrable processes play an important role in martingale theory. We call a stochastic process X uniform integrable on the probability space for all t ≥ 0. The indicator function I |X(t)|>K is defined as: A bounded random process is uniformly integrable, and a uniformly-integrable process is integrable. Uniform integrability can thus be seen as a condition less stringent than boundedness, but more stringent than integrability. For a uniformly integrable martingale process, the random variable X(ω; ∞) = lim t→∞ X(ω; t) exists [47], and it is finite for P-almost all ω ∈ Ω. A process is uniform integrable, if and only if [47]: for t ≥ 0, and with X(ω; +∞) integrable, i.e., E(|X(ω; +∞)|) < ∞.

Doob's maximal inequality
Doob's maximal inequality provides an upper bound on the cumulative distribution of the supremum of a nonnegative submartingale process X(ω; t) in a time interval I [38], viz., for any constant λ > 0. Doob's maximal inequality, given by Eq. (A6), holds for nonnegative submartingales in discrete time, and for right-continuous nonnegative submartingales in continuous time [47]. Note that Doob's maximal inequality is a unique property of martingales, and a stronger result than Markov's inequality. Markov's inequality provides an upper bound on the cumulative distribution of a nonnegative random variable X: with λ > 0. Since Markov's inequality does not imply Doob's maximal inequality, but, Doob's maximal inequality does imply Markov's inequality for nonnegative martingales.

Entropy production of a filtered probability space
Entropy production S tot (ω; t) is a stochastic process adapted to a stationary filtered probability space (Ω, F, {F(t)} t≥0 , P). We define S tot (ω; t) as the Radon-Nikodým derivative of the measure P| F (t) with respect to the time-reversed measure P • Θ t | F (t) [11,51]: for t ≥ 0. For stationary probability measures P, Eq. (B1) is the same as Eq. (18). An analogous definition applies to entropy production for negative values of time, t ≤ 0. The definition of entropy production, given by Eq. (B1), requires thus absolute continuity of P with respect to P • Θ t . Note that it is possible to define entropy production for processes with stritcly irreversible transitions [19], but, Eq. (B1) does not apply to such processes. Stochastic entropy production (B1) is thus the stochastic process of a filtered probability space that characterizes time's arrow in the possible outcomes of a random process [42]. For mesoscopic processes in contact with an equilibrated environment the definition of entropy production, Eq. (B1), is thermodynamically consistent [7,8,11,43]. In statistical physics this latter condition is often called local detailed balance or generalized detailed balance.
We now discuss the mathematical properties of stochastic entropy production. We first show that entropy production is a process of odd parity with respect to the time inversion operator Θ t . Secondly, we show that e −Stot(ω,t)/kB is a uniformly integrable martingale with respect to the filtered probability space (Ω, F, {F(t)} t≥0 , P) generated by the dynamics of the mesoscopic degrees of freedom. This is a key property of entropy production and allows us to apply the techniques of martingale processes to entropy production.

Entropy production under time reversal
An interesting property of entropy production is the change of its sign under the time-reversal operation Θ t , viz., where we have used that Θ t = Θ −1 t .

The exponential of negative entropy production is a martingale in steady state
We now show that the process e −Stot(ω;t)/kB , adapted to a stationary and filtered probability space (Ω, F, {F(t)} t≥0 , P), is a martingale. This follows directly from the fact that e −Stot(ω;t)/kB is a Radon-Nikodým density process.
The process e −Stot(ω;t)/kB is the Radon-Nikodým density process of the filtered probability space (Ω, F, {F(t)} t≥0 , P) with respect to the filtered probability space (Ω, F, {F(t)} t≥0 , Q): with Q = P • Θ, the time reversed measure and t ≥ 0. Consider two sub-σ-algebras F(τ ) and F(t) of F, with τ < t. We first write the measure Q (Φ) of a set Φ ∈ F(τ ) as an integral over the probability space for all sets Φ ∈ F(τ ), which is identical to the Eq. (A1) that defines a martingale process. The process e −Stot(ω;t)/kB is therefore a martingale process with respect to the measure P.

The exponential of negative entropy production is uniformly integrable
We show that e −Stot(ω;t)/kB is uniformly integrable. We use the necessary and sufficient condition (A5) for uniform integrability. The process e −Stot(ω;t)/kB is uniformly integrable since, by definition, the following condition is met: with dP•Θ dP (ω) = e −Stot(ω;+∞)/kB a positive and integrable random variable.

Appendix C: STOCHASTIC DOMINANCE
Consider two positive valued random variables X ≥ 0 and Y ≥ 0. We define their cumulative distributions as: We say that X dominates Y stochastically when the cumulative distribution functions of X and Y satisfy the relation F X (x) ≥ F Y (x). If X dominates Y stochastically, then the mean value of X is smaller than the mean value of Y : X ≤ Y . This follows directly from the relation X = ∞ 0 dx (1 − F X (x)) between the mean and the cumulative distribution.

Appendix D: PASSAGE PROBABILITIES FOR ENTROPY PRODUCTION
We generalize the relations for passage probabilities of entropy production, given by Eqs. (44)- (45), to passage probabilities of entropy production for trajectories starting from a given macrostate I. We define a macrostate I as a subset of the phase space {q,q}. Here, we consider macrostates defined by a given set of constraints on the variables of even parity I even ⊆ R n . The initial state ω(0) belongs to the macrostate I if q(0) ∈ I even .
We define the passage probability P +,I (and P −,I ) as the joint probability of the process being initially in the macrostate I, i.e., ω(0) ∈ I, and of entropy production to pass the threshold s + tot (s − tot ) before passing the threshold s − tot (s + tot ). Formally, the passage probabilities are defined by with Φ + and Φ − the sets defined in Eqs. (30) and (31) respectively, and Γ I the set of trajectories starting from the macrostate I: We also define the conjugate probabilitiesP Here, we have used that I =Ĩ, since we have defined the macrostate using constraints on variables of even parity only. For a steady-state process nonequilibrium, i.e., S tot (t) > 0, S tot (t) passes in a finite time one of the two boundaries with probability one. We thus have: In addition, we can derive, using Doob's optional sampling theorem, the following two identities: The entropy-production passage probabilities conditioned on the initial state are thus the same as the unconditioned entropy-production passage probabilities, given by Eqs. (44) and (45).

Appendix E: STOPPING-TIME FLUCTUATION THEOREMS FOR ENTROPY PRODUCTION
In this Appendix we derive the entropy stoppingtime fluctuation relations. We consider the definition of entropy production S tot (ω; t), given by Eq. (18). An s tot -stopping time T + is a stopping time for which S tot (ω; T + (ω)) = s tot .
We first derive the fluctuation relation for entropy stopping times in the first subsection, and we consequently apply this fluctuation relation to first-passage times in the second subsection. In the third subsection we consider a fluctuation relation for entropy stopping times of trajectories starting in a macrostate I. In the last subsection we use this generalized fluctuation relation to derive a fluctuation relation for waiting times of stochastic processes.

Fluctuation theorem for entropy stopping times
The fluctuation theorem for entropy stopping times, given by Eq. (57), in Section VI, follows from the follow-ing identities: In Eq. (E2) we write the measures of the F(t)-measurable sets Θ T+ Φ T+≤t and Φ T+≤t in terms of an integral over a probability space. In Eq. (E3) we have transformed the variables in the integral using the measurable morphism Θ T+ and the change of variables formula. The change of variables formula relates two integrals under a change of variables, viz., with X a random variable, measurable in a probability space (Ω, F, P); Φ a measurable set in this probability space, i.e., Φ ∈ F; and φ : F → F a measurable morphism from one probability space (Ω, F, P) to another probability space (Ω , F , P • φ), with the property that φ −1 (Ξ) ∈ F for each Ξ ∈ F (see excercise 1.4.38 in [50]). In Eq. (E4) we use the definition of the composition Θ T+ = T T+ • Θ. In Eq. (E5) we use that P is a stationary measure and thus P = P • T T+ . In Eq. (E6) we use the Radon-Nikodým theorem, given by Eq. (17), in order to change the integral over the measure P into an integral over the measure P • Θ, and then use our measure-theoretic definition of entropy production, given by Eq. (18), to write the Radon-Nikodým derivative in terms of entropy production S tot (t). Since e −Stot(t)/kB is a uniformly integrable martingale with respect to P, we use in Eq. (E7) Doob's optional sampling theorem Eq. (A8). In order to apply this theorem we additionally need that e −Stot(t)/kB is right continuous. In Eq. (E8) we use the fact that T + (ω) is an s tot -stopping time such that s tot = S tot (ω; T + (ω)). In the last Eq. (E9) we use that Φ T+≤t has non-zero measure and a/a = 1 for a = 0. The fluctuation theorem Eq. (57) can be written as a ratio between the cumulative distribution of the s totstopping time T + and a conjugate (−s tot )-stopping time T − . We define the (−s tot )-stopping time T − associated to the stopping time T + by We thus have and the fluctuation relation, given by Eq. (57), reads: Since the probability-density functions of T + and T − are given by we find the following fluctuation theorem for the distributions of entropy stopping times: which equals to Eq. (8).

First-passage time fluctuation theorems for entropy production
We apply the fluctuation theorem, given by Eq. (E16), to first-passage times of entropy production. We consider here first-passage times with one absorbing boundary T (1) ± and first-passage times with two absorbing boundaries T (2) ± (see subsection VI A).
The first-passage time T (1) ± denotes the time when the process S tot (ω; t) passes first the threshold ±s tot . If S tot (ω; t) never passes the threshold ±s tot , then T (1) ± = +∞.
The first-passage time T (2) + denotes the time at which the process S tot (ω; t) passes first the threshold s tot , given that it has not reached −s tot before: (E17) Analogously, we define T − : Note that other s tot -stopping times can be defined such as the times of second passage.
Since entropy production is a process of odd parity with respect to the time-reversal map Θ t , see Eq. (B2), we have the relations: and (E20) Using Eqs. (E12) and (E13) we thus find the firstpassage-time fluctuation relations: and 3. Generalized fluctuation theorem for stopping times of entropy production We define macrostates as subsets I (II) of the phase space of configurations {q,q}. We also consider the subsetsĨ Ĩ I of the corresponding time-reversed states I = {(q,q) : (q, −q) ∈ I} (ĨI = {(q,q) : (q, −q) ∈ II}). We associate to each macrostate I a set of trajectories Γ I : Note that Doob's optional sampling theorem, given by Eq. (A8), applies to the set Φ T+≤t ∩Γ I , since Φ T+≤t ∩Γ I ∈ F (T + ). We can therefore follow the steps in Eqs. (E1)-(E8) to find the following generalized fluctuation relation

Fluctuation relations for waiting times
Waiting times T I→II denote the time a process takes to travel between two macrostates I and II (the time it takes for the system to change its macrostate from I to II). Here we derive a fluctuation theorem for waiting times along tajectories that produce a given amount of entropy production. The entropy waiting time T I→II + denotes the time a process needs to travel from the macrostate I into the macrostate II while producing a positive entropy s tot , and given that the process has not returned to the macrostate I before. Analogously, the entropy waiting time TĨ I→Ĩ − denotes the time a process needs to travel from the macrostateĨI into the macrostateĨ while producing a negative entropy −s tot , and given that the process has not returned to the macrostateĨI before.
In order to define waiting times, we first define stopping times T I and T II that denote the time when a process reaches a given macrostate I or II, respectively: The waiting time T I→II denotes the time a process takes to travel from I to II for all trajectories ω for which ω(0) ∈ I, and the associated s tot -waiting time T I→II + is defined as T I→II + (ω) = T I→II (ω) S tot (T I→II (ω); ω) = s tot , +∞ S tot (T I→II (ω); ω) = s tot , for all trajectories ω for which ω(0) ∈ I. We also define a (−s tot )-stopping time TĨ I→Ĩ − , denoting the time when a trajectory reaches the macro stateĨ, and produces a negative total entropy −s tot : Note that the waiting time T I→II + (ω) is defined on all trajectories ω ∈ Ω, but we are interested in those trajectories for which ω(0) ∈ I, and thus set ω ∈ Γ I . Since We thus find the following fluctuation theorem for waiting times between macrostates: . (E35) If we set Γ I = Γ II = Ω, Eq. (E35) is equal to the entropy stopping-time fluctuation theorem, given by Eq. (E16). The quantity in the exponential of the right hand side of (E35) has in general no particular meaning. For macrostates defined by variables of even parity, i.e.,Ĩ = I andĨI = II, we have the relation ΓĨ I = Γ II . We recognize then in Eq. (E35) the system entropy change: between the two macrostates I and II.
A particular important example of the waitingtime fluctuation theorem, given by Eq. (E35), is for macrostates I and II that correspond to one single point in phase space, i.e., I = {q I } and II = {q II }. We find then the relation: with s env = s tot − ∆s sys the change in the environment entropy, and ∆s sys the system-entropy change, which here is given by ∆s sys = −k B log p ss (q II ) p ss (q I ) . (E38) 1. Cumulative distribution of the infimum of entropy production The cumulative distribution Pr (sup X(t) ≤ L) of the supremum sup X(t), of a stochastic process over a time interval [0, t], equals the survival probability Q X (L, t) of the process in the interval (−∞, L) at time t [98,115]: Pr (sup X(t) ≤ L) = Pr(X(s) ≤ L; s ≤ t) = Q X (L, t), with L > 0. For a general stochastic process, the survival probability in an interval can be calculated from the distribution p T (τ ; L) of first-passage times to reach an absorbing boundary located at L: We use this relation between to determine the cumulative distribution of extreme values of X(t).
The infimum of a drift-diffusion process with positive drift is equal to the supremum of a drift-diffusion process with the same but negative drift. We therefore consider the following two conjugate drift-diffusion processes: 1. X + (t) with velocity v, diffusion D, and initial condition X + (0) = 0, 2. X − (t) with velocity −v and diffusion D, and initial condition X − (0) = 0, and v > 0 in both processes. The dynamics of both processes follows from Eq. (73) for V (x) = 0 and for, respectively, a positive and negative velocity. The infimum value of X + (t) equals to minus the supremum of the conjugate process X − (t). In the following we derive analytical expressions for the statistics of the supremum of X − (t), and use these to obtain the statistics of the infimum of X + (t). The survival probability of X − (t) can be obtained from the distribution p T of first-passage times to first reach the threshold L, see Eq. (F2). The first-passage time distribution is given by From the relation between the conjugate processes, we relate the cumulative distribution of the infimum of X + over the interval [0, t], inf X + (t), to the survival probability of X − (t): Pr (− inf X + (t) ≤ L) = Pr (inf X + (t) ≥ −L) = Pr (sup X − (t) ≤ L) = Q X− (L, t) . (F5) Using Eq. (F4) and the property erfc(x) + erfc(−x) = 2, we obtain an analytical expression for the cumulative distribution of the infimum of the position of a driftdiffusion process with positive velocity: Finally, for the stochastic process S tot (t)/k B = (v/D)X + (t) the infimum distribution can be obtained by replacing in (F6) v and D by its effective values for the process S tot (t)/k B , given by v eff = v 2 /D and D eff = v 2 /D. Defining σ(t) = S tot (t) /k B = (v 2 /D)t we obtain: which equals to Eq. (78).

Mean infimum of entropy production
We first determine the distribution of the the mean infimum of X + , and then compute its mean value. Note that the infimum of X + equals minus the supremum of the conjugate process X − : inf X + (t) = − sup X − (t) . (F8) The cumulative distribution of the supremum of X − is given by where we have used Eq. (F4) and the property erfc(x) + erfc(−x) = 2. The distribution of the supremum of X − can be found deriving Eq. (F9) with respect to L, which yields: Pr (sup X − (t) = L) D . (F10)

First-passage-time statistics for two asymmetric boundaries
We consider the drift-diffusion process with two absorbing boundaries located at L + ≥ 0 and −L − ≤ 0, and with the initial position X(0) = 0. The passage probabilities P The corresponding entropy-production passage probabilities follow from the expressions (G3) and (G4) using the threshold values s − tot /k B = vL − /D and s + tot /k B = vL + /D. Equations (G3) and (G4) are in correspondence with the Eqs. (5) and (6) for passage probabilities of entropy production.
Notably the first-passage-time fluctuation theorem, given by Eq. .

(G6)
For asymmetric boundaries, the ratio of the first-passagetime distributions is time dependent, and converges to a finite value in the limit t → ∞ [71]. Consequently, for asymmetric boundaries the mean first-passage time for entropy production to reach the positive threshold s + tot is different than the mean first-passage time for entropy production to reach the negative threshold s − tot , T (1) + = T (1) − . When s + tot = s − tot = s tot we recover the time independent ratio, e stot/kB , as given by the first-passagetime fluctuation theorem.