Thermodynamics of computations with absolute irreversibility, unidirectional transitions, and stochastic computation times

Developing a thermodynamic theory of computation is a challenging task at the interface of non-equilibrium thermodynamics and computer science. In particular, this task requires dealing with difficulties such as stochastic halting times, unidirectional (possibly deterministic) transitions, and restricted initial conditions, features common in real-world computers. Here, we present a framework which tackles all such difficulties by extending the martingale theory of non-equilibrium thermodynamics to generic non-stationary Markovian processes, including those with broken detailed balance and/or absolute irreversibility. We derive several universal fluctuation relations and second-law-like inequalities that provide both lower and upper bounds on the intrinsic dissipation (mismatch cost) associated with any periodic process — in particular the periodic processes underlying all current digital computation. Crucially, these bounds apply even if the process has stochastic stopping times, as it does in many computational machines. We illustrate our results with exhaustive numerical simulations of deterministic finite automata (DFA) processing bit strings, one of the fundamental models of computation from theoretical computer science. We also provide universal equalities and inequalities for the acceptance probability of words of a given length by a deterministic finite automaton in terms of thermodynamic quantities, and outline connections between computer science and stochastic resetting. Our results, while motivated from the computational context, are applicable far more broadly.

The past decade also witnessed progress in thermodynamics of computation.Although many initial studies on energetic costs of computation have mostly concerned unit operations such as bit erasure [23][24][25][26], which is too primitive to be pertinent to the formal models of computation in theoretical computer science (TCS), very recent work started to investigate energetic costs of implementing computational machines central to TCS, which perform tasks such as string matching algorithms (which are justifiably more complex than bit erasure) [27][28][29].Figure 1(a) shows the general model of a computational machine (henceforth called a computer) which implements a basic algorithm presented in Fig. 1(b).
An algorithm is a finite procedure for implementing a given task, which can be executed in various physical ways, e.g., while modifying the current on electrical wires or the structure of a DNA origami.Formally, an algorithm consists of the instructions to be performed (which is implemented by the dynamics of the computer), the local variables and the memory arrays (stored by the computer), as well as mechanisms to decide when to repeat steps and when to halt.A computer executes an algorithm on a given set of inputs, starting from a certain initial state, potentially following unidirectional transitions in its state space, and halting at an arbitrary stochastic time that depends on the computation.Hence a general thermodynamic model of computers which implement arbitrary algorithms should be able to account for the energetic costs of implementing computational processes (i) at arbitrary stopping times, with (ii) unidirectional (possibly deterministic) transitions, and (iii) "absolute irreversibility" due to the computer being initialized to a designated start state.
However, most of the central results in stochastic thermodynamics do not directly apply to processes having the aforementioned three key ingredients of computational processes (stopping times, unidirectional transitions, and absolute irreversibility).In fact, a central assumption in much of stochastic thermodynamics is the condition of local detailed balance (LDB), which requires the system to have only bidirectional transitions, i.e., all transitions between any two states i → j with their reverse j → i have a finite, non-zero probability to occur in a finite time.On the other hand, LDB can be formally avoided by taking an inclusive Hamiltonian approach [30][31][32][33], which has been applied recently to computational machines [29].However, in general there may FIG. 1. Illustrations of computations with absolute irreversibility, unidirectional transitions, and stochastic computation times.An algorithm is executed by using a set of instructions, a finite control unit or local variables, a working memory to store the input or intermediate execution values, an address index, and mechanisms which specify when to loop and when to halt (the latter is of interest to us in thinking of stopping times).Both in computer science models and physical computers, the finite control unit generally corresponds to a circuit [as in (a), implementing the algorithm in (b)] or a DFA as in (c), here deciding whether input bit strings are divisible by four.In physical implementation of such devices which solve a myriad of computational tasks, energetic costs are inevitable.
be hidden nonequilibrium (driven) degrees of freedom which need to be included in the thermodynamic framework beyond a surrounding thermal bath.Moreover, even when assuming LDB, the ratio of transition probabilities between i → j and their reverse j → i can grow exponentially with the entropy exchanged with the environment, so that the reverse transition will not be effectively seen in relevant time-scales, leading to an effective unidirectionality.Until now, little is currently known about the stochastic thermodynamics of systems with broken LDB reflecting unidirectional transitions [34][35][36][37][38][39] or athermal and nonequilibrium environments [40][41][42][43][44].In addition, the recently established martingale theory of thermodynamics (see [19] for a review) --which formulates fluctuation theorems and second-law-like inequalities at generic stopping times--has not yet been extended to apply to systems with either unidirectional transitions or absolute irreversibility.
In this paper, we develop a nonequilibrium thermodynamics theory for computations with stochastic computational times that may have absolute irreversibility and unidirectional transitions [45].Throughout this work, we focus on the intrinsic thermodynamic costs of computations, modelled by generic Markovian dynamics, with minimal or no details about their physical implementation.We derive fluctuation relations for key thermodynamic quantities applicable to all computational processes which can be modeled as discrete-time Markov chains (DTMC), with both unidirectional and bidirectional transitions, and restricted initial probability distributions.These relations hold at both fixed and stopping times, and so simultaneously extend the martingale theory for stochastic thermodynamics to the case of DTMCs with absolute irreversibility and unidirectional transitions.
The thermodynamic meaning of our results is established by introducing a generic physical implementation of the computer as a periodically-driven process operated over a set of hidden degrees of freedom.This allow us to link dissipation from underlying (physical) to visible (computational) levels, and obtain quantifiers for the energetic costs of computations up to their halting time or between halting times of consecutive computations, and their statistics.Our results, while emphasized here for computational processes, can also describe a wide range of systems, including, e.g., biochemical processes with irreversible release of molecules.
We illustrate our results in deterministic finite automata (DFA).Loosely speaking, a DFA is a system with a finite state space, initialized to a special start state q 0 at t = 0, and a logical computer by itself, which can solve basic computational tasks such as string matching.More importantly, it constitutes the "finite logic" part [see Fig. 1(b), (c)] of the engineered computers at use today, and formally corresponds to the finite logic component of Turing machines (TM).

B. Summary of our contributions and roadmap
Consider a discrete-time computational task which is implemented by iteratively processing an input sequence of symbols w through a maximum of τ iterations.We refer to τ as the limit time of the computation.As an example, τ could be the length |w| of a bit string w, and the computation could involve iteratively processing each of those bits in sequence.
During such a computation, the state of the computer evolves in a stochastic manner, tracing a stochastic trajectory on a set of computational variables x [0,τ ] = x 0 , . . ., x τ .The computation finishes either at the limit time or, if it is earlier, at a stochastic computation time T , which in general will be a function of the precise input w.Formally, computational times T are specific examples of stopping times.A stopping time is the first time that a stochastic trajectory meets a specific predefined criterion [46].In this work, we deal with stopping times which associate to each specific trajectory x [0,τ ] = x 0 , . . ., x τ a stochastic time T ≤ τ that is always smaller or equal than the limit time.As an example of stopping time in computation, T could be the first time a DFA reaches a prescribed computational state A. T could also be defined as the first time a DFA reaches a state B ̸ = A once having left state A. Stopping times thus provide a flexible yet rigorous mathematical toolbox to tackle computations that last a stochastic amount of time.
In Sec.II, we provide the elementary concepts of our framework, including a formal definition of DFAs; the description of computational processes (such as the running of a DFA) as Markov chains, and the physical implementation of those Markov chains.We also review the relevant thermodynamic quantities in this context.
In this paper we investigate the thermodynamic costs of computations, paying particular attention to those with such stochastic durations.As described below, this will lead us to concentrate on a specific thermodynamic quantity, which we call the intrinsic mismatch cost.The intrinsic mismatch cost associated with a stochastic trajectory x [0,T ] that takes place in the interval [0, T ] of stochastic duration T is defined as In Eq. ( 1), ρ t (x) is the probability for the computer to be in state x at time t during the computation, whereas r(x) is an arbitrary reference probability distribution.On the other hand, ρ t+1 (x) and r ′ (x) correspond to the distributions retrieved applying one iteration of the computer to ρ t (x) and r(x) respectively.In particular, when the reference distribution minimizes the dissipation in the computer, it is called the prior distribution, and the associated quantity (1) is known as the mismatch cost of the computation up to time T .This cost provides a lower bound on the entropy production incurred by any digital synchronous computer that implements this dynamics over values of the computational variables x, without any precise assumptions about the continuous-time process implementing each successive iteration of the computation [47].Note that while the distributions ρ t and ρ t+1 are indexed by the iteration number, changing as the computation proceeds, the distributions r and r ′ are the same for every iteration of the computation.Some of our most important results are fluctuation relations and inequalities for the statistics of Σ(T ).These are three-fold universal, in that they are valid for all (i) reference distributions r(x) [although we will make specific choices for it to give a concrete thermodynamic meaning to Σ]; (ii) stopping rules T that halt before τ + 1; (iii) discrete-time Markov chains (DTMCs), even those with restricted initial conditions, and / or with unidirectional transitions, and even DTMCs that are implemented by underlying non-Markovian continuous-time processes.
To illustrate these results we need to first introduce more definitions.Here and throughout this paper, we use ⟨A(T )⟩ ≡ E[A(T )] for the expectation of any functional A of x [0,T ] over many realizations of the computation, each ending at the possibly different time T .In addition, we introduce the stochastic distinguishability at stopping times which quantifies the asymmetry of the computational process under time-reversal through its distribution and plays a key role for identifying possible reductions in thermodynamic costs along processes with stochastic duration [16].The distribution ρt (x) above is the probability for an auxiliary computation running backwards in time to be in state x at time t during the computation.Such auxiliary computation evolves with a transition matrix that is the Bayesian inverse of the original computation with respect to r, such that it leads to the perfect retrodiction in time of the distribution r [48].In Sec.III, we provide a formal definition for the auxiliary computational process, that allow us to address the thermodynamics of computations ending at a fixed time at the fluctuating level, and discuss how to incorporate explicitly the role of absolute irreversibility and unidirectional transitions, which is crucial because conventionally formulated computational machines have precisely those features.
In Sec.IV we present our main results for general computations starting and ending at stochastic halting times, which include fluctuation theorems and second-law inequalities that provide lower bounds to the average dissipation in a computation.A central result of our work is the derivation of an integral fluctuation relation at stopping times which applies to arbitrary computations: In this equation, Γ τ quantifies absolute irreversibility at stopping times.It equals the functional e −δτ (T ) averaged over the restricted set of trajectories that can take place in the auxiliary dynamics but have zero probability in the original dynamics, see Eq. ( 36).This contribution introduces an unavoidable source of irreversibility that limits possible reductions in the dissipation of the computer.
Among other things, Eq. ( 3) provides a second-law inequality at stopping times for the intrinsic mismatch cost, As we show in this paper, the right-hand side of Eq. ( 4), which sums up the net effects of time-asymmetry and absolute irreversibility, gives a universal lower bound not only for the intrinsic mismatch cost of the computation but also for the underlying average entropy production incurred by the computer.Moreover, for the particular case of a stationary reference distribution (r = π, with π ′ = π) ⟨Σ(T )⟩, equals the average (discrete-time) nonadiabatic entropy production [49,50].
We remark that Eq. ( 3) follows from a stronger result that we derive here.In particular e −Σ(t)−δτ (t) is a supermartingale process, i.e., e −Σ(t)−δτ (t) decreases with time when conditioned on an earlier part of the trajectory of states: s) , where t ≥ s ≥ 0, see Eq. ( 31) [51].
Another contribution in this paper arises when we apply the general martingale theory for thermodynamics [19], to extend our results to multiple, ordered stopping times.In particular, for the case of two stopping times T 1 and T 2 with T 2 ≥ T 1 we obtain another central result: which provides a powerful second-law inequality applicable to both starting and ending stochastic times of computations.As an example, in the case of a DFA, T 1 could be the first time that some particular state A is reached, and T 2 could be the first time state B is reached after state A has been reached.Alternatively, T 2 could be the second time state A is reached after the system has first reached state A, then left it, and then returned.See Sec.V for numerical illustrations of these ideas in a specific minimal model of a DFA processing binary strings.Moreover, from Eq. ( 5) a sandwich inequality for ⟨Σ(T )⟩ can be derived [52], (6) extending recent research in upper bounds and inverse thermodynamic uncertainty relations in stochastic thermodynamics given in [53][54][55].
In addition to the aforementioned fundamental results, in Sec.VI we also combine the supermartingale property of e −Σ(t)−δτ (t) with the fluctuation relation (3) to derive universal equalities and inequalities for the probability that a computation is completed within a certain amount of time.Such idea can be applied e.g. to compute the probability that a sequence of τ ordered computational states visited by a DFA during its evolution reaches an accept state.
Sec. VII is then devoted to sketch how our theory can be applied to investigate the thermodynamics of multiple concatenations of runs of a DFA, where after each run ends the system is reset to an initial start state and the next run begins.We conclude with Section VIII, where we present our main conclusions and further discuss future research directions motivated by our findings.Mathematical details of the derivations, proofs, and extra discussions are left to corresponding appendices.
It is worth remarking that all our contributions, while originally motivated from problems arising in the context of the kinds of computational machines central to computer science theory, are applicable to any periodic process implementing a time-homogeneous DTMC, one that results in trajectories x [0,T ] .

II. MARKOVIAN COMPUTATIONS
In the following, we make the assumption that the implementation of a task on a given computer is realized through a physical process which induces Markovian (discrete) dynamics over a set of relevant computational states.The actual physical process being modeled will be a generic physical, chemical or biological system, whose dynamics can be described at a microscopic level over a set of hidden degrees of freedom [56,57], here assumed to be not directly accessible.In particular, it is customary to model a computation as a continuous-time Markov chain (CTMC) [57][58][59][60][61].
In "synchronous" physical computers -such as all real-world digital computers -this CTMC is driven externally following a periodic protocol induced, e.g., by the AC electric current powering a computer.Such underlying periodic driving might be ignored in modelling the computational process, by describing its evolution by coarse graining it in time, and this results in an effective model given by a time-homogeneous DTMC.Throughout this paper, we will work at such a coarse-grained level, and consider computational processes as generic DTMCs with time-independent transition probabilities.In doing so, we will map the underlying physical process to the DTMC dynamics of the (symbolic) computational states to formulate the actual physical dissipation in a thermodynamically consistent manner.

A. Stochastic computational processes
We consider computational processes described by a DTMC that can take values over a discrete set of N ≥ 1 computational states x t ∈ X , with t = 0, 1, 2 . . . .For simplicity, we assume that the transition probabilities between the computational states are time independent (however, our results can be extended to time-dependent transition probabilities).
We write P (x t+1 | x t ) for the conditional probability of jumping to state x t+1 given that the previous state was x t in a single time-step or iteration of the computational process.(Note that in a DTMC x t can be the same as x t+1 , allowing for time instances where the system dwells in a given state.)We write ρ t (x) for the probability of being in state x at time t, given an ensemble of realizations of the Markovian process.The associated discrete-time master equation ρ t+1 = Wρ t , where ρ t is an N × 1 column vector and [W] i,j = P (x t+1 = i | x t = j) is the transition probability matrix.The transition matrix W has at least one fixed point with distribution π(x) such that Wπ = π, and if aperiodic and irreducible, π becomes the unique stationary distribution in the long time run, that is, lim t→∞ ρ t = π.However, what follows does not require π to be unique.
Throughout the paper we will write τ for the limit time of a computation, i.e., the maximum time that can be spent to execute a computation, and assume it to be fixed.The probability of a sequence x [0,τ ] = x 0 , x 1 , . . ., x τ is Here we allow for arbitrary initial distributions ρ 0 (x 0 ) and transition probabilities In particular, some of the transitions might be bidirectional (i ↔ j) and others unidirectional (i → j).
Bidirectional transitions are characterized by conditional probabilities verifying P (i | j) > 0 whenever P (j | i) > 0, while for unidirectional ones we can have P (i | j) = 0 with P (j | i) > 0. We notice that exactly because of the existence of unidirectional transitions, it is mandatory to relax the condition of local detailed balance, which is arguably among the most common assumptions adopted in the formulation of stochastic thermodynamics [62].
One of the main quantities of interest in stochastic thermodynamics is stochastic entropy production (EP) which equals the logarithm of the ratio between forward and time-reversed path probabilities of a thermodynamic process [3,63].This quantity, however, generically depends on the details of the underlying physical process implementing the computation, hence is not directly accessible unless certain simplifying assumptions, such as the condition of local detailed balance.Nevertheless, here we aim to obtain a thermodynamic description of the computational processes as deduced solely from the (discrete-time) dynamics of the visible variables defining the computation x t ∈ X .While our analysis holds for arbitrary DTMCs, we focus on digital synchronous computers, which undergo a time-homogeneous dynamics over discrete time, and which we connect to the underlying physical process generating it in a simple manner.This allows us to express and bound the entropy production of the computational task implemented by the DTMC, alongside the work and heat dissipated into the environment.For simplicity, we take the continuous-time physical process that implements the time-stationary DTMC to be periodic and choose units so that the period of the physical process is 1.
As an example (and to help ground the reader's intuition), suppose that our time-homogenous DTMC is implemented by a time-inhomogeneous CTMC.It is wellknown that in general, this requires that the CTMC evolves over an enlarged version of the DTMC's state space Y ⊇ X , which includes "hidden states" in addition to the "visible" states of the DTMC [57,64].In particular, this is true when the DTMC is the update function of a computational machine.Therefore our assumption that the continuous-time physical process is periodic implies that the time-inhomogeneous CTMC is periodic.As a result, the thermodynamics arising in any single iteration of the physical system (implementing the computational machine that starts at discrete time t in some state y(t) ∈ Y) is independent of t.In the following, for further simplicity (and to ensure a time-homogeneous DTMC) we also assume that non-computational degrees of freedom in Y are reinitialized within every single iteration to their (possibly nonequilibrium) initial states.

B. Mismatch cost
Enlarging our original description over Y to include all relevant physical variables of the computer is crucial to define the associated entropy production and other relevant thermodynamic costs of computation.Here we show that this can be done in a standard way.In particular, suppose that we are interested in some generic thermodynamic average cost function that can be written as where S(ϱ) = − i ϱ(i) ln ϱ(i) is Shannon entropy, ϱ 0 is any initial distribution over the (extended set of) states of the system, G is the linear map that transforms that distribution to an associated ending distribution ϱ τ , and F is an arbitrary linear functional of the initial state.Note that C(τ ) is an implicit function of ϱ 0 .As a canonical example, in CTMC-based stochastic thermodynamics obeying local detailed balance, the EP generated during a process is given by Eq. ( 8) by setting F equal to the average entropy flow to the environment: where K v ij (t) is the rate matrix associated to thermal reservoir v, and the rate matrix of the CTMC is [65].For different choices of F , C(τ ) gives different thermodynamic quantities besides EP, such as the drop in nonequilibrium free energy of the system during the process [66], among many others.
For any cost C of the form in Eq. ( 8), and any physical process represented by G, the prior distribution is defined to be the initial distribution ϱ 0 that minimizes C(τ ).(It is called the prior because it is, formally speaking, a prior distribution for calculating the posterior probability of an initial state of a thermodynamic process given its final state [67].)We write the prior as ϱ min .The associated average mismatch cost is where D(ϱ 1 || ϱ 2 ) = y ϱ 1 (y) ln[ϱ 1 (y)/ϱ 2 (y)], denotes the Kullback-Leibler (KL) divergence between the distributions ϱ 1 and ϱ 2 for the case of a discrete random variable.Note that M(τ ) is implicitly a function of ϱ 0 , ϱ min and the linear function G -but nothing else.It depends on no other property of the process besides ϱ min and G.
As shown in Refs.[27,56,57,66] we have, for all ϱ 0 where R(τ ) is an additive non-negative contribution independent of ϱ 0 called "residual cost".In the specific case in which F is identified with the entropy flow, residual cost is often called "residual EP".See Appendix A for a discussion of residual cost, and why we ignore it in this paper.Expressions analogous to Eqs. ( 8), ( 10) and ( 11) hold for other state spaces, e.g., real-valued states, density matrices, etc. Moreover,there are no assumptions of detailed balance or the like in the derivation of Eq. ( 10); it holds purely for mathematical reasons.For the trajectory level version of mismatch cost in Eq. ( 10) see Appendix B.
By the data-processing inequality for KL divergence [68], M(τ ) is never negative.Moreover, it can be shown that the prior ϱ min in Eq. ( 10) has full support (see App.A in Ref. [27]), which ensures that the mismatch cost is finite.Note also that the mismatch cost formula (10) is based on evaluating ϱ at both the beginning and the end of the time interval [0, τ ].This means this general formula applies to any physical process that maps ϱ 0 to ϱ τ , for any choice of C, i.e., any choice of the linear functional F .All the (messy) physical details of the process and the precise choice of F are buried in the prior and the residual cost.
In the following, unless explicitly stated otherwise we will focus on the case in which the cost function C(τ ) in Eq. ( 11) is the average entropy production of the continuous-time periodic process implementing the computation.If such a process is moreover Markovian, the entropy flow F will be given by the canonical CTMC expression in Eq. ( 9).However, markovianity of the continuous-time periodic process is not a necessary assumption for our results (see also Appendix C).

C. Strictly positive lower bounds for dissipation in periodic processes
As described above, in real world (synchronous, digital) physical computers, the underlying physical process implementing each iteration of the computer is identical.This is true whether that physical process is a CTMC, a quantum operation, and so on.As noticed in Ref. [28], this means that the prior ϱ min for each iteration of the computer is the same [69].Using also the fact that noncomputational variables are reinitialized in every single iteration (period) of the computational process, we can write the overall mismatch cost for any computation that takes exactly τ iterations in terms of computational variables as: where ρ t (x) = y / ∈X ϱ t (y) is the (marginal) distribution over computational states x ∈ X only, µ(x) = y / ∈X ϱ min (y) is the prior over computational states at the beginning of (every) iteration, and µ ′ = Wµ the prior at the end of every iteration (i.e., it is µ evolved to the end of the iteration).More details about the derivation of Eq. ( 12) for the DTMC are provided in Appendix C, where we explicitly elaborate its relation to the underlying EP in the continuous-time periodic process.
We note that if ρ 0 = µ in Eq. ( 12), then the first difference of KL divergences being summed equals 0. However, unless W is degenerate (e.g., the identity matrix), Wρ 0 ̸ = ρ 0 , and therefore Wρ 0 ̸ = µ.This in turn means that the second difference of KL divergences being summed in (12) does not equal 0 (so long as W is not logically invertible, i.e., not a permutation matrix).Therefore in this case, the overall sum will be strictly positive.This argument can be extended to prove that so long as W is not logically invertible (and ρ 0 is not a fixed point of the dynamics), the mismatch cost sum in Eq. ( 12) is not zero (see Appendix D).
Since the above reasoning is true for all actual µ, we can lower bound the sum (12) by minimizing over all distributions λ in the unit simplex ∆ X , whether or not they are a valid prior in some physical scenario: with again λ ′ = Wλ (see also Ref. [28]).The precise prior µ in Eq. ( 12) for the EP cost function will depend on the details of the precise physical process under consideration.On the other hand, the sum (13) is independent of those details.We therefore obtain a strictly positive lower bound on EP, given in toto by ρ 0 and W.This strengthened second law arises solely from the fact that we have a periodic process with a non-logically invertible W.Moreover, because minimization in Eq. ( 13) is over all possible priors, it provides a lower bound on all costs that can be written as in Eq. (8).We therefore refer to it as the minimal dissipation.
As an example, suppose that our DTMC is the dynamics of a noise-free, synchronous, digital computer, with update function f : X → X. Plugging in Eq. ( 13), the The term Ω t in Eq. ( 14) is the set of all states that have nonzero probability if the update function is applied to the actual distribution ρ 0 a total of t times.The term ρ 0 (f −t )(x) in Eq. ( 14) is the probability, under ρ, of the entire set of those states in X which, after t iterations of (the periodic processes underlying) the update function f of the digital computer, are in state x (and similarly for ρ 0 (f −t−1 (x)).Suppose that f is not just a permutation of the states of the computational machine that lie in the support of ρ 0 .Then Eq. ( 14) provides a strictly positive lower bound on the dissipation incurred by any physical device that implements that computation, f .
In the sense that it only depends on the conditional distribution W and the initial distribution ρ 0 , the bound for periodic processes in Eq. ( 13) is similar to the generalized Landauer's bound.In particular, the thermodynamic uncertainty relations and speed limit theorems are also lower bounds on EP that depend on the initial distribution over states and the discrete time conditional distribution of the dynamics.However, unlike the lower bound above, those other bounds depend on other properties of the process besides the initial distribution and the conditional distribution giving the dynamics (for example current precisions or expected activities).In this sense, the minimal dissipation given in Eq. ( 13) is more powerful than those other lower bounds on EP (a closed form of this result in terms of Jensen-Shannon divergence has been also reported very recently in Ref. [70]).
In this paper, we calculate mismatch costs by summing the cost over single iterations of a computational machine operating periodically, as in Eq. ( 12).In general this does not equal the standard mismatch cost for the entire computation, with an overall prior and a single drop in KL divergence between initial and final time τ .We remark that, to the authors' knowledge, the necessary and sufficient conditions for this quantity to be larger than the one we use in this paper are not known.However there is a particularly interesting case in which these two expressions become the same, namely, when EP is minimized at the stationary state of the DTMC, i.e. the prior µ coincides with π.In such case we recover from Eq. ( 11) the well-known decomposition of EP into adiabatic and non-adiabatic contributions [49,50,71,72], where mismatch cost reduces to non-adiabatic EP (also called excess EP [19]) and the residual cost becomes adiabatic EP (house-keeping heat [50,71]).

D. Deterministic Finite Automata
An important class of computational machines that can be described within our framework are the deterministic finite automata (DFA).There are several different, very similar definitions of DFA, some of which overlap with common definitions of "finite state machines".To fix the discussion, here we adopt the following definition.A deterministic finite automaton is a 5-tuple (Q, θ, q 0 , A, f ) where: current input symbol and the current logical state to a next logical state.
A finite string of successive input symbols, i.e., an input string ω ∈ θ, is sometimes called an (input) word.To operate a finite automaton on a particular input word, one begins with the automaton in its start state, and feeds that state together with the first symbol in the input word into the update function, to produce a new logical state.Then one feeds in the next symbol in the input word (if any), to produce a next logical state.Note that one can represent any given DFA's update function as a directed graph, where each edge (q 1 , q 2 ) taking logical state q 1 to state q 2 is labelled by the input symbols that would cause that transition (see Fig. 1 (c) and Fig. 2 for illustrations).
Our analysis of stochastic computational processes (as introduced above) in DFAs requires assigning probabilities to the input words (or to the symbols inside them) that are fed into the automaton, as well as to identify the computational states of the DTMC X , which may coincide or not with the set Q of logical states of the DFA (typically X may contain more variables as e.g.previously processes symbols).An important contribution of our work will be to show how one can do this analysis even though the dynamics of a DFA -its update function -is deterministic and often non-invertible (i.e., unidirectional), and given that the initial distribution over states of the DFA (though not over the input words) is a delta function, centered on the start state (i.e., leading to absolute irreversibility).
Physically, the (probabilistically generated) input word ω may be encoded in a tape whose symbols are read by the DFA "head" one by one in each cycle of the computation, but are not modified by the automaton operation [73].In this way the input tape behaves as an (energy-less) information reservoir [74], whose Shannon entropy is keep constant during the computation.More formally, we can consider that tape as forming part of the physical states of the computation in the extended state space Y (a Cartesian-product factor of Q, ω, and other physical variables depending on the implementation), but we will not generically include it within the computational states in X [75].On the other hand, we will eventually incorporate some already processed symbols explicitly into X , which are then assumed to be stored (and modified) in extra physical variables acting as a memory for the computer.
A typical question of interest in computer science is whether the DFA is in an accept state of the set A after the last symbol from the input word is processed.If that is the case, one says that the automaton accepts that input word.In this way any given automaton uniquely specifies a language of all input words that that automaton accepts, which is called a regular language.Importantly, any particular DFA can process input words of arbitrary length [76], and in general may enter and exit its set of accepting states multiple times, before the end of the input word.While the definition of whether an input word is accepted only depends on whether the ending logical state is an accepting state, the statistics of whether, how often, and precisely when a given DFA enters an accept state (when fed words generated by some given distribution) can be of independent interest.

III. INTRINSIC THERMODYNAMICS OF COMPUTATIONS AT FIXED TIMES
The mismatch cost sum introduced in Eq. ( 12) depends only on the computational degrees of freedom involved in the original DTMC dynamics and provides a lower bound on the average entropy production generated by the machine implementing the computation.It is hence a particularly useful candidate to assess the intrinsic (minimal) thermodynamic costs of computations.The prior µ(x) encodes the specific details of the physical implementation of the computational process.Concern for such details can be even avoided by considering the distribution ν(x) given by the infimum of Eq. ( 13), which still provides a useful (positive) bound on EP.
To begin, we construct a stochastic description based on thermodynamic quantities that can be computed by introducing an auxiliary process.This process is defined in terms of the "forward" discrete-time dynamics P (j | i), the initial distribution of that dynamics, ρ 0 (x), and a reference distribution r(x) over computational states.The reference r(x) is arbitrary, and in particular could be chosen to obtain stochastic versions of the mismatch cost sum in Eq. (12) [r(x) = µ(x)] and the minimum dissipation in Eq. ( 13) [r(x) = ν(x)].

A. Thermodynamic costs of periodic computations at the fluctuating level
We start by introducing the discrete-time auxiliary dynamics of the auxiliary process W i,j , with transition probabilities defined from the ones in W by where r ′ = Wr is the reference distribution r evolved for one iteration, i.e. r ′ (j) = i P (j | i)r(i) [77].This auxiliary process is a bona fide Markov chain [78].Moreover, r ′ transforms back into r in a single iteration under W.That is, W corresponds to the Bayesian inverse of W with respect to the reference distribution r, leading to perfect retrodiction for the distribution r [61,[79][80][81].
To fully specify the auxiliary dynamics we must specify its initial distribution; here we will always set it to distribution of the original actual dynamics at its limit time, i.e., ρ0 (x) = ρ τ (x).So the joint distribution of a trajectory x [0,τ ] under the auxiliary dynamics is Note that this choice of the initial distribution of the auxiliary process is not restricted by the choice of r in any way.Note as well that P (i | j) does not necessarily coincide with the transition probabilities induced by the time-reversed implementation of the underlying physical process, but it is solely defined from the distribution r(x) and the original Markov chain transition probabilities.Using Eqs. ( 15) and ( 16), we can write the probability of a time-reversed discrete-time trajectory, Θx [0,τ ] = x τ , x τ −1 , . . ., x 0 , under the auxiliary dynamics as The ratio between the path probability to observe a given trajectory of states, and the path probability to observe its time reversal under the auxiliary dynamics is providing us, for r = µ, a stochastic version of the mismatch cost sum in Eq. ( 12), and for r = ν, the minimal dissipation in Eq. ( 13).The functional Σ(x [0,τ ] ) is an example of a "Σ−entropic functional", as introduced in Ref. [19].
The specific choice for the transition probability of the auxiliary dynamics introduced in Eq. ( 15) is crucial for avoiding divergences that would be induced by unidirectional links if we evaluate expressions like ln[P (i | j)/P (j | i)] -expressions that appear in most functionals associated with entropy production.This makes the functional Σ given by Eq. ( 18) suitable to tackle fluctuations of Markovian processes with unidirectional transitions, which are precisely the (idealized) dynamics of many computational processes.
Here and in the following, as shorthand, we will often write trajectory-level quantities such as Σ(x [0,τ ] ) simply as Σ(τ ), with the precise trajectory left implicit.Following such shorthand notation, equation ( 18) can be decomposed as where we write the change in stochastic Shannon entropy of the computer as and write the change in the nonequilibrium potential as Such non-equilibrium potentials have been fruitfully employed in steady-state thermodynamics [42,50,82], and account for the excess of entropy absorbed from the environment during the computation x [0,τ ] whenever the state of the system ρ t differs from the distribution r along its time evolution.Suppose that the initial distribution ρ 0 (x) has full support.Then if we average Eq.( 18) over P (x [0,τ ] ) we get ⟨ P (Θx [0,τ ] )/P (x [0,τ ] )⟩ = 1, which is an integral fluctuation relation [3], e −Σ(τ ) = 1.Moreover, ⟨ln[P (x [0,τ ] )/ P (Θx [0,τ ] )]⟩ ≥ 0 is a KL divergence, which can be rewritten in an appealing form as We notice that for the choice r = µ we recover the expression for mismatch cost sum in Eq. ( 12) while for r = ν we obtain Eq. ( 13), as expected.Crucially, for the two choices r = µ and r = ν, the quantity ⟨Σ(τ )⟩ provides a lower bound on the total average entropy production incurred in the physical implementation of the computational process, and therefore we may refer to it as the intrinsic mismatch cost associated to a given computation.We remark that here and above averages are over trajectories of fixed length τ , that is, ⟨Σ(τ For more general choices of r, the quantity Σ(τ ) can still be defined (as long as the distribution r has full support over X ), however it cannot be guaranteed in general that ⟨Σ(τ )⟩ would provide a lower bound on the underlying entropy production anymore.In particular, by taking r = π, the stationary state of the DTMC, Σ(τ ) becomes the discrete-time non-adiabatic entropy production for a relaxation process, whose average reads thus we recover the expression for EP proposed by Spohn [83] (see also Ref. [42]).Remarkably in this case ⟨Σ(τ )⟩ becomes non-extensive in time, contrary to the general case [c.f. ( 22)].As a consequence, the steady state π of the DTMC (whenever aperiodic and irreducible) becomes the natural candidate for the prior ν providing the infimum in Eq. ( 13) in the large time limit.Therefore we expect the non-adiabatic entropy production in Eq. ( 23) to provide the minimum dissipation of the computation in many cases of interest.However it is worth remarking that for ensuring the average non-adiabatic entropy production to be a lower bound on the EP would require π to share support with the initial distribution ρ 0 -which would often not be the case in the computational context-and π being also invariant state in the time-reversed (underlying) physical dynamics of the computer [42].

B. The role of absolute irreversibility
In many models of computation in TCS, the initial distribution ρ 0 (x) over the states of the computational machine is restricted to a subset of computational states in X .For instance, almost any automatonin particular, not just a DFA but also a TM-starts in a single, predetermined state, x 0 .Such a system may have a delta-function initial distribution, ρ 0 (x) = δ x,x0 .For such an initial distribution the quantity e −Σ(τ ) = P (Θx [0,τ ] )/P (x [0,τ ] ) may become ill-defined as there might be trajectories for which P (x [0,τ ] ) = 0, but P (Θx [0,τ ] ) > 0, e.g., trajectories in the auxiliary dynamics that do only reach states different from x 0 .This phenomenon has been often referred to as absolute irreversibility [84,85].
Following the techniques in Refs.[84][85][86] one can circumvent the divergence associated with absolute irreversibility by restricting the averages over sets of trajectories for which the intrinsic mismatch cost, Σ(τ ), is well defined.Adopting the language of modern probability theory [87], we call such sets filtrations (see also [19]).In particular, we denote F the filtration containing all possible trajectories x [0,τ ] taking place in [0, τ ].Similarly, we call F AI the filtration containing all "absolutely irreversible" trajectories, that is, trajectories for which P (x [0,τ ] ) = 0, but P (Θx [0,τ ] ) > 0. On the other hand, we denote the complementary set of "absolutely continuous" trajectories as F AC , such that F = F AC ∪ F AI .Using these definitions, an extended version of the integral fluctuation theorem (IFT) for the intrinsic mismatch cost is shown, where 0 ≤ γ τ ≤ 1 the total probability that the timereversed picture of any absolutely irreversible trajectory (i.e.belonging to F AI ) occurs in the auxiliary dynamics Applying Jensen's inequality ⟨e x ⟩ ≥ e ⟨x⟩ to the IFT (24) we obtain a lower bound on the intrinsic mismatch cost, implying a minimum dissipation due to the restricted initial condition: where the second inequality follows from γ τ ≥ 0 and hence extends the applicability of Eq. ( 22) to systems showing absolute irreversibility.We remark that here absolute irreversibility arises because of the restricted initial distribution, but not because of the unidirectional transitions, since they have been flipped in the auxiliary dynamics according to Eq. ( 15).Fluctuation theorems similar to Eq. ( 24) has been previously derived within the canonical framework of stochastic thermodynamics for entropy production [84] and standard mismatch cost [66], as well as in the inclusive Hamiltonian framework for entropy production [29].

IV. THERMODYNAMICS OF COMPUTATIONS AT STOCHASTIC STOPPING TIMES
We now extend our analysis to investigate the thermodynamics of computations which first reach a computational state of interest at a time that varies depending on the random input provided to the computer.In doing so, we extend the martingale theory for stochastic thermodynamics [19] to accommodate unidirectional transitions and arbitrary initial distributions leading to absolute irreversibility.
Consider a random sequence of τ bits sequentially fed into a computer (e.g. a DFA) see also Fig. 1: with τ ≥ 1 being the word length processed by the machine.While processing a specific sequence, the computer jumps between its computational states, as described in Sec.II D. We are interested in the thermodynamics of the (physical implementation of the) computer during the time from when it starts to a stopping time, T , that is until when a stopping condition is met.For example we will often consider that the stopping condition is simply that the computer has for the first time reached an accept state.Note that this stopping time generally takes a different value when processing different words.Since the words are generated by sampling a distribution, this means that the stopping time is a random variable.Generalizing from this case to give a fully formal definition, a stopping time is the earliest instance when a particular condition concerning the entire trajectory generated by a stochastic process is met: where Ω ⊆ F denotes the set of trajectories satisfying the stopping condition.For example, Ω might be the set of trajectories of a given DFA that have reached an accept state at least once.Note that its definition in Eq. ( 28) involves a limit time τ .So the stopping time associated with each stochastic trajectory is a bounded random variable that obeys 0 ≤ T ≤ τ .As shorthand, from now on we will typically just write "T ", leaving the precise trajectory x [0,T ] implicit.It is also worth remarking that the computational machine does not necessarily stop functioning at T , but this variable can just signal to us the time at which a specific computation is processed (e.g.accepting a word).We will therefore sometimes refer to T in this context as the computation time, which is a particular instance of a (bounded) stopping time.

A. Martingale theory with absolute irreversibility
Inspired by [16], we now introduce the stochastic distinguishability between the computational process and the auxiliary process.Stochastic distinguishability (with respect to time τ ) evaluated at time t ≤ τ is defined as where ρτ−t (x) is the probability distribution of the auxiliary process defined in Eq. ( 15), evaluated at the conjugate time τ − t for the state x t .(Recall that the auxiliary dynamics has initial distribution ρ0 (x) = ρ τ (x), i.e., it is the distribution of the original dynamics at the limit time τ .)Stochastic distinguishability is a measure of the asymmetry between the original and the auxiliary dynamics and plays a crucial role in martingale theory for stochastic thermodynamics of non-stationary processes [19].
An important property of M τ is that the expectation of M τ (τ ) conditioned on a fixed trajectory ending at a time 0 ≤ t ≤ τ , satisfies where we introduced the quantity defined by and α τ (τ ) := 0 (See Appendix E for details.) Combining Eq. ( 31) with the fact that α τ (t) ≤ 1 we establish that M τ (t) = e −Σ(t)−δτ (t) is a supermartingale: i.e., its conditional expectation given a fixed trajectory of length t < τ monotonically decreases over time.Note that for t = 0 one has Σ(0) = 0 and hence Eq. ( 31) yields the IFT with absolute irreversibility [cf.Eq. ( 24)]: where we have used Eq. ( 25) in the last equality.In addition, in the absence of absolute irreversibility, F AI is the empty set and α τ (t) = 0 for all t ∈ [0, τ ].In such a case M τ (t) in Eq. ( 31) becomes a martingale.Therefore in that limit we would be able to use the analysis in [16,19] on the thermodynamics of systems with stochastic stopping times.However, that analysis does not directly apply for generic initial states ρ 0 (x) without full support.

B. Integral fluctuation relations with absolute irreversibility at stopping times
Fortunately, the fact that M τ (t) is a supermartingale rather than a martingale when our system has absolute irreversibility does not prevent us from analyzing its thermodynamics at stopping times.To carry out such analysis, here we closely follow the derivation of Doob's optimal stopping theorem for martingales, generalizing it to apply to supermartingales that are written as in Eq. (31).
As elaborated in Appendix F, this generalized form of the optimal stopping theorem provides an fluctuation theorem at stopping times, which is valid even in the presence of absolute irreversibility: where T ≤ τ is the (stochastic) stopping time, Γ τ ∈ [0, 1] is a contribution from absolute irreversibility, and therefore ⟨e −Σ(T )−δτ (T ) ⟩ ≤ 1.Since ⟨.⟩ is an average over trajectories, and different trajectories have different stopping times, ⟨e −Σ(T )−δτ (T ) ⟩ involves averaging over (stochastic) values of T .This introduces statistical coupling between the time T and the value Σ(T ).
The quantity Γ τ appearing in (35) is an average of the functional e −δτ (T ) evaluated at stopping times T for trajectories leading to absolute irreversibility: To understand its meaning intuitively, first note that the second summation in Γ τ is done over trajectories x [0,T ] that belong to AI , that is, trajectories verifying the stopping condition for the first time at T , but that have zero probability to occur in the original process P (x [0,T ] ) = 0, due to the restricted shape of the initial distribution ρ 0 (x).We notice also the presence of the distribution ρτ−T (x), which is due to δ τ (T ).That is, Γ τ consists of the total probability of trajectories starting at the stopped point x T according to distribution ρτ−t (x), and not turning back to the set of states with ρ 0 (x) > 0 under the auxiliary dynamics.Recall also that the reference distribution r determining the precise meaning of Σ(T ) appears in Eq. ( 36) only implicitly, due to the definitions of P and ρt .
The inequality Γ τ ≤ 1 is saturated when all trajectories are in the set F AI = F, for which the sum over all trajectories in Eq. ( 36) is obtained, that is Γ τ = τ t=0 x [0,t] ∈F (t) P (Θx [0,t] )ρ τ −t (x t )/ρ t (x t ) = 1.Moreover, we also have Γ τ ≥ 0, since it is a sum of probabilities.Whenever the initial distribution ρ 0 (x) is not restricted in the state space, we obtain Γ τ = 0, and recover the standard form of the fluctuation theorem at stopping times for non-stationary processes [16,19].
It is worth remarking here that our previous results for fixed times [Eqs.(24) and (25)] can be directly obtained from Eqs. (35) and ( 39) by letting T = τ , i.e., when all trajectories are stopped at the final time τ , as we also discuss below in more detail.Our results thus provide an extension of Martingale theory to cover different versions of mismatch costs in physical scenarios with absolute irreversibility, where martingales can be transformed into super-martingales via the correction term α τ (t) in Eq. (32), and stopping-time fluctuation relations can be derived from them.
Moreover, using the fact that M τ (t) is a supermartingale [c.f.Eq. ( 33)], we can also readily apply Doob's optional sampling theorem [88] for supermartingales to obtain (see Appendix G): where T 1 and T 2 are two stopping times, ordered such that P (T 2 ≥ T 1 ) = 1, but otherwise arbitrary.Taking T 1 = T and T 2 = τ , the above Eq.( 37), together with the FT for stopping times [Eq.(35)] and fixed-times [Eq.(34)] implies: where we have used δ τ (τ ) = 0.The above inequality implies that the absolute irreversibility term at stopping times Γ τ is always smaller than its fixed-time counterpart γ τ , that is, absolute irreversibility implies always greater dissipation at fixed-times than at stopping times.
C. Second-law inequalities at stopping times: universal lower and upper bounds If we apply Jensen's inequality ⟨e x ⟩ ≥ e ⟨x⟩ to the fluctuation theorem of Eq. ( 35) we derive a second-law inequality at stopping times: This sets a strict lower bound on the average dissipation incurred by a given computation up to an arbitrary stopping time T , from its time-reversal-symmetry breaking (as quantified by ⟨δ τ (T )⟩) and the absolute irreversibility (as quantified by Γ τ ).Moreover, Γ τ ≥ 0 implies that − ln[1−Γ τ ] ≥ 0. Therefore Eq. ( 39) also implies the simpler bound These inequalities suggest that ⟨Σ(T )⟩ might be negative whenever ⟨δ τ (T )⟩ ≥ − ln[1 − Γ τ ] ≥ 0, as we discuss in detail further below.Any concave function [such as ln(x)] of a supermartingale yields another supermartingale by Jensen's inequality.Therefore the supermartingale property of M τ (t) also implies that ln[M τ (t)] = −Σ(t) − δ τ (t) is supermartingale.So Σ(t) + δ τ (t) is a submartingale, i.e. it conditionally increases with time.If we now invoke Doob's optional sampling theorem for submartingales we get the inequality: where again T 1 and T 2 are two ordered stopping times with P (T 2 ≥ T 1 ) = 1.This inequality has several implications, the most immediate one being a second law for intervals between two ordered stopping times T 1 and T 2 : where ⟨∆Σ(T 1 , T 2 )⟩ := ⟨Σ(T 2 )⟩ − ⟨Σ(T 1 )⟩.This inequality provides a result applicable to both stochastic stopping and starting times, bounding the entropy production incurred for computations that both start and end at stochastic times.
As an example, inequality (42) provides a bound concerning the stochastic interval between the first time that a DFA enters an accept state, and the earliest subsequent time that it again enters an accept state, after having left the set of accept states in between.Then the time up to T 1 can be interpreted as the time it took for the DFA to accept a first sub-string of the full input word, and the time between T 1 and T 2 can be interpreted as as the time it took for the DFA to accept a second substring of the full input word, a sub-string which follows the first one.Again, the inequality in Eq. ( 42) suggests that ⟨∆Σ(T 1 , T 2 )⟩ might eventually become negative for such a case, whenever there is an increasing time-reversalasymmetry, i.e. for ⟨δ τ (T 2 )⟩ > ⟨δ τ (T 1 )⟩.
Moreover, for the choice T 1 = T and T 2 = τ , the inequality (41) gives us the following upper bound for the intrinsic mismatch cost at stopping times The inequality (43) implies that whenever ⟨δ τ (T )⟩ ≥ 0, the intrinsic mismatch cost at stopping times will be upper bounded by its fixed-time counterpart, suggesting a drop in the thermodynamic costs of the computation at stopping times.On the other hand by taking T 1 = 0 and T 2 = T in Eq. ( 41), we obtain an alternative second-law at stopping times, namely: to be compared with Eqs. ( 39) and ( 40).Here we have used that Σ(0) = 0 and This inequality provides us an alternative lower bound on the intrinsic cost of the computation.We notice that, while we expect it to be less tight in general than Eq. ( 39), it has the advantage of relying on the KL divergence between initial distribution ρ 0 and the final distribution in the auxiliary dynamics ρτ , which we expect to be more easily computable than Γ τ in Eq. (36).Remarkably, combining Eqs. ( 43) and ( 44) we find a sandwich inequality for the intrinsic mismatch cost at stochastic times, (46) which provides both upper and lower bounds on ⟨Σ(T )⟩.
The stopping time fluctuation relation in Eq. ( 35) and the inequalities (39)- (44) for the intrinsic thermodynamic costs in computational processes with stochastic stopping times provide our main results.In the following we further discuss their interpretation and some of their implications, while in Section V we investigate their applications to CS setups with some illustrative examples.

D. Thermodynamic interpretation and implications
The second-law inequality (40), ⟨Σ(T )⟩ ≥ −⟨δ τ (T )⟩ [as well the stronger versions (39) and ( 44)], suggests that both the intrinsic mismatch cost and the underlying entropy production incurred in a given computation may be negative on average when evaluated at stopping times.To understand how this is possible in light of the dataprocessing inequality we write ⟨Σ(T )⟩ explicitly as the functional (18) averaged over many trajectories that are stopped each at a stochastic time T : Here, p(T ) denotes the probability that the stopping time takes value T .Similarly, ρ t (x | T ) denotes the conditional probability that the process takes the value x at time t given that the stopping condition is met at time T .Because ρ t (x | T ) ≤ ρ t (x) in general, the terms xt ρ t (x t | T ) ln[ρ t (x t )/µ(x t )] and xt+1 ρ t+1 (x t+1 | T ) ln[ρ t+1 (x t+1 )/µ ′ (x t+1 )] are not KL divergences in general, and thus not necessarily greater or equal than zero (see also Ch. 8.3 in Ref. [19]).This implies that ⟨Σ(T )⟩ can in principle be negative.The second law at stopping times (40) permits ⟨Σ(T )⟩ ≤ 0 whenever ⟨δ τ (T )⟩ ≥ 0, yet it is not clear when this would be actually the case.
The explicit expression for the stochastic distinguishability at stopping times reads Equation ( 48) also reveals that ⟨δ τ (T )⟩ is not a KL divergence in general, and thus can in principle take any sign, yet so far only examples where ⟨δ τ (T )⟩ ≥ 0 have been reported in the literature.We remark that ⟨δ τ (T )⟩ is not a KL divergence unless T = τ , for which the process "stops" at the deterministic limit time τ , and one has that the joint stopping-time probability distribution i.e. it takes the value, at time τ , of the solution of the Master equation.Plugging in Eq. ( 49) in Eq. ( 48) one gets ⟨δ τ (T = τ )⟩ = D(ρ τ || ρ0 ) = 0 because ρ0 = ρ τ .Analogously for T = τ , intrinsic mismatch cost ⟨Σ(T = τ )⟩ takes the expression ( 22) thus retrieving non-negativity, ⟨Σ(τ )⟩ ≥ 0. Note that other examples of negative entropy production at stopping times based on threshold criteria for work were first reported in Ref. [16] and for free energy more recently [89].Such gambling demon [16] effect is allowed whenever ⟨δ(T )⟩ > 0, which is not guaranteed for arbitrary stopping conditions but possible for wise stopping strategies as shown experimentally in Refs.[16,89].
We can obtain further insight on this effect by decomposing the intrinsic mismatch cost at fixed times τ in two terms, one associated to intervals [0, T ] up to the stopping time T and [T , τ ] from the stopping time to the limit time τ , that is: which follows from the fact that T is a single-valued function of the trajectory.Since ⟨Σ(τ )⟩ ≥ 0, the above decomposition implies that, whenever ⟨Σ(T )⟩ < 0, such a negative value must be compensated by an incremented mismatch cost ⟨∆Σ(τ, T )⟩ ≥ ⟨Σ(τ )⟩ incurred in the interval [T , τ ], if no external action is taken on the system at time T to physically stop the dynamics.These considerations will be valid also in cases where the stopping condition is structurally imposed through the dynamical evolution of the computational process, e.g. using absorbing accept states to "stop" the computation, as it is the case in some models of DFAs.
The role of absolute irreversibility as captured in the stronger inequality (39) with − ln[1 − Γ τ ] ≥ 0 makes more difficult the observation of negative average intrinsic mismatch cost, since it would require a higher timereversal asymmetry in the dynamical evolution leading to large distinguishabilities ⟨δ(T )⟩ > − ln[1 − Γ τ ] [and similarly for inequality (44)].Remarkably, however, the examples explored in Sec.V show how still dissipation can be reduced at stopping times thanks to a positive time-reversal asymmetry ⟨δ(T )⟩ > 0, in agreement with Eq. ( 43) above.This reduction might be linked to the information needed to execute the stopping condition T , similarly to what happens in feedback control scenarios [90,91].However a general relation between these two quantities remains unknown.
The second-law inequality at stopping times (40) can be further rewritten using Eq. ( 19) in a form reminiscent of Landauer's principle: where the l.h.s.accounts for the excess entropy flow dissipated into the environment as a consequence of a drop in Shannon entropy of the computational states, −⟨∆S sys (T )⟩.Again, whenever ⟨δ τ (T )⟩ > 0, the above inequality suggests that the entropy flow to the environment may be eventually reduced.Here it is also worth noticing that even in the case in which trajectories are stopped when returned to the initial state (as in the DFA example in Section V), the average system entropy change at stopping times, namely ⟨∆S sys (T )⟩ = ⟨S sys (T )⟩ − S(ρ 0 ), with is non-zero even when x T = x 0 for all T since in general the distribution ρ T (x) ̸ = ρ 0 (x), as corresponds to a relaxation process.
The second-law inequalities derived above not only can be applied to assess stochastic stopping times of a computation, but also to stochastic starting times, see Eqs. (41) and (42).This extension allow us to apply our theory to computations that may "stop" at multiple consecutive times T 1 < T 2 < ... < T n (see Sec. V for a particular example in a DFA) or to the concatenations of simpler computations that start at a stochastic time, after the previous one is accomplished.We will further elaborate on the application of starting times to the computation of concatenated words with stochastic resetting in Sec.VII.

V. APPLICATION TO DETERMINISTIC FINITE AUTOMATA
In this section we analyze minimal yet insightful examples of computations executed by deterministic finite automata (DFA).A computational task for a DFA starts by it receiving a sequence of exogeneously generated symbols, an input string or an input word, ω.As the DFA iteratively processes the symbols of the input string, it makes associated transitions among its possible states.
Here we first assume that the sequence of symbols to the DFA are produced in an independent identically distributed (i.i.d.) manner and so the time evolution over the DFA states while processing those strings can be modeled using a DTMC.Then we will move to the case of input symbols that are not produced in an i.i.d.manner, but from a Markovian source.In the following examples, we consider two minimal DFA models that processes binary strings.In the first example involving i.i.d.symbol sources, the DFA under consideration accepts strings which encode binary numbers divisible by four, e.g.0 (zero), 100 (four), 1100 (twelve), etc.In the second example, involving non-i.i.d.sources, we use a DFA that accepts strings which encode binary numbers divisible by three.In all cases, we assume that the input string behaves as an information reservoir [74] whose symbols are not modified by the computation, hence not leading to further energy or entropy changes (see Sec. II D).
The state of the DFA when a stopping condition is reached (e.g., whenther the DFA enters a designated accept state) defines a computation that the DFA performs on that string.However this computation can be followed by further processing of input symbols up to a limit time τ (e.g. the DFA may exit the accept state in forthcoming iterations).In this sense our results for stopping times can be applied to various situations, for example: (i) computations generated by input words of fixed length τ where we ask about the value of thermodynamic quantities when visiting the accept state for the first (or the n-th) time; and (ii) computations that may actually end when visiting the accept state by some reason (e.g. the accept state is an absorbing state of the DFA or there exists an external mechanism that activates when the accept state is reached to stop the dynamics).In particular, we can always modify a given DFA by removing all edges of the associated directed graph that leave an accepting state.This turns the accept state into an absorbing state (or set of states, if there are more than one accepting states).
A. Processing symbols from i.i.d.sources As mentioned above, consider the DFA from Fig. 1, initialized to state q 0 with certainty, and that its computation starts by processing a stream of binary letters generated as an i.i.d.sequence of 0s and 1s, with p 0 ≤ 1 the probability to observe a 0 and p 1 = 1 − p 0 the probability to observe a 1.Under this assumption, the time evolution of the DFA's states follows a DTMC over four computational states q 0 , q 1 , q 2 and q 3 , with transition probabilities as indicated in Fig. 2 (a).All together, the Markov chain associated with the DFA's dynamics is characterized by its initial state with † denoting here matrix transposition, and the transition matrix It follows that for t = 1 we have whereas for larger times t ≥ 2, i.e., the dynamics already reaches the stationary state at the second iteration.
For computing the auxiliary dynamics for this DFA's DTMC, we would need to identify the reference distribution r(x) appearing in Eq. ( 15) as the prior µ(x) minimizing the mismatch cost sum in Eq. ( 8) or ν(x) leading to its minimum in Eq. (13).For simplicity, here we assume r(x) = π(x), the stationary state of the DFA dynamics.This is a reasonable assumption as long as the induced DTMC is aperiodic, irreducible, and π has full support over the computational states.As discussed before, since Σ becomes non-extensive in time in this case, there are reasons to expect minimal dissipation in the steady state (see also Refs.[92][93][94]).
The auxiliary dynamics starts in ρ0 = ρ τ , which can take two possible values depending on the value of the final maximum time of the computation τ : If τ = 1 we have ρ0 = ρ 1 , whereas for τ ≥ 2 we have ρ0 = π.Following Eq. ( 15) for the transition probability, with r = π, the stationary distribution given in (56), we obtain the transition matrix associated with the auxiliary dynamics: as illustrated in Fig. 2 1c).The transition matrix of such DTMC is given by Eq. ( 54) where p0 and p1 = 1 − p0 denote respectively the probability for a 0 and a 1 in the input string.
(b) DTMC associated with the auxiliary dynamics associated with the stationary prior, with transition probability matrix obtained from Eq. ( 15) and given by Eq. ( 57).
p 0 0 p 1 0 † .On the other hand, since by construction, the auxiliary dynamics Eq. ( 15) will always preserve the steady state for r = r ′ = π, it follows that in the case τ ≥ 2, the auxiliary dynamics is stationary at all times t, that is ρt given by Eq. ( 56).The intrinsic mismatch cost in Eq. ( 18), evaluated over a trajectory x [0,τ ] , reduces in this case to the (discretetime) stochastic non-adiabatic EP: with x t ∈ X = {q 0 , q 1 , q 2 , q 3 } for all t, which only depends on the initial and final states.Having obtained the system probability distribution at all times for the original and auxiliary dynamics, we are now ready to compute thermodynamic quantities at stopping times.In particular we consider the family of stopping times with τ fixing a time horizon and T 1 ≥ 1 the first time the DFA returns to the accept state q 0 , hence accepting a word as a multiple of four (including "0").From numerical simulations, we obtained sample histograms for the stopping time T given by Eq. ( 59) for three different choices of the limit time τ , see Fig. 3.There we observe the first peak at T = 1 in the three plots, corresponding to the cases where the first incoming symbol is "0" and the word is then accepted.In order to allow longer accepted words we need τ > 2, such that T 1 = 3 (accepting four "100") or T 1 = 4 (accepting twelve "1100"), etc.Notice however that with the stopping condition given in Eq. ( 59) we do not capture the acceptance of some of the multiples of four like e.g.eighth "1000", since the stopping condition would be already verified at previous symbol of the string, "100", corresponding to four.Same happens for any other accepted number to which an arbitrary number of zeros are attached at the end.
For assessing the acceptance of such numbers extra stopping conditions such as T n i.e. the n-th time the DFA resturns to the accept state q 0 , are needed (see example in Sec.V C).For all trajectories in which T = T 1 < τ , i.e., the word is accepted before the limit time τ is reached, we have x T1 = q 0 , the accept state, and thus and the stochastic distinguishability in Eq. ( 29) is: where we used the fact that (by construction) τ > T 1 ≥ 1 and hence ρτ−T1 = π.If however T 1 ≥ τ , the dynamics stops at the maximum time T = τ , independently of the state x τ , and we obtain: Note that the case τ ≥ 2 is independent of x τ because the system has already reached its stationary state, and thus Σ(τ ) = − ln π(q 0 ) = −2 ln p 0 for all x τ ̸ = q 0 .On the other hand, the stochastic distinguishability verifies δ τ (τ ) = 0 since ρ0 = ρ τ always.
Using the above calculations we obtain the average intrinsic mismatch cost at the stopping time (59) for all τ ≥ 2 as: which follows from FIG. 3. Stopping-time statistics for the DFA recognizing binary numbers divisible by four, obtained from 10 4 numerical simulations of the DTMC sketched in Fig. 2, with initial state q0, and an absorbing accept state set also at q0. Simulations are done by feeding the DFA with i.i.d.binary sequences with probability of observing the letter 0 given by p0 = 0.9, obtained from Monte Carlo simulations of the discrete-time Markov chain sketched in Fig. 2c.
where we have used P (T 1 = 1) = p 0 and thus P (T 1 > 1) = 1 − p 0 .In addition, using Eq. ( 61), we obtain the average stochastic distinguishability at the stopping time (59) for all τ ≥ 2: where again we have used P (T 1 = 1) = p 0 .Notice that the above expressions remain also valid in the limit of large input word lengths, τ → ∞.
To tackle the contribution from absolute irreversibility at stopping times, it is convenient to first identify which trajectories contribute to Γ τ in Eq. (36).These are trajectories that are stopped at T ≤ τ (either with or without reaching q 0 ) and have zero probability to occur in the original dynamics.Note that the original dynamics is a Markov chain with initial state ρ 0 (x) = δ x,q0 .The set of absolutely irreversible trajectories at stopping times consists of two sets: (i) trajectories that do not start in q 0 and reach q 0 with T ≤ τ in the original dynamics, (ii) trajectories of length τ that do not start at q 0 and do not reach q 0 in the original dynamics.
Let us now flesh out the list of such trajectories x [0,T ] classified by the value of T for the special case τ = 2: • q 2 q 0 reaches the accept state at T = 1 yet it has zero probability to occur in the original dynamics with ρ 0 (q 2 ) = 0.Here the computation stops at T = min(T1, τ ) for the limit time τ = 2, or earlier if the accept state is reached in one iteration.Symbols represent analytical results for the averages at the stopping time for the relevant thermodynamic quantities: intrinsic mismatch cost (non-adiabatic entropy production) ⟨Σ(T )⟩ given by Eq. ( 63) for prior equal to the stationary probability (blue filled squares); stochastic distinguishability ⟨δ2(T )⟩, given by Eq. ( 65) (blue dotted line); fixed-time non-adiabatic entropy production ⟨Σ(τ )⟩ evaluated over trajectories of the same length τ = 2 (open symbols); and the absolute irreversibility contribution − ln[1 − Γ2] given by Eq. ( 36) (blue dashed line).The blue solid line is given by the sum −⟨δ2(T )⟩ − ln[1 − Γ2] which in this example equals, ⟨Σ(T )⟩ thus saturating the second law (39).The horizontal black thick line is set to zero as a reference value.
• q 1 q 2 q 0 and q 3 q 2 q 0 reach the accept state at T = 2 yet they have zero probability to occur in the original dynamics since ρ 0 (q 1 ) = ρ 0 (q 3 ) = 0.
• q 1 q 2 q 1 , q 1 q 3 q 2 , q 1 q 3 q 3 , q 2 q 1 q 2 , q 2 q 1 q 3 , q 3 q 2 q 1 , q 3 q 3 q 2 , and q 3 q 3 q 3 are stopped at T = 2 without reaching the accept state.They have zero probability in the original dynamics because their initial state is different from q 0 .All the sequences listed above are such that they would halt the computation at the stopping time T = min(T 1 , 2), they have non-zero probability in the auxiliary dynamics but zero probability in the original dynamics.In order to calculate the absolute irreversibility correction term Γ τ in Eq. ( 36) we thus need the probability of the above trajectories to occur in timereversed order in the auxiliary dynamics.More precisely, one needs to compute P (Θx [0,T ] ) multiplied by ρτ−T (x T )/ρ T (x T ) = π(x T )/ρ T (x T ), which in this case is equivalent to modify their initial condition to π(x T ), i.e. to compute the following path probabilities: P (q 0 , q 2 |q 0 )π(q 0 ) = p 2 0 p 1 P (q 0 , q 2 , q 1 |q 0 )π(q 0 ) = p 2 0 p 1 p 0 P (q 0 , q 2 , q 3 |q 0 )π(q 0 ) = p 2 0 p 1 p 1 P (q 1 , q 2 , q 1 |q 1 )π(q 1 ) = p 0 p 1 p 1 p 0 P (q 1 , q 2 , q 3 |q 1 )π(q 1 ) = p 0 p 1 p 1 p 1 P (q 2 , q 1 , q 2 |q 2 )π(q 2 ) = p 0 p 1 p 0 p 1 P (q 2 , q 3 , q 3 |q 2 )π(q 2 ) = p 0 p 1 p 1 p 1 P (q 2 , q 3 , q 1 |q 2 )π(q 2 ) = p 0 p 1 p 1 p 0 P (q 3 , q 3 , q 3 |q 3 )π(q 3 ) = p 2 1 p 1 p 1 P (q 3 , q 3 , q 1 |q 3 )π(q 3 ) = p 2 1 p 1 p 0 P (q 3 , q 1 , q 2 |q 3 )π(q Summing up all the contributions in Eq. ( 66) leads us to the absolute irreversibility contribution [cf.Eq. ( 36) for the general formula]: Combining all the terms above, we observe that for the stopping time T = min(T 1 , 2): In other words, the second law at stopping times given by Eq. ( 39) is saturated over the stopping time given by Eq. ( 59) for τ = 2, as it is illustrated in Fig. 4 for different values of the probability of incoming zeros, p 0 .As can be appreciated in that figure, the positive sign of the term ⟨δ 2 (T )⟩ > 0 implies that the intrinsic mismatch cost at stopping times ⟨Σ(T )⟩ = −2 ln p 0 + p 0 ln p 0 , see Eq. ( 65), is smaller than its value at fixed times in spite of the presence of the absolute irreversibility contribution with Γ 2 .For τ > 2 we have Γ τ ≤ Γ 2 , which follows by combining the equality in Eq. ( 68) with the generic bound in Eq. ( 39).In any case, the inequality ⟨Σ(τ )⟩ ≥ ⟨Σ(T )⟩ holds for any limit time τ for this example.When p 0 approaches 1 (words with a high number of zeros) the dynamics cannot escape from the initial state q 0 and the steady state π becomes equal to the initial distribution ρ 0 .In this limit, the DTMC dynamics becomes fully stationary and hence the intrinsic mismatch cost becomes zero for every trajectory, the time-reversal asymmetry is lost, and the absolute irreversibility is no longer present, leading to a drop in the three quantities on the RHS of Fig. 4. As we move away from that limit, the mismatch cost increases (both at stopping and fixed times), signaling the energetic costs incurred by the computational task, which grow as p 0 decreases.This can be justified by the fact that the dynamics on the DTMC spreads more easily over all computational states as p 1  , τ ) with uniform prior, evaluated for the DFA in Fig. 2a processing i.i.d.binary input data, as a function of the probability of input symbol 0. We used different limit times: τ = 5 (blue filled squares), and τ = 14 (red filled circles).The solid lines correspond to the lower bound predicted by Eq. ( 44) and given by D(ρ0 || ρτ ) − ⟨δτ (T )⟩ for τ = 5 (blue solid line), and τ = 14 (red solid line), while the dashed lines are the corresponding upper bounds in Eq. ( 43), ⟨Σ(τ )⟩ − ⟨δτ (T )⟩ for same values of τ .Averages are estimated from 10 4 numerical simulation for each parameter value.The thick gray line is the average cost at stopping times for stationary prior π, see Eq. ( 63) and Fig. 4, and the vertical dashed line is set to p0 = 1/2 as a reference value.Inset: ⟨Σ(τ )⟩ − ⟨Σ(T )⟩ (solid line) and ⟨δτ (T )⟩ (dashed line) as a function of p0 for τ = 14.The horizontal dotted line is set to zero as a reference value.
increases (see Fig. 4 a), leading to a greater distinction between initial and steady-state distributions.In this case we also observe non-zero stochastic distinguishability and an increasingly large absolute irreversibility term.In the limit p 0 → 0 (words with a high number of ones) accepting a word becomes almost impossible, and hence the stopping occurs most probably at the maximum time T ≃ τ , leading again to zero stochastic distinguishability.We notice that in this limit π tends to localize at state q 3 and hence it would lead to ⟨Σ(τ )⟩ → ∞, which is not physically meaningful.The catch point is that in this limit the fixed point π would not be equal to the prior µ or ν anymore.

B. Uniform prior
We now implement the analysis in Sec.V A for a different setting, where the strings are generated i.i.d. and we consider the same four-states DFA, now with a uniform prior distribution over its states The evolution of r under W after one iteration yields Because r changes after one iteration, we write Σ as in Eq. ( 19) for τ > 0 where we the first term is the system entropy change ∆S sys (x [0,τ ] ) and the second one the nonequilibrium potential ∆ϕ(x [0,τ ] ) in Eq. ( 21).Unlike for the stationary prior, now this term is extensive with time [cf.
Eq. ( 58)].Note that in this case Σ is no longer equal to the non-adibatic EP associated with the stochastic trajectory x [0,τ ] .The uniform distribution is not invariant under the map W, hence the intrinsic mismatch cost Σ associated with a stochastic trajectory x [0,τ ] is extensive with time.This implies that, unlike for the case of stationary prior (see Sec. V A), the averages of Σ at fixed times τ as well as at stopping times with limit time τ [of the form of Eq. ( 59)], will crucially depend on τ .This is also the case for any other choice for the prior distribution which differs from the stationary distribution.
In Fig. 5 we show the intrinsic mismatch cost ⟨Σ(T )⟩ at the stopping time stopping time T = min(T 1 , τ ) for the DFA with the uniform prior for two different values of τ , and compare it with the case of stationary prior, Eq. ( 63).We observe that the uniform prior leads to higher values for the intrinsic mismatch cost for high values of p 0 , while for low p 0 values the tendency can be inverted.However when increasing τ sufficiently we always obtain a lower cost for the stationary prior, as expected from its non-extensivity.Indeed we observe a tendency for the mismatch cost at stopping times ⟨Σ(T )⟩ to saturate when increasing the limit time τ , in contrast with the linear scaling of ⟨Σ(τ )⟩ with τ .In Appendix H we confirm this point by studying in more detail the scaling behaviour of these two quantities as a function of τ .
We test the sandwich inequality in Eq. ( 46) comprising the upper and lower bounds to ⟨Σ(T )⟩ in Eqs. ( 43) and (44), respectively.As can be appreciated in Fig. 5, both inequalities provide useful bounds that become tighter for small τ , and are simultaneously saturated at the point p 0 = 1/2.This example also reveals that again there is a reduction of intrinsic costs at stopping times with respect to fixed times, that is ⟨Σ(τ )⟩ ≥ ⟨Σ(T )⟩ holds over the entire parameter range of probability of symbol 0, p 0 , and the limit time τ , as shown in the inset of Fig. 5.This reduction is guaranteed by a positive value of the stochastic distinguishability ⟨δ(T )⟩ > 0 in the range p 0 ≥ 1/2 [c.f.Eq. ( 43)] but, interestingly, it is also verified even for ⟨δ(T )⟩ < 0 as it happens for p 0 ≤ 1/2.C. Beyond i.i.d.sources So far we have analyzed the statistics of a DFA processing inputs generated by a source of i.i.d.bits, which induces a Markovian dynamics for the time-evolution of the computational states.This is, however, one of the simplest possible computational processes, as e.g., regular languages recognized by DFAs are often composed of correlated words.To illustrate the applicability of our theory to computing thermodynamic costs of DFAs processing arbitrary strings from arbitrary languages, it is mandatory to consider DFAs processing non-i.i.d.sequences.
In processing a generic non i.i.d.sequence, the dynamics over the computational states of a DFA is in general a non-Markovian process.However, one can extend the computational state space such that our formalism can be applied.For the analysis in this section it is important to remark the distinction between the "computational states" of the DTMC computational state space X , and the states of the DFA.In particular, we refer to states of the DFA (as in the usual TCS definition) as logical states of the DFA, and remind that with "computational states" we refer to the sets of variables which describe the entire state-space for a computational process of interest, as introduced in Sec.Section II.Now consider that the (process generating the) input string itself is a DTMC characterized by timeindependent transition probabilities p(b i+1 |b i ) for the (i+ 1)'th bit to be equal to b i+1 = {0, 1} given that the i ′ th symbol (bit) of the string is b i = {0, 1}.In this case, the logical state of, e.g., a three-state DFA z t = {q 0 , q 1 , q 2 } processing this input string is not a DTMC, although by constructing the computational state space as the Cartesian product of z t = {q 0 , q 1 , q 2 } and b t = {0, 1}, we encode the current computational state x t = {z t , b t } as the logical state of the DFA z t and the most recent input symbol fed to the DFA b t .In this case, one is left with a DTMC with six possible computational states, for which our formalism can be readily applied to tackle the thermodynamic properties.
Using the above definitions we can compute Σ(τ ) in Eq. ( 18), at arbitrary fixed times.Moreover, in order to evaluate thermodynamic quantities at stopping times, we embrace again the family of stopping times T = min(T 1 , τ ) with τ the fixed time horizon and T 1 ≥ 1 the first time the DFA returns to the accept state q 0 for either b = {0, 1}.
We show numerical results in Fig. 7, where ⟨Σ(T )⟩, together with the corresponding upper and lower bounds given by Eqs.(46) are plotted as a function of the probability p 01 = 1 − p 11 to obtain symbol 0 after a symbol 1, for different values of p 00 = 1 − p 10 .Again we obtain relevant bounds on the intrinsic mismatch cost at stopping times, which, interestingly, become tightest when p 00 = 1 − p 01 , i.e. when p 00 = p 11 and p 10 = p 01 .This corresponds to the situation in which the input sequence is a Markovian process with homogeneous stationary probabilities, p st 0 = p st 1 = 1/2.The fact that our bounds become tight for homogeneous input sequences was also observed for the i.i.d.example (see Fig. 5) and makes us conjecture that this phenomenon may be generic to correlated input sequence, maybe also non-Markovian.
As also commented for the previous examples, however, using a stopping time of the form T = min(T 1 , τ ) allows us to describe computation times for the DFA to reach the accept state for the first time.That corresponds to the acceptance of only some of the multiples of three, e.g."0" (zero), "11" (three), "1001" (nine), but not other multiples like "110" (six) or any other word that already contains an acceptable prefix.In order to explore thermodynamic costs associated to these words we now consider more general stopping times T = min(T n , τ ), where T n is the n-th time the DFA returns to the accept state.Therefore T 2 is related with the acceptance of words like "110" (six) or "10010" (eighteen), while T 3 corresponds to accept words like "1100" (twelve), among many others.
In Fig. 8 we plot ⟨Σ(T )⟩ with T = min{T n , τ } as a function of the return time to the accept state, n = 1, 2, 3, 4, 5.We notice that different behaviors are obtained depending on the choice of input symbols proba- bilities, p 00 and p 01 , leading to either increasing values of the intrinsic mismatch cost or a non-monotonic behavior.Interestingly considering different stopping times allows us to test inequality (42) for two stopping times, which is shown in the inset of Fig. 8 for T 1 = min(T n−1 , τ ) and T 2 = min(T n , τ ) as a function of n > 1.In particular we observe that the mismatch cost between consecutive returning times to the accept state can be eventually negative for specific choices of parameters (probabilities p 00 and p 01 ), that is, ⟨∆Σ(T n , T n−1 )⟩ < 0 for n = 3, 4, owing to a reduction in the associated stochastic distinguishability and despite having ⟨Σ(τ )⟩ ≥ 0 at fixed times.

VI. UNIVERSAL EQUALITIES AND INEQUALITIES FOR ACCEPTANCE PROBABILITIES
Our formalism can be further applied to address other issues in computer science theory, beyond automata literature, and besides second laws and fluctuation theorems at stopping times.Both in this Sec.VI and in Sec.VII we develop further theoretical predictions for key statistical properties of interest for computer science that may inspire numerical and experimental illustrations of future work.An example that we develop in this section is using our formalism to establish universal equalities and inequalities concerning the probabilities of acceptance or rejection of sets of distinct bit sequences when a given DFA is implemented.In what follows we focus on a specific choice of such sets, namely, 1) the set of all strings or trajectories that end in an accept state before the limit time τ vs. 2) the set of all trajectories that do not end in an accept state before the limit time τ .However, we emphasize that this formalism can be generalized to arbitrary pairs of sets of trajectories, by specifying suitable filtrations as done in martingale approaches.
Thus we will explore a class of simple examples of "acceptance" statistics for binary words of length τ ≥ 2 that are processed by a computer.We will use the notation accept to signify that a computer reaches a prescribed accept state before the limit time τ , and reject otherwise.The probabilities P a (τ ) denotes the probability for the computer to have reached the accept state within [0, τ ], and P r (τ ) = 1−P a (τ ) the probability for the complementary.Recall that for simple computer architectures (e.g., DFAs processing i.i.d.binary strings), P a (τ ) and P r (τ ) can often be evaluated analytically or with Monte Carlo simulations.The approach we reveal below is complementary to Monte Carlo approaches in such simple computations, however we highlight its usefulness in revealing how such accept/reject statistics are related to thermodynamic quantities.Note that here those thermodynamic quantities can be used as a tool of calculation, determined completely by the computer update function and the distribution over input words.In particular, they need not correspond to any "real" thermodynamic quantities that one would measure in the laboratory.That is, our formalism provides a way to derive the relative probabilities of accepting or rejecting a string while sidestepping the conventional technical difficulties found in the traditional approaches to this issue [95,96].On the other hand, one can also interpret the results presented below as a way for obtaining information about the intrinsic thermodynamic costs of computations by looking at the acceptance probabilities (of languages solved by machines), which might be calculated by other means, such as Monte Carlo approaches.
So we consider again a stopping time T = min(T 1 , τ ), which signifies the first time that the computer reaches the accept state, T 1 , or the limit τ in the case that the accept state is not visited before τ .So T < τ if a word of length τ −1 is accepted by the computer.Otherwise, T = τ , if the word is not accepted before τ .The probabilities that a word of length τ − 1 is accepted or not are then given by P a (τ ) = P (T < τ ), (77) P r (τ ) = 1 − P (T < τ ) = P (T = τ ), (78) respectively.We now make use of our formalism to derive bounds for P a (τ ) and P r (τ ) in terms of thermodynamic quantities.
Using our fluctuation theorem at stopping times with absolute irreversibility, Eq. ( 35), ⟨M τ (T )⟩ = 1 − Γ τ , we expand its l.h.s.into terms corresponding to accepted and rejected words as: P a (τ )⟨M τ (T )|T < τ ⟩ + P r (τ )⟨M τ (T )|T = τ ⟩, (79) with ⟨A(T )|c(T )⟩ = E(A(T ) | c(T )) being the conditional average of functional A over trajectories x [0,T ] given that the condition c(T ) is fulfilled over the stopping time T .Upon using P r (τ ) = 1 − P a (τ ), the decomposition (79) gives us the following relation between the acceptance probability and the averages of the supermartingale M τ (T ) at stopping times: Equality ( 80) generalizes analytical expressions obtained in previous works for absorption probabilities [13,19,97] by including the absolute irreversibility contribution Γ τ .
As can be appreciated in Eq. ( 80), since Γ τ ≥ 0, the role of absolute irreversibility is to decrease the acceptance probability P a (τ ) of a word of lenght τ − 1 by the DFA.This can be intuitively understood from the fact that starting computation from a restricted set of initial states can only decrease the velocity at which the computational state space is explored, and hence the probability to reach a generic stopping condition before time τ .Since P a (τ ) is a well-defined probability (i.e.0 ≤ P a (τ ) ≤ 1), we further obtain from Eq. ( 80) that one of the two following chain inequalities hold: (82) which provide us constrains on the values of M τ (T ) for generic T of the form T = min(T 1 , τ ).
Analogously, we can exploit the second-law-inequality at stopping times (44), namely ⟨Σ(T )⟩ ≥ −⟨δ τ (T )⟩ + D(ρ 0 || ρτ ), to derive universal bounds for the finite-time acceptance probability.Indeed, average of the left-handside of this equation at the stopping time T = min(T 1 , τ ) can also be decomposed into two terms, accounting, respectively, for accepted and rejected words of maximum length τ − 1: ⟨Σ(T ) + δ τ (T )⟩ = P a (τ )C a (τ ) + P r C r (τ ), (83) where we have introduced the conditional averages Note that in Eq. ( 85) we have used the fact that δ τ (τ ) = 0. We refer to these two conditional averages as the average thermodynamic costs associated with the acceptance and rejection of words of length τ − 1, respectively.
Combining Eqs.(44) and Eq. ( 83) we obtain the two following lower and upper bounds for the acceptance probability valid whenever C a (τ ) > C r (τ ), and similarly valid in the complementary case when C r (τ ) > C a (τ ).These bounds express a constraint on the acceptance probability of a word with maximum lenght τ −1 in terms of the average costs associated with the accepted and rejected words as defined in Eqs. ( 84) and ( 85), and the KL divergence between the initial distribution of the computational state and the final distribution of the computational state under the auxiliary dynamics [see Eq. ( 45)].Equation ( 86) provides a meaningful bound whenever its r.h.s is non-negative and smaller than one, i.e. when C a (τ ) ≥ D(ρ 0 || ρτ ) ≥ C r (τ ).On the other hand, the bound ( 87) is meaningful when C a (τ ) ≤ D(ρ 0 || ρτ ) ≤ C r (τ ).We expect the first condition to be satisfied if the probability of accepted words is large enough so that the associated cost C a (τ ) is larger than the cost of rejected words C r (τ ).So we expect the bound (86) to be helpful for parameter values of the DFA and distribution over input words in which the acceptance rate is high.On the contrary when the probability of rejected words is large enough, we expect C r (τ ) to be larger than C a (τ ), and the bound (87) to be useful when the acceptance rate is low.
The above relations in Eq. ( 80) and Eqs. ( 86) and (87) concerning the acceptance probability of a word can also be applied to any finite-horizon stopping time of the form T = min(T c , τ ), where T c represents the time at which a given arbitrary condition c is verified for the first time, e.g., the first time the accept state is reached twice, or the first time the accept state is reached after passing through any other arbitrary state (or sequence of them).Thus there is an ample flexibility in choosing the stopping condition T , including the logical composition of any other set of conditions, e.g., c = c 1 ∪ c 2 giving the first time either condition c 1 or condition c 2 are verified, or c = c 1 ∩ c 2 for the fist time both c 1 and c 2 are simultaneously verified.

VII. CONCATENATING RUNS OF A DFA WITH STOCHASTIC RESETTING
In this section we further elaborate on how our results would be applied to sequences of computations separated by a reset of the dynamics which implements concatenated computational rounds.This is an interesting avenue where our results might be fruitfully combined in the future with the powerful analytical tools from the framework of stochastic resetting [35,[98][99][100][101][102].Let us consider a random sequence of symbols fed into a computer, where ⊔ is a blank symbol that flags the beginning of a new computation.For the example sequence (88), a computation starts at the random starting time T start = 5 and ends at the stochastic ending time T end = 10 just before the next blank symbol arrives, thus generating the input word "010111".During this computation, the computer begins computing at T start = 4 from its start state, and ends the computation either in an accept state or in another logical state.Now, stochastic starting times can be reformulated as stochastic stopping times [see also our results concerning multiple stopping times, Eq. ( 41)].In particular here the starting time T start is the first appearance of a blank symbol ⊔.Whenever the probability of a blank symbol p ⊔ > 0 is greater than zero, then it is guaranteed that P (T end < ∞) = 1, i.e., there is a limit time τ that is a finite global upper limit to T end .This is the setting which would correspond to, e.g., stochastic starting times that are drawn from distributions with bounded support, say from Bernoulli or Binomial distributions.Under such mild assumptions, it is then possible to establish thermodynamic constraints for computations starting at stochastic times.
Supposing p ⊔ > 0, we outline how the stopping-time fluctuation relations derived in our work can be applied to a computation of the example sequence (88).First, we let the computer processes the sequence "000", which implies visiting the accept state at least once.At time t = 3, the computer may or may not be in the start state depending on its update rules.Next at t = 4, the state of the computer is reset to its start state from whichever state x 3 it occupies at the previous time instance.Upon this, the computer processes the string 010111 before the arrival of the next blank symbol, during which the logical state may or may not have reached the accept state.This leaves us with an ordered sequence of stopping times: Here above we have denoted by accept the first return time to the accept state during the computation of the i−th word.Similarly, T (i) blank is the stochastic arrival time of the i−th blank symbol.While the stochastic times T (i) accept have the same structure as the stopping times considered throughout our work, the times T (i+1) blank can be seen as stochastic starting times, which are also examples of stopping times for which our formalism applies.
Figure 9 provides an illustration of a DFA processing an i.i.d.sequence of bits interspersed by blank symbols.The DFA processing the symbols recognizes binary words multiples of four, as in the examples of Sec.V. Assuming 0 0 0 0 1 0 1  1c) with stochastic resetting to the start state.Along a stochastic computation, resetting takes place when a blank symbol is recognized by the DFA.For the model illustrated here, we have assumed that words are drawn from i.i.d.sequences with probabilities p0, p1 and p⊔ of 0, 1 and blank symbols respectively, with p0 + p1 + p⊔ = 1, however more complex scenarios could be envisaged in future work.
time-independent probabilities p 0 , p 1 and p ⊔ for the occurrence of 0, 1, and blank ⊔ symbols respectively (with p 0 + p 1 + p ⊔ = 1), the DTMC associated with this computation can be represented by a discrete-time stochastic resetting process (see Fig. 9).In such processes, resetting takes place from each logical state to the start state q 0 at a stochastic starting time.This requires a suitable description of computation whose transition matrices include resetting events.For the DFA example considered here, such transition matrix takes the form cf. Eq. ( 54) for the case where no resetting takes place, corresponding to p ⊔ = 0.The DTMC described by the transition matrix (90) allows one to study multiple realistic computational scenarios where Σ at stochastic starting and stopping times can be efficiently tackled.For example, one may consider that the processing of the input string by the DFA as a nonequilibrium stationary process with resetting and apply results from the martingale theory for stationary processes (see Ch. 7 in Ref. [19]).Alternatively, one can apply the formalism in this work to establish bounds for the intrinsic mismatch costs of the computation between the first and the n−th arrival of a blank symbol, etc [see Eq. ( 42)], similarly to Sec.Vc.

VIII. DISCUSSION
In this work we have shown how to extend stochastic thermodynamics to describe the minimal costs associated with a computer processing with a stochastic halting time, processing strings of arbitrary length.Our formalism applies to computations described by discretetime Markov chains over a set of computational states that may have restricted initial conditions, unidirectional links, and start and/or stop at a stochastic time.We obtain quantifiers, which are collectively dubbed as the intrinsic mismatch cost of a computation, that lowerbound the entropy production incurred by the computer and that can be formulated at the fluctuating level.A key insight here is that these quantifiers, which provide a tool to probe the entropy production associated with computations at stopping times, can be entirely obtained from the DTMC evolution and the prior, without further details about their physical implementation.Notice that such an intrinsic cost is independent of the internal energy of the computational states, x t ∈ X , which can be indeed assumed to be equal for every computational state and constant over time.Still, non-zero entropy production through the irreversible dissipation of heat into the environment will be in general incurred for any physical computer which implements a given computation over such set of states.
Putting forward the modern martingale formalism of stochastic thermodynamics, we also unveiled a plethora of universal fluctuation relations and inequalities that are valid for the broad class of computations analyzed in this work.We obtained a main fluctuation theorem, Eq. ( 35), valid for settings which include arbitrary stopping times, unidirectional transitions, and absolute irreversibility.In doing that, we have extended the martingale theory for stochastic thermodynamics to account for this additional source of irreversibility in generic situations, which we expect to have broad applicability in nonequilibrium thermodynamics.
The rigor and flexibility of our theory for stopping times allowed us to formulate and interpret several second-law-like inequalities [Eqs.( 39)-( 44)] at stochastic stopping times, as well as relations for the probabilities of acceptance/rejection of input data by a computer in terms of thermodynamic quantities [equality (80) and inequalities ( 86)-( 87)].In particular, the second law inequalities ( 39) and ( 44), provides us useful lower bounds on the minimum dissipation incurred by a generic computation stopping at an arbitrary stopping time, while Eq. ( 43) establishes formally how stopping times can be used to reduce the thermodynamic costs of a computation by means of time-reversal-symmetry breaking.Moreover, we have also shown the relevance of accounting for absolutely irreversible sequences in providing accurate bounds for the intrinsic mismatch cost of the computation.In this sense, the bound we derived in Eq. (39) with the absolute irreversibility term Γ τ is tighter, with respect to the alternative bound in Eq. (44).However, computing Γ τ might be challenging depending on the setting considered, specially for large limit times τ .On the contrary, the alternative bound in Eq. (44), would be much easier to compute (as it only depend on two probability distributions), while still providing a meaningful bound in all examples explored here.
The framework developed in this paper can be readily applied for assessing thermodynamic costs of computations in a broad range of models of relevance in CS theory, including -but not being limited to-deterministic finite automata.Our results apply to every computation implemented by a synchronous digital computer and remains valid independently of how the computational variables are defined.In particular they can include already processed input symbols (as in non-iid DFAs), stacks (as in pushdown automata models), or even entire words written on a random access tape (as in Turing machines).Hence our results provide a tool to classify abstract computational machines by their intrinsic (unavoidable) thermodynamic costs.Applying our framework to more complex models of computational machines such as pushdown automata or Turing machines halting at stochastic times is a natural step following the investigation initiated here.
Our results are amenable of experimental testing using state-of-the-art techniques in line with previous tests of Landauer's principle [25,103,104] and other experimental platforms in stochastic thermodynamics [105] in setups ranging from colloidal particles [106] and nanoscale devices [107,108] to biopolymers [91].Regarding the determination of the prior, in some cases the experimentalist will have designed the system in a sufficiently detailed manner such that it is possible to calculate the prior, or at least approximate it with reliable numerical estimates.In other cases, the experimentalist can estimate the prior by repeatedly running the system and observing the resultant behavior.In any case, our results have nonzero lower bounds that apply no matter what the prior is.
It would also be very interesting in the future to extend the framework developed here by combining analytical tools from stochastic resetting (e.g.renewal theory and first-passage-time ideas [109]) with computer science methods.This will allow to obtain tight bounds for the statistics of starting time and entropy production bounds in specific models of DFAs and TMs processing regular languages, as follows from the ideas sketched in Sec.VII.Also, we note that even if current digital devices are very close to periodic, they are not exactly so.In other words, they are some first-order perturbation away from being periodic, which suggests other avenues for future work.In general, the prior is a function of the physical process implementing the computation.For example, it is a function of the time-dependent rate matrix in the case of a CTMC.Given this, we might be able to use the envelope theorem (often used in game theory) to calculate how much the prior can change under first order perturbations away from an exactly periodic process.That in turn might allow us to modify Eq. ( 13) to involve some infinitesimal first-order perturbation parameter ϵ characterizing how much the process differs from being exactly periodic.
However, the results developed in this paper also provide new insights in the field of nonequilibrium thermodynamics.An important consequence of our work is the finding that the auxiliary dynamics introduced in Eq. ( 15) is suitable to treat processes at stopping times that may have unidirectional transitions and absolute irreversibility, hence making our framework applicable to generic situations where local detail balance is broken.Similar auxiliary dynamics has been invoked in the literature such as so-called "dual", "dual-reversed" or "adjoint" dynamics, in the context of fluctuation theorems, see e.g.[20,42,49,82,110,111].In particular, as shown above, if the process admits a well-behaved stationary solution, we can obtain from Σ the so-called non-adiabatic (or excess) entropy production [49,72].Within such scenario, our work is another brick in the wall or recent progress highlighting the role of non-adiabatic entropy and excess heat in characterizing the efficiency [112] and calorimetry [113,114] of active nonequilibrium systems.
We also expect our results to have potential applications outside statistical mechanics and computer science, e.g. in biological physics, for instance within the field of biomolecular computation [115], enzyme kinetics [116] and information processing in biology [117].As a minimal model, consider a minimal Michaelis-Menten scheme for enzyme kinetics in which an enzyme E transforms a substrate molecule S into a product molecule P .A typical assumption is that the conversion of the substrate into a product takes place through an irreversible chemical re- where k i 's here are suitable transition rates.Within this model, the enzyme's state during enzymatic cycles follows a continuous-time Markov jump process with one irreversible transition.In previous works, the presence of the irreversible transition has been circumvented by considering virtual processes with a very slow transition rate.However our formalism can be readily applied to describe the stochastic thermodynamics of such enzymatic reaction, inasmuch that it does not require the presence of bidirectional transitions.Similarly we expect our formalism to be suitable to describe fluctuations of biological populations in processes that include totally irreversible transitions such as cell-fate decisions [118,119] cell death and apoptosis [120], among others.For such systems, our approach puts forward recent approaches to estimate dissipation developed within in the field of active matter [121][122][123] which did not contemplate the presence of unidirectional transitions which are commonplace in biophysical modelling [124,125].
Then the residual EP of the physical process that implements G is as shown in [126].Therefore the EP for the process starting with distribution ϱ 0 is (A5) Since expected EP C(τ ) is non-negative, by definition the residual cost of any island c, C c min , is non-negative.So the total residual cost given by an expectation over all islands, which is the residual cost for the entire thermodynamic process characterized by G, is also non-negative.Like priors, residual costs of islands in general will differ from one cost function C to the next.
In general, as the iteration t of a periodic process changes, the distribution p t (c) over the islands c will change.Therefore so will the associated total expected residual cost.However, since that total residual cost is always non-negative, all the lower bounds on EP in the main text that consider only mismatch cost apply.This is true even if the residual costs of the islands are strictly positive.In particular, Eq. ( 14) will still be a lower bound on the EP generated in the process.
On the other hand, if we write down the formula for minimal total residual cost which is analogous to Eq. ( 14), minimizing over the residual costs of each island, we just get zero, by taking those costs to all equal zero.So unless we fix the physical details underlying the process, and therefore fix the residual costs of the islands to be strictly positive, our analysis of lower bounds isn't changed by the existence of residual costs.This is why such quantities are ignored in the main text.
As a final comment, note that in general both the prior and residual costs will vary with τ .However, the same mismatch cost formula bounds dissipation for any such choice of τ , once one plugs in the appropriate prior.This need not be true for residual cost, in the sense that the islands might change for different choices of τ .In this appendix we provide extra numerical evidence for the scaling behavior of mismatch cost at stopping and fixed times, which is shown in Fig. 10.There we observe that indeed the average intrinsic mismatch cost at fixed times scales linearly with τ , that is ⟨Σ(τ )⟩ ∼ τ , as illustrated in the inset of Fig. 10.Moreover we obtain that ⟨Σ(τ )⟩/τ decreases monotonically with τ up to a saturating positive value, yet its scaling behaviour is rather insensitive to the statistics of the input strings (see open symbols in Fig. 10).This point makes us question whether the fixed-time average ⟨Σ(τ )⟩, or its rate per iteration ⟨Σ(τ )⟩/τ , is a suitable indicator of the thermodynamic costs of the computation.On the other hand, when considering the stopping-times average ⟨Σ(T )⟩, we observe a sublinear scaling at moderate values of the limit time, i.e. ⟨Σ(T )⟩ ∼ τ α , with |α| < 1, reaching a plateau at large τ (see filled symbols in Fig. 10).
We also find that ⟨Σ(T )⟩ is more sensitive than ⟨Σ(τ )⟩/τ to the value of p 0 for all values of τ explored in our simulations.This reveals that the intrinsic mismatch cost at stopping times ⟨Σ(T )⟩ is a suitable quantity to quantify the average performance of a computation accomplished at a stochastic time.For example, the sensitivity of ⟨Σ(T )⟩ to string statistics could be fruitfully exploited as a probe of the performance of a DFA in processing different regular languages, an exciting avenue that we leave for future work.
FIG. 2. (a) Discrete-time Markov chain (DTMC) associated with the DFA recognizing binary i.i.d.sequences that are multiple of four (see Fig.1c).The transition matrix of such DTMC is given by Eq. (54) where p0 and p1 = 1 − p0 denote respectively the probability for a 0 and a 1 in the input string.(b) DTMC associated with the auxiliary dynamics associated with the stationary prior, with transition probability matrix obtained from Eq. (15) and given by Eq. (57).

6 FIG. 4 .
FIG.4.Illustration of analytical results for the second law at stopping times applied to the discrete-time Markov chain model of the DFA recognizing binary strings whose length is a multiple of four (see Fig.2 a).Here the computation stops at T = min(T1, τ ) for the limit time τ = 2, or earlier if the accept state is reached in one iteration.Symbols represent analytical results for the averages at the stopping time for the relevant thermodynamic quantities: intrinsic mismatch cost (non-adiabatic entropy production) ⟨Σ(T )⟩ given by Eq. (63) for prior equal to the stationary probability (blue filled squares); stochastic distinguishability ⟨δ2(T )⟩, given by Eq. (65) (blue dotted line); fixed-time non-adiabatic entropy production ⟨Σ(τ )⟩ evaluated over trajectories of the same length τ = 2 (open symbols); and the absolute irreversibility contribution − ln[1 − Γ2] given by Eq. (36) (blue dashed line).The blue solid line is given by the sum −⟨δ2(T )⟩ − ln[1 − Γ2] which in this example equals, ⟨Σ(T )⟩ thus saturating the second law(39).The horizontal black thick line is set to zero as a reference value.

4 FIG. 5 .
FIG.5.Numerical results for the average intrinsic mismatch cost ⟨Σ(T )⟩ (symbols) at stopping times T = min(T1, τ ) with uniform prior, evaluated for the DFA in Fig.2aprocessing i.i.d.binary input data, as a function of the probability of input symbol 0. We used different limit times: τ = 5 (blue filled squares), and τ = 14 (red filled circles).The solid lines correspond to the lower bound predicted by Eq. (44) and given by D(ρ0 || ρτ ) − ⟨δτ (T )⟩ for τ = 5 (blue solid line), and τ = 14 (red solid line), while the dashed lines are the corresponding upper bounds in Eq. (43), ⟨Σ(τ )⟩ − ⟨δτ (T )⟩ for same values of τ .Averages are estimated from 10 4 numerical simulation for each parameter value.The thick gray line is the average cost at stopping times for stationary prior π, see Eq. (63) and Fig.4, and the vertical dashed line is set to p0 = 1/2 as a reference value.Inset: ⟨Σ(τ )⟩ − ⟨Σ(T )⟩ (solid line) and ⟨δτ (T )⟩ (dashed line) as a function of p0 for τ = 14.The horizontal dotted line is set to zero as a reference value.

FIG. 6 .
FIG. 6. (a)Minimal DFA that accepts binary multiples of three with three states zt = {q0, q1, q2} with same start and accept state q0.(b) Associated DTMC, where the automaton processes input strings generated by a non-i.i.d.source of input symbols with probabilities depending only on the last processed symbol bt = {0, 1}.See Eq. (73) for the corresponding transition probability matrix.

75 FIG. 7 .
FIG.7.Numerical results for the intrinsic mismatch cost ⟨Σ(T )⟩ with uniform prior for the DFA in Fig.6a) processing Markovian input strings, as a function of the input symbol transition probability.Here T = min(T1, τ ), with T1 the first return time to any of the states (q0, 0) or (q0, 1), and τ = 5 a prescribed limit time.Symbols are numerical estimates for ⟨Σ(T )⟩ obtained from 10 4 numerical simulations, for two different values of the transition probability p00 of the input string containing two consecutive zeroes (see legend).Solid lines are estimates from the numerical simulations of the quantity D(ρ0 || ρτ )−⟨δτ (T )⟩, confirming the lower bound given by Eq. (44).

4 FIG. 10 .
FIG.10.Numerical results for the intrinsic mismatch cost with uniform prior as a function of the limit time τ .We show three different values of the probability of symbol 0: p0 = 0.75 (blue squares), p0 = 0.5 (red circles), and p0 = 0.3 (green diamonds).Filled symbols correspond to the stopping-time average ⟨Σ(T )⟩.We include for comparison the corresponding fixed-time ensemble average rate ⟨Σ(τ )⟩/τ without absorbing conditions (open symbols).The open symbols in the inset show the value of ⟨Σ(τ )⟩.Parameters of the simulations are as in Fig.5.