Optimization of nonequilibrium free energy harvesting illustrated on bacteriorhodopsin

Harvesting free energy from the environment is essential for the operation of many biological and artificial systems. We use techniques from stochastic thermodynamics to investigate the maximum rate of harvesting achievable by optimizing a set of reactions in a Markovian system, possibly under various kinds of topological, kinetic, andthermodynamicconstraints. Thisquestionisrelevantfortheoptimaldesignofnewharvestingdevices as well as for quantifying the efficiency of existing systems. We first demonstrate that the maximum harvesting rate can be expressed as a constrained convex optimization problem. We illustrate it on bacteriorhodopsin, a light-driven proton pump from Archaea, which we find is close to optimal under realistic conditions. In our second result, we solve the optimization problem in closed-form in three physically meaningful limiting regimes. These closed-form solutions are illustrated on two idealized models of unicyclic harvesting systems.


I. INTRODUCTION
Many molecular systems, both biological and artificial, harvest free energy from their environments.Biological organisms require free energy to grow and replicate [1,2], and many species undergo selection for increased harvesting [3][4][5][6].Artificial harvesting systems have also been constructed and optimized in the field of synthetic biology [7][8][9][10][11][12][13][14].The optimization of free energy harvesting is thus a central problem both in biology and engineering.
As an example, consider a harvesting system such as a biological metabolic network that converts glucose to ATP [15].Suppose that the kinetic and thermodynamic parameters of one or more reactions can be optimized, either by natural selection or artificial design.What is the maximum rate of free energy harvesting that the network can achieve, and what are the kinetic and thermodynamic parameters that achieve it?These questions are relevant both for design of optimal harvesting devices and for quantifying the efficiency of existing systems.
In this paper, we use techniques from stochastic thermodynamics to derive bounds on maximum rate of free energy harvesting.
We consider a harvesting system in nonequilibrium steady state which is coupled to an external source of free energy, an internal free energy reservoir, and a heat bath.The setup is well-suited for studying the kinds of smallscale systems usually considered in stochastic thermodynamics [16], where assumptions of local detailed balance and steady state are justified.The steady-state assumption is reasonable in many molecular systems, where there is a separation of timescales between internal relaxation and environmental change.
We suppose that the system's dynamics can be separate into two kinds of processes, termed baseline and control.The baseline processes, which are held fixed, mediate the coupling to the external source of free energy.Control refers to all other processes which can be optimized for increasing the harvesting rate at which free energy flows into the internal reservoir.The particular separation of baseline/control generally depends on domain knowledge about the system and the scientific question at hand.For example, to study the efficiency of a particular reaction in a metabolic network, that reaction may be treated as control while the other reactions are baseline.The baseline/control separation is similar to the distinction in control theory between "plant" (a given system with fixed dynamical laws) and "controller" (the part that undergoes optimization) [17].
In our first set of results, we demonstrate that the optimization of the harvesting rate can be expressed as a convex optimization problem.The optimal solution of this problem determines both the maximum harvesting rate and the specific control processes that achieve that maximum.Importantly, the optimization can also account for various types of constraints on the possible network topology, kinetic timescales, and thermodynamic forces of the control processes.
We illustrate our results on bacteriorhodopsin (Fig. 1), a proton-pumping membrane protein.Bacteriorhodopsin is a photosynthetic system found in Archaea, with close relatives in bacteria [20,21].It is also used in many artificial lightharvesting systems [7][8][9]11].Using published thermodynamic and kinetic data, we develop a thermodynamically consistent stochastic model of bacteriorhodopsin.We demonstrate that, under normal operating conditions, the bacteriorhodopsin system is remarkably efficient.
Our main result is formulated as a convex optimization prob- Figure 1.Left: Bacteriorhodopsin is a biomolecular free energy harvesting machine [18], illustrated in its trimer configuration by D. Goodsell (CC-BY-4.0)[19].Right: during each turn of the bacteriorhodopsin photocycle, the molecule absorbs a photon and pumps a proton against the cell's membrane potential.lem which must be solved numerically in the general case.In the second part of this paper, we derive closed-form solutions of this problem for three physically meaningful regimes: the weakly-driven linear response regime, the irreversible deterministic regime, and the intermediate near-deterministic regime.These solutions illustrate how optimal harvesting reflects the "alignment" between free energy input and relaxation dynamics.We illustrate these closed-form solutions on two unicyclic systems, which may be interpreted as idealized models of two types of nonequilibrium harvesting devices.
We finish our paper with a brief Discussion.There we relate our approach to previous work, including maximization of power output in steady-state engines and flux balance analysis.We also propose directions for future research.

II. SETUP
We consider a system with n mesostates described by the distribution p = (p 1 , . . ., p n ) ∈ R n + .The distribution evolves according to the master equation ṗi = j R ij p j , where R ji is the transition rate i → j (R ii = − j R ji ).Usually p represents a probability distribution of a stochastic system with Markovian dynamics [22,23].However, under an appropriate choice of units, it may also represent chemical concentrations in a deterministic chemical reaction network with pseudo-firstorder reactions, such as an enzymatic cycle [24,25].
The system is coupled to a heat bath at inverse temperature β = 1/k B T , an internal free energy reservoir, and a nonequilibrium environment that serves as an external source of free energy.For example, in a metabolic network, one may consider an internal reservoir of ATP and an external source of glucose.The system has nonequilibrium free energy where S(p) := − i p i ln p i is the Shannon entropy and f i is the internal free energy of mesostate i [26].
As mentioned in the Introduction, we suppose that the dynamics of the system can be separated into baseline and control processes.We make one important assumption in our analysis: the control processes are only coupled to the heat bath and internal free energy reservoir, but not directly to the external source of free energy.This means that control can only increase harvesting by interacting with the baseline, rather than directly increasing the inflow of free energy from the external source.For example, in a metabolic network, control processes cannot directly increase the import of glucose, but they can affect the rate at which glucose is converted into ATP by optimizing intermediate reactions and mechanisms.Control processes may be driven by the internal reservoir (e.g., coupled to ATP hydrolysis) or undriven (e.g., enzymes).
To formalize the baseline/control distinction, we write the rate matrix as R = R b + R c , where R b ji and R c ji represent the transition rate of i → j due to baseline and control.Given distribution p, the increase of system free energy due to baseline processes is The increase due to control processes is defined analogously but using rate matrix R c , The harvesting rate is the rate at which free energy flows to the internal reservoir.The harvesting rate due to baseline processes is where g b ji is the free energy increase in the internal reservoir due to a single baseline transition i → j and ġb i is the rate of free energy flow to the internal reservoir due to internal transitions within i (assuming i is a coarse-grained mesostate).Similarly, the harvesting rate due to control processes is where g c ji is the free energy increase in the internal reservoir due to control transition i → j.For simplicity, we assume that control cannot exchange free energy with the internal reservoir due to internal transitions within i. Negative values of g b ji , ġb i , g c ji indicate driving done on the system by the internal reservoir.
For a concrete example of how (R b , R c , f , g b , ġb , g c ) are defined for a real biomolecular system, see the bacteriorhodopsin example below and SM-II [27].
As standard in stochastic thermodynamics, we assume that control processes obey local detailed balance (LDB) [16,28], Eq. ( 6) guarantees that the irreversibility of each control transition is determined by the amount of free energy dissipated by that transition [29].Observe that the right side accounts for free energy changes of the system (f i − f j ) and the internal reservoir (g c ji ), but not the external source.This formalizes the assumption that control processes do not exchange free energy with the external source.
We do not require that the baseline rate matrix obeys LDB, although it will often do so for reasons of thermodynamic consistency.

III. MAXIMUM HARVESTING RATE
Suppose that the combined rate matrix R = R b + R c has the steady-state distribution π, which satisfies R b π + R c π = 0.The total steady-state harvesting rate due to baseline and control is We seek to maximize this harvesting rate by varying the parameters of the control processes (R c , g c ) while holding the baseline parameters (f , R b , g b , ġb ) fixed.Finding this maximum would allow us to determine fundamental bounds on harvesting and to evaluate the efficiency of existing harvesting systems.
However, Ġtot is not a concave function of the parameters (R c , g c ), therefore maximization of ( 7) is not a convex optimization problem and is not generally intractable.In the following, we reformulate this maximization as a convex optimization with a physically interpretable objective.This allows us to solve the optimization numerically and, at least for some special cases, also in closed form.
To begin, we rewrite (7) as where we introduced the Schnackenberg formula for the entropy production rate (EPR) [22], where J c ji = π i R c ji ≥ 0 is the one-way probability flux due to control transition i → j.
Eq. ( 8) has an intuitive physical interpretation: the total steady-state harvesting rate is the rate of free energy increase in the system and internal reservoir due to baseline, minus the rate of dissipation (EPR) due to the control fluxes.The derivation of this result proceeds in two steps (see SM-I A [27] for details).The first step is to show that Σ(J c ) = −β[ Ḟc (π) + β Ġc (π)], which follows by combining (9) with (3) and (6).This states that the EPR due to control is proportional to the free energy loss in the system and internal reservoir due to control.The second step is to show that Ḟb (π) + Ḟc (π) = 0, which follows because the left side is the overall derivative of the nonequilibrium free energy F, as defined in (1), therefore it must vanish in steady state.The result (8) then follows by combining with (7) and rearranging.
Importantly, when expressed in the form (8), the harvesting rate is a concave function of the steady-state distribution π and the control fluxes J c (see SM-I B [27]).To find the maximum harvesting rate, we optimize (8) with respect to π and J c .Note that varying π and J c is equivalent to varying the control rate matrix via R c ji = J c ji /π i and control driving g c ji via (6).However, when performing this optimization, we must also ensure that π is the steady-state distribution induced by the fluxes J c .This condition can be expressed as a linear constraint on π and J c via R b π +BJ c = 0.Here J c is treated as a vector in R n 2 and B ∈ R Ḟb (p) + Ġb (p) − Σ(J )/β.(10) In this expression, Λ is the feasible set of distributions p and control fluxes J .At a minimum, Λ ensures the validity of the distribution p and the fluxes J via the linear constraints p i = 1, p i ≥ 0, and J ji ≥ 0. We write sup instead of max because the set of allowed fluxes is potentially unbounded.Eq. ( 10) implies a tradeoff between the total gain of free energy in the system and internal reservoir due to baseline (which depends only on p) and the dissipation due to control fluxes (which depends only on J ). Importantly, the feasible set Λ can include additional convex constraints, which may introduce topological, kinetic, thermodynamic, etc. restrictions on the control processes.Topological constraints restrict which transitions can be controlled; e.g., J ji = 0 ensures that control does not use transition i → j).Kinetic constraints restrict timescales of control processes, as might reflect underlying physical processes like diffusion; e.g., an upper bound on control transition rate R c ji = J ji /p i ≤ κ can be enforced via J ji ≤ p i κ.Thermodynamic constraints bound the affinity [22] of control transitions; e.g., The above examples all involve linear constraints.An example of a nonlinear, but still convex, constraint is an upper bound on the EPR incurred by control, Σ(J ) ≤ Σc max .Eq. ( 10) is our first main result.Importantly, G is defined purely in terms of the thermodynamic and kinetic properties of the baseline process, along with desired constraints encoded in Λ.Thus, G is the maximum steady-state harvesting rate that can be achieved by any allowed control processes, given a fixed baseline.In addition, Eq. ( 10) involves the maximization of a concave objective given convex constraints.This is equivalent to the minimization of a convex objective, thus Eq. ( 10) is a convex optimization problem that can be efficiently solved using standard numerical techniques [30].The optimization also identifies an optimal steady-state distribution p * and control fluxes J * that achieve the maximum harvesting rate G (or come arbitrarily close to achieving it).These fix the optimal control rate matrix via R c * ji = J * ij /p * i .Thus, our optimization specifies an upper bound on harvesting as well as the optimal control strategy that achieves this bound.
There is an important special case in which our optimization problem is simplified.Suppose that Λ does not enforce additional constraints on p and J (more generally, we permit topological constraints if the graph of allowed transitions is connected and contains all n states).Then, the objective is maximized in limit of fast control, J → ∞ and Σ(J ) → 0. We can then write (10) as an optimization over steady-state distributions: The optimal p * is unique as long as the baseline rate matrix is irreducible.The optimal control rate matrix is very fast (R c * → ∞) and obeys detailed balance for p For details, see SM-I C and SM-I D [27].

IV. BACTERIORHODOPSIN
We illustrate our results using bacteriorhodopsin, a lightdriven proton pump from Archaea [18].
We define a thermodynamically consistent model of the bacteriorhodopsin cycle using published thermodynamic [31] and kinetic [32] data (see SM-II [27]).The system operates in a cyclical manner, absorbing a photon and pumping a proton during each turn of the cycle (Fig. 1, right).Specifically, the transition M 1 → M 2 pumps a proton out of the cell.This stores free energy in the internal reservoir (the membrane electrochemical potential), where e is the elementary charge constant, ∆ψ is the membrane electrical potential, and ∆pH is the membrane pH difference.The other transitions in the cycle do not affect the free energy of the internal reservoir (g ij = 0 and ġi = 0).
During the transition bR → K, the system leaves the ground state by absorbing a photon at 580nm, thereby harvesting free energy from the external source and dissipating some heat to the environment at T = 293 • K.This transition is much faster (picoseconds) than the other transitions in the photocycle (micro-to milliseconds).As commonly done in photochemistry [33], we coarse-grain transitions O → bR and bR → K into a single effective transition O → K.
To explore the performance of bacteriorhodopsin under different conditions, we vary the membrane electrical potential ∆ψ between −75 and 350 mV, while using a realistic fixed ∆pH = −0.6 [34].We show the actual harvesting rate ( Ġtot in units of k B T /sec) at different potentials as a black line in Fig. 2 (a).At a plausible in vivo ∆ψ = 120 mV [34], the model exhibits a steady-state current of 11.5 protons/sec, each proton carrying 6.1 k B T of free energy, corresponding to Ġtot ≈ 70 k B T /sec.The largest output is achieved near the in vivo potential: at lower potentials, the cycle current saturates while the free energy per proton drops, and at higher potentials the pump stalls.
Next, we quantify the maximum harvesting rate that can be achieved by optimizing the parameters of individual transitions.This analysis is relevant for understanding limits on increasing bacteriorhodopsin output, whether via natural selection or biosynthetic methods [35][36][37][38].Interestingly, such transition-level optimization may be feasible in bacteriorhodopsin, as the properties of several transitions are known to be individually controlled by particular amino acid residues in the bacteriorhodopsin protein [35,[39][40][41].
For each reversible transition in the cycle, for instance N ↔ O, we define the baseline as the rest of the cycle without that transition.We then optimize control under the topological constraint that only the relevant transition (e.g., N ↔ O) is allowed.The analysis is repeated for all transitions, except for the (coarse-grained) photon-absorbing transition O ↔ K, which is in accordance with our assumption that control cannot directly exchange free energy directly with the external source.
Fig. 2 (a) shows G , the maximum Ġtot achievable by optimizing each reversible transition.In Fig. 2 (b), we also show the efficiency Ġtot /G ≤ 1 for each transition, that is the ratio of the actual and maximum harvesting rate.
Several transitions, such as K ↔ L,L ↔ M 1 ,M 1 ↔ M 2 , are remarkably efficient (≥ 85%) near in vivo membrane potentials.The reprotonation step N ↔ O is the least efficient (∼ 40%) and also has the slowest kinetics of the 5 transitions studied in Fig. 2.This suggests that N ↔ O is a bottleneck  whose optimization can have a big impact on the harvesting rate, while optimization of other non-bottleneck transitions has a more limited effect.
Observe that G for M 1 ↔ M 2 does not depend on ∆ψ.This is because G is a function of baseline properties, which do not depend on the membrane potential when M 1 ↔ M 2 is treated as control.Conversely, M 1 ↔ M 2 as control transition can be optimized by varying the membrane potential and/or scaling up the forward/backward rates.Our results show that this transition is very close to optimal at in vivo membrane potentials and kinetic timescales.
Optimal distributions p * are also obtained, with a typical one shown in Fig. 2 (c).We find a consistent shift toward state O, which accelerates the reset of the cycle and increases the flux across the photon-absorbing transition O → K.
In SM-II [27], we illustrate how the efficiency of bacteriorhodopsin transitions can be evaluated under other types of constraints, including constraints on thermodynamic affinity, dynamical activity, and kinetics.
As a final analysis, instead of optimizing individual existing transitions in the bacteriorhodopsin cycle, we ask to what extent the harvesting rate can be increased by any additional control processes.For example, this could involve an additional enzyme that shifts the cycle's steady state by permitting a new transition between distant states (e.g.L ↔ N ), possibly yielding an increase of the proton pumping rate.
In this case, we treat the entire bacteriorhodopsin system as the baseline, and we do not introduce any additional constraints on the steady-state distribution or the control fluxes.Then, as shown in Sec.SM-I D [27], the objective in (10) is achieved in the limit of fast control, and the maximum harvesting rate can be found by solving the simplified optimization problem (11).
For this setup, Fig. 3 (a) shows the baseline (actual) harvesting rate Ġtot and the maximum harvesting rate G at varying ∆ψ.Interestingly, both peak at around the in vivo values of the membrane potential.In Fig. 3 (b), we show that the actual bacteriorhodopsin cycle harvests approximately 50% of the fundamental bound given by G (at in vivo values of ∆ψ).This suggests that bacteriorhodopsin is remarkably close to optimal, relative to improvements that could be achieved by introducing any additional control processes.
We also show the actual steady-state distribution and the optimal distribution p * in Fig. 3 (c).The optimal distribution increases the probability of state O, similar to the optimal distribution found by optimizing the N ↔ O transition, shown in Fig. 2 (c).However, unlike Fig. 2 (c), where most of the extra probability is taken from state N , in Fig. 3 (c) the probability is drawn more uniformly from other states in the cycle, indicating the presence of distributed control.

V. LIMITING REGIMES
Our results are stated via an optimization problem that generally does not have a closed-form solution.In our second set of results, we identify closed-form expressions in three physically meaningful regimes.For simplicity, here we focus on the simplified objective (11).See SM-III [27] for detailed derivations, including analysis of the conditions under which each of these three approximation are be valid.
For convenience, we first rewrite (11) as where Ṡb (p) = − i,j R b ij p j ln p i is the increase of the Shannon entropy of p under R b and for convenience we defined . The objective (13) contains a nonlinear term − Ṡb (p)/β quantifying the decrease of information-theoretic entropy and a linear term i p i ϕ i quantifying the flow of thermodynamic free energy.
Next, we consider three regimes.Linear response (LR) applies when the optimal distribution p * is close to the steady-state distribution of the baseline rate matrix R b .Suppose that R b is irreducible and has a unique steady state π b with full support.We introduce the "additive reversibilization" of R b , The rate matrix A obeys detailed balance (DB) for the steadystate distribution π b and has the same dynamical activity [42] on all edges as R b .A may be considered as a DB version of R b , and it is equal to R b when the latter obeys DB [43,44].Let u α indicate the α-th right eigenvector of A normalized as i (u α i ) 2 /π b i = 1, and λ α the corresponding real-valued eigenvalue (λ 1 = 0).The LR solution for the maximum harvesting rate and the optimal distribution is where quantifies the harvesting "amplitude" for mode α.
Eq. ( 15) has a simple interpretation.In addition to the baseline harvesting rate Ġb (π b ), G contains contributions from the relaxation modes of A, with each mode weighed by its squared amplitude and relaxation timescale −1/λ α .All else being equal, G is large when slow modes have large harvesting amplitudes.The optimal p * shifts the baseline steady state π b toward mode α in proportion to that mode's harvesting amplitude and relaxation timescale, thereby optimally balancing the tradeoff between harvesting and dissipation.
The Deterministic (D) regime applies when the nonlinear information-theoretic term in ( 13) is much smaller than the linear thermodynamic term.We can then ignore the former, turning (13) into a simple linear optimization.This gives the approximate solution where i * = arg max i ϕ i is the optimal mesostate.This solution concentrates probability on a single mesostate, effectively ignoring the cost of maintaining this low-entropy distribution.
The Near-Deterministic (ND) regime lies between Linear Response and Deterministic ones.By perturbing p * around δ i * i , we can decouple the values of p i in the objective function (11).The maximal harvesting rate and optimal distribution in this regime are then given by The ND solution also has a simple interpretation.It perturbs the Deterministic solution by shifting probability towards states with high transition rates away from the optimal state (large R b ii * ) and small decreases in harvesting (ϕ i * − ϕ i ).This balances the benefit of harvesting against the cost of pumping probability against R b ii * .

VI. EXAMPLE: UNICYCLIC SYSTEMS
We illustrate our closed-form solutions using two simple models, both based on a unicyclic system with n states.The baseline dynamics involve diffusion across a 1-dimensional ring, with left and right jump rates set to 1.The baseline steady state is a uniform distribution, π b i = 1/n, with no cyclic current.We assume a uniform free energy function, We consider two different scenarios.In the first scenario, shown schematically in Fig. 4 (a), Θ of free energy is harvested each time the system carries out the transition 1 → 2, so and g b ij = ġb i = 0 otherwise.This scenario may be interpreted as an idealized model of a biomolecular harvesting cycle, such as a transporter.In the second scenario, shown schematically in Fig. 4 (b), free energy is harvested at a rate of θ per unit time when the system is located in one particular mesostate i * = 1, so and g b ij = ġb i = 0 otherwise.This scenario may be interpreted as an idealized model of error correction or self-assembly, where free energy can only be harvested when the system is in some particular functional mesostate.
For both scenarios, we evaluate the maximum harvesting rate G and the optimal distributions achievable by adding any possible control to the system, without constraints.We report exact values found by numerical optimization of Eq. ( 13), as well as the LR, ND, and D approximations described above.To calculate the LR values, we exploit the fact that the baseline unicyclic rate matrix is a circulant matrix with a simple eigendecomposition [45].Full details of the derivations for the two scenarios are provided in SM-IV [27] and SM-IV C [27], respectively.
We first report results for the first scenario from Fig. 4 (a), where free energy is harvested during the transition 2 → 1. Observe that baseline harvesting rate vanishes, Ġb (π b ) = 0, since harvesting free energy requires a cyclic current.In Fig. 5 (a), we plot the maximum harvesting rate G and its approximations as a function of the supplied free energy Θ.For small Θ, LR applies and the maximum harvesting rate is The optimal distribution in the LR regime, shown in Fig. 5 (c), builds up in equal increments starting from i = i * + 1 until the optimal state i * = 1, after which it drops sharply.For  large Θ, the D regime is relevant and the optimal distribution concentrates on the optimal state i * = 1, so At intermediate Θ, the ND regime applies, which gives The optimal distribution in the ND regime, shown in Fig. 5 and the rest to the optimal state p * i * .Next, we consider the second scenario from Fig. 4 (b), where free energy is harvested when the system is in the optimal mesostate i * = 1.Observe that the uniform baseline steady state assigns 1/n probability to the optimal state, thus in this scenario the baseline harvesting rate is Ġb (π b ) = θ/n.To facilitate comparison with the first scenario, we focus on the increase of the maximum harvesting rate relative to baseline, In Fig. 6 (a), we show ∆G and its approximations as a function of the free energy supply rate θ.For small θ, LR applies and the maximum harvesting rate is The optimal distribution in the LR regime, shown in Fig. 6 (c), is symmetric about the optimal state i * = 1.For large θ, the D regime is relevant and the optimal distribution concentrates on the optimal state i * = 1, so At intermediate θ, the ND regime applies, giving The optimal distribution in the ND regime, shown in Fig. 6 (b), allocates and the rest to p * i * .There are some similarities among the two harvesting scenarios.For both scenarios, in the LR regime, the increase of the harvesting rate scales quadratically in the supplied free energy (Θ or θ) and linearly in inverse temperature β.This scaling reflects the fact that the optimal strategy has to balance harvesting (Θ or θ contributions) with the thermodynamic cost of maintaining a low entropy p * (β contributions).In the ND and D regimes, G scales linearly in the supplied free energy and loses its linear dependence on β.Thus, outside of LR, the thermodynamic cost of maintaining a low entropy distribution has a minor effect on the optimal strategy.
There are also important differences between the two scenarios.For the first scenario, the optimal strategy maintains an asymmetric p * , thereby generating a net flux across the transition 2 → 1.In the LR regime, the cost of maintaining this asymmetric distribution grows with the system size n, therefore the maximum harvesting rate in Eq. ( 19) scales as ∼ O(n −1 ).This is shown in Fig. 5 (d), where we plot G and its LR approximation at various Θ and n.For the second scenario, the optimal strategy maintains a peaked but symmetric p * .Remarkably, the cost of maintaining this distribution does not depend on system size n.This is shown in Fig. 6 (d), where we plot ∆G at various θ and n.

VII. DISCUSSION
In this paper, we consider the problem of optimizing free energy harvesting in a nonequilibrium steady-state system.We demonstrate that this problem can be formulated as a constrained convex optimization problem, and we use this formulation to study optimal harvesting and efficiency in the bacteriorhodopsin proton pump.We also solve the convex optimization problem in closed-form for three limiting regimes, as illustrated on two unicyclic models discussed above.
A key step in our analysis is to separate the dynamics of the system into separate contributions from fixed baseline processes and optimizable control processes.We note that, in stochastic thermodynamics, the baseline/control separation has been previously used to study autonomous Maxwellian demons [46,47], counterdiabatic driving [48], and the cost of maintaining a nonequilibrium steady state [49,50].
We derive a simplified bound on the maximum harvesting rate in (13), which is achieved in the limit of fast dissipationless control.Interestingly, this expression involves a tradeoff between two terms, one information-theoretic and one thermodynamic.At first glance, this resembles information/freeenergy tradeoffs characteristic of Maxwellian demons and other "information engines" [51][52][53][54][55][56][57][58].However, there are important differences.In a typical information engine, there is no external source of driving and information serves as fuel, which can be converted into β −1 ln 2 of thermodynamic free energy per bit.In our case, there is an external source of free energy that in some cases can be harvested more effectively by reducing the system's statistical entropy, e.g. by concentrating it on favorable states.Here, a bit of information can increase the harvesting rate by a very large amount (much larger than β −1 ln 2/bit), and information acts more like a catalyst than a fuel [59,60].Loosely speaking, this is similar to how information encoded in the sequence of a metabolic enzyme is not itself fuel, but rather allows metabolism to harvest a large amount of fuel from elsewhere.
We finish by mentioning some connections to previous work and future directions.First, our approach may be related to prior work on optimizing power output and free energy transduction in steady-state molecular machines [29,[61][62][63][64][65][66].Here we consider the general problem of optimizing a set of control processes, given a fixed baseline and possible additional constraints on kinetics, topology, and thermodynamics.Previous work does not make the baseline/control distinction; instead, it is mostly concerned with the problem of optimizing system performance with respect to a small set of specific parameters or observables of interest, such as the location of free energy barriers [62,63,65,66], efficiency [66], and the size of fluctuations [61,64].
There is also an interesting relation between our work and flux balance analysis (FBA) [67][68][69].The goal of FBA is to identify deterministic fluxes in biological metabolic systems that optimize biomass production, or other similar metrics of performance.This can be formulated as a linear program, which may include linear constraints that enforce thermodynamically favored reaction directions [69] (interestingly, in Ref. [70], the authors propose a version of FBA that also accounts for the entropy production rate).Our setting and optimization are different from FBA and its variants.We seek to optimize free energy harvesting at the stochastic level, and our objective involves nonlinear information-theoretic contributions to free energy.In addition, our optimization involves both the steady-state distribution p and fluxes J , which allows us to optimize harvesting due to to internal transitions within coarse-grained mesostates, as in Fig. 4 (b).Nonetheless, investigating the relationship between our approach and FBA is an interesting direction for future work.
Another interesting direction for future work is to consider stochastic fluctuations of free energy harvesting.In particular, the thermodynamic uncertainty relation may be used to study tradeoffs between the entropy production rate, the average harvesting rate (the quantity Ġtot considered here), and the fluctuations in the amount of harvested free energy [64,71].For biomolecular systems, large fluctuations in harvesting can lead to starvation, suggesting that minimizing fluctuations may be of significant biological importance.
Finally, an interesting direction for future work is to consider free energy harvesting in a system embedded in a fluctuating environment.For example, one may imagine a harvesting system in an environment with fluctuating sugar sources, or with a fluctuating amount of available light.In this setting, it is natural to optimize the harvesting rate under the topological constraint that control fluxes cannot directly change the state of the environment, for instance using bipartite models of Markovian dynamics [72].It would be interesting to investigate how, under the optimal harvesting strategy, the information flow from the environment to the system varies with the abundance of free energy and complexity of the environment.

Supplemental Material: Optimization of nonequilibrium free energy harvesting illustrated on bacteriorhodopsin
Jordi

A. Derivation of (8) from LDB and steady-state assumption
Here we derive (8) in the main text, which reads as To derive this result, The first line follows from definitions made in the main text.Since R c is a rate matrix, i,j p i R c ji a i = i p i a i j R c ji = 0 for any a.This allows us to further rewrite (S1.2) as In the second line, we plugged in the condition of LDB, as expressed in (6) in the main text: Using the definition of one-way fluxes due to control given a generic distribution p, J c ji = p i R c ji , and the Schnackenberg expression of EPR Σ, we rewrite (S1.3) as In the main text, this equality is applied at the steady-state distribution π.However, Eq. (S1.5) is valid for all distributions p as long as (S1.4) holds.
To complete our derivation of (S1.1), recall that Rπ = (R b + R c )π = 0 by definition of the steady-state π.At the same time, from the definition of Ḟb (p) in (2), we have Combining the definition of Ġtot with (S1.5) and (S1.6) gives Eq. (S1.1), Finally, note that the entropy production rate Σ is always nonnegative: where the last inequality follows because the terms p j R c ij − p i R c ji and ln Let us briefly comment on the physical meaning of our derivation.First, we note that LDB holds when the control dynamics exhibit a separation of timescales, with fast equilibration within each mesostate and slower transitions between mesostates [73].
Second, recall our assumption that the control processes do not interact directly with the external source of free energy, but only with the internal reservoir and heat bath.This assumption is formalized in the particular form of the LDB condition (S1.4).It means that for the control processes, the combined "system+internal reservoir" may be treated as a single system coupled only to a heat bath at inverse temperature β.Hence, the rate of entropy production due to control is simply β times the decrease of the combined free energy of the system and internal reservoir, as in (S1.5).Moreover, owing to Second Law, their combined free energy cannot increase under control dynamics [26,29], B. Concavity of constrained optimization (10) Consider the objective Ḟb (p) + Ġb (p) − Σ(J )/β in (10) in the main text.In Sec.I C below, we show that the term Here we show that Σ(J ) is convex in J .This shows that the overall objective is concave in the pair (p, J ).
Consider any pair of feasible fluxes J ̸ = Y and λ ∈ (0, 1) and let To prove convexity, we write Here we used the log-sum inequality [74, Thm.2.7.1], This follows from nonnegativity of Σ(J ) and the fact that (11) has less constraints than (10).
We will show that G = G ′ as long as the feasible set Λ is not too constrained.In addition to the basic normalization and nonnegativity constraints on p and J , we allow Λ to encode a set of topological constraints like for i ̸ = j, where G is a symmetric matrix that determines the topology of the control dynamics (G ji = G ij = 1 when the transition i ↔ j is allowed, and otherwise G ji = G ij = 0).We assume that G is connected and contains all n states.Of course, we may have G ji = 1 for all i ̸ = j, in which case no topological constraints are imposed.We assume that Λ includes no other constraints.
Given these assumptions, we construct a sequence of control process (R c (κ), g c ) which satisfy LDB such that: 1.The combined steady state π(κ 2. The combined steady state obeys lim κ→∞ π(κ) = p * , where p * is the optimizer of (11).
3. The steady-state EPR vanishes lim κ→0 Σ(κ) = 0, therefore the steady-state harvesting rate obeys Our proof technique is related to the idea sketched out in Ref. [50], which studied the minimal cost of maintaining a nonequilibrium steady state.It is also related to constructions from the literature on counterdiabatic driving [75].

Construction of optimal control
We define a sequence of control processes parameterized by κ > 0. For each control process, the free energy exchanges with the internal reservoir are set to The control rate matrix R c (κ) is defined by scaling a given rate matrix B Here κ ≥ 0 is an overall rate constant and G is the matrix discussed in (S1.19).The particular choice of the topology encoded in G is arbitrary, given our assumption that it is connected and contains all n states which guarantees that R c is irreducible.It can be verified that the rate matrix R c (κ) defined in (S1.21) obeys LDB (6).It also satisfies detailed balance (DB) for the distribution p * , This implies that p * is the unique steady-state distribution, R c (κ)p * = 0.
We note that there are various other types of rate matrices that can be used in the construction, as long as they satisfy LDB and DB.However, the parameterization used in (S1.21) is common in the literature [76].

Achievability of the steady state p *
We show that π(κ), the steady-state of the combined rate matrix of R(κ) := R b + R c (κ), approaches p * in the limit of fast control (large κ).Since [R b + R c (κ)]π(κ) = 0 and R c (κ)p * = 0, we have Rearranging and applying the Moore-Penrose inverse B + to both sides gives Since B is irreducible by construction, B + B = I − p * 1 T .Plugging into (S1.24),we obtain which follows from where in the last step we used that the norm of any probability distribution is bounded by 1.This shows that lim κ→∞ ∥p * − π(κ)∥ = 0, meaning that the combined steady-state converges to p * .

II. BACTERIORHODOPSIN MODEL A. Details of model
Here we provide thermodynamic and kinetic details of the bacteriorhodopsin model analyzed in the main text.
The thermodynamic parameters are taken from Ref. [31], which reports in vitro measurements of internal free energies at 293 • K = 20 • C. Based on Fig. 7 in that paper, we use the following internal free energies for the 6 cycle states:  We use g ji to indicate the free energy transferred to the internal reservoir by the jump from state i ∈ {K, L, M 1 , M 2 , N, O} to state j ∈ {K, L, M 1 , M 2 , N, O}.Only transitions between neighboring states in the cycle are permitted: The values of g ji are zero for all transitions except for the transition M 1 ↔ M 2 , corresponding to proton transport across the membrane.This value is determined via Eq.( 12) as a function of the electrochemical potential (a.k.a.proton motive force), which includes contributions from electrical potential and pH difference across the membrane.To summarize: We emphasize that ∆ψ is the membrane electrical potential in volts, where ∆ψ ≥ 0 indicates that the inside is more negative.∆pH is the membrane pH difference, where ∆pH ≤ 0 indicates that the inside is more basic.No free energy is exchanged in the processes internal to each cycle state, so The dynamics are represented by the system's rate matrix.We parameterize the transition rate for the jump from state i ∈ {K, L, M 1 , M 2 , N, O} to neighboring state j ∈ {K, L, M 1 , M 2 , N, O} as where κ ji = κ ij is the relaxation rate for the transition i ↔ j, and ∆s tot ji = −∆s tot ij is the entropy produced during the transition i → j.This is a common parametrization which guarantees that the rates satisfy local detailed balance (LDB) [76].The entropy production for each jump i → j is where f i refers to internal free energies in joules (S2.30 Boltzmann's constant.As above g ji = −g ij is the increase of free energy of the internal reservoir, while m ji = −m ij is the decrease of free energy in the external source during the transition i → j.In fact, m ji = 0 for all i, j except the transition O ↔ K, which corresponds to photon absorption.The energy absorbed from the photon is hc/λ joules, where h is Planck's constant, c is speed of light, and λ is the photon wavelength.We use a physiologically plausible value of 580 nm [77].At this wavelength and temperature of 293 • K, Observe that the transition O → K is highly irreversible (r KO ≫ r OK ).In fact, our results are essentially the same regardless of whether the transition rate for this step is computed using (S2.33) or made absolutely irreversible.
The relaxation kinetics that enter into (S2.33)are taken from Table 1 in Ref. [32].We use the geometric mean of the upper and lower estimates of k −1 relax to get the following relaxation rates (in sec −1 ): For concreteness, here we provide the numerical values of entropy production for each jump i → j, at in vivo potential ∆ψ = 120 mV: The resulting rate matrix, again at the in vivo potential, is (in sec −1 )

B. Details of numerical analysis in Fig. 2
To calculate the curves plotted in Fig. 2, we first define the ordered set of transitions Then, for each value of the electrical membrane potential ∆ψ we perform the following: 1. Use the transition rates from (S2.33) to define the 'total' (baseline-and-control) rate matrix R and numerically solve for Rπ = 0 to obtain π, which is unique since R is irreducible (all states are connected, see Fig. S7).
2. Use the transition rates from (S2.33) and values of g from (S2.31) to calculate the total harvesting rate as The total harvesting rate is plotted as the thick black line in Fig. 2. (b) Define optimization parameters (p, J ) with p ∈ R n and J a vector in R n 2 constrained such that i p i = 1, we make all J ji = J ij = 0 for {i, j} ̸ = t.
(c) Use the transition rates and free energy values of f from (S2.30) to fix the "baseline harvesting rate" and the "increase of system free energy" (Eq.( 2)) functions: Ġb (p) and Σ(J ) into Eq.( 10) to solve for p * , J * and G (color-coded line in Fig. 2) using numerical optimization.
We emphasize that in the data shown in Fig. 2 we add no extra constraints.However, in the remaining of this supplementary material, we consider various additional linear constraints pertaining to activity, thermodynamic affinity and kinetic limitations.Such constraints are imposed at Step 3.b and determine the feasibility set Λ in Eq. ( 10).
We make three final observations regarding our bacteriorhodopsin model.First, our analysis assumes that control processes can be added without affecting membrane parameters, such as ∆pH and ∆ψ.In practice, this may be justified by homeostatic mechanisms.For instance, excess proton pumping produced by the introduction of control may be balanced by up-regulation of ATPase, which consumes the protons while making ATP.
Second, in order to consistent with LDB (6), changing the rate of control transition i → j, while keeping the internal free energy values fixed, may require changing the free energy used by that control transition g c ji .For transitions coupled to the membrane potential (such as M 1 ↔ M 2 ), this can be achieved by manipulating ∆ψ or ∆pH, which here act as external parameters (see (12)).However, for a transition that is not coupled to the membrane potential, manipulating g c ji may require, for example, the consumption of a chemical fuel such as ATP.The strength of driving can be modulated by regulating the nonequilibrium concentration of the chemical fuel.
Third, our model reproduces steady-state currents and parameter dependence (such as membrane potential) that agree with reported data, at least to a first approximation.At the same time, it is worth noting that the equilibrium constants reported in our source of kinetic data [32] differ somewhat from the free energy changes reported in our source of thermodynamic data [31].This may be due to different experimental or estimation methods, or because some reaction steps are not approximated well as elementary transitions.

C. Additional analysis: reprotonation step as control
In the analysis in the main text, we observe that near the in vivo membrane potential, the reprotonation reaction (N ↔ O) is the least efficient of all five transitions in T .In this section, we study this transition as control with additional constraints.
Our analysis can be motivated in the context of synthetic or natural selection for increased output in bacteriorhodopsin [35][36][37][38].Suppose that the genotype-phenotype map of bacteriorhodopsin is sufficiently modular such that the thermodynamic and kinetic properties of individual transitions in the cycle can be separately manipulated by mutations.In fact, in case of the reprotonation transition N ↔ O, there is evidence that a single amino acid in the bacteriorhodopsin protein specifically and effectively changes the kinetics of that transition [35].In this setting, we ask to what extend the existing N ↔ O transition is optimal.At the same time, we illustrate how underlying thermodynamic and kinetic constraints, e.g., as might arise from underlying diffusion timescales and biochemistry, can be incorporated when quantifying optimality.
In concrete terms, we let R b N O = R b ON = 0 and fix the flux parameters of the optimization problem (10) as Formally, we consider the following feasibility set in (10): along with other constraints discussed below.
To compare the result of optimization to the actual system, we also consider the actual transition as the control, defined via: 43) with all other R c ij = 0.The baseline-and-control dynamics correspond to the actual bacteriorhodopsin system described in Sec.II A. See also Fig. S8 (left).
For Sections II C and II D in this SM, it is useful to define the current (i.e., net flux) as where J * N O and J * ON are the optimal fluxes found by our optimization problem (10).In the setting of these sections, where the N ↔ O control transition is the missing step of the unicyclic bacteriorhodopsin system, J is the cyclic current in the presence of the baseline and the optimal control transition.

Constrained activity and affinity
We optimize the maximum harvesting rate with respect to the transition N ↔ O, while imposing an additional constraint on the "dynamical activity" of the fluxes N → O and O → N .This defines a smaller feasibility set in (10):

/sec
Harvesting rate for some value of a in units of sec −1 .This upper-bounds the dynamical activity of our control transition such that fluxes cannot be arbitrarily large.
The result of this analysis are shown in Fig. (S9).We begin by noting that, in some instances in the right plot in this figure, the optimized constrained control yields a lower harvesting rate than that of the actual system (black line).
For example, if a = 10 1 at around in vivo values, the actual harvesting rate is roughly double the one achieved by the constrained G a=10 1 (blue line).This is not too surprising since, from our computation of the actual harvesting rate at in vivo values, we read off (in sec −1 ): Thus, constraining the activity as in (S2.45) limits the ability of the optimized system to amplify the current above the actual in vivo current (see top left plot in Fig. S9).In contrast, increasing a approaches the optimal solution to the one shown for the N ↔ O transition in the main text (Fig. 2), here shown in red.Note also that for strongly constrained activity, such as a = 10 1 , the optimal distribution around the in vivo value shifts in the opposite way with Harvesting rate respect to the unconstrained case (blue against red bar-plots in the lower left of Fig. S9).
Finally, in the upper left of Fig. S9 we plot the current as a function of ∆ψ.We observe that the actual (baselineand-control) current effectively goes to zero (stalls) at large potentials, since there is not enough free energy in the cycle to push protons across the membrane.At low potentials, the actual current saturates at a larger value while the harvesting rate plummets.This is because at sufficiently low ∆ψ, transporting protons from inside to outside the cell actually drains free energy stored in the membrane potential, so the cycle operates as a dud.
Thermodynamic affinity constrained: Next, we add further constraints to the optimization problem by including thermodynamic affinity limitations, defined by the feasibility set: This linear constraint enforces a bound on the thermodynamic affinity of the N ↔ O transition, Results for this constraint are shown in Fig. S10 with the fixed choice of dynamical activity a = 10 2 sec −1 .
The combination of the constraint on dynamical activity and thermodynamic affinity is necessary to give physically meaningful results.To see why, imagine that only the thermodynamic affinity was constrained, as in (S2.47).Now consider some pair of a distribution p and flux vector J in our optimization problem (10), where J is restricted to Potential ∆ψ (mV) Harvesting rate

Constrained kinetics
Suppose that we optimize N ↔ O as control under a constraint involving the respective transition rates.In particular, assume that the possible rates are bounded by some value κ such that R c ON ≤ κ, R c N O ≤ κ (in units of sec −1 ).Then, this can be encoded in the following feasibility set: Note that Λ κ introduces constraints that mix both p and J .We also note that constraints on the transition rates are perhaps the most realistic ones when considering the constraints faced by by biomolecular machines [35].Harvesting rate curves for different κ are shown in the right plot of Fig. S11.

D. Proton-pumping as control
The proton-pumping steps (M 1 ↔ M 2 ) is perhaps one of the most experimentally accessible transitions to study.This is because it is the only transition that depends explicitly on the membrane potential ∆ψ, which can be experimentally tweaked and thus used as an external parameter.For example, the electrochemical membrane potential is frequently manipulated by introduction of ionphores, such as valinomycin [78].
In this section, we study this transition as control, possibly under constraints.Harvesting rate (k ).Middle: Log-linear plot of the optimized harvesting rates under thermodynamic affinity and dynamical activity constraints (Λ M1M2 A,a ) for values of A at fixed a = 50 sec −1 (green line) and a = 10 2 sec −1 (red line).Right: Log-log plot of the optimized harvesting rates as a function of the kinetic bound κ (Λ M1M2 κ ).All figures also show the value of the actual harvesting rate Ġtot at the in vivo membrane potential value (black dashed lines), which gives ∼ 70k B T /sec, and the maximum achievable harvesting rate that the system can achieve for any ∆ψ (gray line), ∼ 83.5k B T /sec.
In this analysis, the baseline rate matrix R b does not include the transition M 1 ↔ M 2 .In concrete terms, we let R b M2M1 = R b M1M2 = 0 and fix the flux parameters of the optimization problem (10) as Formally, we consider the following feasibility set in (10): along with other constraints discussed below.
To compare the result of optimization to the actual system, we also consider the actual transition as the control, defined via: with all other R c ij = 0.The baseline-and-control dynamics correspond to the actual bacteriorhodopsin system described in Sec.II A. See also Fig. S8 (right).
Recall that the proton-pumping transition is the only one that involves the membrane potential ∆ψ (see (S2.31)), and that it is no longer part of the baseline.For this reason, the optimization problem (10), which only involves baseline parameters, no longer depends on the choice of ∆ψ.We now proceed to study this problem for different choices of additional constraints on (p, J ).For this reason, in Fig. 2, the value for G when M 1 ↔ M 2 acts as control is a constant.In this case, it is interesting to compare the optimum harvesting rates with the values of Ġtot at in vivo membrane potential versus max ∆ψ Ġtot , the maximum value attained by Ġtot across all membrane potentials in Fig. 2 (right).Results are shown in Fig. S12.
Activity constraint: analogously to Sec.II C, we now define: for values of a in units of sec −1 .Solving (10) under Λ M1M2 a yields the blue curve in Fig. S12 for G as a function of a.
Thermodynamic affinity constraint: owing to the same reasoning as in Sec.II C, this constraint proves physically meaningful if combined with another constraint on the fluxes, such as the activity constraint (S2.51).In this case, we analogously define the feasibility set: In the inset of Fig. S12, we show the solution to (10) under for values of a range of A with fixed values of a = 50 sec −1 (green line) and a = 10 2 sec −1 (red line).
Kinetic constraint: analogously to Sec.II C, we define: for values of κ in units of sec −1 .Solving (10) under Λ M1M2 κ yields the orange curve in Fig. S12 for G as a function of κ.

III. CLOSED-FORM SOLUTIONS OF EQ. (11) IN DIFFERENT REGIMES
We consider a system of n states indexed by i ∈ {1, . . ., n} that evolves according to a Markovian master equation with rate matrix R = R b + R c , where R b and R c correspond to baseline and control processes respectively.Here we study the optimization (11) in three limiting regimes.
A. Linear Response (LR) regime, Eq. (15) We first consider (11) in the linear response regime, that is under the assumption that the optimal distribution is close to the baseline steady state p * ≈ π b , thereby arriving at (15).We assume that R b is irreducible and has a unique steady state π b with full support.Note that the assumption that π b has full support is satisfied when R b is "weakly reversible", meaning that R b ij > 0 whenever R b ji > 0.
First, we rewrite the entropic term in (13) as where in the last step we used that − i,j R b ij π b j ln π b i = 0 when π b is the steady-state distribution of the baseline.
We focus on the first term in (S3.55), and use the expansion where we defined Our variable to optimize is a probability distribution p, which needs to satisfy two constrains: i p i = 1 and p i ≥ 0.
The latter constraint holds automatically, given our hypothesis that p is very close to π b .The former constraint can be expressed in terms of z by setting We impose this linear constraint on z below, when redefining the optimization problem in terms of z rather than p.
Before we finish off with the entropic term, we introduce the following symmetric matrix: The matrix A is sometimes dubbed the "additive symmetrization" of R b .It will always obey the detailed balance condition (DB) for π b , π b j A ij = π b i A ji .Moreover, A = R b if and only if R b obeys DB.We note that only in this latter case would π b be an equilibrium distribution.We also define: Observe that M = D −1 AD, which implies that M and A are related via a similarity transformation so any right eigenpair (λ, u) of A, λ, D −1 u is an eigenpair of the symmetric matrix M .Also, since D and A are both symmetric, so is M .In particular, since A is the sum of two irreducible rate matrices, it is then itself an irreducible rate matrix which has a unique right eigenvector.It is easy to show that, in fact, π b is A's steady-state distribution with eigenvalue 0.Then, by similarity, M also has a unique eigenvector D −1 π b with eigenvalue 0.Moreover, since A is a rate matrix, it is negative semidefinite (λ α ≤ 0), and so M is also negative semidefinite.We indicate the right eigenvectors of A as u α and the eigenvectors of the symmetric matrix M as m α = D −1 u α .We assume that m α are normalized as ∥m α ∥ = 1, therefore ∥D −1 u α ∥ = 1.
Going back to (S3.56), we note that In order to keep track of the second term in (S3.55), let us first recall the free-energy term in (13) and the pay-off vector ϕ which, in terms of the baseline parameters, reads as and conveniently redefine it such that: For reasons that will become obvious below, we re-scale (S3.62) by setting We combine the above with the definition (S3.57) and the nonlinear term (S3.61).We then recast the optimization problem in terms of the variable z, which is constrained by (S3.58).Finally, under the LR regime, (11) is approximated by Eq. (S3.64) is a quadratic optimization problem, which can be solved using standard techniques from linear algebra.
For convenience we summarize these techniques in Lemma 1 in Sec.V of this SM.That theorem implies that where M + = α>1 λ −1 α m α m αT is the Moore-Penrose pseudo-inverse of M .In applying Theorem 1, we used result (S5.115) and the relation p * = π b + Dz * , where z * solves (S3.64).
We can rewrite (S3.65) using the eigensystem of M , Finally, as in the main text we define which can be interpreted as the harvesting amplitude of the eigenmodes of A. Plugging into (S3.65)gives Eq. ( 15) in the main text,

Region of validity of the LR regime
Note that the LR regime applies when ∥p * − π b ∥ ≪ 1.We can write where we used the triangle inequality and properties of the matrix norm.Therefore, the LR approximation is guaranteed to be valid when In other words, the harvesting amplitude on each eigenmode must be slower than relaxation modes, up to a factor that depends only on β, n, and the steady state π b .
B. Deterministic (D) regime, Eq. (16) In the Deterministic (D) regime, we assume that the nonlinear terms in (13) are small compared to the linear terms.
This allows us to approximate the optimal solution of (13) as the linear optimization where (ln p) T := (ln p 1 , . . ., ln p n ).The maximum in (S3.72) is achieved by (p * D ) i = δ ii * with i * = arg max i ϕ i , so as appears in the main text.

Region of validity of the D regime
We now discuss when the Deterministic regime approximation is valid.
We begin by making the assumption that the baseline harvesting rate is non-negative, Ġ(π b ) ≥ 0. In addition, for notational convenience, let K := max i (−R b ii ) indicate the largest escape rate.
We express the regime of validity in terms of the following two parameters: The parameter α reflects the balance between diffusion out of the optimal state versus Deterministic harvesting rate.
The parameter γ is the minimal steady-state probability of any state.
Assuming α < 1, we show that The RHS of (S3.76) vanishes for α → 0, so the LHS tightens as G /G D → 1.We emphasize that the D approximation only becomes accurate in relative, not additive, terms (i.e., it is not necessarily true that G − G D → 0).
To derive (S3.76), we first consider an upper bound on G .Observe that for any p, where we used R b ij p j ln p i ≤ 0 for i ̸ = j and S(p) ≤ ln n.Plugging into (S3.71)and using that G D ≥ ϕ T p for all p and γ ≥ ln n, we then have To derive a lower bound on G , define the distribution p i := (1 − α) δ ii * + απ b i , with α as in (S3.74).Plugging this distribution into the objective in (S3.71) yields where we used that we bound the first term on the right side of (S3.79) as where we used that ln α − γ < 0 in the last inequality.For the second term on the right side of (S3.79), We can plug (S3.80) and (S3.81) into (S3.79) to give a lower bound on G , where we used the definition of α (S3.74) and rearranged.Since ln α − γ < 0, we can further drop the α 2 term to give where in the second line we used the assumption Ġ(π b ) ≥ 0. Combining and using that α < 1 yields which can be rearranged to give (S3.76).
Interestingly, we can also derive a relative perturbation bound on the maximum increase of the harvesting rate, above the baseline harvesting rate.Specifically, using a similar derivation as above, we can show that C. Near-Deterministic (ND) regime, Eq. (17) In this regime, we consider a perturbation of the Deterministic regime: we assume that the optimal p * is close to a delta-function distribution centered at i This allows us to approximate the entropic term in (S3.71) as where we used ln(1 − x) ≈ −x when x ≈ 0. Plugging into (S3.71)gives where we recall that G D = ϕ i * , which does not depend on p.After maximizing with respect to {p i } i:i̸ =i * by taking derivatives and setting them to zero, one obtains Plugging (S3.88) into (S3.87)gives

Region of validity of the ND regime
The ND approximation applies when (p * ND ) i * ≈ 1, or equivalently when For convenience, we define the harvesting gap as the difference in the pay-off between the optimal state i * and the second optimal state, By combining and rearranging the above, we can simplify the validity condition for the ND approximation as Note that this condition also guarantees that (p * ND ) i ≥ 0 for i ̸ = i * .Expression (S3.90) implies that ND approximation is valid when the harvesting gap is much larger than the rates of escape from the optimal state i * .

IV. UNICYCLIC MODEL
Here we analyze the unicyclic model considered in the main text.Before proceeding, we briefly introduce some useful facts about the eigendecomposition of a unicylic rate matrix.

A. An Algebraic Aperitif: eigendecomposition of unicyclic rate matrix
Consider a unicylic rate matrix R b , which is a symmetric circulant matrix.Due to symmetry, its steady state is uniform: The eigensystem for R is obtained from the theory of circulant matrices [45], which, for odd n, yields: where ω a := e i2π(a−1)/n .These eigenvalues are all degenerate twice (except λ (1) = 0, whose eigenvector is simply )).An orthonormal choice of eigenbasis is given by the set

B. Transition harvesting cycle
We now analyze the unicyclic model using techniques developed in Sec.III.We consider a systems with n state arranged in a ring, where baseline transitions are symmetric with uniform kinetics: (see Fig. S13 here and Fig. 4 (a) in the main text).The baseline dynamics are equivalent to a random walk on a one-dimensional ring, and the baseline rate matrix is given by (S4.91).
We assume that a single transition, taken to be 1 → 2 without loss of generality, harvests g b 21 = −g b 12 = Θ joules of free energy.In addition, note that in the main text we chose to set the units of k = 1 without loss of generality.
However, we have kept k explicit in the rest of the SM.For this system, the baseline steady state is uniform, π b i = 1/n, and the harvesting vector is ϕ LR = ϕ = (kΘ, −kΘ, 0, . . ., 0) .(S4.94) Figure S13.A unicyclic system across the n states with symmetric back-andforth rates k, as discussed in the main text.An amount of Θ joules is harvested as free energy into the internal reservoir for each transition 1 → 2 (and vice versa for 2 → 1).
In the rest of this section, we analyze this model in the LR, D and ND regimes.We will also discuss the validity of these approximation in terms of the model parameters.

LR regime
We first consider the LR solution (S3.We can compute Ω a = β −1 v T m α for a = 2, . . ., n, using the eigenvector set given in (S4.93),which simply yields which is depicted in Fig. S14(a).(Note that, due to our choice of a complex-valued eigenbasis, we must be careful in adding the complex conjugate on the Ω a , following the ordering of the operator M + in (S3.68).)Now, component Figure S14.The optimal ∆p * LR (deviation of optimal distribution from the uniform baseline steady state) given (S4.99).The optimal distribution exhibits a clockwise cyclic current given by (S4.100),The magnitude of ∆p * LR decays as ∼ n −1 , so larger rings will result in optimal distributions that are closer to the baseline steady state.Thus, in the LR regime, the optimal distribution builds up in equal increments starting at i = 2 until the optimal state i = 1, after which it falls off a cliff.This is shown in Fig. S14.
We can also compute the probability current across the transition 1 → 2, which we leave as an exercise for the The expressions (S3.69) and (S3.70) suggest for which values of Θ the optimal solution will be in the LR regime.

C. State harvesting cycle
In this section, we consider the second unicyclic model, see Fig. S15 and Fig. 4 (b) in the main text.In this model, the baseline dynamics correspond once again to the unicycle with symmetric rates, as in Sec.IV B. However, harvesting is not coupled to transitions, but rather to internal fluxes within a single mesostate.Without loss of generality, we choose this mesostate to be i = 1.In other words, we let ġb 1 = θ > 0 and ġb 1<i≤n = 0.There is now no preferential direction for the probability current.The free energy harvested per unit time when the system is in mesostate 1 is given by the parameter θ, which carries the same units as G .Just as before, the baseline steady state is uniform, i.e.

LR Regime
In analogy with the previous example, the expression for the upper bound in free energy harvesting rate under the LR regime here is obtained as with Ω a = β −1 v T m α , where the vectors m α correspond to the eigensystem discussed in discussed in Sec.IV A, i.e. in (S4.93).On the other hand, here v = βθ 2 √ n (1, 0, . . . , 0).The latter is obtained by construction using (S3.57),(S3.62) and (S3.63).Thus, Figure S16.The optimal ∆p * LR (the deviation of the optimal distribution from the uniform baseline steady state) from (S4.109).Note that there optimal distribution does not lead to a cyclic current.
For large even n, it is possible to show that [80]  If we wanted to approximate the result above for a large odd n, we would need to add a term proportional to 1/4n 2 , which is of second order, hence the leading order is still captured by expression (S4.108).
We can also study the optimal distribution using (S3.Then, using ∥D∥ = 1/ √ n, condition (S3.70) will read as: Expression (S3.70), however, is not necessarily a tight bound.That is, the result hereby obtained is a sufficient condition for the LR regime to be valid, but not a necessary one.

D and ND Regimes
The optimal mesostate is i * = 1.Hence Next, by following a similar procedure as above, we denote ∆ϕ (S4.112) The harvesting rate attainable in this regime can be calculated using Eq.(S3.89), .113) n×n 2 is the incidence matrix with entries B k,ij = δ ki − δ kj , which guarantees R c π = BJ c .Combining, we arrive at the bound Ġtot ≤ G , where G = sup (p,J)∈Λ:R b p+BJ=0 Figure 2. (a) Comparison of the actual harvesting rate Ġtot at different electrical potentials ∆ψ, versus maximum rate G achieved by optimizing five intermediate transitions (color scheme from Fig. 1 Right).(b) Efficiency Ġtot /G computed while separately optimizing each transition, with colors as in (a).(c) The actual steady state π versus the optimal distribution p * when optimizing the N ↔ O transition (at ∆ψ = 120 mV).
Figure 3. (a) Comparison of the actual harvesting rate Ġtot at different electrical potentials ∆ψ, versus maximum rate G achieved by fixing the bacteriorhodopsin cycle as baseline and allowing any additional transitions as control.(b) Efficiency of the actual bacteriorhodopsin cycle with respect to the optimized cycle.(c) The actual steady state π and optimal distribution p * (at ∆ψ = 120 mV).

Figure 4 .
Figure 4. (a) Unicyclic system where free energy Θ is harvested by a single transition.(b) Unicyclic system where free energy per unit time θ is harvested when the system is in a particular mesostate.

Figure 5 .
Figure 5. (a) Maximum harvesting rate G for the unicyclic system from Fig.4(a), as a function of supplied free energy Θ.Exact value is found numerically, LR, D, and ND are calculated using approximations described in the text.Exact and approximate optimal distributions p * in ND (b) and LR (c) regimes are shown, with the optimal state i * = 1 located in the middle of the histograms.(d) G and its LR approximation for different Θ and system sizes n.

Figure 6 .
Figure 6.(a) Maximum harvesting rate G for the unicyclic system from Fig. 4 (b), as a function of supplied free energy rate θ.Exact value is found numerically, LR, D, and ND are calculated using approximations described in the text.Exact and approximate optimal distributions p * in ND (b) and LR (c) regimes are shown, with the optimal state i * = 1 located in the middle of the histograms.(d) G and its LR approximation for different θ and system sizes n.
sign.This can be seen as a statement of the Second Law.

Figure
Figure S7.Bacteriorhodopsin cycle.Using the irreversible character and fast speed of the bR → K transition, we coarse-grain the cycle from the original seven-state system (left) to a six-state one (right).Circled region in gray shows the states and transition that is coarse-grained into a single state.

3 .
For each transition t ∈ T that acts as control: (a) Define the baseline transition matrix R b by removing the chosen transition from R. As standard, the diagonal entries R b ii are determined by R b ii = − j R b ji .

Figure S8 .
Figure S8.Depiction of choices for baseline and control for Sections II C (left) and II D (right) of the SM.In each case, the baseline (shaded blue) does not include the transition of reprotonation (N ↔ O) and proton-pumping (M 1 ↔ M 2 ), respectively;instead, these are treated as control (shaded yellow).For both scenarios, as in the analysis done in the main text, note that no cyclic current circulates under baseline dynamics but does so under baseline-and-control (i.e. the actual) dynamics.

Figure
Figure S9.Right: actual harvesting rate (in black) and optimized harvesting rate for controlled transition N ↔ O at different values of activity constraint a. Dashed vertical line shows in vivo membrane potential (120mV).Left panel shows the respective currents (upper plot) and the distribution at in vivo value of ∆ψ (lower plot) for the actual system versus the optimized solutions under a = 10 1 and a → ∞, in units of sec −1 .

Figure
Figure S10.Right: actual harvesting rate (in black) and optimized harvesting rate for controlled transition N ↔ O at different values of maximum affinity A and maximum activity a = 10 2 sec −1 .Dashed vertical line shows in vivo membrane potential (120mV).Left panel shows the respective currents (upper plot) and the distribution at in vivo value of ∆ψ (lower plot) for the actual system and the optimized solutions under A = 0.1 and A = 0.8.
have non-zero transitions only for N → O and O → N , and satisfy the steady-state condition BJ = −R b p.These fluxes can be increased as J N O → J N O + α, J ON → J ON + α for α ≥ 0. This transformation doesn't change the current across the transition N → O (S2.44), so the steady-state constraint is still valid for the same p (note that BJ only depends on the currents, i.e., net fluxes).Finally, by choosing α large enough, we can make the EPR term in(10)

Figure S11 .
Figure S11.Right: actual harvesting rate (in black) and optimized harvesting rates for controlled transition N ↔ O at different values of kinetic constraint κ.Dashed vertical line shows in vivo membrane potential (120mV).Left panel shows the respective currents (upper plot) and the distribution at in vivo value of ∆ψ (lower plot) for the actual system and the optimized solutions under κ = 10 1 and unconstrained κ.
vanish and satisfy the affinity constraint (S2.47), since lim α→∞ | ln[(J N O + α)/(J ON + α)]| = 0.This shows that, lacking other constraints, the affinity constraint is vacuous, since it can be always satisfied by making the one-way fluxes sufficiently large.On the other hand, the combination of the activity and affinity constraints sets a bound on the fluxes.

Figure S12 .
Figure S12.Left: Log-log plot of the optimized harvesting rates as a function of dynamical activity bound a (Λ M1M2 a .84) where α = K/β(G D − Ġ(π b )) ≥ 0 and γ is defined as before.In fact, this perturbation bound on the increase holds without any additional assumptions, i.e., regardless of the sign of Ġ(π b ).

Figure S15 .
Figure S15.We consider a unicyclic system with transitions across the n states with symmetric back-and-forth rates k.A single state (labelled i = 1) leads to a flux of free energy at rate θ.