Limits of sensing temporal concentration changes by single cells

Berg and Purcell [Biophys. J. 20, 193 (1977)] calculated how the accuracy of concentration sensing by single-celled organisms is limited by noise from the small number of counted molecules. Here we generalize their results to the sensing of concentration ramps, which is often the biologically relevant situation (e.g. during bacterial chemotaxis). We calculate lower bounds on the uncertainty of ramp sensing by three measurement devices: a single receptor, an absorbing sphere, and a monitoring sphere. We contrast two strategies, simple linear regression of the input signal versus maximum likelihood estimation, and show that the latter can be twice as accurate as the former. Finally, we consider biological implementations of these two strategies, and identify possible signatures that maximum likelihood estimation is implemented by real biological systems.

Cells are able to sense concentration gradients with high accuracy. Large eukaryotic cells such as the amoeba Dictyostelium discoideum and the budding yeast Saccharomyces cerevisiae can sense very shallow spatial gradients by comparing concentrations across their lengths [1]. By contrast, small motile bacteria such as Escherichia coli detect spatial gradients indirectly by measuring concentration ramps (temporal concentration changes) as they swim [2], and can respond to concentrations as low as 3.2 nM-about three molecules per cell volume [3]. The noise arising from the small number of detected molecules sets a fundamental physical limit on the accuracy of concentration sensing, as originally shown in the seminal work of Berg and Purcell [4,5]. This approach was recently extended to derive a fundamental bound on the accuracy of direct spatial gradient sensing [6]. However, no theory exists for the physical limit of ramp sensing, which is what bacteria actually do when they chemotact. In this Letter, we present such a theory for different measurement devices, from a single receptor to an entire cell. We contrast two strategies: linear regression (LR) of the input signal (in line with Berg and Purcell) and maximum likelihood estimation (MLE) [7,8], a method from statistics to optimally fit a model to data, revealing an up to twofold advantage for the latter. Finally, we introduce a biochemical signaling network, similar to the E. coli chemotaxis system, that outputs an estimate of the ramp rate. Consistent with the derived theoretical bounds, we find that a mechanism emulating MLE yields twofold higher accuracy that one emulating LR. However, this improved performance has a cost: either storage of signaling proteins near the receptors, or irreversibility of the receptor cycle with concomitant energy consumption.
Sensing small numbers of molecules implies relative noise ∼n −1/2 , where n is the number of detected molecules. Berg and Purcell (BP) calculated how this noise affects the accuracy of concentration sensing [4]. They considered three types of measurement devices: a single receptor, a perfectly absorbing sphere, and a per- fectly monitoring sphere. Following their approach, we investigate ramp sensing by these three devices when presented with a concentration c(t) = c 0 + c 1 t, as schematized in Fig. 1.
A single receptor [ Fig. 1(a)] binds particles at rate k + c(t) and unbinds them at rate k − . Following BP, we assume that diffusion is fast enough that the receptor never rebinds the same particle. An ideal observer has access to the binary time series s(t) of receptor occupancy between −T /2 and T /2. The lengths of bound and unbound invervals have exponential distributions with means 1/k − and 1/k + c, respectively. Throughout, we assume that the ramp is shallow, c 1 T c 0 , and that the observation time is long compared to receptor kinetics, In BP, the true concentration c is estimated from the fraction of time the receptor is bound, , which is equal to the equilibrium occupancy in the limit of large times: where · represents an ensemble average. Following a similar strategy, we can estimate the ramp rate by performing the linear regression of s(t) to s 0 + s 1 t: from which the concentration and the ramp rate are estimated using (A18) as: The uncertainties of these estimates can be calculated from the time correlations of receptor occupancy (see Appendix A 1 a), yielding: where n is the total number of binding events in the time T . Note that the result for c 0 is precisely that of BP [4,8].
In [8], it was shown that the accuracy of concentration sensing could be improved using maximum likelihood estimation. In this scheme, the parameters of the model are chosen to maximize the probability ("likelihood") that the observed data was generated by the model. Can we also improve the accuracy of ramp sensing over LR by using this method? The time trace s(t) can be characterized by the series of binding (t + i ) and subsequent unbinding (t − i ) times, i = 1, . . . , n. The probability of the data within our model is [8]: where T b is the total bound time. The concentration and the ramp rate, c 0 and c 1 , are the model parameters.
Given the times of the events, the likelihood is maximized with respect to c 0 and c 1 by solving ∂P /∂c 0 = 0 and ∂P /∂c 1 = 0, from which the maximum likelihood es- ) is obtained. In general these equations have no simple solution, but we can obtain the average behavior by exploiting the fact that binding and unbinding are fast with respect to concentration changes, i.e. that the receptor remains adiabatically in equilibrium with the concentration c(t). We can thus simplify the sum and product in (5): where s(t) is the equilibrium occupancy at time t, given by (A18) with c =c 0 +c 1 t, wherec 0 andc 1 are the true parameters that generated the data. Applying this approximation to ∂P /∂c 0 , ∂P /∂c 1 , we confirm that c MLE 0 =c 0 and c MLE 1 =c 1 for T → ∞ (see Appendix A 1 b). For finite times, the errors in c MLE 0 , c MLE 1 can be estimated by the Cramér-Rao bound [9], which states that the variance of parameter estimates exceeds the inverse of the Fisher information, and approaches equality in the limit of long time series: where δc = (c MLE 0 −c 0 , c MLE 1 −c 1 ) and ∂ c = (∂/∂c 0 , ∂/∂c 1 ). Again we can use the adiabatic approximation to compute the Hessian of the log-likelihood on the right-hand side of (A21), to obtain: These variances are half the ones obtained from LR (4). The first result for constant concentrations is that of [8].
As observed there, the LR estimate adds the uncertainties from both bound and unbound interval durations. In contrast, the maximimum likelihood estimate relies only on unbound interval durations, since these carry all the information about the concentration. We now turn to ramp sensing by an entire cell, starting with the case of an idealized absorbing sphere [ Fig. 1(b)]. An ideal observer witnesses a time series of absorption events, described by the instantaneous current I(t) = n i=1 δ(t − t i ), where δ(t) is the Dirac delta function and {t i } are the absorption times. The average current of molecules impinging on the sphere is given by I(t) = 4πDac(t), where D is the diffusivity, a the sphere radius and c(t) the concentration far from the sphere [4]. Applying the same methods used for the single receptor, we calculated the uncertainty of ramp sensing for linear regression of I(t) as well as for MLE (see Appendix A 2). We found no difference between the two strategies, which both yield the same uncertainties as in (9), with n now the total number of molecules absorbed during time T : n ≈ 4πDac 0 T . For a monitoring sphere [ Fig. 1(c)], molecules are free to diffuse into and out of the sphere, and the observer records the number N (t) of particles inside the sphere as a function of time. On average this number is N (t) = (4/3)πa 3 c(t). Performing a linear regression of N (t) to N 0 + N 1 t, one can estimate the concentration and the ramp rate through c LR 0 := 3N 0 /4πa 3 and c LR 1 := 3N 1 /4πa 3 . Following [4], the uncertainty of these estimates can be calculated from the time autocorrelation of N (t) (see Appendix A 3), yielding: The first result was obtained in [4]. Maximum likelihood is difficult to implement in the context of the monitoring Binding of ligand to the receptor increases its activity u and causes species x to be produced. This production is downregulated by a feedback factor y which is itself catalyzed by x. Right: average network response to a step function in the concentration, c(t) = c0 + ∆c θ(t − t0) (solid curves) and to a ramp, c(t) = c0 +c1(t−t0)θ(t−t0) (dotted curves). In response to the step function, the network adapts precisely and x decays back to its original value after an initial increase. In response to a ramp, x shifts by an amount proportional to the ramp rate. The quantitative ability of the network to sense such ramps depends on whether receptors signal continuously or in a discrete burst upon ligand binding.
sphere because it requires a sum over all possible histories of particles exiting and returning to the sphere. Thus, whether the LR result can be improved upon remains an open question. Maximum likelihood estimation is in general the optimal way to sense ramps, and provides a twofold improvement over simple linear regression in the case of the single receptor. Could MLE be implemented in biological systems? To address this question, we now introduce a simple, deterministic biochemical network ( Fig. 2) that can approach the optimal performance limit set by MLE. The same network implements either LR or MLE depending on the receptor signaling mechanism: LR is implemented if each receptor signals continuously while a particle is bound; MLE is implemented if each receptor signals with a fixed-size burst upon binding a particle, and then releases the particle rapidly. The first case corresponds to integrating the fraction of time the receptor is bound, while the second corresponds to counting binding events. Accordingly, we will show that the shot noise (Poisson noise) due to the stochastic nature of binding and unbinding is twice as large in the first case as in the second. Let u(t) be the receptor activity, proportional to the instantaneous production rate of signaling molecules. For continuous signaling, this activity is simply proportional to receptor occupancy: u(t) = αs(t), whereas for burst signaling, u(t) is a series of fixed-size bursts at the times of binding: u(t) = β n i δ(t − t + i ). Without loss of generality, we set α = k − and β = 1 so that u(t) is equal to the mean rate of binding events in both cases, u(t) = k − k + c(t)/(k − + k + c(t)). For averaging times much longer than 1/k − and 1/k + c, we can approximate the fluctuations of u(t) by Gaussian white noise, 2 , with g = 2 for continuous signaling, and g = 1 + (k + c/k − ) 2 for fixed-size burst signaling (see Appendix B 1). For rapid unbinding, k − → +∞, we recover the same twofold difference as between (4) and (9), and for the same reason: in the case of continuous signaling, noise from the stochasticity of bound intervals adds to the noise from random arrivals.
To extract the ramp rate from receptor activity requires a network that "takes the derivative" of its input signal. An example is the E. coli chemotaxis system, which relies on precise adaptation via integral feedback [10,11]. A minimal deterministic version of such a network is schematized in Fig. 2 and described by the following differential equations: where for simplicity u(t) is the activity of a single receptor, and x is the concentration of signaling molecules it produces. f (y) is a monotonically decreasing function regulating the production of x. The role of y is similar to that of the receptor methylation level in E. coli: y precisely adapts the production rate of signaling molecules so that the steady-state value of x does not depend on the external ligand concentration. This property is illustrated by the graphs on the right side of Fig. 2, which show how the network responds to a sudden change in ligand concentration (solid curves). While the network output x is insensitive to the absolute concentration, it responds to steady ramps (dotted curves). When the input varies slowly in time, u(t) = u 0 + u 1 t (with u 1 u 0 k x , u 0 k y ), the system responds by shifting x away from 1 so that the change in y(t) tracks the change in u(t): with u 0 f (y 0 ) = 1 and γ = −f (y 0 )/f (y 0 ). Thus y provides a readout of the absolute concentration, and x provides a readout of the ramp rate. The accuracy of these representations is limited by the ligand binding shot noise δu(t). The effect of noise can be calculated by expanding the solution of (11) linearly around its average (see Appendix B 2): where ω 2 = k 2 x /4 − k x k y /γ (ω can be imaginary). From (12) we deduce the uncertainties of c 0 and c 1 : For a fixed k y , the optimal value of k x is the smallest one with a non-oscillating response kernel K(t): k x = 4k y /γ. Systems with oscillating kernels are undesirable because they detect oscillations rather than ramps. For k x = 4k y /γ, our results are consistent with those of Eqs. (4) and (9), namely uncertainties inversely proportional to the number of binding events, if we interpret γ/k y → T as the effective time of measurement, and u 0 as the rate of binding events. The factor g reflects the difference between the two mechanisms of receptor signaling. Despite its simplicity, our biochemical model may help analyze features of real biological systems. There are two separate aspects to the model: on the input side, different mechanisms of receptor signaling-continuous signaling (LR) versus burst signaling (MLE)-affect readout accuracy; on the output side, integral feedback provides a natural readout for sensing ramps.
Many receptors, including the well-studied chemotaxis receptors of E. coli, signal continuously rather than in bursts, and therefore do not employ MLE. In practice, how could cells implement MLE? A receptor could simply "store" a fixed amount of signaling molecules and release all of them upon ligand binding. Alternatively, receptors could signal continuously following a binding event but with a narrowly peaked distribution of durations. Our results can easily be extended to an arbitrary distribution of durations τ b , yielding g = 1 + (δτ b ) 2 / τ b 2 : the more peaked the distribution of τ b , the less noisy the readout. For equilibrium binding/unbinding, we find g ≥ 2 (see Appendix C), with an irreversible binding cycle driven by energy dissipation required to achieve g < 2. Interestingly, there are examples of such irreversible cycles in ligand-gated ion channels [12][13][14], where ions play the role of our output signal x. In these ion channels, peaked open-time distributions are interpreted as evidence that time reversibility is broken and energy is being consumed [15]. We speculate that the role of this irreversibility may be to reduce the variance of bursts, thereby increasing the accuracy of concentration or ramp sensing. Relatedly, a multiplicity of irreversible steps in rhodopsin signaling has been shown to explain the reproducibility of singlephoton responses in rod cells [16].
As for the mechanism of ramp sensing, the integral feedback system underlying E. coli chemotaxis is similar to our simple model. However, the receptor methylation level, which plays the same role as y in our model, adjusts the binding/unbinding rates k + /k − so that k − ≈ k + c, rather than adjusting the production rate k x f (y)u as in (11). In E. coli receptors increase their gain by responding cooperatively [17], and k − ≈ k + c is required to maximize this gain, which precludes the limit k − k + c required for MLE. Moreover, k + is physically limited by diffusion and receptor size, and should optimally be kept near the diffusion limit to maximize the number of binding events. It is worth noting that in E. coli the methylation and demethylation processes responsible for integral feedback are themselves subject to noise, giving rise to additional fluctuations [18]. For a receptor signaling in bursts, integral feedback could act by adjusting the number of released molecules upon binding if the receptor stores molecules, or the mean bound duration τ b if signaling is continous, or the channel conductivity in ligand-gated ion channels. We hope that our analysis will suggest experiments for testing these scenarios.
We thank Pankaj Mehta and Aleksandra Walczak for helpful suggestions. T. M. was supported by the Human Frontier Science Program and N.S.W. by National Institutes of Health Grant No. R01 GM082938.
Appendix A: Uncertainties in ramp sensing

Single receptor
We consider a single receptor, which particles bind to and unbind from stochastically. The unbinding rate is denoted by k − , and the binding rate by k + c(t), where c(t) is the concentration of particles. We assume fast diffusion and so neglect rebinding of particles [5]. Between the times −T /2 and T /2, the concentration changes slowly with time: c(t) = c 0 + c 1 t. We assume that variations are small: c 1 T c 0 , and that the integration time T is large compared to the waiting times:

a. Linear regression
We perform linear regression of the binary time series s(t) of receptor occupancy s(t) to s 0 + s 1 t. The time trace s(t) can be characterized by the series of binding (t + i ) and subsequent unbinding (t − i ) times, i = 1, . . . , n. The resulting estimates for c 0 and c 1 are: where are the results of the linear regression.
In the limit of long time series, these estimates give the correct answers: replacing s(t) by its ensemble average value, s(t) = k + (c 0 +c 1 t)/[k − + k + (c 0 +c 1 t)], wherẽ c 0 ,c 1 are the true values of the concentration and the ramp rate, one obtains: which leads to c LR 0 =c 0 and c LR 1 =c 1 on average. The expected error can be obtained from the covariance matrix δs t δs . To compute this covariance matrix we need the following quantity: where the approximation ≈ is valid provided that |t | c 0 /c 1 . In the limit of large times we have: from which we deduce (using Eqs. A1, A2 and c 1 T c 0 ): The above result for c 0 was originally obtained by Berg and Purcell [4].

b. Maximum likelihood
The probability ("likelihood") of a sequence of binding and unbinding events at times {t + i } and {t − i } between −T /2 and T /2, is: (A13) The log-likelihood is (up to a constant independent of c 0 and c 1 ): Given the times of the binding and unbinding events, the optimal strategy for estimating the concentration and the ramp rate is to maximize this log-likelihood with respect to c 0 and c 1 , i.e. to solve ∂ log P ∂c 0 = 0 and ∂ log P ∂c 1 = 0, . To solve Eq. A15, we exploit the self-averaging property of log P as T → +∞ (adiabatic approximation). The sums in Eq. A13 and A14 become: where s(t) is the equilibrium probability of the receptor being bound at time t: andc(t) =c 0 +c 1 t is the true concentration. We take the derivatives of the log-likelihood (Eq. A14, with replacements from Eqs. A16 and A17) with respect to c 0 and c 1 : It is straightforward to confirm that these expressions become zero for c 0 =c 0 and c 1 =c 1 . Thus, in the limit of long time series, the maximum likelihood estimates coincide with the true values of the concentration and the ramp rate. The expected error of these estimates is given by the Cramér-Rao inequality, which becomes an equality in the limit of large T : where δc = (c MLE 0 −c 0 , c MLE 1 −c 1 ) and ∂ c = (∂/∂c 0 , ∂/∂c 1 ).
The second derivatives of the log-likelihood are, to leading order: Exploiting the diagonal structure of the Hessian, we obtain: where n is the total number of binding events in the time series, n ≈ k − k + c 0 T /(k − + k + c 0 ). The above result for c 0 was first obtained in [6].

Absorbing sphere
We now consider as a measurement device a perfectly absorbing sphere of radius a. Solving Laplace's equation with absorbing boundary conditions at the surface of the sphere yields the average flux of particles impinging on the sphere [4]: where D is the diffusivity, a is the sphere radius, and c(t) = c 0 + c 1 t is the (time-dependent) concentration far away from the sphere. When the ramp arises from motion of the sphere up a spatial gradient (with velocity v), one must be careful: in general the flux of particles is not uniform around the sphere, and will differ between the front and the rear of the sphere (windshield effect) [4]. However, this effect can be neglected when measuring the change of concentration with time if the particle turnover rate D/a 2 is much larger than rate of moving one body length v/a, i.e. if D/a v. For bacteria the relevant numbers are D ≈ 10 3 µm 2 /s, a ≈ 1 µm, and v ≈ 30 µm/s, so that D/a ≈ 10 3 µm/s. Therefore, from the perspective of a swimming bacterium, the concentration appears to be changing with time, essentially uniformly in space, which is the case we consider.

a. Linear regression
For a constant concentration, the simplest estimate for the concentration is c LR 0 = n/4πDaT , where n is the total number of absorption events in time T . In the presence of a ramp, an estimate of both concentration and ramp rate can be obtained by performing the linear regression of to 4πDa(c 0 + c 1 t), where the {t i } are the absorption times, yielding: (A30) Since the particle absorptions are independent events, n is a Poisson variable, (δn) 2 = n = 4πDac 0 (as above, c 0 ,c 1 are the true values of the concentration and the ramp rate), and thus (δc 0 ) 2 /c 2 0 = 1/n. Concerning the ramp rate, we have and where we have used the fact that over a short time δt, the number of absorbed particles is a Poisson variable, which entails: Therefore: Finally, in summary: where n is the total number of absorption events in time T .

b. Maximum likelihood
The observations by an absorbing sphere in the interval (−T /2, T /2) are summarized by the sequence of times when particles are absorbed, {t i }, i = 1, . . . , n. The loglikelihood of this sequence reads: where I(t) c0,c1 is the expected average flux for concentration c 0 and ramp rate c 1 . Setting the derivatives to zero, d log P/dc 0 = 0, d log P/dc 1 = 0, yields to leading order: where we have used c 1 T c 0 . In the limit of long time series, the sums self-average and become: To obtain the uncertainties, we calculate the second derivatives of the log-likelihood and use the Cramér-Rao bound. After a calculation similar to Eqs. A22-A24, we find: Therefore, maximum likelihood estimation is equivalent to linear regression in the limit of long time series. (One can similarly show that the same conclusion applies to the sensing of spatial gradients: the estimate from linear regression derived in [6] is equivalent to maximum likelihood estimation.)

Linear regression
Following Berg and Purcell [4], we now consider a perfect monitoring sphere of radius a, which lets particles freely diffuse in and out of the sphere. This device records the number of particles N (t) inside the sphere at all times, with ensemble average N (t) = (4/3)πa 3 c(t). The particle turnover time, a 2 /D (where D is the diffusivity), is assumed to be small compared to the total time T of observation. Regressing N (t) to (4/3)πa 3 (c 0 + c 1 t) yields an estimate for the concentration and the ramp rate: where V = (4/3)πa 3 is the volume of the sphere. In the limit of long times, this estimate gives the true value of the concentration and the ramp rate: c LR 0 =c 0 , c LR 1 =c 1 . The uncertainties are given by the variances of c LR 0 and c LR 1 . For example, the uncertainty of the concentration estimate is: To compute δN (t)δN (t ) , we decompose N (t) as a sum over M independent particles in a large volume U V : is a binary variable for each particle, whose value is 1 if the particle is in the sphere, and 0 otherwise. We assume that M and U are large, such that M/U =c 0 +c 1 t. The ramp can arise from a change in volume U , or from creation/annihilation of particles. For definiteness, we will assume that the number of particles M stays fixed, while the confining volume U varies in time. Since the particles are independent of each other, we have δx i (t)δx j (t ) = 0, and therefore: where x(t) the binary indicator variable of a single particle, and δx(t) = x(t) − x(t) . In the limit of large time differences |t − t | D/a 2 , the autocorrelation function δx(t)δx(t ) decays to 0. Since T D/a 2 , Eq. A48 simplifies to: where we have used time translation invariance and time-reversal symmetry between t and t . The quantity x(0)x(t) is the probability that a particular particle was in the sphere at times 0 and t, i.e. P (x(0) = 1)P (x(t) = 1|x(0) = 1) = x(0) u BP (t), with x(0) = (c 0 +c 1 t)v/M and u BP (t) := P (x(t) = 1|x(0) = 1) where the integrals are over the sphere of radius a. A calculation familiar from electrostatic done in [4] yields: Combining the above results and using x(0) x(t) = (cV /M ) 2 , we obtain the Berg and Purcell bound: The calculation of the uncertainty of the ramp rate proceeds very similarly: where we have used t ≈ t. This approximation is justified by the fact that for |t−t | D/a 2 , δx(t)δx(t ) → 0. Finally we obtain: Appendix B: Uncertainties in a ramp-sensing biochemical network

Input noise
We first consider the most general case of the biochemical network shown in Fig. 2 of the main text. Ligands bind the receptor at a rate k + c(t). When a ligand binds to the receptor, it remains bound for a time τ b , and β signaling molecules are released. The distributions of both τ b and β can be arbitrary. For continuous signaling, particles are produced at a rate β while the receptor is bound, yielding β = ατ b , while for fixed-size burst signaling, β particles are realeased directly upon binding at time t + i . Both scenarios can be described by the instantaneous receptor activity: Continuous signaling u(t) = αs(t), where, in the first equation, s(t) = 1 when the receptor is bound and 0 otherwise. When there is only one bound state, τ b has an exponential distribution, with τ b = 1/k − and (δτ b ) 2 = 1/k 2 − . For averaging times much longer than the durations of bound and unbound intervals, u(t) can be approximated by a Gaussian variable. The mean of u(t) is: Fixed-size burst signaling u(t) = β τ b + 1/k + c(t) .
To estimate the variance of u(t) for fixed-size burst signaling, we calculate the variance of the number of binding events during a time ∆t τ b , 1/k + c(t) (but ∆t T ).
from which we obtain the uncertainty of the concentration readout, using Eqs. B10 and B15: state at time t ≤ t after entering it at time t = 0 is: which yields the probability distribution of intervals of activity: To exploit the property of detailed balance, we symmetrize Q AA by definingQ = {q ij p j /p i } i∈A,j∈A .Q being symmetric, it can diagonalized in an orthonormal base: λ k |u k u k |, with λ k > 0 and u k |u k = δ kk .
(C12) The diagonal form of Q AA thus reads: where v R,k (C15) and therefore: which establishes Eq. C2 with w k = (A 2 k /λ k )/( k A 2 k /λ k ).