Chemotaxis in uncertain environments: hedging bets with multiple receptor types

Eukaryotic cells are able to sense chemical gradients in a wide range of environments. We show that, if a cell is exposed to a highly variable environment, it may gain chemotactic accuracy by expressing multiple receptor types with varying affinities for the same signal, as found commonly in chemotaxing cells like Dictyostelium. As environment uncertainty is increased, there is a transition between cells preferring a single receptor type and a mixture of types – hedging their bets against the possibility of an unfavorable environment. We predict the optimal receptor affinities given a particular environment. In chemotaxing, cells may also integrate their measurement over time. Surprisingly, time-integration with multiple receptor types is qualitatively different from gradient sensing by a single type – cells may extract orders of magnitude more chemotactic information than expected by naive time integration. Our results show when cells should express multiple receptor types to chemotax, and how cells can efficiently interpret the data from these receptors.

As a white blood cell finds a wound, or an amoeba finds nutrition, they chemotax, sensing and following chemical gradients. Eukaryotic cells sense gradients of chemical ligands by measuring ligand binding to receptors on the cell's surface. In eukaryotic chemotaxis in shallow gradients, accuracy is limited by unavoidable stochasticity arising from randomness in ligand-receptor binding and diffusion [1][2][3][4]. Eukaryotic chemotaxis has been extensively modeled [1,[5][6][7][8], including extensions to collective chemotaxis [9][10][11][12] and stochastic simulation [13,14]. Modeling and experiment show eukaryotic chemotaxis is most accurate at ligand concentrations near the receptor dissociation constant K D [2,3,5]. Cells crawling through tissue and searching for targets at variable concentrations are exposed to a huge variation in environmental signals. Cells often express multiple receptors for the same signal, with K D values ranging over orders of magnitude [15,16]. For instance, during Dictyostelium's life cycle, Dicty expresses multiple different combinations of cAMP receptors CAR1-CAR4 [17], with ranges of K D from 25 nM to > 5000 nM. Larger-K D receptors are expressed later in development, when the cAMP background level rises; the change in receptor expression has been suggested to allow Dicty to deal with the new environments [18,19]. In addition, Segota et al. recently showed that to explain the high accuracy of Dictyostelium chemotaxis to folic acid over a broad range of folic acid concentrations, multiple receptor types (with K D values ranging from 2 nM to 450 nM [15]) and multiple measurements over time must be accounted for [4].
We argue that if a cell is sufficiently uncertain about its chemical environment, it should express multiple receptor types. We provide results for the optimal receptor K D s depending on environmental uncertainty. In addition, we show that integrating information from multiple measurements of the receptor binding state is more complicated in the manyreceptor-type case, and show that a standard approach significantly under-estimates gradient sensing accuracy.
We generalize the model of [5,20], considering a cell with receptors of R types with dissociation constants K i D , i = 1 · · · R, with receptors spread evenly over a circular cell. We find the fundamental limit set by the Cramér-Rao bound with which this cell can measure a shallow gradient g using a snapshot of current receptor occupation. If the concentration near the cell is locally c(r) = c 0 e r·g/L , i.e. g is the percentage change across the cell diameter L, this uncertainty is (Appendix B): (1) where N is the total receptor number, and f i is the fraction of receptors that are type i. When can a cell improve its accuracy by expressing multiple receptor types? If we try to maximize signal-to-noise ratio SNR = g 2 /σ 2 g , we see from Eq. 1 that SNR is a linear function of the f i , and will be maximized by choosing f = 1 for the type with the largest value of c0K i D (c0+K i D ) 2 , and f = 0 for the others. For the simplest case of two types A and B with dissociation constants K A , K B (K B > K A ) this means that the best accuracy occurs with all A receptors when c 0 < √ K A K B , and with all B receptors when c 0 > √ K A K B (Fig. 1, top). If the background concentration is completely known, it is never beneficial for a cell to express multiple receptor types simultaneously. However, if a cell is uncertain about the concentration it is likely to encounter, it may hedge its bets by expressing multiple receptor types, allowing it to chemotax effectively in more environments. Does a cell in concentration c 0 with probability p(c 0 ) benefit from multiple receptor types? What metric is appropriate? We could compute average signal-to-noise ratio, SNR = dc 0 p(c 0 )g 2 /σ 2 g , but increasing SNR at one concentration is little consolation to the cell that finds itself completely lost at another concentration -the utility of SNR saturates. We therefore optimize the mean chemotactic index CI, where CI = f (SNR), with f (x) saturating at 1 as x → ∞. Here, we use f (x) = 2x/π/L 1/2 (−x/2), where L 1/2 (x) is a generalized Laguerre polynomial, as in [21]; alternate definitions of CI lead to similar results.
In Fig. 1, we consider two receptor types with dissociation constants K A and K B , and numerically determine the fraction of A receptors that maximizes CI. We choose p(c 0 ) to be lognormal, p(ln c 0 ) ∼ e −(ln c0−ln c * ) 2 /2σ 2 µ -a generic option for large variability. When environmental uncertainty σ µ is small, we see the behavior predicted above -at small c * , the cell FIG. 1. Cells benefit from mixed receptor expression when their environment is uncertain-TOP: The signal-to-noise ratio g 2 /σ 2 g from Eq. 1 for 100% A receptors (yellow), 100% B receptors (red) or both in a 50-50 combination (blue). BOTTOM: Fraction of A receptors that maximizes CI for snapshot measurements (Eq. 1) as a function of p(c0) where p(ln c0) ∼ e −(ln c 0 −ln c * ) 2 /2σ 2 µ , characterized by the environment concentration c * and the standard deviation of the log of the concentration, σµ. The transition between All-A and All-B occurs at √ KAKB at low uncertainty. Dashed lines indicate receptor KD. KA = 1 nM, KB = 1000 nM, N = 5 × 10 4 and g = 0.05 in both plots.
should express all A receptors, while for c * > √ K A K B the cell switches to all-B. However, at larger σ µ , cells optimize CI by expressing equal amounts of A and B receptors (Fig. 1,  bottom).
Why is a 50-50 mix optimal even when c * ≈ K A ? This may seem like a natural response to uncertainty, but it is not obvious why, if the typical concentration, c * is close to K A , the cell would not prefer A-type receptors. We argue that in the limit of large σ µ , where p(c 0 ) becomes very broad, but CI(c 0 ) is locally peaked, the optimal fraction should not depend on c * or σ µ . The chemotactic index CI(c 0 ) for snapshot sensing is generally peaked when c 0 is around K A or K B , because the SNR decays when c 0 K A or c 0 K B (Fig.  1, top). As σ µ increases and p(c 0 ) becomes more weakly dependent on c 0 , we can approximate the integral defining CI as FIG. 2. Optimal receptor configuration. Best K i D values for receptors as a function of environmental uncertainty σµ. Marker areas are scaled to the fraction of that receptor type. Solid lines illustrate p(ln c0). c * = 1 nM, N = 5 × 10 4 and g = 0.05, ∆CI = 0.01 and a maximum of R = 7 types are considered.
(We've chosen √ K A K B as a typical value in the range K A · · · K B .) In Eq. 2, the parameters c * and σ µ of the environment distribution p(c 0 ) only appear in p( √ K A K B ), and the receptor fractions only appear in the term dc 0 CI(c 0 ). In this limit, p( √ K A K B ) becomes an irrelevant prefactor -the same fraction will optimize CI independent of c * and σ µ , and so we see a 50-50 mix for a broad range of parameters. We will see later that the 50-50 mix is no longer optimal when cells time average and CI(c 0 ) is no longer locally peaked.
The 50-50 mix between A and B receptors in Fig. 1 emerges when p(c 0 ) is so broad it is slowly-varying on the scale of CI(c 0 ). We caution that at these large levels of environmental variation, the difference between the optimal receptor configuration and simply choosing all-A or all-B receptors is small (Appendix F) -at sufficiently high uncertainties, no configuration is particularly successful. Fig. 1 shows when a cell should choose to express a combination of receptor types. Can we also find which receptors a cell would evolve to maximize gradient-sensing ability in a given p(c 0 )? We optimize CI by varying K i D and receptor fraction f i for different numbers of receptor types R and different widths σ µ , holding total receptor number N constant. We choose the configuration that maximizes CI-with an important caveat. By adding more types with arbitrary K D , we can always at least match the performance of a single type. If many configurations generate roughly the same near-optimal CI (all within ∆ CI = 0.01), we choose from these the configuration with the fewest receptor types R. To reduce the number of variables we vary, we use the symmetry of p(c 0 ), assuming ln K i D and f i are mirror-symmetric around ln c * . The resulting optimal K i D are shown in Fig. 2. We see that as the environment uncertainty σ µ increases, there is a transition between preferring a single receptor type and multiple receptor types, with the K D values for the multiple types being spread over the likely range of concentrations observed. Fig. 1 and Fig. 2 are based on Eq. 1, which gives the fundamental uncertainty for a cell sensing a gradient from a single snapshot of its receptors. If cells integrate measurements over time [3,5,6,20,22,23], they can improve gradient sensing. Our results in Appendix B give the estimatorĝ for the gradient vector g given the snapshot data. Defining a time-integrated estimatorĝ T = 1 T T 0ĝ (t)dt, as in earlier work [5,10,20], we find its variance, σ 2 g,T = |g T − g| 2 . To do this, we must consider the kinetics of binding and unbinding at each receptor, which happens with a receptor correlation time of τ i = 1/(k i − + c 0 k i + ). In the limit of large averaging times T τ i , we find (Appendix C): where β i = c 0 K i D /(c 0 + K i D ) 2 reflects the accuracy of measuring by type i -maximized at c 0 = K i D . For a single type, error is reduced by the effective number of measurements T /2τ , σ 2 g,T = σ 2 g × 2τ /T [5,20]. Eq. 3 can be cast in a similar form as i β i f i is the weight given to receptor type i. This equation shows that the variance -in a naive time average -is reduced by a weighted sum that depends on the receptor correlation times. In the single receptor type case, σ 2 g,T becomes arbitrarily small as τ → 0. However, because σ 2 g,T is proportional to a weighted sum of τ i /T , when receptor correlation times τ i decrease, error is limited by the slowest correlation time. If one receptor correlation time is significantly faster than another, Eq. 3 predicts reduced error merely by removing the slow receptors (Fig. 3). The naive time average, therefore, does not efficiently use the information available -if it did, the cell would not be able to gain accuracy by throwing away measurements. The core reason for the failure of naive time averaging is that the snapshot estimator weights receptors equally -which is appropriate to the amount of information they provide at that moment. Naively averaging this estimator weighs information from fast receptors (which gain more information as T increases) and slow receptors (which gain less information) similarly.
The failure of naive time averaging is reminiscent of a wellknown result for concentration sensing: it is not optimal for a single receptor to estimate c from a simple time-average of its occupation, T −1 T 0 n(t)dt. If the whole history of binding and unbinding events is used in a maximum likelihood estimate, the error σ 2 c is reduced by two [24]. We compute the accuracy limit for gradient sensing σ 2 g,T using the entire receptor trajectory (Appendix D), finding (again in the limit T τ i ): (entire receptor trajectory) (4) or more intuitively, For a single receptor type, Eq. 4 is a factor of two smaller than the naive time average Eq. 3, precisely as in concentration sensing. However, for multiple types, ERT error can be orders of magnitude better, as the time correlation factors τ i /T add "in parallel" -error is no longer limited by the slowest type. We illustrate the differences between these two errors in Fig. 3, computing SNR g 2 /σ 2 g for two receptor types. Which type provides more information depends on the relative off rates of the two types: Estimation using the entire receptor trajectory is much more accurate than naive averaging.-SNR g 2 /σ 2 g as a function of ρ = k B − /k A − ; ρ > 1 states B receptors have faster off rates. SNR is much larger using the entire receptor trajectory (ERT) method; even in its best case, the naive average only reaches half of the ERT SNR. We use two receptor types, KA = 1 nM, KB = 1000 nM, and N = 5 × 10 4 , g = 0.05, with fixed c0 = √ KAKB and fA = 0.5, with k A − T = 2. Fig. 3 shows that as ρ is varied, the naive time average SNR is always at least a factor of two lower than ERT. When the B receptor off rate is large (ρ 1), the naive average is worse than if only the N/2 B receptors were used (the "Naive (fast only)" yellow dotted line). In this limit, most information is from the B receptors, and using only B receptors reaches half the ERT SNR (Fig. 3).
How does time averaging affect bet-hedging? For a single environmental concentration, all-A is optimal when Eq. 5 in- For ρ ≤ 1, it is always best to use the lower-K D receptor -but trade-offs are more complex when B receptors are faster. The balancing point c bal varies from c bal → ∞ at ρ = 1 to c bal = 0 at ρ = K B /K A . ρ = K B /K A corresponds to the condition where the on rates of the two types are equal, k A -this would be the case if the on rates were diffusion-limited.
Dependence on receptor off rates is preserved when we study optimal receptor configurations in an uncertain environ- ment p(ln c 0 ) ∼ e −(ln c0−ln c * ) 2 /2σ 2 µ . In Fig. 4, we show how the optimal share of A receptors depends on ρ = k B − /k A − . Even with significant uncertainty, when ρ = 1, all-A is optimal. However, at higher values of ρ, a transition between all-A and all-B like Fig. 1 occurs when c * = c bal (white dashed line). By contrast with the snapshot results, in the time average at large uncertainties σ µ > 1, the 50-50 mixture is not optimal. Instead, at high uncertainties, for ρ = 10, 100, optimal receptor fractions are similar above and below c bal , and the fraction of A receptors exceeds 0.5, with f A decreasing as σ µ 10. Why? When cells time-average, CI is nonzero over a broad range of c 0 (see Fig. 4, top), and our argument from the snapshot case fails.
Can we produce a plot such as Fig. 2 showing the optimal receptor configuration if time-averaging is performed? No. When time averaging, error is minimized by making the correlation times as small as possible -taking k − → ∞. We cannot find a consistent set of optimal receptors unless k − is restricted by some biochemical constraint. This is because, as recognized for concentration sensing [24], only binding rates are sensitive to c -bound times should be minimized.
Discussion.-Our results show that cells can hedge their bets against an uncertain environment by expressing multiple receptor types -but that this behavior is only reasonable if the uncertainty in c 0 spans the range of observed K D (Fig. 2) -i.e. if cells typically explore environments where concentration varies over orders of magnitude. The idea that signalprocessing should be adapted to the likely range of concentra-tions is similar to classical results showing information transmission is maximized by tuning input-output relationships to the input probability distribution [25][26][27][28]. When cells chemotax to hunt bacteria, as Dictyostelium uses folic acid chemotaxis, it is intuitively plausible that observed c 0 span orders of magnitude, as bacterial hunting must function over both sparse and concentrated solutions, and over many distances to bacteria. However, at any fixed concentration, expressing multiple receptor types is always suboptimal to choosing the receptor that best fits your current concentration c 0 -so hedging is plausible in circumstances when the environment is uncertain over the timescale on which the receptor affinity is fixed. What other timescales could appear in the problem? Dictyostelium receptors are internalized in response to large increases in cAMP concentration [29,30], but this process takes several minutes -much longer than it would typically take Dicty to chemotax to a new mean concentration level. We also show that even if receptor numbers change in different environments, we see very similar results (Appendix E).
Our work has not so far distinguished between truly different receptors and receptors that are phosphorylated or otherwise modified to change their K D [31,32], which play a role in adaptation to different signal levels in bacteria [33]. If receptor modification is fast compared to the environment's change in concentration c 0 , i.e. can occur before the cell samples a new concentration from P (c 0 ), hedging will be less effective. Our work thus suggests interesting future directions for extension, based on recent studies of concentration sens-ing in time-varying environments [34]. Future work could also consider multiple ligand types, which could also limit accuracy by competing for receptors [35,36]. For cells to reach the lower bound of Eq. 4, they must compute estimates with some reaction network, possibly extending recent work showing how to compute the ERT estimate in concentration sensing [36,37]. One important difference here is that finding the ERT g requires spatially resolved measurements of bound and unbound times separately for each receptor type. Computation of the ERT estimate for concentration requires additional free energy expenditure [37] -it would be interesting to determine if the extravagant benefits of the ERT approach for gradient sensing with multiple receptor types (Fig. 3) comes with a commensurate cost. Though these are significant complexities, the huge gap between the fundamental bound of Eq. 4 and the naive average of Eq. 3 shows even very rough approximations to the ERT provide significant gains over a naive average.

ACKNOWLEDGMENTS
We thank Wouter-Jan Rappel and Emiliano Perez Ipiña for a close reading of the paper. We thank Allyson Sgro for useful discussions. BAC acknowledges support from the grant PHY-1915491.

NOTATION FOR APPENDIX
This appendix includes calculations where we have to distinguish the type i and the index n of each receptor. The intermediate calculations are more complex than the final results in the main paper, and to help keep the details straight, we will denote receptor types i with superscripts and receptor indices n with subscripts. A superscript to a power i like C i n never denotes exponentiation. To keep consistent with this, we've included formulas with β i where in the main text we've written β i .

Appendix A: Numerical details
To evaluate the integral CI = dc 0 p(c 0 )CI(c 0 ) numerically, we found some difficulties that arise because of how broad p(c 0 ) is. In particular, because the SNR most naturally varies on the log scale (see Fig. 1, top), it is easiest for us to evaluate this expectation in terms of ln c 0 , For our log-normal distribution, p(ln c 0 ) has the simple Gaussian form However, we note that even Eq. A1 can be tricky to evaluate when CI is nonzero only for a range of ln c 0 compared with the scale σ µ . We evaluate this integral using Matlab's Gauss-Kronrod quadrature (quadgk), setting waypoints to ensure that all nonzero ranges of both functions should be included. To find the receptor type fractions that optimize CI we use Matlab's fminbnd (golden section search) and fminsearch (Nelder-Mead), depending on the number of variables to be optimized over.
Code to reproduce the results in the paper can be found at: https://github.com/bcamley/hedging reproduce

Appendix B: Snapshot sensing
How precisely can a cell make a measurement of a chemical gradient -using only its current information about which receptors on its surface are occupied? We extend results from [5,20] on this accuracy to include multiple receptor types. We assume that the cell is in a shallow exponential gradient with direction φ and steepness g = L|∇C| C0 , where L is the diameter of the cell and C 0 the concentration at the cell center. The gradient g can also be written as g = (g x , g y ) = (gcos(φ), gsin(φ)). The concentration at a cell receptor with angular coordinate ϕ can then be written as Let there be R receptor types, where there are N i receptors of type i and N = R i N i total receptors. We describe the receptors as being uniformly spread across the cell, with angular positions ϕ i n (Fig. 5). Then each receptor of type i can be represented as a Bernoulli trial, i.e. we define a variable x i n that is one if a receptor n of type i is occupied, and zero otherwise. The probability of x i n = 1 is the probability of that receptor being occupied, where C i n is the concentration at the nth receptor of type i and K i D is the dissociation constant of receptor type i. The dissociation constant K i D is the ratio of unbinding and binding rates of the receptors of type i, i.e.
Assuming that all the receptors are independent of one another, we have the following likelihood function giving the probability of seeing receptor occupations x i n given gradient g The log-likelihood function is In the second term in this equation, we then assume that the receptors are numerous enough that we can replace the sum over receptor position by a continuous integral, We define z i 1 = N i n x i n cosϕ i n and z i 2 = N i n x i n sinϕ i n , which measure the spatial asymmetry in the occupancy of receptors of type i. Z 1 = i z i 1 and Z 2 = i z i 2 measure the total spatial asymmetry in receptor occupancy of the cell. In a shallow exponential gradient, we can neglect terms O(g 4 ) and higher for an estimation of the gradient g = (g x , g y ). Then, the log-likelihood function becomes (We note that past papers with similar derivations [12,20] have not always written the last term in this log-likelihood, which is an irrelevant constant.) Taking the derivative with respect to g x,y gives Because the log function is monotonic, we can set Eq. B6 equal to zero to findĝ x andĝ y , the parameters which for g x and g y which maximize the likelihood function. Carrying out this procedure, we find: which be solved forĝ x,y to determine estimatorsĝ x,y aŝ To determine the asymptotic variance on these estimators, we will need to compute the second derivative of the log-likelihood function. Applying an additional derivative to Eq. B6 gives From the log-likelihood function, we can also determine the Fisher information matrix, which controls the best possible measurement that the cell can make of the uncertain g [5,38]. In this case it is diagonal, and its inverse gives the variances ofĝ x and g y in the limit of many samples. As a result, we have expressions for the asymptotic variances forĝ x andĝ y and so The important parameter is σ 2 g , which is just the sum of the component variances As the sample size becomes large, the distribution ofĝ 1,2 converges to a normal distribution with means g x,y and variance σ 2 gx,gy . This also implies that the mean values of Z 1 and Z 2 are Appendix C: Naive time averaging A cell may improve its estimation of the gradient by time averaging. In the previous section, we determined an estimator g that is the best estimate of a cell's gradient, given a snapshot of its receptor information. Naively, a cell could improve its accuracy by making a measurement over a time T and determining the average of these estimateŝ Then the variance of this new estimator will be reduced, To understand how time averaging improves the cell's sensing accuracy, we need to compute ĝ(s) ·ĝ(t) , the correlation function ofĝ. This correlation function is related to the correlation functions in the estimates of each component of the gradient as (C3) And, by Eq. B10, the correlation functions for g x,y can be related to the correlation functions for Z 1,2 as The correlation functions for Z 1 can be written in terms of the single receptor correlation function: The kinetics of receptor binding and unbinding with multiple receptor types can be quite complicated [39,40], with ligands potentially diffusing from one receptor to another. However, if the binding and unbinding process is slow with respect to this diffusion -i.e. binding is reaction-limited, as is believed to be the case in eukaryotic chemotaxis [20,39], it is appropriate to think of the ligand-receptor binding having two states -one bound and one with ligand in the bulk. Then, there are two relevant rates, that of ligand binding to a receptor of type i exposed to concentration C is k i + C and the off rate is k i − -which results in an exponential single receptor correlation function for receptor n of type i. This limit is also appropriate if all ligand is internalized, as discussed by [24]. The parameter σ 2 x i n characterizes the fluctuations in the occupancy of the receptor, and is given by the variance of a Bernoulli trial. τ i n is the single receptor correlation time in the reaction-limited case. (Generalization to other limits is possible but not straightforward [40][41][42].) Because different receptors are independent, the mean of their product is just the product of their means Using Eq. C8 for terms where i = j and n = m and Eq. C11 otherwise, we can expand the correlation function of Z 1 in Eq. C7 as The second term in Eq. C13 is Z 1 2 , which can be solved as by Eq. B14. For the first term in Eq. C13, taking the sum to an integral gives i.e., Eq. C10 for a receptor in the ambient concentration C 0 . Therefore, for shallow gradients, the correlation function for Z 1 is Using the relation in Eq. C4, the correlation function of the estimatorĝ 1 can be found from Eq. C16: Similar expressions for the correlation functions of Z 2 andĝ 2 can be derived as and Thus, the correlation inĝ for shallow gradients is This now gives us enough information to compute the time-averaged variance, Then, using the result in Eq. C20, we have In the limit where T τ i for all τ i , Eq. C23 becomes where the parameter β i = C 0 K i D /(C 0 + K i D ) 2 reflects the accuracy of measuring only using receptor i, as in the main text. Instead of simply performing a naive average, a cell could also improve its sensing of the gradient by determining an estimate of the gradient from the history of its receptors over the measurement time -when they are bound and unbound (Fig. 6). We a time interval for receptor n of type i as time series {t i +;n , t i −;n }, where particles bind at times t i +;n, and unbind at times t i −;n, , where indexes the binding and unbinding events. Following [24], we compute the probability for a time series of binding and unbinding events. Define a function f * ;i −;n (t i −, ;n ) = p i −, ;n (t i −, ;n |t i +,1;n , t i −,1;n , . . . , t i −, −1;n , t i +, ;n ) which is the probability density for the event that receptor n of type i experiences an unbinding at time t i −, ;n given the previous time series data {t i +,1;n , t i −,1;n , . . . , t i −, −1;n , t i +, ;n }. Here, the time series has been written for a receptor that is initially unbound, and the indexing of the time series in the following equations will follow that notation. The procedure is the same for a receptor that starts in the bound state. Define the analogous function where η i b;n , η i u;n are the numbers of binding events and unbinding events, respectively. If a cell measures for a time interval T that is long compared to the relevant time scales (ie, T 1/k i − , T 1/C 0 k i + for all receptor types), then η i b;n ≈ η i u;n because the number of binding events can differ by at most one from the number of unbinding events. In this limit, the information about the gradient is dominated by the observed time series, and not the initial snapshot state of the receptors.
We assume that we are in the reaction-limited case, where we can treat the rate of binding to a receptor of type i exposed to concentration C as k i + C and the off rate as k i − -neglecting rebinding. (Neglecting rebinding, as discussed in more detail by [24], is also the appropriate limit to find the fundamental bound to accuracy, as cells may prohibit rebinding by degrading or internalizing ligand.) The functions f * ;i +;n (t i −, ;n ) and f * ;i −;n (t i −, ;n ), with this simple Markovian kinetics, do not depend on the whole time series, they only depend on the time of previous unbinding/binding event: These are probability density functions for exponential distributions with rates k i − and k i + C i n for unbinding and binding, respectively. From these equations, we can determine the likelihood function for the gradient parameters g x and g y given the observed time series {t i +;n , t i −;n } at receptors n of types i. Because each receptor is independent, the likelihood function is the product of the probability function in Eq. D1: Then, define the total time bound T i b;n and the total time unbound T i u;n for receptor n of type i (the indexing in these definitions assume the receptor starts unbound, but analogous definitions can be written for a receptor that is bound at t = 0, and in the limit of long times treated here, this assumption does not matter.) With these definitions, the likelihood function in Eq. D4 becomes The log-likelihood function is then Substituting the expression C i n = C 0 exp 1 2 g x cos(φ i n ) + g y sin(φ i n ) in Eq. D7 gives For shallow gradients, we can approximate the log-likelihood function by expanding to second order in the magnitude of the gradient. This results in Differentiating Eq. D9 with respect to g x and g y , we get Because g x and g y here do not depend on i and n, Eq. D11 can be equated to zero and solved to find the maximum likelihood estimator in terms of sums over functions of T i u;n and φ i n . However, we have not found the precise form very useful. From the derivatives of the log-likelihood function, we can compute the Fisher Information Matrix: Differentiating Eq. D10 and Eq. D11 with respect to combinations of g x and g y gives the following matrix .

(D13)
The expectation value T i u;n can be found in terms of the measurement time T and the probability P i n that a receptor is occupied (Eq. B2) By substituting Eq. D14 into Eq. D13 and taking the inner sums to an integral, we have the following expressions for each matrix element: cos(φ i n )sin(φ i n )dφ = 0.
Therefore, the Fisher information matrix is diagonal, and in shallow gradients it is We note that Eq. D18 has only been calculated in the large-T limit; in the limit of T → 0, we would expect the Fisher information to limit to the estimate from a single snapshot. For cells with a single receptor type, Eq. D18 implies that the asymptotic variances on g x and g y are 1/2 of their value determined from time averaging-the same factor as in concentration sensing [24]. However, in the multiple receptor type case, there is a more significant difference. The variance inĝ determined from Eq. D18 is the sum of the inverses of the diagonal elements = 16 As discussed in the main text, this shows that a slow receptor correlation does not act as a limiting factor when the entire receptor trajectory is considered.
Appendix E: Hedging allowing the number of receptors to change Within the main text, we have followed earlier work in keeping the number of receptors on the cell fixed [2-5, 14, 20, 43]. However, it is possible that when cells explore more complex environments, they should express different numbers of receptors depending on the typical concentration c * and the level of uncertainty σ µ . We address this possibility in Fig. 7.
Within the framework we have applied in this paper, accuracy always increases with increasing N -there are more measurements of the gradient, leading to increased accuracy (see Eq. 1,3,4 in the main text). If we allow the number of receptors to freely vary, and choose the number of receptors N and receptor fractions f , we would find that the receptor number would increase without bound. This is obviously unphysical. Cells are under many restraints in controlling how many receptors they have, both in terms of the energetic cost of synthesizing them, and in the opportunity cost in taking up space on the cell surface.
In modeling cells with varying receptor number, we chose to find the receptor configuration that maximized CI × e −N added /N penalty , where N added is the number of receptors expressed beyond the typical value N basal = 5 × 10 4 , and N penalty = 50N basal . This choice ensured that cells could easily express more than the basal level of receptors, but that expressing multiple orders of magnitude more receptors would be implausible -consistent with the observed variation in receptor number on the membrane. We found that, though the optimal receptor numbers varied depending on the environment (Fig. 7), the optimal receptor fractions closely agreed with those found assuming a constant number of receptors (Fig. 1).
Other choices for the penalty (e.g. optimizing CI+αN added ) gave different optimal receptor numbers but preserved the optimal receptor fractions and the transition between all-A, all-B, and the 50-50 mix. This suggests that the receptor fractions and the