Finding the last bits of positional information

In a developing embryo, information about the position of cells is encoded in the concentrations of"morphogen"molecules. In the fruit fly, the local concentrations of just a handful of proteins encoded by the gap genes are sufficient to specify position with a precision comparable to the spacing between cells along the anterior--posterior axis. This matches the precision of downstream events such as the striped patterns of expression in the pair-rule genes, but is not quite sufficient to define unique identities for individual cells. We demonstrate theoretically that this information gap can be bridged if positional errors are spatially correlated, with relatively long correlation lengths. We then show experimentally that these correlations are present, with the required strength, in the fluctuating positions of the pair-rule stripes, and this can be traced back to the gap genes. Taking account of these correlations, the available information matches the information needed for unique cellular specification, within error bars of ~2%. These observation support a precisionist view of information flow through the underlying genetic networks, in which accurate signals are available from the start and preserved as they are transformed into the final spatial patterns.


I. INTRODUCTION
During the development of an embryo, cell fates are determined in part by the concentrations of specific morphogen molecules that carry information about position [1][2][3].For the early stages of fruit fly development, all of these molecules have been identified [4][5][6].For patterning along the main body axis, spanning from anterior to posterior (AP), information flows from primary maternal morphogens to an interacting network of gap genes to the pair-rule genes [7,8], whose striped patterns of expression provide a precursor of the segmented body plan in the fully developed organism, visible within three hours after the egg is laid (Fig. 1).It has been known for some time that, at this stage in development, essentially every cell "knows" it's fate [9], so it is natural to ask how this information is encoded, quantitatively, in the concentrations of the relevant morphogens.
The expression levels of the gap genes provide enough information to specify the positions of individual cells with an accuracy ∼ 1% of the embryo's length [11].This matches the precision with which the stripes of pair-rule expression are positioned, and the precision of macroscopic developmental events such as the formation of the cephalic furrow [12].Further, the algorithm that extracts optimal estimates of position from the expression levels of the gap genes also predicts, quantitatively, the distortions of the striped pattern in mutant flies with deletions of the maternal inputs [13].At the moment when pair-rule stripes are fully formed, just before gastrulation, there are fewer than one hundred rows of cells along the length of the embryo, so it is tempting to think that positional signals with 1% accuracy define unique cellular identities.In fact, this is not quite correct [11]: if each cell makes independent positional errors drawn from a Gaussian distribution, then there is a small but significant probability that neighboring cells will get "crossed (B) An optical section through an embryo stained for three of the "pair-rule" proteins, 50 min into nuclear cycle 14 (∼ 3 h after oviposition), showing striped patterns that align with the body segments; data from Ref [13].(C) As in (B), from multiple embryos, illustrating the pattern reproducibility.Time in nuclear cycle 14 indicated at bottom right of each profile.
signals," driving errors in cell fate determination.
The small difference between 1% positional errors and unique cellular identities provides an interesting test case in the search for a more quantitative understanding of living systems.In physics, we are used to the idea that small quantitative discrepancies can be signs of qualitatively new ideas or mechanisms.But in complex biological systems one might worry that small discrepancies reflect experimental errors or over-simplifications in interpretation.If correct, these concerns would limit our ambitions for quantitative theory in the physics tradition.However the small discrepancies need to be re-examined in light of dramatic improvements in experimental precision [14][15][16].
Here we take the small quantitative discrepancy in positional information seriously.On the theoretical side, we clarify the problem, defining an "information gap," and show that this gap can be closed if errors in the positional signals are spatially correlated over relatively long distances.Early work by Lott and colleagues [17] detected such correlations in mRNA levels of gap and pairrule genes; subsequent work found that noise in different combinations of protein levels in the gap gene network are correlated significantly over the entire length of the embryo [18].On the experimental side we re-examine these correlations, measuring the positions of stripes in the concentrations of pair-rule proteins.We find that the extent of these correlations is what is needed to close the information gap between positional errors and unique cellular identities, quantitatively.

II. DEFINING THE PROBLEM
In the early fly embryo, cells have access to the concentrations of morphogens, and these concentrations are continuously graded.From these concentrations, it is possible to decode an estimate of position, which we label as xn in cell n [13].We expect that these estimates are correct on average, so that ⟨x n ⟩ = nL/N , where there are N cells along the length L of the embryo. 1 However the signals are noisy, so decoding in one cell will have errors, For simplicity, but guided by the experimental observations [11,13,21], we assume that σ x is the same for all cells and that the distribution of δx n is Gaussian.Here we are interested in the question of whether cells get signals that define the correct ordering along the axis so that To find the probability of a wrong ordering we can take a look at the distribution of the distance to the next cell y = xn+1 − xn .But since xn+1 and xn both are Gaussian, their difference y is also Gaussian, with mean equal to ⟨y⟩ = L/N .If the noise is independent in each cell, then the variance of this difference signal will be ⟨(δy)2 ⟩ = 2σ 2 x .Incorrect ordering happens when y < 0, which then has probability with z = σ x (N/L), as shown in Fig. 2. If positional errors are comparable to the spacing between cells, σ x ∼ L/N , the probability of an error is nearly 24%; for the experimental value σ x ∼ (0.74)L/N [11], crossed signals will occur in ∼ 16% of cells.With N ∼ 74±5 rows of cells along the AP axis [11], the probability that all signals come in the right order would be vanishingly small. 2 This failure to specify unique cellular identities can be given a simple information-theoretic interpretation.To specify one cell uniquely out of N requires I unique = log 2 N bits of information [22,23].On the other hand, if we have signals that represent a continuous position x drawn uniformly from the range 0 < x ≤ L, and these signals have Gaussian noise with (small) standard deviation σ x , as described above, then the amount of information the signal conveys about position is where the first term is the entropy of the uniform distribution of positions and the second term is the entropy of the Gaussian noise distribution [23].Combining these we can define an "information gap" As discussed below, we obtain a more accurate estimate of the information gap by averaging over measurements of σ x at multiple points along the embryo, defined by the pair rule stripes, and we find I gap = 1.39 ± 0.08 bits (Appendix A).Importantly this gap is measured per cell: it is not that the embryo is missing ∼ 1.4 bits of information, but rather that every cell is missing this information.

III. EXTRA INFORMATION FROM CORRELATIONS: THEORY
In order to address this information gap directly, we leverage the concept that correlated noise facilitates enhanced information transmission.While correlated noise is typically viewed as challenging due to its resistance to averaging, in the context of neighboring cells making correlated errors in position, it mitigates the probability of receiving "crossed signals," as previously defined.Here we develop these considerations more formally.
Information is roughly the difference in entropy between the signal and the noise, where entropy measures the (log) volume in phase space that is occupied by a set of points.When random variables become correlated, the volume and hence the entropy is reduced, even if the variances of the individual variables are unchanged.In our example, with correlations, the full pattern of points {x 1 , x2 , • • • , xN } fills a smaller volume in the space [0, L] N of possible positions for all the cells, and thus the embryo as a whole has access to more positional information.
More formally, we can define the correlation matrix C, with diagonal elements C nn = 1.Assuming again that the noise δx n is Gaussian, the reduction in noise entropy for the entire set of variables {δx n } is given by the determinant of this matrix [23], and this reduction in entropy is the gain in information.
Entropy is an extensive quantity, so that when N is large with correlation length ξ.This is what we would see if signals were encoded in the gradient of a single molecular species that has a lifetime τ and diffusion constant D, with ξ = √ Dτ .Although this is over-simplified, it is useful for building intuition about how the range of correlations determines the additional information.Within this model it is straightforward to evaluate ∆S numerically, with results shown in Fig. 3A.
We can also give an analytic theory for ∆S in the large N limit, leading to Eq (15) and the red line in Fig. 3.If we define eigenvalues and eigenvectors of the matrix then we have In the limit of large N at fixed N/L, the ends of the embryo are far away, and there is an effective translation invariance.This means that the eigenvectors ϕ µ n are complex exponentials, ϕ µ n ∝ exp(iq µ n), or equivalently that the matrix C nm is diagonalized by a discrete Fourier transform;3 allowed values of q µ are in the interval −π ≤ q < π.Then as N → ∞ we find the eigenvalues and the change in entropy In Fig. 3A we see that this analytic result agrees with numerical results at N = 50 and N = 100, which agree with one another, confirming that the fly embryo is large enough for the entropy to be extensive.We conclude that an information gap of ∼1.4 bits can be closed if correlations extend over distances ξ ∼ 13(L/N ) ∼ 0.18L.Lott and colleagues saw significant correlations across this range of distances for all the genes that they probed [17], and combinations of gap gene protein levels have even longer correlation lengths [18].Beyond the perhaps abstract information theoretic measures, we can evaluate the probability that all cells receive signals that are in the correct order, that is xn+1 > xn for all n = 1, 2 • • • , N .If correlations extend over a distance ξ ∼ 13(L/N ), then proper ordering will occur in more than 99% of embryos, as illustrated in Fig. 3B.

IV. EXTRA INFORMATION FROM CORRELATIONS: EXPERIMENT
Taking the information gap seriously, we predict that the noise in positional signals should be correlated over distances ξ ∼ 0.2L.These distances are long compared to the separation between neighboring cells.The first indication that such correlations exist came from experiments marking the boundaries of gene expression domains as seen through measurements of mRNA for selected gap genes and the pair rule gene eve [17].At the same time, it was reported that fluctuations in the concentration of a single gap gene product protein are correlated only over short distances [24].Analyzing simultaneous measurement on protein concentrations of four gap genes demonstrated that different combinations or modes of the network have different correlation lengths [18]; the longest correlation lengths are a significant fraction of the length of the embryo.Finally, early analyses showed that errors in relative position are smaller than errors in absolute position [11].All of this suggests that the noise in positional signals is spatially correlated.Can we make this statement more quantitative?
We analyze the experiments in Ref [13], which used immunofluorescence stainings to measure spatial profiles of protein concentration for three of the pair-rule genes eve, prd, and rnt (Fig. 1).The data include N em = 109 embryos, fixed and stained in the time window from 35 to 60 min after the start of nuclear cycle 14.This is the period of cellularization, and as in previous work, the progress of the cellularization membrane provides a time marker with an accuracy of up to one minute [16].For each of the three genes, the seven peaks in the striped concentration profile can be found automatically, and their locations vary linearly with time throughout this period [25].If we don't correct for this systematic dynamical behavior, the variance of stripe positions will be large and their fluctuations will be correlated, artificially.We consider the noise in position to be the deviation from the best fit linear relation for each individual stripe marker.The standard deviations then are consistently slightly below σ x ∼ 0.01L, and the distribution of fluctuations is well approximated by a Gaussian.These results agree with previous work [11,13,25], and are summarized in Appendix A.
Before analyzing correlations, we can use these data to make a more precise estimate of the information gap.
If each cell has access to a positional signal with errors σ x (n), that might vary with n, the average positional information available to a single cell is where ⟨• • • ⟩ n denotes an average over cells, generalizing Eq (6).Rather than making inferences about single cells, we have direct access to the signals that mark the locations of the stripes in the expression of three pair-rule genes, for a total of 21 features spread across half the AP axis.The mean separation between the nearest stripes is ∆x = 0.023L, just a few times larger than the spacing between cells.Rather than introducing a model that would interpolate, we take the stripe positions themselves as the signals x n , now with n = 1, 2 • • • , 21, and the average in Eq (16) becomes an average over stripes.The challenge in evaluating the positional information is that random errors in our estimates of the errors σ x (n) become systematic errors in estimates of information.This problem of systematic errors was appreciated in the very first efforts to use information theoretic concepts to analyze biological experiments [26].The analysis of neural codes has been an important testing ground for methods to address these errors [27][28][29]; for a review see Appendix A.8 of Ref [23].The approach we take here uses the fact that naive entropy estimates depend systematically on the size of the sample; if we can detect this systematic dependence then we can extrapolate to infinite data, as described in Appendix A. The result is that I gap = 1.39 ± 0.08 bits/cell.The idea of positional information is that cells have access to a signal that represents position along the axis of the embryo [2,21].In the discussion above we have taken this idea very seriously, identifying the signal in each cell as xn .But the signals we observe are the positions of stripes in three different pair-rule genes, and the different stripes for each gene are controlled by different enhancers responding to distinct combinations of transcription factors.We need to test the hypothesis that these multidimensional molecular concentrations encode a single positional variable.
We are looking at fluctuations in the positions of the stripes, δx n .Fig. 4 shows the elements of the correlation matrix as a function of the mean separation ∆x nm between stripes n and m.We see that, within experimental error, the correlations really are a function of distance.There is no obvious pattern linked to the identity of the enhancers that control these different features, or to the identity of the transcription factors to which the enhancers respond: nearby stripes are highly correlated, the decay of correlations with distance is the same whether we are looking at correlations between the same or different genes, and different pairs of stripes with same mean separation have the same correlation.This suggests that, as in the theoretical discussion above, we can think about an abstract positional signal that is transmitted to each cell and controls the placement of the pair-rule stripes.Correspondingly, there are strong indications that the correlations are inherited from the structure of the noise in gap gene expression (Appendix C).
Qualitatively, the correlations that we see in Fig. 4 decay over distances ξ ∼ 0.15L, consistent with the scale needed to close the information gap, and with early measurements [17].Quantitatively, the decay of correlations is not well described by a single exponential function of distance, so we cannot simply transcribe the predictions of the theory.Instead, we would like to make a direct estimate of the positional information from the data.Conceptually this is simple: we estimate the correlation matrix from the data, then compute the (log) determinant of this matrix following Eq (9).As with the information gap itself (above), the problem is that random errors in our estimates of individual matrix elements become systematic errors in the entropy.We follow the same strategy of identifying the dependence of this error on the number of embryos that we include in our analysis and extrapolating to large data sets, as described in Appendix B.
By definition, to see the extra information hidden in correlations we have to look at the positions of multiple stripes.We start with two neighboring stripes, and gradually work out toward all twenty-one stripes.We see in Fig. 5 that beyond N ∼ 10 stripes, the information per stripe reaches a plateau at ∆S/N = 1.51 ± 0.08 bits/stripe.This agrees, within experimental error, with our estimate of the information gap I gap = 1.39 ± 0.08 bits/cell.

V. DISCUSSION
There is strong evidence that, early in embryonic development, each cell acquires a distinct identity [9]; it is less clear how this information is encoded.In the fruit fly embryo, positional information along the anterior-posterior axis is orchestrated through a sequential cascade involving three primary maternal inputs, a select number of gap genes, and the pair rule genes.The conventional perspective suggests that the information flow through this cascade entails a gradual refinement, with noisy inputs ultimately generating a precise and reproducible pattern [30,31], in the spirit of the Waddington landscape [32].
In contrast to the picture of noisy inputs and precise outputs, at least one maternal input itself exhibits a high level of precision, consistently reproducible across embryos [24,33].Moreover, the expression levels of gap genes within a single cell prove sufficient to determine positions with an error smaller the distance between neigh- boring cells [11,13].Notably, this precision agrees with that observed in downstream events such as the pair-rule stripes.In parallel, crucial developmental events exhibit highly reproducible temporal trajectories [34].These quantitative observations challenge the conventional view of refinement and error correction, supporting instead a precisionist perspective in which locally available information is processed and preserved with near optimal efficiency.Given that all relevant molecules are present at low copy numbers, this places significant constraints on the architecture of the underlying networks [34][35][36][37].
Despite their precision, local signals in the fly embryo do not quite provide enough information to uniquely specify all N = 74 ± 5 cellular identities along the AP axis, I unique = log 2 N : errors in the position that a cell can infer from molecular concentrations come from a distribution, and distributions have tails [11].The result is that there is a substantial (∼ 22%) gap between the information provided by the gap genes, or the pair-rule stripes, and I unique .
Previous measurements have characterized the noise in local estimates of position for each cell individually.But there are many hints from previous work that this noise is correlated [11,17,18].Extra information can be hiding in these correlations, and we have seen in §III that if correlations extend over distances ξ ∼ 0.15L then this would be enough to close the information gap.This prompts a more detailed examination of the noise correlations, which really do seem to be a function of distance independent of gene identity (Fig. 4).
The perhaps surprising conclusion of §IV is that the extra information contained in the correlations, ∆S/N , matches the information gap I gap to within a few percent of I unique , with the remaining difference essentially equal to our error bars: This agreement supports, strongly, the precisionist view of information flow in this system.Historically, the lack of precise data on gene expression levels, with uncertainties extending to factors of two, led to skepticism regarding the relevance of more refined measurements to general mechanisms of genetic control.These expectations stood in contrast, for example, to our understanding of signaling in rod photoreceptors, where the quantitative reproducibility of responses to single molecular events provides important constraints on the underlying biochemical mechanisms [38].
The fly embryo has provided a laboratory within which to explore the precision vs. noisiness in the function of an intact living system.We have seen reproducible protein and mRNA concentrations across embryos with an accuracy of 10% [16,24,33], and these concentrations encode position with an accuracy of ∼ 1% of the embryo's length [11,13,21].The current study adds a layer to this understanding, demonstrating that the available positional information, including the subtle effects of correlated noise, matches the threshold for specifying unique cellular identities, and this match itself has an accuracy of just a few percent.Beyond the fly embryo, these results suggest a more general conclusion: quantitative measurements in living systems merit serious consideration, even at high precision, as in other areas of physics.tagged antibodies against those antibodies [13].Independent experiments demonstrate that these classical staining methods, used carefully, yield fluorescence intensities that are linear in protein concentrations [16].The data set used here, which contains a large number of wild type embryos, comes from Ref [13].
We briefly summarize the imaging protocol and describe the procedure for localizing the stripe positions.Images are taken in the midsaggital plane showing a row of nuclei along the dorsal and ventral side of the embryo.For consistency and to avoid geometric distortion, we focus on the dorsal profiles, as was done previously.In order to include the entire embryos in a single image, large field-of-view images, with pixel size 445 nm are acquired with a 20× 0.7NA objective on a Leica SP5 confocal microscope.Fluorescence intensity is averaged inside a sliding window of the size of a nucleus and the position of the window center is recorded.In a given embryo, positions of the 7 stripes are first roughly identified by finding local maxima in the profile of an individual embryo.To make this quantitative, we tried several methods.First, we used an iterative procedure in which the mean peak shape is used as a template [25].Second, we fitted a model of seven Gaussians with variable amplitudes and widths to the entire profile.Finally, we fit individual Gaussians to each stripe, using a window centered on the local maximum with width of 5% embryo length.These methods give consistent results, and importantly global fits do not generate larger correlations than local fits.In the end we use the local Gaussian fits, as in Fig. A1A.
The age of embryos is estimated to 1 minute precision in nuclear cycle 14 by measuring the length of the cellularization membrane [11].At 30 min into this cycle, the stripes of prd first start to become visible and the other two genes have a well defined stripes by that time, so we confine our attention to t > 30 min.
Stripe patterns are dynamic, with positions that depend on time.If we don't take account of this systematic variation, then across an ensemble of embryos with different ages we would see artificial correlations among fluctuations in stripe position.Stripe movement is small, however, and we can use a linear fit (separately for each of the 21 stripes) across the population of embryos, Results are shown in Fig. A1B and C. For each embryo we find an equivalent position of all the stripes at a reference time t 0 = 45 min [25].With x n the positions of each pair rule stripe, we have the mean and variance where ⟨• • • ⟩ denotes an average over our complete experimental ensemble of N em = 109 embryos.Results are shown in Fig. A1 D, where we confirm that positional errors are almost all smaller than 1% of the embryo length.
Beyond measuring the variance, we can estimate the distribution of positional errors.Since the different stripes have slightly different σ x , we normalize the positional errors for each stripe individually, With this normalization we can pool across all 21 stripes, and we estimate the distribution of z as usual by making bins and counting the number of examples in each bin, with results shown at left in Fig. A2.Qualitatively the distribution is close to being Gaussian, but what matters for our analysis is the entropy of this distribution.
When we estimate a probability distribution and use this estimate to compute the entropy, the random errors in the distribution that arise from the finiteness of our sample become systematic errors in the entropy.The general version of this problem goes back to the very first efforts to use information theoretic concepts to analyze biological experiments [26]; for a review see Appendix A.8 of Ref [23].Briefly, naive entropy estimates depend systematically on the size of the sample, and if we can detect this systematic dependence we can extrapolate to infinite data.At right in Fig. A2 we show the difference between the entropy of the estimated distribution P (z) and the entropy of a Gaussian.We see that when we base our estimates on N em embryos there is a term ∼ 1/N em .Extrapolating N em → ∞ we see that the entropy difference goes to zero within the small (< 0.01 bit) error bars.We conclude that, for the purposes of our discussion, it is safe to approximate the positional errors as being Gaussian.
Finally we can use the same extrapolation methods to provide a better estimate of the "information gap" defined in the main text.Equation ( 16) defines the positional information contained in the local signals, I position , and the information gap is the difference between this and I unique = log 2 N .Fig. A2 shows the values of ) estimated from fractions of our data set and then extrapolated.The result is I gap = 1.39 ± 0.08 bits (Fig. A2).
Appendix B: Entropy estimates Fig. A3 shows estimates of the extra information ∆S/N [Eq (9)] based on measurements in different numbers of embryos, for N = 10 and N = 20 contiguous pair rule stripes.We see the expected dependence on 1/N em , and the steepness of this dependence is twice as large at N = 20 than at N = 10.This gives us confidence in the extrapolation N em → ∞ [23,[26][27][28][29].

Appendix C: Origin of the correlations
The precision of pair rule stripe placement matches, quantitatively, the noise in optimal estimates of position based on the local expression levels of the gap genes [11,13].To be consistent with this result, the correlations should also be visible in the gap genes.As noted above, Lott and colleagues saw correlations in expression boundaries for selected gap genes [17], and later measurements showed that combinations of gap gene expression levels have correlations extending over a significant fraction of the embryo [18].Here we revisit these measurements and connect fluctuations in gap gene expression to positional noise.Notice that for the pair rule genes we can work directly with the positions of the stripes, but for the gap genes we have to think more carefully about how positions are encoded in expression levels.
We start with a brief review of ideas about decoding positional information [13].Measurements of gap gene expression in multiple embryos provide samples from the conditional distribution P ({g i }|x), at all values of the position x along the anterior-posterior axis; we focus on the d = 4 gap genes expressed in the middle ∼ 80% of the embryo, hunchback, giant, krüppel, and knirps.To a good approximation this distribution is Gaussian, where ḡi (x) is the mean expression level of gene i at position x and is the covariance matrix of fluctuations around these means.To decode the position of a cell from the local expression levels we need to construct But because nuclei are arrayed uniformly along the length of the embryo, P (x) is uniform and hence the dependence on x is captured in Eq (C1).A cell at the actual position x true has expression levels where the variance of positional noise is defined by Previous work has emphasized the scale of positional errors σ x [11,13,21].But the optimal decoding of gap gene expression levels [13] maps the deviation of expression levels from the mean into a decoding error for each embryo individually, as in Eq (C9).An example is in Fig. A4, where the small fluctuations of expression levels around the mean (A) translate into proportionally small errors δx (B).
For each embryo α we can take the positional errors δx α (x) and compute the correlation function C α (∆x) = 1 L − ∆x dx δx α (x)δx α (x + ∆x).(C12) Fig. A4C shows the mean and standard error of the normalized correlation function across all N em = 38 em-bryos in our experimental ensemble.Qualitatively, correlations in the positional noise encoded by the gap genes extend over distances similar to the correlation in positional noise of the pair rule stripes (Fig. 4).Quantitatively, the gap gene correlations include an additional component with a short correlation length.One possibility is that this component is averaged away by interactions among neighboring cells during expression of the pair rule stripes.Another possibility is that a modest fraction of the noise in gap gene expression reflects local noise in the measurements, as discussed previously [16]; this measurement noise has only a small impact on our estimates of the effective noise σ x but a larger impact on the shape of the correlation function.It seems likely that both effects contribute.Nonetheless, it is clear that relatively long ranged correlations, which are crucial to closing the information gap, are present already in the gap gene expression levels, as suggested in earlier work [11,17,18].New experiments will be needed to give a reliable estimate of the information that is encoded in these correlations.

FIG. 1 :
FIG. 1: Segmented Drosophila body plan.(A) Brightfield color image of a 5 mm long 3 rd instar larva of the fruit fly Drosophila melanogaster [10] with clearly visible segments.(B) An optical section through an embryo stained for three of the "pair-rule" proteins, 50 min into nuclear cycle 14 (∼ 3 h after oviposition), showing striped patterns that align with the body segments; data from Ref [13].(C) As in (B), from multiple embryos, illustrating the pattern reproducibility.Time in nuclear cycle 14 indicated at bottom right of each profile.

FIG. 3 :
FIG.3: Extra information from correlations, as a function of the correlation length.(A) Numerical results for N = 50 and N = 100 from Eq (9) with the correlation matrix in Eq (10); analytic results for N → ∞ from Eq(15).Compare with the information gap from Appendix A (solid black line bracketed by dashed error bars).(B) Probability of at least two signals being "crossed," xn+1 < xn in a line of N = 74 cells, with σx/L = 0.01.

FIG. 4 :
FIG.4: Correlations between noise in peak positions of the eve, run, and prd stripe patterns, as a function of the mean separation between stripes.Error bars estimated from the standard deviation across random halves of the data.With three genes, each having seven stripes, we observe (21 × 20)/2 = 210 distinct elements of the correlation matrix Cnm.Solid red line is a smooth curve to guide the eye.

FIG. 5 :
FIG. 5: Extra information in correlations per cell, ∆S/N , computed from the observed correlations in pair-rule stripe fluctuations Cnm, including different numbers of contiguous stripes.Circles and error bars (blue) are the extrapolated estimates from Appendix B. Beyond N ∼ 14 stripes there is a plateau ∆S/N = 1.51±0.08bits/cell, bracketed by the dashed lines.Square and error bars (red) are the best estimate of the information gap Igap = 1.39 ± 0.08 bits/cell from Appendix A.
FIG. A1: Pair rule stripe positions.(A) Concentration of Eve protein in a single embryo.Colored circles indicate regions were fitted with a Gaussian function to calculate the stripe position.Each stripe is fitted individually, with fits shown in red.Red triangles indicate centers of each fitted peak.(B) Stripe positions as a function of time in the nuclear cycle 14.Linear fits from Eq (A1) are shown as black lines.(C) Peak positions xn(t0) corrected to t0 = 45 min.(D) Positional error of the pair rule stripes.Magnitude of the error σx(n) is plotted against the mean position xn for each of the eve, prd, and rnt stripes.Errors in xn are standard errors of the mean; errors in σx are standard deviations across random halves of the data.Dashed line marks the rough estimate σx/L ∼ 0.01.
FIG. A2: (A) Positional errors are well approximated as Gaussian.An estimate of the distribution of normalized errors, Eq (A4).Open circles are means pooled across all stripes and embryos; error bars are standard deviations across random halves of the embryos; and the line is the Gaussian with zero mean and unit variance.(B) The entropy difference between this estimated distribution and the Gaussian, as a function of the (inverse) number of embryos we include in our analysis.Points (cyan) are examples from random choices out of the full ensemble of embryos; open circles with error bars are the mean and standard deviations of these points; and the line is a linear extrapolation [23, 26-29].(C) Estimates of the information gap, Eq (A5).Points (cyan) are examples from random choices out of the full ensemble of embryos; open circles (blue) with error bars are the mean and standard deviations of these points; and the line is a linear extrapolation to Igap = 1.39 ± 0.08 bits.
FIG. A3: Entropy reduction by correlations among the pair rule stripe fluctuations, estimated from different numbers of embryos Nem; N = 10 stripes at left and N = 20 stripes at right.Points (cyan) are examples from random choices out of the full ensemble of embryos; open circles (blue) with error bars are the mean and standard deviations of these points; and the line is a linear extrapolation to the square.