Evolutionary adaptation is facilitated by the presence of lethal genotypes

The adaptation rate in theoretical models of biological evolution increases with the mutation rate but only to a point when mutations into lethal states cause extinction. One would expect that removing such states should be beneficial for evolution. We show here that, counter-intuitively, lethal mutations speed up adaptation on rugged fitness landscapes with many fitness maxima and minima, if strong competition for resources exist. We consider a modified stochastic version of the quasispecies model with two types of genotypes, viable and lethal, and show that increasing the rate of lethal mutations decreases the time to evolve the best-fit genotype. This can be explained by an increased rate of crossing fitness valleys, facilitated by reduced selection against less-fit variants.

Biological evolution is often thought to be a slow process occurring over timescales of 10k-100ks years [1].However, ecological speciation which causes a species to split into non-interbreeding sub-populations can occur in just tens of generations in animals and plants [2], and microevolution -a significant change in allele (gene variants) frequencies can occur over a few hours in microbes exposed to antibiotics [3][4][5][6].
The rate of biological evolution is affected, among others, by the mutation rate [7], the structure of the fitness landscape [8][9][10][11][12], its temporal dynamics [13][14][15][16], and spatial structure of the population [17][18][19].Here we focus on its dependence on the fitness landscape (FL) and the mutation rate, which has been studied in the quasispecies model [20].It has been shown that the time it takes biological evolution to reach the best-adapted genotype decreases with increasing mutation rate [21][22][23][24].From the physics viewpoint, this is expected because the mutation rate plays a similar role to the temperature in a quantum-spin system with an energy landscape given by the inverted fitness landscape [25].However, if the mutation rate is too high, the population becomes delocalized ("error catastrophe") and may even die out if the fitness landscape has a sufficient density of genotypes with null or negative fitness [26,27].In real fitness landscapes, such genotypes make up a substantial fraction of all genotypes [28].It is therefore not possible to make the mutation rate arbitrarily large and, consequently, there is a limit on how fast evolution can be on a given fitness landscape.
In some situations such as directed evolution of proteins [29,30], one would however wish to increase the rate of evolution.Since the presence of lethal genotypes is a limiting factor, one could think that reducing the fraction of such genotypes, e.g., by changing experimen-tal conditions, should be beneficial.
To explore how the presence of non-viable (lethal) genotypes affects the rate of evolution, we consider a stochastic version of the quasispecies model on a fitness landscape with viable and non-viable genotypes.In contrast to previous works on the quasispecies model, we divide up the genotype space into two regions: one for genes relevant for adaptation to a new environment, and another one for housekeeping genes crucial for metabolism, DNA replication, etc., which have already been highly optimized and therefore any mutation in them is likely to be lethal.This scenario is biologically more realistic than assigning zero fitness to random genotypes in the landscape.
To be specific, we consider a population of organisms replicating and dying stochastically.Each organism has a genotype i = 0, . . ., 2 L − 1 represented by a binary sequence of length L. The state of the system is described by a vector {n i }, where n i is the number of organisms of type i.An organism of type i replicates with rate f i and dies with rate N/K, where N = i n i , and K is the (soft) carrying capacity of the system.Upon replication, the organism either produces a copy of itself or a mutant.The mutant can be either viable, or non-viable (lethal).We assume that faithful replication occurs with probability 1 − µ − γ, a viable mutant is generated with probability µ, and a lethal mutant with probability γ.A lethal mutant is instantaneously removed from the system.The genotype of a viable mutant is obtained by inverting a randomly selected letter (0 or 1) of the binary representation of i.The genotype space is therefore an L-dimensional hypercube, with an additional node representing the lethal state connected to all other genotypes (see Fig. 1A).
The replication rates {f i } are drawn independently from the uniform distribution on [0, 1), except for f 0 = 0.5 and f 2 L −1 = 1 which we fix so that i = 2 L − 1 is always the fittest genotype, and the initial genotype has intermediate fitness.The system is initialized with K organisms of genotype i = 0 and simulated until at least one organism of the fittest genotype i = 2 L − 1 emerges.
For γ = 0, the model is essentially the stochastic quasispecies model with a maximally-rugged fitness landscape [31].For γ > 0, the fitness landscape contains a fraction γ/(µ + γ) of lethal genotypes.By construction, µ + γ must be smaller than one since it is the total mutation probability.We cannot therefore make either µ or γ too large.
We are interested in the large-population/largemutation rate regime because we want evolution to be fast.In this limit, low-fitness variants do not fix in the population, and therefore the rate of evolution decreases monotonously with population size [32].Since the presence of lethal genotypes reduces the population size, evolution is expected to be slower for γ > 0. We shall see that this expectation is not always correct, and that lethal genotypes, while not participating in the evolutionary process, have a significant effect on the rate of evolution.
To compare the model with and without lethal genotypes, we simulated 1000 copies of the model for K = 1000, µ = 0.04, L = 3, . . ., 8 and a range of γ = 0, . . ., 0.96, and measured the time T it took for a single organism of best-adapted genotype to evolve.We used the same sequence of randomly generated FLs for each γ. Figure 1B shows that the average adaptation time increases exponentially with L. However, the increase is slower in the model with γ > 0. In fact, for L > 4, the presence of the lethal genotype speeds up evolution.For L = 8 and γ = 0.75, evolution is three orders of magnitude faster than for γ = 0.
This behaviour is counter-intuitive since the total number of organisms in the system is significantly reduced for γ > 0; for example, for γ = 0.75 the population reaches only about 20% of the maximum capacity.Although the per-capita birth rate does not depend on population size, the average rate with which mutants are generated is lower because there are fewer births.Moreover, the effect shows up only on large-enough fitness landscapes; for L < 4, non-lethal model is actually faster.The lethal-genotype model is slower for any L also on a flat landscape (all fitnesses the same, results not shown), on which there is only genetic drift and no selection (except for non-lethality).The effect is also unrelated to the increased probability of extinction in the lethal model, since extinction becomes likely only for γ very close to 1 (Fig. 1C); in this limit Fig. 1B shows that T increases slightly with γ.
In what follows we shall focus on two cases: γ = 0 ("non-lethal") and γ = 0.75 ("lethal"), for which the extinction probability is effectively zero for K = 1000.The choice of µ = 0.04, γ = 0.75 may be interpreted as about 1 in 20 mutations being viable, the rest being lethal.Figure (1D) shows the distribution of adaptation times for L = 7.While the distributions of the adaptation time are very broad for both models (as expected due to the stochastic nature of the evolutionary process), we notice that the non-lethal model has a distribution skewed to the right, which contributes to the much longer average adaptation time.
To understand whether the two models differ only for certain fitness landscapes or all landscapes, we set µ = 0.04, and γ = 0 (no lethal genotypes) or γ = 0.75 (lethal genotypes present), and run the models 20 times on 1000 random landscapes (identical for both models).We then looked at evolutionary pathways in both models for 100 landscapes with largest/smallest differences in the adaptation time.We shall call such landscapes "slow" and "fast", respectively.Figure 2A shows an example pathway in the non-lethal model on a "slow" landscape.The first mutation increased fitness to approx.the same value as the fitness of the final genotype, but subsequent mutations led through a fitness valley.This is typical for "slow" landscapes (Fig. 2B).In contrast, evolutionary pathways on "fast" landscapes usually involve a more gradual fitness increase, with fewer and shallower fitness valleys (Fig. 2C).Interestingly, the average fitness profile along the evolutionary trajectory looks very similar for the lethal and non-lethal model (Fig. 2D).This suggests that the evolutionary speed-up provided by lethal genotypes does not rely on  following different pathways, but rather on the enhanced rate of crossing of fitness valleys.
To test this hypothesis, we investigated a simpler model with a one-dimensional fitness landscape (Fig. 3A).Mutations can only change genotype i to i ± 1 with probability µ/2 in either direction, and the fitness values are a flat fitness valley of depth δ separates the initial and the final genotype.This is meant to represent a single evolutionary pathway from Fig. 2 and is a special case of a more generic problem considered in Ref. [33], which significantly simplifies mathematical analysis.Numerical simulations confirmed that indeed the lethal-genotype model leads to much faster adaptation time on this fitness landscape (Fig. 3B).
To understand this, we considered a deterministic counterpart of the stochastic model.Neglecting stochastic fluctuations, the abundances {n i } of organisms evolve according to the following set of equations: where N (t) = i n i (t).Assume for a while that the final state has fitness zero, so it acts as an absorbing boundary.The system of equations admits then a steady state solution, with abundances determined by the nonlinear set of equations, with N = i n i .These equations will also correctly describe the quasi-stationary distribution for the case of non-zero fitness at i = L, as long as the transition rate from i = L − 1 to i = L (proportional to µ) is small.For small µ, we expect n i+1 n i , which enables us to write This means that the total population size N will be dominated by n 0 .Inserting N = n 0 into Eq.( 1) we obtain that and hence, for i > 0, and finally The above equation shows that (i) n i decreases exponentially with i, (ii) the rate of decrease is smaller for non-zero γ.This means that the abundance of genotype L−1 preceding the best-adapted genotype increases with increasing γ, even though the total population size decreases, as long as L is sufficiently large, γ is not too large, and our approximations remain valid.If the adaptation time T is now limited by the last mutational step of going from i = L − 1 to i = L, we can write that Indeed, it turns out that the above formula agrees very well with computer simulations of the 1d model (Fig. 3B), which supports our hypothesis that long fitness valleys slow down evolution in the full model, and that the presence of lethal genotypes facilitates valley crossing by increasing the abundance of less-fit but viable genotypes.
It is interesting to compare this result with the observation of Ref. [34] that decreased turnover increases the fixation probability of a mutation with a fixed selective advantage, thus decreasing the time to fixation if mutants occurs spontaneously and with a small rate.In our model, turnover, defined as the sum of per-capita birth and death rates, is indeed reduced for γ > 0: for strain i, per-capita turnover is f i + N/K, with N decreasing with increasing γ.However, there is no fixation of valley genotypes in our model, so the result of Ref. [34] does not directly apply.
We can now use the analytic expression (9) to make predictions for the full model with 2 L genotypes.First, we notice (Fig. 2D) that for a fixed fitness landscape with a sufficiently deep and wide valley, evolution quickly reaches a local fitness maximum one mutation away from the initial genotype.To use Eq. ( 9) we must find the distribution p l of fitness valley length l.We can then calculate the ratio of the adaptation times for the model with and without lethal genotypes as follows: To find {p l } numerically, we generated 10,000 random FLs, and found all fitness valleys, defined as consecutive runs of genotypes with monotonously decreasing fitness.Specifically, for each FL we found all such valleys along trajectories starting at the single-mutated genotype with maximum fitness and ending at the best adapted genotype.For each trajectory, we found the length (the number of links along which fitness decreases) of the shortest valley.We then took the minimum of all maximum lengths for each trajectory, and repeated this process for all trajectories and FLs.This gave the distribution of the minimum-length valley that evolution might encounter in our model.Such valleys would form a bottleneck limiting the rate of evolution on each specific FL.
Figure 4A shows the distribution of fitness valley lengths for different sizes L of the FL.Using the numerically obtained p l (L), we calculated the ratio (10) for different L, assuming δ = 0.3 (average depth of fitness valley from Fig. (2)).Figure 4B shows that this estimate approximately agrees with the ratio obtained from the adaptation times from Fig. 1.
Interestingly, an even simpler approach amounting to replacing L → L − 1 in Eq. ( 9) and calculating the ratio of adaptation times as qualitatively reproduces the data (Fig. 4B).This is because, as seen in Fig. 4A, a fitness landscape for binary sequences of length L always has a non-zero probability of having a fitness valley of length L − 2. These longest valleys dominate the adaptation time due to its exponential dependence on the fitness valley length.Even though the probability p L−2 decreases exponentially with L as ∼ exp(−0.5L)(Fig. 4A), the rate of decrease is not sufficient to overcome the exponential increase of T l (γ) with rate 2δ(1 − γ − µ)/(µ(1 − δ)) as long as µ is sufficiently small.However, due to the presence of shorter fitness valleys in many landscapes (Fig. 4A), the lethal-genotype model will perform better only on a subset of FLs which do not have such short valleys.
To summarize, we have observed a significant reduction of the adaptation time in the presence of lethal mutations.The effect is not caused by a reduced number of evolutionary accessible paths; unlike in the case of "holey landscapes" [35,36], lethal genotypes in our model are external to the hypercube fitness landscape and increasing γ does not affect the structure of the FL.The effect is also not due to evolution avoiding fitter genotypes for the sake of less-adapted ones, or vice-versa; we have seen that evolutionary trajectories are similar for both models.Rather, the presence of lethal genotypes decreases the effective depth of fitness valleys by reducing selection against less-fit genotypes.This increases the abundance of valley genotypes and hence the rate with which bestadapted mutants are generated.
It would be interesting to see if our conclusions hold for fitness landscapes more realistic than the maximallyrandom ("House-of-Cards") FL used here [37], fitness landscapes that evolve over time, and non logistic-growth like growth models, in particular for slowly expanding populations such as bacterial colonies [38].Moreover, in the sequential fixation regime (low mutation regime, small fitness differences) evolution can be faster for smaller population sizes [32], which would be of interest to check as well.Our model could be also extended to the case of "dormant states" which do not replicate but can be transformed back into viable states in a way similar to what happens in "gene bank" models [39].
Data and code.The software required to reproduce the results presented here can be found at https://github.com/Dioscuri-Centre/quasilethal.

Figure 1 :
Figure 1: (A) The genotype space of the model for L = 3. Green = viable states, red = lethal state.Lines represent mutations.(B) Mean adaptation time T decreases with increasing probability γ of lethal mutations, for L ≥ 5. Colors = different L. (C) The extinction probability Pext is low even for γ ≈ 0.99.(D) The distribution of adaptation time P (T ), for γ = 0 and γ = 0.75.L = 7 in both cases.

Figure 3 :
Figure 3: Evolution in a one-dimensional model with a wide fitness valley.(A) Schematic representation of the fitness landscape.(B) Time to reach genotype i = L. Points = simulation, line = theoretical predictions from Eq. (9): L = 3 (blue), L = 4 (yellow).

Figure 4 :
Figure 4: The linear-FL model predicts the adaptation time in the full model.(A) Probability of finding a fitness valley of length l, for different L. See the main text for the definition of l. (B) The ratio of adaptation times for the model with γ = 0.75 and γ = 0, for different L. Points = results from Fig. 1, black line = Eq.(10), blue line = Eq.(11).