Number of attractors in the critical Kauffman model is exponential

The Kauffman model is the archetypal model of genetic computation. It highlights the importance of criticality, at which many biological systems seem poised. In a series of advances, researchers have honed in on how the number of attractors in the critical regime grows with network size. But a definitive answer has proved elusive. We prove that, for the critical Kauffman model with connectivity one, the number of attractors grows at least, and at most, as $(2/\!\sqrt{e})^N$. This is the first proof that the number of attractors in a critical Kauffman model grows exponentially.


Introduction
The Kauffman model is a discrete dynamical system on a random directed graph of N nodes, each with K inputs but any number of outputs [1,2].The state of each node is a Boolean function of the states of its neighbors, and all nodes are updated simultaneously.Despite its simplicity, the system captures essential features of many physical systems, such as gene regulation [3,4], chemical reaction networks [5], and economic networks [6].It is particularly useful as a null model, whereby its baseline behavior can be contrasted against networks that have specific structure.
The long-term behavior of the Kauffman network falls into two regimes [7,8].In the frozen regime, perturbations to the initial state die out and attractor lengths do not grow with system size.In the chaotic regime, perturbations can grow exponentially and attractor lengths grow with the size of the system size.The two regimes are separated by a critical boundary in which a perturbation to one node spreads to on average one other node [8][9][10].The properties of this critical region are interesting to scientists across a broad range of fields [11,12].
For the critical K = 2 model, early computational evidence [2] indicated that the mean number of attractors scaled as √ N , where N is the size of the network.But complete enumeration up to N = 32 suggested linear scaling [13].Socolar and Kauffman found evidence for faster than linear [14], noting that the correct scaling only emerges for large systems.Through a combination of simulation and arguments about modular structure, Bastola and Parisi suggested a stretched exponential [15,16].Then, in an analytical tour de force, Samuelsson and Troein [17] proved that the number of attractors grows faster than any power law.
The elusive scaling for the K = 2 motivated interest in the scaling for K = 1.This model has a beautifully simple structure that makes it amenable to analytical work.The network is composed of loops and trees branching off the loops, shown in Fig. 1.Because the nodes in the trees are determined by the nodes in loops, they do not contribute to the number or length of attractors, which are set solely by the m nodes in the loops.Every node is assigned one of four Boolean functions: on, off, copy or invert.However, the critical version of the problem requires that all Boolean functions in the loops be copy or invert [18][19][20][21].
For the critical K = 1 model, Flyvbjerg and Kjaer found that the number of attractors is at least 2 0.63 √ N [18].Drossel et al. obtained a slightly slower growth rate of 2 0.59 √ N , but with a much simpler calculation [19].We recently improved this to 2 1.25 √ N , by defining a product between dynamics in different loops [20].We obtained a similar result with a simpler calculation using results from number theory [21].
In this paper, we prove that the number of attractors in the critical Kauffman model with connectivity one grows at least, and at most, as (2/ √ e) N , where N is the network size and 2/ √ e = 1.213.This is the first proof that the number of attractors in a critical Kauffman network grows exponentially.Our approach is as follows.We start by working out the exact probability that m of the N network nodes are in loops.For a given m, we write down the minimum and maximum number of attractors that the network can have.We then average these quantities over m.To our surprise, the two averages converge to (2/ √ e)

N
. Our proof is in four steps, each of which relies on a lemma.We prove the four lemmas after the derivation of our main result.Proof that number of attractors is exponential We now prove that the mean number of attractors scales as (2/ √ e) N .
Step 1.The probability that m ∈ [1, N ] of the network nodes are in loops is which we prove in lemma 1 below.
For a given m, the maximum number of attractors occurs when all of the loops are even and of length 1.Then there are 2 m attractors, all of size 1.On the other hand, the largest attractor length is double the lcm of the individual loop sizes.This is precisely twice Landau's function.From our previous work [21], we know that the minimum number of attractors is 2 m−1.52 √ m ln m /2, in which we made use of a bound on Landau's function [22].Thus By summing this over the distribution of m, we can write down bounds for the mean number of attractors c(N ): Step 2. The quantity √ m ln m is difficult to handle analytically, but we can bound it from above by a line in m.Let ϵ ∈ (0, 1] be an arbitrarily small constant.As we prove in lemma 2 below, where b = 1.52.Inserting this into eq.( 1) gives 2 m P (m). ( Step 3. It turns out that these sums can be calculated exactly for finite N .We show how in lemma 3 below.In the limit of large N , the exact form becomes For small ϵ, this simplifies to The large N and small ϵ approximations are rather accurate.For example, for ϵ = 0.01 and N = 10 5 , eq. ( 2) gives for the log of the lower bound 18,170, eq. ( 3) gives 18,170, and eq.( 4) gives 18,169.We can make (2 − ϵ ln 2)/ √ e as close as 2/ √ e as we wish.Doing so makes the constant on the right side of the lower bound smaller.The essential point is that this constant is independent of N .We thus see that the number of attractors scales at least, and at most, as (2/ √ e) N .
Step 4. The rate at which these bounds converge to the same scaling can be determined explicitly.For any N , there is a choice of ϵ which maximizes the lower bound.Setting ϵ to this optimal value allows us to get rid of ϵ altogether.Our result, which we derive in lemma 4, is where W is the principal branch of the Lambert W function.This approach is plotted in Fig. 2. Since W (x) asymptotically grows as ln x, the right side of the lower bound approaches 1 as 1 − ln N/N .For example, for N = 10 6 and 10 9 , the right side is 0.978 is 0.999.Once again, we find that the number of attractors scales as (2/ √ e) N .Proof of our four lemmas Lemma 1: Distribution for m.We aim to directly count the number of network architectures which have m nodes in loops.We consider the case of labelled nodes.Take m of N nodes N m ways and arrange them in cycles in m! ways.For the remaining N − m nodes not to create loops, they must be arranged in trees protruding from the loops.
Given a set of k trees, there are m k ways of attaching them to the m loop nodes.Calling the number of labelled, rooted k-forests of N − m nodes F (N − m, k), we sum over k ∈ 1...N − m to find the total number of arrangements of nodes in trees attached to the loop nodes as From Moon [23], F (n, k) = n k kn n−k−1 which allows us to evaluate the sum above as, Putting the pieces together and dividing by the total number of networks N N we have For example, for N = 3, this gives 3/9, 4/9, 2/9 for m = 1, 2, 3. Note in particular that the last term is N !/NN .This makes sense, since there are N N ways of drawing a single-input network on N nodes, but N ! of these are permutations, that is to say, have all N nodes in loops.
Lemma 2: Bound on √ m ln m.Now we prove that where ϵ ∈ (0, 1] and b is a parameter which we will later set to 1.52.We want to find a line mϵ + w(ϵ) such that where ϵ is the slope and w(ϵ) is a constant that depends on ϵ but is independent of m.Let's first transform this equation to combine ϵ and b into one variable.With δ = ϵ/b and v(δ) = w(ϵ)/b, we have Our ansatz is v(δ) = − ln δ/δ, in which case Squaring both sides and rearranging, With u = δ 2 m, we can write where u > 0. Since ln 2 δ/u is always positive, exp(ln 2 δ/u) > 1.Since e u > u, eq. ( 7) follows.Then and eq.( 6) follows.With b = 1.52, w(ϵ) = −2.31ln(0.66ϵ)/ϵ.Lemma 3: Sum over P (m).Here we calculate exactly sums of the form where we are specifically interested in α = 2 and α = 2 1−ϵ .Inserting P (m), we have The factor m can be traded for a differentiation with respect to the parameter α, giving Increasing the range from m = 1 to N to m = 0 to N , and swapping N and N − m, this can be rewritten where Γ(s, x) is the upper incomplete gamma function, this becomes In the region we are interested in, α ≤ 2, Γ(N + 1, N/α)/N !rapidly approaches 1 from below.Applying Stirling's approximation, Replacing α with 2 1−ϵ , and considering the large N limit, the sum has the asymptotic form This gives both of the sums in eq. ( 1).Lemma 4: Maximum over ϵ.We can eliminate ϵ in eq. ( 3) by maximising the lower bound with respect to ϵ.For large N , the dominant terms in the log of the bounds are The ϵ * that gives the largest lower bound satisfies For example, for ϵ * = 0.01, N = 278,000.Since N is a decreasing function of ϵ * for ϵ * < 1/2, the maximum must go to 0 as N goes to ∞.
For large N , we can replace 2 − 2 ϵ * with 1. Doing so allows us to solve for ϵ * in terms of N : where W is the principal branch of the Lambert W function.Inserting this into eq.( 8), we find ln c(N ) where the constant 5.45 is b/ log 2 (2/ √ e).Notice that our approach of setting ϵ to the value that maximizes the lower bound implicitly forces ϵ to be small for large N .In retrospect we could have just as well maximized the log of the lower bound in eq. ( 4) rather than in in eq. ( 3).

Discussion
The search for the number of attractors in the critical Kauffman network has spanned half a century.For the model with connectivity one, our work brings this search to a close, because our lower and upper bounds converge to (2/ √ e) N .If, as many believe [19,24], all critical Boolean networks behave in a similar way, then our result suggests that the Kauffman model with connectivity two would also exhibit exponential scaling.
Why has the critical behavior of Kauffman networks proved so elusive, despite the attention of many leading statistical physicists?Part of the answer lies in the admonition that "correct scaling only emerges for very large N " [14].From eq. ( 1), for N = 75 nodes, c min = 213 and c max = 2.12 × 10 9 .We know that the true value of c lies somewhere in between these, but until N is large, even the logarithms of these numbers are disparate.From eq. ( 5), we can write c as x N , where x ranges between x min and x max = 1.213.This is plotted in Fig. 3.Not until the network size N is of order 1,000 do the values converge.
That the scaling is exponential in N is surprising as the maximum number of states in attractors is 2 m and the mean number of nodes in loops m scales as √ N .Previous work has used Jensen's inequality to obtain the bound 2 m > 2 m ∼ 2 √ N .By computing the expectation exactly, we show that the mean number of attractors scales far more quickly, as 2 m ∼ (2/ √ e) N .This means that the average number of attractors is more than the average model can accomodate, and the expectation is dominated by rare configurations with a number of nodes that is linear in N .
While surprising, the result has a familiar basis.In the large N limit the distribution over m is approximately P (m) ≈ m N exp −m 2 /(2N ) .Taking the expectation e m , the familiar completion of the square gives that the average is proportional to exp(N ), rather than exp( √ N ) in analogy with our exact result.
Having analytical results for model systems is important for researchers working on applications.The use of Boolean networks across systems biology [25,26] has inspired computational problems around their control [27,28], and efficient identification of attractors [29].Many attractors in the Kauffman model are accessible from only a few states making them difficult to locate through sampling.As a result "purely numerical investigation of Kauffman networks will never produce reliable results" [30].We hope that the analytical results we present here will be useful to the wider community developing tools and algorithms for Boolean networks.

FIG. 1 :
FIG.1: Kauffman network with connectivity K = 1.This typical network of N = 100 nodes has one 1-loop, one 3-loop and two 6-loops, for a total of m = 16 nodes in loops.

FIG. 2 :
FIG. 2:The mean number of attractors c(N ) grows as (2/ √ e) N .As the network size N increases, the slope of the lower bound approaches the slope of the upper bound, namely, ln(2/ √ e).

FIG. 3 :
FIG. 3: Correct scaling requires a large system size.Writing the mean number of attractors as x N , here we plot the xmin and xmax, where xmax = 2/ √ e = 1.213.The true x is somewhere in the orange area.The bounds are close only for networks of around 10 3 nodes.Notice that the slope of the lines in Fig. 2 are ln xmin and ln xmax.