Binary-state dynamics on complex networks: pair approximation and beyond

A wide class of binary-state dynamics on networks---including, for example, the voter model, the Bass diffusion model, and threshold models---can be described in terms of transition rates (spin-flip probabilities) that depend on the number of nearest neighbors in each of the two possible states. High-accuracy approximations for the emergent dynamics of such models on uncorrelated, infinite networks are given by recently-developed compartmental models or approximate master equations (AME). Pair approximations (PA) and mean-field theories can be systematically derived from the AME. We show that PA and AME solutions can coincide under certain circumstances, and numerical simulations confirm that PA is highly accurate in these cases. For monotone dynamics (where transitions out of one nodal state are impossible, e.g., SI disease-spread or Bass diffusion), PA and AME give identical results for the fraction of nodes in the infected (active) state for all time, provided the rate of infection depends linearly on the number of infected neighbors. In the more general non-monotone case, we derive a condition---that proves equivalent to a detailed balance condition on the dynamics---for PA and AME solutions to coincide in the limit $t \to \infty$. This permits bifurcation analysis, yielding explicit expressions for the critical (ferromagnetic/paramagnetic transition) point of such dynamics, closely analogous to the critical temperature of the Ising spin model. Finally, the AME for threshold models of propagation is shown to reduce to just two differential equations, and to give excellent agreement with numerical simulations. As part of this work, Octave/Matlab code for implementing and solving the differential equation systems is made available for download.


Introduction
Binary-state dynamics on complex networks are frequently used as simple models of social interaction [1,2,3,4].Each node in a network is considered to be in one of two possible states (e.g., susceptible or infected, inactive or active) at each moment of time.Switching of nodes to the opposite state occurs stochastically, with probabilities that depend on the state of the updating node, and the states of its neighbors.Examples include models for competition between two opinions in a network-based population, such as the voter model [5,6] and the majority-vote model [5,7,8].Models for rumor propagation, adoption of new technologies, or the spreading of behavior are often based on generalizations of disease-spread [9,10,11] or individual-threshold [12,13,14,15] dynamics.Recently, data from online social networks has been mined to help increase the understanding of how individuals are influenced by their network neighbors [16,17,18,19,20,21]; further insight is provided by experiments where the network topology and social interactions are carefully controlled [22].Binary-state dynamics provide the simplest test for newly-developed models [23,24]; interest is often focussed on the critical parameters where the behavior of the model changes qualitatively, as at the paramagnetic/ferromagnetic phase transition that occurs in the Ising spin model at the critical temperature [25,26,27].
All these models, and many others, can be considered as special cases of a general formulation for stochastic binary-state dynamics.Letting k be the degree (number of network neighbors) of a node X in an unweighted, undirected, uncorrelated, static network, we denote by m the number of neighbors of X who are in state 1, i.e., infected/active/spin-up, depending on the model context.If node X itself is in state 0 (i.e., susceptible/inactive/spin-down) then the rate F k,m is defined by letting F k,m dt be the probability that X will switch from state 0 to state 1 in an infinitesimally small time interval dt.On the other hand, if node X is in state 1, then R k,m dt is the probability that node X will switch to state 0 within the time interval dt.We call the functions F k,m and R k,m the infection rate and recovery rate respectively; collectively they are also called the transition rates or spin-flip rates of the dynamics.These rates depend (only) on the degree k of node X, and the number m of neighbors of X who are in state 1; we also assume that the rates F k,m and R k,m do not vary in time.In Sec. 2 (see Table 1) we show that many stochastic binary-state social interaction models can be expressed in terms of suitable F k,m and R k,m functions [28,29].Deterministic threshold models [12,13,14]-where nodes change state (become active) only when the number m of active neighbors exceeds a pre-defined threshold level-may also be put into this form (see Sec. 7).
Approximations for macroscopic quantities of interest-such as the expected fraction of active nodes in the network at a given time-are needed in order to understand how the combination of network topology (e.g., the degree distribution P k ) and the microscopic dynamics (defined by the rates F k,m and R k,m ) affect the emergent dynamics at macroscopic level.In the limit of infinite network size, systems of deterministic differential equations may be derived to describe such macroscopic quantities, at various levels of approximation.The most common analytical approach for dynamics on complex networks is mean-field (MF) theory.Mean-field theories are typically derived under a number of assumptions, the most important of which for the current discussion is the assumed lack of dynamical correlations [30].Under this assumption, the neighbors of a node A are considered to be active (i.e., in state 1) with a probability that is determined by the overall fraction of active nodes (of same degree) across the network as a whole.In particular, the probability of a neighbor of A being active is assumed to be independent of the state of node A: thus there is assumed to be no correlation between the state of A and the state of A's neighbors.
Examples of MF theories for binary-state dynamics are found in [9,6,33,3], and the validity of the assumptions of mean-field theories on networks are considered in detail in [34,35].Meanfield theories, though relatively simple to derive and solve, are often quite inaccurate, especially on sparse networks (with low mean degree z) and near to critical points of the dynamics.Pair approximation (PA) theories improve upon the accuracy of MF theories-at the cost of extra complexity in the resulting system of differential equations-by including dynamical correlations at a pairwise level [36,37,38,39,40,41,42,43]. In addition to the fraction of active nodes in each degree class (as in MF), PA theories account for the fraction of edges in the network that link active nodes to active nodes (called I-I edges here), inactive nodes to inactive nodes (S-S edges), and inactive to active nodes (S-I edges).However, dynamical correlations beyond nearest neighbors are not captured by PA theories.If node A is inactive, for example, the probability that each of its neighbors is in the active state is approximated in PA theory by the relative preponderance of S-I edges over S-S edges in the whole network.Moreover, each neighbor of A is considered to be independent of every other neighbor of A. This means that if m is the number of A's neighbors that are in the active state, m is assumed to have a binomial distribution.
Higher-order accuracy, beyond PA level, has been obtained recently using compartmental models or approximate master equations (AME) [44,45,46,47].These approaches consider the state of nodes and their immediate neighbors, generating large systems of differential equations, which better capture the (non-binomial) distributions of the number of active neighbors of updating nodes.The increased complexity of the differential equations systems gives improved accuracy over PA and MF, particularly near critical points of the dynamics [46].
In this paper, we investigate the possibility of, for certain classes of dynamics, reducing the dimensionality of the full AME system to obtain a simpler set of differential equations, but without any loss of accuracy.Ideally, the reduced-dimension system would permit analytical solutions, but even if closed-form solutions are not possible, the smaller system is less computationally expensive to solve.We restrict our attention to static networks, but note that AME approaches have also been successfully used for models where the network structure co-evolves with the dynamics [44,48,49].We also assume here that the networks are generated by the configuration model [31,32,4], so having zero clustering (transitivity) in the infinite-size limit.These simplifications enable us to focus on the types of dynamics for which the AME can be exactly reduced to a simpler (e.g., pair approximation) system; we anticipate that complicating factors such as nonzero clustering can be studied in subsequent work, for example by employing the models of [50,51,52,53].
The remainder of the paper is organized as follows.In Sec. 2 we give examples of binarystate dynamics and in Sec. 3 we briefly review the approximate master equation approach of [46], showing how the equations of PA and MF theories can be derived as systematic approximations of the full AME.In Sec. 4 the AME and PA solutions are shown to give identical results for certain macroscopic quantities within the class of dynamics we call monotone.In Sec. 5 a more general class of dynamics (corresponding to equilibrium spin systems obeying detailed balance) is shown to have AME and PA solutions that are equal in the steady-state limit t → ∞, but not at finite t.Focussing on equilibrium models with up-down symmetry, this result enables us to derive-in Sec.6-an analytical result for the critical point of such dynamics, i.e., the bifurcation point marking the phase transition from paramagnetic (disordered) to ferromagnetic (ordered) behavior, as in the Ising model on complex networks [26,25].Finally, in Sec. 7, we show that (monotone) threshold models may be accurately approximated using an extended version of the AME, and that the large system of equations may be reduced to just two differential equations to determine the expected fraction of active nodes.Details of some mathematical derivations are contained in the Appendices.  4 of [17], where the probability of, for example, becoming infected (voting on a Digg story) initially increases with the number of infected neighbors, but then saturates, and eventually decreases, giving a function that is roughly symmetric about m = k/2.

Examples of binary-state dynamics
Here we briefly examine some of the binary-state models that can be described using transition rates F k,m and R k,m ; see Table 1 and the schematic Fig. 1.Note that in many of the models described here a single randomly-chosen node is updated at each time step, for which the time increment is dt = 1/N , where N is the number of nodes in the network (and we take the N → ∞ limit).In such a model, F k,m is the probability that a spin-down node, when selected for updating, flips to spin-up, while R k,m is the corresponding probability of a node flipping from spin-up to spin-down [54].
The susceptible-infected-susceptible (SIS) disease spread model [55,56,57] assumes that infected individuals transmit disease to each of their network neighbors at a rate λ.Thus if a susceptible node has m infected neighbors, the probability of it remaining susceptible for a time interval dt is (1 − λ dt) m .Therefore its probability of infection is 1 − (1 − λ dt) m , and in the limit dt → 0 this is λmdt, giving the F k,m rate for SIS dynamics in

Yes No
Ising Glauber Yes Yes Table 1: Infection and recovery rates for some examples of binary-state dynamics on networks: k is the node's degree, m is its number of infected neighbors, z is the mean degree k = k kP k of the network.The parameters of the various models are described in Sec. 2. Symmetric models are those with rates that obey condition (23); equilibrium models obey condition (17).
SIS model is a constant µ, and in the case µ = 0 recovery is impossible; this special case is called the SI model.The Bass model [58,59,60] for diffusion of technological innovations is an extension of the SI model, and may be similarly specialized to a network context.In the Bass model, nodes move from the susceptible (non-adopted) state to the infected (adopted) state with rate F k,m = c + dm.The parameter c represents individual action, independent of the influence of neighbors, while the parameters d quantifies the "herding behavior", whereby individuals copy the actions of their neighbors: in this case, by adopting the innovation when m of their neighbors have already done so.Transitions from the adopted state to the non-adopted state are not permitted, so R k,m ≡ 0. Kirman's ant colony model [61,62,63] for choice between stock market trading strategies has similar herding effects, but with switching permitted between both states (strategies).A node with m of its k neighbors infected has k − m neighbors in the susceptible state, so the rate for transitions of such a node from infected to susceptible is R k,m = c 2 + d(k − m).Note that mean-field theory (see Eq. ( 5) below) for the complete-graph case P k = δ k,N -where all nodes are connected to all other nodes-gives the standard population-level version of the Bass and Kirman models [58,61].
In the voter model [5,6], a node is chosen at random in each time step (dt = 1/N ) and it adopts the same state of one of its (randomly-chosen) neighbors.If m of the node's k neighbors are in the active state, the node thus becomes active with probability m/k, and becomes inactive with probability (k − m)/k, these being the fractions of its neighbors in the respective states.Many variants of the voter model have been studied [1,3].In the link update voter model, for example, an edge (rather than a node) is chosen at random in each time step, and one of the end-nodes of the chosen edge then copies the state of the other.Nonlinear voter models and other variants have been examined by a number of authors [64,65,29,42].In [42] for example, the current state of the node is incorporated into the probabilities for switching state, lending an effective inertia to the dynamics: see the transition rates for this case (on 4-regular random graphs) in Table 1.The language model of [65] has been examined in detail in [41]; in this model the two states represent the primary language choice of a person (node), and the probability of switching states is proportional to the fraction of speakers in the locality, raised to the power α, multiplied by the status parameter s or 1−s of the respective language (s = 1/2 for the symmetric case of equal-status languages, α = 1 then gives the voter model).
Many models of opinion dynamics are based on the classical Ising model of magnetic spins.Here each node is in either the spin-up or spin-down state and transitions occur according to dynamical rules which minimise the Hamiltonian of ferromagnetic interactions [3,66].Letting T represent the temperature of the heat bath, and J the ferromagnetic interaction parameter, the transition rates for Glauber [67] and Metropolis [68] dynamics are shown in Table 1.The majority-vote model [5,7] is an non-equilibrium spin model, with spins tending to align with the local neighborhood majority, but with a noise parameter Q giving the probability of misalignment.
As a final example, we consider threshold models [12,13,14,69], which are used to model diffusion of fads, collective action, or adoption of innovations [11].Each node has a (frozen) threshold level, which may depend on the degree of the node.In each time step, a fraction dt of the N nodes in the network are updated.When updated, a node compares the number m of its active neighbors to its threshold and activates (with probability 1) if m is greater than or equal to its threshold.Similar to the Bass model of innovation diffusion, usually no recovery is permitted in this model [11,18] (but see [15] for a recent extension including recovery).Note that a coordination game, modelling the diffusion of new behavior through a network, can also be written as a threshold model of this type [70,18,71].Threshold models, and the extension of the AME required to describe them, are considered in Sec.7 below.
3 Approximate master equations, pair approximations, and meanfield theory For completeness, we here briefly recapitulate the approximate master equations (AME) introduced in [46]; these were derived by generalizing the approach used in [44,45] , and the fraction of infected nodes in the whole network is found by summing over all k-classes: (Recall P k is the degree distribution, i.e., the probability that a randomly-chosen node has k neighbors).
The approximate master equations for the evolution of s k,m (t) and i k,m (t) are [46]: for each m in the range 0, . . ., k, and for each k-class in the network, with , and . Equations ( 1) and ( 2), with the time-dependent rates β s , γ s , β i and γ i (defined as nonlinear functions of s k,m and i k,m ), form a closed system of deterministic equations which can be solved numerically using standard methods; Octave/Matlab scripts are available from the author's webpage [72].Assuming a randomly-chosen fraction ρ(0) of nodes are initially infected, the initial conditions are , where B k,m (q) denotes the binomial factor For dynamics on a general network, with non-empty degree classes from k = 0 up to a cutoff k max , the number of differential equations in the system (1)-( 2) is (k max + 1)(k max + 2), and so grows with the square of the largest degree.Some approximation is therefore necessary if it is desirable to reduce the AME to a lower-dimensional system.One possibility is to consider the parameters p k (t) (resp.q k (t)), defined as the probability that a randomly-chosen neighbor of a susceptible (resp.infected) k-degree node is infected at time t.Noting that p k (t) can be expressed in terms of s k,m as k m=0 ms k,m / k m=0 ks k,m , an evolution equation for p k may be derived by multiplying equation ( 1) by m and summing over m.The right-hand-side of the resulting equation contains higher moments of s k,m , so a closure approximation is needed to proceed.If we make the ansatz that s k,m and i k,m are proportional to binomial distributions: ),, we obtain the pair approximation (PA), consisting of the 3k max +1 differential equations: for each k-class.The rates here are given by inserting the binomial ansatz into the general formulas, so that β s , for example, is (1 A cruder, mean-field (MF), approximation results from replacing both p k and q k with ω: , where ω = k z ρ k is the probability that one end of a randomly-chosen edge is infected, and z = k is the mean degree of the network.Using this ansatz in the master equations yields a closed system of k max + 1 differential equations for the fraction ρ k of infected k-degree nodes: with ρ k (0) = ρ(0).
In [46] we showed that the AME system (31) generally gives improved accuracy over the approximations (4) and (5).Moreover, the general equations ( 4) and ( 5) for pair approximation and mean-field theory reduce to previously-studied equations when specific rates F k,m and R k,m are selected from Table 1.In this paper we concentrate on finding dynamics for which the AME system can be reduced exactly to lower-dimensional systems, for instance to the PA equations.

Monotone dynamics
We first consider the case where R k,m = 0 for all k and m, with F k,m nonzero for at least some arguments.The zero recovery rate means it is impossible for nodes to switch from the infected state to the susceptible state and so we call this type of dynamics monotone [73].In this case the s k,m equations of the AME system are decoupled from the i k,m equations, with Eq. ( 1) reducing to and the fraction of infected nodes is given by ρ Thus several important macroscopic quantities (though not all) can be found by solving Eq. ( 6) for s k,m (t), without considering the i k,m (t) variables.Now consider whether or not the pair approximation ansatz )that was used to derive Eqs.(4)-could be an exact solution of Eq. ( 6).Inserting the PA ansatz into Eq.( 6) and dividing by (1 where the identities and have been respectively utilized on the left-hand side and right-hand side of Eq. ( 7).Equation ( 7) can be viewed as a condition on the forms of the infection rate F k,m for which the PA ansatz on s k,m is an exact solution of the corresponding approximate master equation.Note that all terms in Eq. ( 7) are linear in m, so F k,m is necessarily of the form for (possibly k-dependent) constants c k and d k .Using this form in (4) confirms that the solutions of the AME for s k,m are given by the PA ansatz, with ρ k and p k being the solutions of the reduced system where For the special case of z-regular random graphs or Bethe lattices (i.e., P k = δ k,z for integer z), system (11) can be solved analytically to give the explicit solution for the infected fraction as In Figure 2 we show results for SI dynamics (see Table 1), which are monotone, and of form (10) with c k = 0 and d k = λ for all k.Note the exact match between the AME and PA solutions for ρ(t) and for the fraction of S-I edges (see Eqs. (39) and ( 56)) in the network, and the excellent agreement with numerical simulation on networks of size N = 10 5 [74].
The parameters d k in (10) may be negative (provided that all F k,m values are non-negative).As an example, consider the Bass diffusion model on a z-regular random graph with c = 1 and d = −1/z.This might model, for example, the adoption by indie music fans of a new band, where the attractiveness of the band decreases as the number of neighbors m who have already "jumped on the bandwagon" increases; see [15] for another approach to this aspect of social contagion modelling, which they call "limited imitation contagion".The expected fraction of fans ρ(t) is given explicitly by Eq. ( 12); note for these parameters the entire network does not become infected, see Figure 3.
It is interesting to note that while, in the monotone case (10), the AME solutions for s k,m (t) are exactly reproduced by the PA equations, the corresponding i k,m (t) variables are not necessarily equal to their respective PA ansatz ρ k (t)B k,m (q k (t)).Focussing on the SI model, the effect of the non-binomial m distribution in i k,m is not visible in Figure 2, as the quantities shown there can be written in terms of s k,m only; indeed, it is necessary to examine connected triples of nodes to demonstrate this effect [39]. Figure 4 shows the fraction of connected triples (defined by choosing a node at random, then randomly choosing two of its neighbors) that are of type S-I-S : the chosen (middle) node being infected, while both chosen neighbors are susceptible.In the AME formulation, this fraction is given by 2 . The differences seen in Fig. 4-and the close fit of AME results to numerical simulations-indicate that the match between AME and PA results in Figs. 2 and 3 is due to the binomial m distribution s k,m for susceptible nodes, but the corresponding distribution i k,m for infected nodes is not binomial in m.

General dynamics and equilibrium spin models
We consider in this section the more general case of non-monotone dynamics, where none of the rates F k,m and R k,m are identically zero.Inserting the PA ansatz for s k,m and i k,m into the AME equations ( 1)-( 2) we find, after dividing by (1 where c k and c (2 k represent combinations of terms that depend on t and k, but are independent of m.Similarly, using the PA ansatz in the AME equation for i k,m yields where, like Eq. ( 13), the left-hand side is an exponential function of m (because of the presence of the B k,m functions, see Eq. ( 3)), while the right-hand side is linear in m.Bearing in mind that the transition rates F k,m and R k,m are time-independent, it is not generally possible to simultaneously solve equations ( 13) and ( 14) to obtain constant transition rates.We conclude that in this general case, AME solutions are not exactly equal, for all times, to the corresponding PA solutions.However, an important special case occurs if we restrict our attention to the steady-state limit t → ∞.The AME solutions and PA solutions can be identical in the limit t → ∞ (despite being different at finite t) if equations ( 13) and ( 14) are simultaneously satisfied in the steady state.The m dependence of the left-and right-hand sides then requires that and c (1) = 0, where the overbar denotes the steady-state limit of the corresponding variable, e.g., p k = lim t→∞ p k (t).
The conditions c (1) k = 0 can be shown to imply that p k /(1 − p k ) = β s /γ s , and the k-independence of the right-hand side implies that p k must, in fact, be independent of k: Similarly, an analysis of the conditions c (3) k = 0 shows that the steady-state values of q k must also be independent of k, i.e., q k = q for all k.
Replacing p k and q k in Eq. ( 15) with these simplifications, we can rewrite the condition on the rates as where and It is demonstrated in Appendix B that Eq. ( 17) is precisely the condition for microscopic reversibility (detailed balance) of the stochastic dynamics, i.e., for the dynamics to correspond to an equilibrium spin-flip model.Moreover, we show in Appendix C that the steady solution of the AME for transition rates obeying condition ( 17) can be fully specified in a simple manner, by the equation where p is a solution of the algebraic equation As an example of dynamics obeying condition (17), we introduce a modified model of susceptibleinfected-susceptible type, with for positive constants λ and µ.Like the standard SIS disease-spread model (see Table 1), recovery of infected nodes occurs at a constant rate µ.However, in constrast to the linear-in-m dependence of the SIS infection rate F k,m , here the probability of infection is assumed to grow exponentially with m.While admittedly artificial, such dynamics might find application in fitting data from social contagion experiments on networks, where the dependence of the rates of m is not yet clear [16,21,22].Figure 5 shows that the AME solutions and PA solutions indeed agree as t → ∞, despite being different for finite t.
Other important examples of equilibrium models obeying (17) are given by the Glauber and Metropolis dynamics for the Ising spin model (see Table 1 and Fig. 6), which both have a = e

Models with up-down symmetry
Models with up-down symmetry have dynamics that are invariant to the swap of state labels (susceptible to infected, and vice versa) for all nodes.In [28,29] this symmetry is called "Z 2symmetry"; it is characteristic of the voter model and other opinion dynamics models; see the fourth column of Table 1.In terms of the transition rates, the symmetry condition implies that R k,m = F k,k−m for m = 0, . . ., k and for all k. ( Note that for such dynamics, the AME system (1)-( 2) is invariant under the change of variables s k,m → i k,k−m and i k,m → s k,k−m .Since the expected fraction of degree-k nodes can be written as , this symmetry condition implies that a solution exists of the AME with ρ k (t) = 1/2 for all t.However, this may not be the only possible solution: depending on the initial condition ρ(0), and on the parameter regime, other solutions of the AME may also be found.
Focussing first on equilibrium spin models with up-down symmetry, which obey condition (17) in addition to (23), we investigate the stability of the symmetric solution with ρ k = 1/2.First, note that there is only a single parameter in these models: putting m = k/2 for even k, or, for odd k, m = (k − 1)/2 and then m = (k + 1)/2, into Eq.( 17) and imposing condition (23) immediately yields the necessary relation between the parameters of equilibrium spin models.Using the steady-state solution of Sec. 5, it is possible to show that a critical value a c of the parameter a exists: in the language of dynamical systems, this is a (pitchfork) bifurcation point [75].For parameter values a with a < a c , the symmetric solution ρ = 1/2 is stable, meaning that if ρ(0) is close to 1/2, the steady-state solution will be ρ = 1/2.In spin models (where the magnetization can be written as M = |2ρ − 1|) this regime is the paramagnetic (disordered) phase; for opinion models in this regime the two opinions coexist equally on the network.However, if the parameter a exceeds the critical value a c , then the symmetric solution ρ = 1/2 is unstable, and two other stable solutions, symmetric about ρ = 1/2, exist.This is the ferromagnetic (ordered) phase; for opinion models, one of the two opinions dominates the other.The critical value a c gives the phase transition point, and the behavior of ρ near a c can be determined from Eq. ( 21) (see Appendix D for details) in a very similar fashion to the analysis of the Ising model in [26,25], see also [76].The results of such analysis (see Appendix D) may be summarized as follows.If the degree distribution P k possesses a finite fourth moment k 4 = k k 4 P k , then the phase transition is of mean-field type, with critical parameter and with ρ − 1/2 ∼ ±(a − a c ) 1 2 as a → a c from above.Following [26,25] (see Appendix D), if the network has a scale-free degree distribution P k ∼ k −γ as k → ∞, then for exponents γ in the range 2 < γ < 3, the critical point is a c = 1, with ρ − 1/2 ∼ ±(a − a c ) 1 3−γ , while for exponents with 3 < γ < 5, we have a c given again by Eq. ( 25), but with near-criticality scaling of ρ − 1/2 ∼ ±(a − a c ) 1 γ−3 as a → a c .The case γ = 3 shows an infinite order transition at a = 1, as discussed in [25].
As mentioned at the end of Sec. 5, the Ising model (Glauber or Metropolis dynamics) is of type (23), with temperature T related to the parameter a via T = 4J/ ln a, so a = 1 corresponds to infinite temperature.A social influence model of this type might be given, for example, by transition rates with the properties Here all F k,m values for m = 0 to ⌊ k 2 ⌋ are free parameters; for example, the rates F k,m and R k,m could be given by Fig. 1(c), which is motivated by the data analysis in Figure 4 of [17].In any model satisfying (26) the parameter a of ( 23) is equal to 1, and by the results above the model is in the paramagnetic (disordered) phase for all network topologies.However, Eq. (25) shows that a = 1 is near the critical point of the system if the degrees in the network are very heterogeneous (so that k 2 ≫ k ), and indeed the model is poised precisely at criticality (a c = 1) on scale-free networks with infinite-variance degree distributions.
For symmetric models that do not obey the equilibrium condition (17), the PA and AME solutions are different, even as t → ∞, see Fig. 7 for an example using majority-vote dynamics.The solutions of the PA equation are therefore of limited usefulness, but it is nevertheless interesting that for z-regular graphs an expression for the critical point can be explicitly obtained.In Appendix E we show that the PA critical point of symmetric models on such networks occurs precisely when the steady-state PA paramater p has the value and the location of the critical point in parameter space is given by the solutions F z,m (for Figure 7: Majority-vote model on 3-regular random graph, with ρ(0) = 0.55 and noise parameter Q = 0.07.This non-equilibrium model does not obey condition (23) and the AME and PA solutions are not equal, even as t → ∞, in contrast to Figs. 5 and 6. m = 0, . . ., z) of the implicit equation Using the infection rate F k,m for the majority-vote model (see Table 1), for example, in Eq. ( 28), gives an explicit expression for the PA critical noise parameter Q: Equation ( 28) can similarly be used to obtain analytical expressions for the PA critical points in other models on z-regular random graphs, e.g.Fig. 12 of [42], Fig. 1 of [38], or Fig. 1 of [77].

Threshold dynamics
Threshold models are often used to model propagation of fads or collective action through a population [12,13,14,18,78,69].Each node has a (frozen) threshold level; these thresholds may be chosen at random (e.g., from a Gaussian distribution), or assigned in some other way (perhaps depending on the degree of the node).In the asynchronous-updating version of these models, a fraction dt of nodes are randomly chosen for updating in each time step [79].When chosen for updating, an inactive node becomes active (with probability 1) if m, the number of its neighbors who are active, exceeds the node's threshold [82].Once activated, nodes cannot subsequently become inactive, so the dynamics are monotone, cf.Sec. 4. Unlike in Eq. (10) of Sec. 4 however, the transition rate is not linear in m; in fact, it is given by to reflect the deterministic activation of a node (once it is chosen for updating) when m exceeds the threshold level M k .We have introduced here the vector k to encode two properties defining a class of node: their degree k (a scalar), and their type r, which together determine the threshold M k for such nodes.The types are assumed to be from a discrete set of possibilities: all nodes of type r = 1, for example, might have the same threshold M 1 , with all nodes of type r = 2 having a common threshold M 2 , with M 2 = M 1 .In this way, the set of all nodes may be partitioned into disjoint sets labelled by their degree and their type; in mathematical notation we combine these into a 2-vector, defining k = {k, r} for the k-class of nodes.All nodes in the same k-class have the same degree and the same type, and therefore all share the same threshold M k .We generalize the degree distribution P k to the distribution P k , which gives the probability that a randomly-chosen node has vector k (i.e., has degree k and type r).For example, if the thresholds of the nodes are randomly chosen, independent of their degrees, then the P k distribution can be written as P k = P k P r , where P k is the degree distribution and P r is the probability that a node is of type r.By taking the discrete set of types to be sufficiently large, it is possible to approximate a continuous distribution of types or thresholds with a desired level of accuracy.With this extended notation, the AME approach can be generalized in an obvious manner, essentially replacing the scalar degree k by the vector k as appropriate in Eq. ( 6), to yield with the rate β s given by . Note the sums here are over all k-classes, i.e., over all degrees k and all types r: k := k r .
In [81] (see also [83]) it is argued that for no-recovery threshold models of the type ( 30)-( 31) described above, the fraction ρ(t) of active nodes at times t can be found by solving just two differential equations: where and Figure 8: Threshold model on a 5-regular random graph, with all nodes having the same threshold M = 2; the initial condition is ρ(0) = 0.05.The AME solution is identical to the solution of the 2-dimensional system (32).
Here ρ k (0) is the fraction of nodes with vector k that are activated (infected) at time t = 0; as in [83], we generalize the usual infected fraction ρ(0)-that is used elsewhere in this paper-to allow for possible dependence on the degree or type of the nodes chosen to "seed" the contagion.
In Appendix F we demonstrate that Eqs. ( 30) and ( 31) reduce to ( 32)-( 34) through an exact solution of (31) given by The distribution of s k,m for m ≥ M k is, in general, not of the binomial form (35), but it can nevertheless be given explicitly as detailed in Appendix F. This means that the reduced-dimension system (32) is not precisely of the PA type ( 4), but it enables efficient and very accurate solution of threshold-dynamics models, see the example in Fig. 8.

Conclusions
We have pointed out that many stochastic binary-state dynamical systems on networks can be described using the transition rates F k,m and R k,m introduced in Sec. 1.The main results of this paper are the identification of dynamics for which the full approximate master equation system (31) can be reduced, without loss of accuracy, to a lower-dimensional system of equations, occasionally even yielding closed-form solutions (such as Eqs.( 12) and ( 28)).We showed in Sec. 4 that the pair approximation system exactly matches the AME results for nodes in one state (e.g., the susceptible state) if the dynamics are monotone (i.e., the recovery rate R k,m is identically zero) and the infection rate is linear in the number of infected neighbors ( The classical SI model and the Bass diffusion model are of this type-see Figs. 2 and 3-and for Bethe lattices (z-regular random graphs) Eq. ( 12) explicitly gives the expected fraction of infected nodes.Interestingly, the PA solution is not exactly equal to the AME for nodes in the infected state, see Fig. 4.
In Sec. 5 we showed that if both sets of transition rates F k,m and R k,m are non-zero, then the AME solutions cannot be given exactly by the PA ansatz for all time t.However, in the special case defined by Eq. ( 17), which corresponds to equilibrium stochastic dynamics, i.e., those obeying detailed balance (microscopic reversibility), the AME solutions reduce to the PA solution in the limit t → ∞, see Figs. 5 and 6.This property allows us to perform bifurcation analysis for systems with up-down (Z 2 ) symmetry (i.e., for dynamics obeying condition Eq. ( 23)), giving the explicit expression (25) for the critical point where the steady-state limit of ρ changes from the disordered (paramagnetic) state ρ = 1/2 to an ordered (ferromagnetic) state with ρ = 1/2.The Glauber and Metropolis dynamics for the Ising spin model are important examples that obey both conditions ( 17) and ( 23), and our results reproduce the critical Ising temperature that was found using very different methods (replica trick, recursion approach) in [26,25].The analysis of [26,25] for the critical behavior on scale-free networks can also be applied here, and gives results described in Sec.6 and Appendix D. We highlight the fact that the critical point a = 1 for networks with infinite-variance degree distributions is attainable in plausible social influence models of this type, as is the case a < 1, although for the Ising model a = 1 corresponds to infinite temperature and the regime a < 1 is unphysical.We also give an explicit expression (Eq.( 28)) for the critical point (pitchfork bifurcation point) predicted by pair approximation of models with up-down symmetry-including non-equilibrium dynamics that obey (23) but not necessarily (17)-on z-regular random graphs, with the caveat that this PA critical point is not identical to the AME critical point except for the class of equilibrium models that obey (17) as well as (23).Finally, in Sec. 7, we showed that the AME approach can be extended to include threshold models of adoption or fad diffusion, and that the AME system can be reduced to a system of only two differential equations, see Eqs. (32) and Fig. 8.
The exact agreement of AME and PA solutions-whether for all time as in Sec. 4, or in the steady-state as in Sec.5-implies that higher-order correlations (beyond nearest-neighbor) are correctly captured by PA in the cases we have identified.Indeed, the agreement of the theory curves with numerical simulation results in all these cases (for all time in Figs. 2 and 3, and as t → ∞ in Figs. 5 and 6) is excellent.We interpret this as indicating that for the classes of dynamics where PA is equal to AME, the results of PA are essentially exact.In contrast, in cases where PA and AME are not equal, as in Fig. 7 for example, it is necessary to solve the full AME system to obtain high-accuracy approximations.
The present work is focussed only infinite, uncorrelated networks, with negligible levels of clustering (transitivity).Generalizing the AME approach to clustered network models and/or to networks with degree-degree correlations remains a challenge, as the added complexities will lead to even larger differential equation systems than needed for configuration model networks.Nevertheless, we hope the insights gained here may assist in generating and analyzing pair approximations for dynamics on such networks.Another direction for further work is to consider the effects of finite N upon the various approximations used here.Although the local (nodelevel) dynamics are stochastic, the differential equations derived for the emergent dynamics are deterministic, because we assume the N → ∞ limit.On smaller networks, stochastic effects will be important even at the global level and different approaches-such as branching processes [84]-will be required to describe the variability of results across realizations.40) through (48).For each set, the central (Ego) node is shown along with some of its neighbors: white nodes are susceptible/inactive/spin-down and black nodes are infected/active/spin-up.See also Fig. 1 of [44].
where each node may be in the +1 (spin-up) or the −1 (spin-down) state.The networks have degree distribution P k and are generated by the configuration model [31,32,4].Dynamics are stochastic, and are defined by infection and recovery probabilities F k,m and R k,m , which depend on the degree k of a node, and on the current number m of infected neighbors of the node, see Sec. 2.
We now proceed to derive the approximate master equations for dynamics of this type, closely following the approach used in [44,45] for SIS dynamics.Let S k,m (resp.I k,m ) be the set of nodes which are susceptible (resp.infected), have degree k, and have m infected neighbors.To quantify the size of these sets, define s k,m (t) (resp.i k,m (t)) as the fraction of k-degree nodes that are susceptible (resp.infected) at time t and have m infected neighbors.Then the fraction ρ k (t) of k-degree nodes that are infected at time t is given by and the fraction of infected nodes in the whole network is found by summing over all k-classes: If a randomly-chosen fraction ρ(0) of nodes are initially infected, then the initial conditions for s k,m and i k,m are easily seen to be where B k,m (q) is the binomial distribution introduced in Eq. ( 3).Note that we can also calculate the number of edges of various types using this formalism.For example, the number of edges in the network which join a susceptible node to an infected node (we call these S-I edges for short) can be expressed in two equivalent ways: The first of these expressions, for example, follows from noting that in a sufficiently large network that there are N P k nodes of degree k, of which a fraction s k,m are susceptible and have m infected neighbors.Each such node contributes m edges to the total number of S-I edges.Similar expressions may also be given for the number of S-S and I-I edges in the network.We note that the equivalence of the two expressions in (39) is preserved by the evolution equations described below.
Next, we examine how the size of the S k,m set changes in time.We write the general expression to reflect all the transitions whose rate is linear in dt (all other state-transitions are negligible in the dt → 0 limit), see Fig. 9.Here W (S k,m → I k,m ) dt, for example, is the probability that a node in the S k,m set at time t moves to the I k,m set by time t + dt.It is clear from the definitions above that A node moves from the S k,m−1 set to the S k,m set if it remains susceptible, while one of its susceptible neighbors becomes infected.Note this means that an S-S edge changes to an S-I edge.If we suppose that S-S edges change to S-I edges at a (time-dependent) rate β s , we can write [85] W since nodes in the S k,m set have k − m susceptible neighbors, while those in the S k,m−1 set have k −m+1 susceptible neighbors.To calculate β s , we count the number of S-S edges in the network at time t, and then count the number of edges which switch from being S-S edges to S-I edges in the time interval dt; the probability β s dt is given by taking the ratio of the latter to the former, i.e., A similar approximation is used to define γ s , the rate at which S-I edges change to S-S edges due to the recovery of an infected node: and we then write Taking the limit dt → 0 of equation (40) gives the master equation for the evolution of s k,m (t) (see Fig. 9): where m is in the range 0, . . ., k for each k-class in the network (and adopting the convention s k,−1 ≡ s k,k+1 ≡ 0).Applying identical arguments, mutatis mutandis, to the set I k,m , we derive the corresponding system of equations for i k,m (t): for m = 0, . . ., k and for each k-class in the network, with time-dependent rates β i and γ i defined through s k,m and i k,m as The approximate master equations ( 46) and ( 47), with the time-dependent rates β s , γ s , β i and γ i (defined as nonlinear functions of s k,m and i k,m ), form a closed system of deterministic equations which, along with initial conditions (38), can be solved numerically using standard methods [72].Note that the evolution equations are completely prescribed by the functions F k,m and R k,m , and so this method can be applied to any stochastic dynamical process defined by transition rates F k,m and R k,m .For the SIS model, equations (46) and (47) were derived in [44] (see also [86] and [45]), with additional terms to study adaptive rewiring of the network.

B Microscopic reversibility in equilibrium models
We consider transition rates F k,m and R k,m for which detailed balance holds, i.e., for which the dynamics exhibit microscopic reversibility [3]. Figure 10 shows a pair of connected nodes: node 1-on the left-has degree k 1 , and node 2-on the right-has degree k 2 .We assume that m 1 neighbors of node 1, other than node 2, are in the active (infected) state; similarly, m 2 is the number of infected neighbors, other than node 1, of node 2.
Starting from the state at the top of Fig. 10, where both node 1 and node 2 are in the susceptible state, we consider possible cycles of state-transitions for the node pair, assuming that For microscopic reversibility, it is necessary that the product of the rates around a closed cycle are independent of the direction of rotation around the cycle [3,38,66].From Fig. 10, this means we must have Rearranging gives the condition Since k 1 and k 2 (and m 1 and m 2 ) are independent, this requires that = a for all k, and all m ≤ k, where a is a constant.Defining u m = F k,m /R k,m , this equation can be written as the first-order difference equation with the solution u m = a m u 0 .In terms of the rates, this means that detailed balance requires that Identifying b k as F k,0 /R k,0 yields Eq. ( 17).

C Derivation of equation (20)
Here we manipulate Eqs. ( 18) and (19) for equilibrium models to produce the solution (20) and the algebraic equation ( 21) for p. Starting with Eq. ( 19), we can solve for q, yielding q = pa 1 − p + pa .
Solving (18) for ρ k yields Inserting expression ( 54) into (55) gives the desired equation (20).Next, we consider the identity (39), for which the PA ansatz gives Since p k and q k are independent of k for the case considered here-see Eq. ( 16)-we obtain the simpler expression where ω = k z ρ k .Solving Eq. ( 57) for ω and using Eq. ( 54) gives and equating this to k k z P k ρ k , with Eq. ( 20), gives Eq. ( 21).

D Critical point of equilibrium models with up-down symmetry
Equilibrium spin models with up-down symmetry are discussed in Sec.6, and it is shown there that they constitute a special case of the class of models defined by relation (17).The up-down symmetry imposes condition (24) on the parameters in the solution (20), so that ρ k can be expressed in the form Recall from Sec. 6 that a symmetric solution of the AME with ρ k = 1/2 exists for all t; comparison with (59) shows this corresponds to the solution with p = 1/(1 + √ a).Inserting this into the expression (58) for ω-the probability that one end of a randomly chosen edge is infected-gives ω = 1/2 for the symmetric solution.
Next, we investigate the possibility of other solutions lying near the symmetric solution; in the language of dynamical systems, we construct the normal form of the bifurcation [75].We choose ω as the order parameter, and let ω = 1 2 + ǫ, for small ǫ, to explore the neighborhood of the symmetric solution.Equation (58) can then be inverted to give p in terms of ǫ, and inserting this into Eq.( 21) leads to a self-consistent equation for ǫ: After some rearrangement, this can be written as and, in the case a = e 4J/T , this is very similar to Eq. ( 13) of [25] and Eq. ( 28) of [26], which were derived-using very different methods (recursion method and replica trick)-for the Ising model.It follows that the structure of solutions of Eq. ( 61) near the critical point (bifurcation point) can be analysed-for the class of models obeying both ( 17) and ( 23)-using the same methods for (61) as used in [26,25].If P k has finite fourth moment, for example, the right hand side of (61) can be expanded as a Taylor series in small ǫ, giving where and c 3 is a coefficient involving moments of P k up to the fourth.Equation ( 62) possesses the symmetric solution ǫ = 0 for all parameter values, but solutions with non-zero ǫ can also be found, if leading order terms balance such that The ǫ-independent coefficient c 1 − 1 in this equation is negative for small values of the parameter a, but it is positive when a exceeds the value a c given by Eq. ( 25), while c 3 is negative at a = a c .Thus Eq. ( 63) has real-valued solutions for ǫ provided that a ≥ a c , and so a c marks the pitchfork bifurcation point (critical point).
The case P k ∼ k −γ can be examined in the same fashion as in [26,25].The small-ǫ expansion of (61) has leading-order terms of the form If γ is in the range 2 < γ < 3, the ǫ γ−2 term dominates both the linear and cubic terms, and the critical point is determined by the vanishing of coefficient c 0 : this occurs at a = a c = 1.For exponents in the range 3 < γ < 5, the linear term dominates the ǫ γ−2 term, and the critical point is again ( 25), but with a different scaling near criticality.For γ > 5 we recover the case ( 62).

E PA critical point of symmetric models on z-regular random graphs
For models with the symmetry ( 23), but not necessarily possessing the equilibrium property ( 17), we focus here on the steady-state of the PA equations (4) on z-regular graphs.In particular, we suppose that the dynamics (through the rates F k,m ) depends on a parameter Q, and we derive the condition (28) determining the critical value (bifurcation point) of this parameter.We begin by noting that the property (23) means that ρ = 1/2 is always a steady-state solution of (4).Using this value, and property (23), on the right-hand side of the first of the PA equations (4) gives the steady-state relation which is satisfied-as are the other steady PA equations-if q = 1 − p in the symmetric state.Using this to replace q in the second of the PA equations gives, after some manipulation, the implicit relation for the steady-state value of p: Next we consider the possibility of steady-states that are distinct from the symmetric state solution discussed above; we introduce the symbol σ as a convenient shorthand for the symmetric state: σ = {ρ = 1/2, p solves (66), q = 1 − p}.For a general-possibly non-symmetric-steady state, the first of the PA equations is We now treat ρ, p, and q as implicit functions of Q, the parameter defining the rates F z,m .Differentiating both sides of equation ( 67) with respect to Q, evaluating at the symmetric state σ, using the relations q| σ = 1 − p| σ and, from (57), leads eventually to the relation where all terms are evaluated at the symmetric state σ.For this equation to be true, we must have either ∂ρ/∂Q = 0 or the term in square brackets must vanish (note the third term is a sum of positive terms so cannot be zero).The symmetry-breaking bifurcation points correspond to the vanishing of the square bracketed term, and this gives the critical value of p: Finally, inserting this value into the general steady-state solution (66) gives the criticality condition (28) on the rate F z,m (= R z,z−m ) of the PA dynamics.

F Derivation of reduced-dimension equations for threshold models
Our goal here is to demonstrate that Eqs.(30) and ( 31) reduce to Eqs. ( 32)-( 34) through an exact solution of (31) given by (35).We proceed to insert the ansatz (35) into the AME (31) for m < M k (for which m values we have F k,m = 0).The left hand side of (31) then gives, after some rearrangement, ṡk,m = (1 where dots denote time derivatives.Using the identity on the right hand side of Eq. ( 31) yields the condition φ = β s (1 − φ) (73) on the function φ(t) for the ansatz (35) to be an exact solution of (31).From the initial condition on s k,m we also obtain Comparing (32) and (73) we see that it remains for us to show that We first prove a preliminary and rather general result, as follows.Multiplying the AME ( 31 Note that the second term on the right hand side is a telescoping series that reduces to and that the definition of β s enables us to express the first term on the right hand side as Therefore ( 76) can be solved for β s to yield ( We now use this result and the definition of β s to prove (75).For the infection probabilities (30), β s is given by using (35).It is straightforward to verify that this reduces to equations ( 32)- (33).
For completeness, note that the distribution of s k,m for m ≥ M k is, in general, not of the binomial form (35), which applies only to m values below the threshold M k .To obtain the values of s k,m for these m values, note that equation (31)

Figure 1 :
Figure 1: Schematic of the infection rate F k,m and recovery rate R k,m as defined in the text.Here-and in Figs. 9 and 10-white nodes are susceptible/inactive/spin-down and black nodes are infected/active/spin-up.Examples of the dependence of the rates on the number m of infected neighbors: (a) susceptible-infected-susceptible disease spread model, (b) Glauber dynamics for the Ising model, (c) a stylized "friend-saturation" model.Case (c) is motivated by the data analysis of social contagion for Digg stories in Fig.4of[17], where the probability of, for example, becoming infected (voting on a Digg story) initially increases with the number of infected neighbors, but then saturates, and eventually decreases, giving a function that is roughly symmetric about m = k/2.

Figure 2 :
Figure 2: Susceptible-infected (SI) dynamics (with transmission rate λ = 1) on a truncated scalefree network, with degree distribution given by P k ∝ k −2.5 for degrees in the range 3 ≤ k ≤ 20, with P k = 0 otherwise.The initial infected fraction is ρ(0) = 10 −2 .Here, and in all subsequent Figures, symbols show the results of numerical simulations on networks of size N = 10 5 , using time step dt = 10 −4 .Results are averages over 24 realizations; error bars indicate mean ± one standard deviation (but note error bars are smaller than the symbols in this example).

Figure 4 :
Figure 4: Fraction of triples that are of S-I-S type, for SI dynamics on a 3-regular random graph.

Figure 6 :
Figure 6: Glauber dynamics for the Ising spin model dynamics on a Poisson (Erdös-Rényi)random graph with mean degree z = 7.The interaction parameter J is set to 1, the temperature T is 2/ log(2.5)≈ 2.18, and the initial fraction of up-spins is 0.51.

Figure 9 :
Figure 9: Schematic of transitions to/from the S k,m and I k,m sets, as described in equations (40) through(48).For each set, the central (Ego) node is shown along with some of its neighbors: white nodes are susceptible/inactive/spin-down and black nodes are infected/active/spin-up.See also Fig.1of[44].

Figure 10 :
Figure 10: Rates for state transitions of a connected pair of node, assuming that the states of all their other neighbors remain unchanged.Node 1 (left-hand node) has degree k 1 and has m 1 infected enighbors that are not shown.Node 2 (right-hand node) has degree k 2 and has m 2 neighbors-other than node 1-who are infected.

m 1 and m 2
remain unchanged.The transition from the {S, S} state (top of figure) to the {S, I} state (right of figure), for example, occurs at rate F k 2 ,m 2 , since node 2 becomes infected while it has m 2 infected neighbors.The transition from {S, I} (right) to {I, I} (bottom of Figure) occurs at rate F k 1 ,m 1 +1 , since node 1 is required to become infected at a time when it has m 1 + 1 infected neighbors.The other rates are derived similarly.
) by (k − m)P k , summing over m = 0, . . ., k and over the k-classes gives:d dt k P k m (k − m)s k,m = − k P k m F k,m (k − m)s k,m m) 2 s k,m − (k − m)(k − m + 1)s k,m−1 .(76)

2 d
dt k P k m (k − m)s k,m k P k m (k − m)s k,m as β s = − d dt ln(1−φ) and comparing with (79) gives the result that k P k m (k− m)s k,m and (1 − φ) 2 are equal, up to a multiplicative constant.Using the initial conditions on s k,m and φ determines the constant, giving k P k m (k − m)s k,m = z(1 − φ) 2 .

Table
. The recovery rate in the for SIS dynamics, see Appendix A for details.Let s k,m (t) (resp.i k,m (t)) be the fraction of k-degree nodes that are susceptible (resp.infected) at time t, and have m infected neighbors.Then the fraction ρ k (t) of k-degree nodes that are infected at time t is given by ρ k