Critical mass effect in evolutionary games triggered by zealots

Tiny perturbations may trigger large responses in systems near criticality, shifting them across equilibria. Committed minorities are suggested to be responsible for the emergence of collective behaviors in many physical, social, and biological systems. Using evolutionary game theory, we address the question whether a finite fraction of zealots can drive the system to large-scale coordination. We find that a tipping point exists in coordination games, whereas the same phenomenon depends on the selection pressure, update rule, and network structure in other types of games. Our study paves the way to understand social systems driven by the individuals' benefit in presence of zealots, such as human vaccination behavior or cooperative transports in animal groups.

One hallmark of complex systems at the critical point is that small perturbations may trigger large responses, shifting the system from one equilibrium to another [1][2][3][4][5][6]. Small perturbations can be responsible for the emergence/disruption of collective phenomena such as synchronization [7,8], active Brownian motion (flocking) [9], and cultural evolution [10,11]. One way to model perturbations is to assume the existence of a committed minority, whose fluctuations may trigger a system-wide response. Committed minorities also play a pivotal role in the emergence of consensus among opinions and in coordination problems [12,13]. Over the time, committed minorities have spurred a cascade of behavioral changes leading to a shift in the conventions held by the majority of the population (e.g., civil rights movements [14], riots and revolutions [15,16], and vaccine hesitancy [17]).
We refer to a committed individual, i.e., having strong beliefs about something, as a zealot. Zealotry can elicit consensus in human/animal behavior [18][19][20], the polarization of opinions in the voter model [21], majority rule [13], naming game [22][23][24], Social Knowledge Structure (SKS) [25], and Cooperative Decision Making Model (CDMM) [26] models. Zealotry can also drive the emergence of cooperation in evolutionary games [27][28][29], and the attainment of the optimal equilibrium in the Schelling's model of social segregation dynamics [30]. Conversely, zealous (i.e., stubborn) dissenters can disrupt flocking in the Vicsek's active matter model [9]. Lastly, including zealots of opposite types hinders opinions' polarization in the voter model [31] and majority rule [13]. One question about zealots is whether or not their catalytic role in migrating the system across equilibria requires a critical mass. Specifically, an infinitesimal fraction of zealots is enough to shift the equilibria in the voter [21], CDMM [26], and Schelling's [30] dynamics, whereas finite fractions are needed in both the majority rule [13] and Vicsek [9] dynamics, as well as in the naming game theoretically [22,24] and experimentally [32].
In this Letter, we clarify the presence of critical mass effects induced by zealots by providing a comprehensive study based on evolutionary game theory. We delineate the conditions under which a critical mass of zealots enabling one opinion state, which one often identifies with cooperation in the context of social dilemma games [33], exists in well-mixed and networked populations of agents, depending on the game's type, selection pressure, update rule, and network structure.
Let us consider a well-mixed population of N agents playing an evolutionary two-strategy game. The state of an agent i, 1 ≤ i ≤ N , is determined by its strategy, s i , which we refer to as either cooperation (s i = C) or defection (s i = D). The entries of the payoff matrix, A, correspond to the payoff gained by the row agent and fully characterize the game. A subclass of all the possible 2 × 2 payoff matrices representing different types of social dilemmas is with T ∈ [0, 2] and S ∈ [−1, 1] [33][34][35]. For example, when agent i cooperates (i.e., s i = C) and agent j defects (s j = D), the former gets payoff a CD = S, and the latter gets a DC = T . Matrix A allows us to study the following games: the Harmony Game (HG), the Stag Hunt (SH), the Hawk and Dove (HD), and the Prisoner's Dilemma (PD) (see Fig. S1 for the values of T and S corresponding to each game); albeit in this study we focus on the latter three. In each round, the total payoff for agent i is given by π(i) = 1 N −1 N j=1;j =i a sisj . Once all agents have played with all the others, an ordered pair (i, j) of agents is selected uniformly at random to update their strategies. Agent, i, with strategy X and payoff π X , will adopt the strategy Y of agent j, with payoff π Y , where X, Y ∈ {C, D}, with a probability P X←Y specified by the so-called Fermi rule [36,37]: where β ∈ [0, ∞[ is an inverse temperature parameter called also selection pressure. A large value of β makes P X←Y more sensitive to the payoff difference, π Y − π X , corresponding to a strong selection pressure. After the strategy updating, the round is completed. Then, all agents reset their payoffs and play the next round. The state of the system is determined by the fraction of cooperative agents, 1], where N C is the number of cooperators. The dynamics continues until the system reaches one of its absorbing states, i.e., f C = {0, 1} [33,38]. Committed individuals are modeled as a new class of agents called zealots. A zealot always cooperates and, is immune to strategy updating, while other agents may still imitate her by becoming cooperators.
To examine the extent to which a finite fraction of committed cooperators triggers cooperation, we replace a fraction f Z = N Z /N ∈ [0, 1/2] of normal agents with zealots. Then, we measure the fraction of cooperators among the normal agents, , in the stationary state. Figure 1 shows the average fraction of cooperators in the stationary state, denoted by f C , for the HD, PD, and SH games for arbitrary choices of T and S (see SM for a comprehensive exploration of the (T, S) space). Figure 1 indicates that f Z has little effect on f C for the HD and PD games. However, for the SH game we observe at f Z 0.15 a sharp transition from a state where cooperation is sustained almost exclusively by zealots (i.e., f C 0), to a fully cooperative state (i.e., f C = 1), denoting the existence of a critical mass effect akin to the one observed for the majority rule opinion model [13] and for other opinion dynamics models [21,22,32]. In contrast, the lack of a critical mass effect observed in the HD and PD games suggests the inability of zealots to initiate a positive feedback mechanism enabling large-scale invasion of cooperators. Furthermore, defectors end up exploiting zealots in the HD game, yielding a reduction in f C as the fraction of zealots increases [39]. This trend is consistent with the theoretical result for infinite well-mixed populations, corresponding to the dashed line in Fig. 1 (see SM for the details).
To address why zealots trigger the onset of cooperation in the SH but not in the HD and PD games, we study the evolutionary dynamics in terms of the concentration of cooperators, f C , in the thermodynamic limit. The following critical mass results are qualitatively the same for finite populations (Fig. S3) and for the replicator dynamics [38], for which a critical mass effect exists for all the dilemmas and selection pressures strengths allowed (see Fig. S5 and SM for details). Given an infinite population, the evolutionary dynamics are described bẏ f C = P D→C − P C→D , whereḟ C denotes f C 's derivative with respect to time. Quantity P X→Y is the probability per unit time that a pair of agents having strategies X and Y is selected for updating, and then agent with strategy X changes it to Y through Eq. (2). Therefore, one obtainṡ where is the difference between the payoffs of a cooperator and a defector. We search then for the equilibria of the dynamics described by Eq. (3), i.e., f C such thatḟ C fC=fC = 0. This is effectively done in Fig. 2, where we plotḟ C as a function of f C (i.e., we draw the phase portrait) for the three types of games and two selection pressure values, β = {10, 1}. The hue of the lines corresponds to different values of f Z including the case of zealots' absence (blue line). The solid square represents the fully cooperative absorbing state (f C = 1), and the other equilibria correspond to the intersection(s) of the curves witḣ f C = 0 (dashed lines). We first observe that f C = 0 (i.e., absence of cooperation) exists if and only if zealots are absent. Therefore, zealots induce a positive fraction, which may be small, of cooperators among normal agents (see SM for the proof). Second, equilibrium f C = 1 (i.e., fully cooperative state) exists for any f Z and is either stable (SH game) or unstable (HD and PD games). Third, there is at most one intermediate solu- , whose position depends on the dilemma's type, f Z , and the selection pressure, β. In the HD game ( Fig. 2(a)), f C = γ is always stable, but its position moves towards f C = 0 as f Z becomes larger, in agreement with the scenario portrayed in Fig. 1, in which defectors exploit zealots. In the PD game ( Fig. 2(b)), the position of f C = γ is slightly positive, such that zealots induce some cooperation in agreement with the observation made in Refs. [27,28]. However, γ is small and insensitive to f Z . In the SH game ( Fig. 2(c)), f C = γ is unstable when it exists, and the response to zealots is richer than in the HD and PD games. When f Z < f Z 0.4, the dynamics is bistable such that a stable equilibrium where either C of D is the majority is attained, depending on the initial condition. This dynamics is qualitatively the same as that for f Z = 0, in which case the stable equilibria are located at f C = 0 and f C = 1, and the unstable equilibrium is located at f C = γ = 1/2. However, as f Z increases, the system undergoes a saddle-node bifurcation at f Z = f Z such that the left stable equilibrium and f C = γ coalesce and disappear. When f Z > f Z , the only equilibrium left is f C = 1, confirming the existence of a critical mass effect (an extensive exploration of the (T, S) space is reported in Fig. S6). Finally, although zealots do not trigger the transition to full cooperation in the HD and PD games, they elongate the transient time needed to reach the equilibrium ( Fig. S7 and [29]).
Are there other ways in which zealots trigger the critical mass effect? In Figs. 2(d),(e), and (f), we examine the case of weak selection pressure by setting β = 1. In the HD game ( Fig. 2(d)), the results are opposite to those for strong selection pressure ( Fig. 2(a)) in that f C = γ moves towards f C = 1 (as opposed to f C = 0) as one adds zealots. In the PD game ( Fig. 2(e)), the equilibrium f C = γ exists for any f Z > 0, is stable, and moves towards f C = 1 as f Z increases. Nevertheless, we do not observe a critical mass effect. In the SH game, decreasing β reduces the critical mass threshold, f Z , from 0.4 to 0.08 ( Fig. 2(f)). In a nutshell, although reducing selection pressure β does not spawn a critical mass effect in the HD and PD games, it still fosters higher levels of cooperation in all the dilemmas considered, and reduces the threshold for the critical mass effect in the SH game. We note that, under replicator dynamics, the critical mass effect with f Z = 0.3 exists regardless of whether the selection pressure is strong or weak, and under all types of games (Fig. S5).
Finally, we consider agents interacting on a network to assess the role of heterogeneous numbers of interactions across agents. In Fig. 3, we report the fraction of cooperators among normal agents in the three types of games for random regular lattices (REG), Erdős-Renyí (ER), and Bárabasi-Albert (BA) networks [40][41][42]. All networks have the same number of nodes, N = 1000, and average number of neighbors, k = 6. Agents' payoffs are computed according to the so-called additive payoff scheme, i.e., the payoff of an agent i is π(i) = j∈Ni a sisj , where N i is the set of neighbors of i [35]; we have confirmed that the results remain qualitatively the same also when we normalize π(i) by agent i's degree (i.e., we use the average payoff scheme). We consider a fraction f Z of uniformly randomly selected nodes to be zealots, whereas the other nodes are initialized as defectors. In the HD game, BA networks, but not the REG and ER networks, attain almost full cooperation for f Z = 0, in agreement with previous results [34,43,44]. An increase in f Z allows defectors to exploit normal cooperators regardless of the network's type ( Fig. 3(a)), in agreement with results for well-mixed populations (Fig. 1). In the PD game ( Fig. 2(b)), BA networks display a clear critical mass effect, with a sensible increase in cooperation for f Z > 0.25. We set T = 1.5 and S = −0.5 such that cooperation would be absent in the BA networks without zealots, while BA networks with the additive payoff scheme considerably enhance cooperation in the PD games [34,43,44]. Finally, in the SH game ( Fig. 3(c)), networked populations slightly anticipate the transition to f C = 1 in terms of f Z . Our extensive exploration of the (T, S) space supports the generality of these results (Figs. S9 and S8).
In this Letter, we have studied how committed individuals induce qualitative and quantitative changes in evolutionary dynamics under different types of social dilemma. We have considered three evolutionary dilemmas, and observed that only the SH game displays a clear critical mass effect, whereas in the other two dilemmas we need to reduce the selection pressure, change the update rule, or consider heterogeneous networks to observe a critical mass effect. The presence of a critical mass in terms of the fraction of zealots in the SH game is in line with the observations made for other coordination dynamics such as the majority rule model [13], the naming game [22][23][24]32], and the voter model [21].
Given our analytical underpinning, extensions to evolutionary vaccination scenarios [45] where zealots contribute in part to herd immunization and elicit defection, i.e., not taking immunization, among normal agents (in agreement with the phenomenology observed in the HD game) would be straightforward. In the evolutionary synchronization dynamics based on the so-called evolutionary Kuramoto dilemma [46] (which is akin to a coordination dynamics), the existence of a critical massin analogy with the SH game -is expected to enable the emergence of either global or chimera synchronized states. Such states would not be observable in the absence of zealots. Hence, the addition of zealots to populations of evolutionary oscillators could be seen as the evolutionary analog of pinning in control theory [47]. Our findings may also be useful for understanding why zealots succeed in suppressing oscillations in the rockscissor-paper game [48], whether or not a non-negligible fraction of zealots is required for the emergence of collective coordination in human [20] and animal [18] behavior, cooperative transport in ants [19,49], and cell migration [50]. For the latter two dynamics, a lengthened correlation length plays a role on coordination [19,50] similar to a low selection pressure in our framework. Finally, we have considered only a random placement of zealots on the nodes of the network. However, placing zealots on nodes ranked according to their topological properties can reduce the critical fraction of zealots needed to attain full consensus [51].
The authors thank A. Baronchelli

Payoff matrix
Let us consider a population of N agents playing an evolutionary game. The state of an agent i is characterized by its strategy, s i . We assume two strategies, i.e., cooperation s = C and defection s = D. An agent i plays with all her neighbors j, and receives a payoff according to its own strategy, s i = X, and her opponent's one, s j = Y, with X, Y ∈ {C, D}. The entries of the so-called payoff matrix, A, correspond to the payoff, a XY , gained by the row agent and fully characterize the game. For games with two strategies, the most generic form of payoff matrix is where mutual cooperation leads to the reward, R, whereas mutual defection to the punishment, P . A cooperator that plays with a defector gets S (the sucker's payoff), whereas a defector playing with a cooperator gets the temptation, T [1]. We consider three types of game, namely, the Hawk and Dove, for which T > R > S > P , the Stag-Hunt for which R > T > P > S, and the Prisoner's Dilemma, for which T > R > P > S. Without loss of generality, we set R = 1 and P = 0 to obtain the following reduced payoff matrix: where T and S are parameters. Then, we restrict T ∈ [0, 2] and S ∈ [−1, 1] [2], which is sufficient for obtaining the three types of dilemmas (and a fourth dilemma which we do not analyze), as summarized in Fig. S1. 2 Analytical results

Evolutionary dynamics in finite populations
In finite populations, we can study the onset of cooperation under the presence of zealots using onedimensional random walks [3][4][5]. Let us consider a population of size N = N + z, where N is the number of normal agents, and z = f Z N is the number of zealous agents. Our goal is to study the evolution of the number of cooperators among normal agents, N C ≡ i (0 ≤ i ≤ N ), given a fraction of zealots, f Z . For the payoff matrix given by Eq. (2), the payoffs associated with cooperation and defection are equal to and respectively. The evolutionary dynamics is equivalent to a one-dimensional random walk on a finite line whose position is identified with the number of normal cooperators, 0 ≤ i ≤ N . Figure S2 shows the possible state transitions in each strategy updating, where T − , T + , and 1 − T + − T − are the probabilities that i decreases by one, increases by one, and remains the same after a single strategy updating, respectively. The probability that i increases (decreases) by one is equal to the product of the probability of choosing a pair of agents with different strategies, P ch , and the probability that one of the agents updates its strategy, P D←C (P C←D ). By keeping in mind that zealots do not update their strategy but can induce defectors to become cooperators, under the Fermi rule one obtains where we have used N (N − 1) N 2 assuming a large N . Then, the bias in the random walk reads It is convenient to compute the payoff difference α = π C − π D using Eqs. (3) and (4), yielding In terms of α, Eq. (7) is rewritten as The equilibria of the evolutionary dynamics are the values of i such that T + − T − = 0. Given Eq. (9), finding the equilibria is equivalent to finding the solutions of Therefore, i = N , corresponding to the fully cooperative state, is an equilibrium. To find other equilibria, we start by rewriting the payoff difference, α, in terms of i as follows where and In terms of λ 1 and λ 2 , we obtain Therefore, i = 0 is solution of Eq. (14) if and only if z = 0. This means that zealots always induce some cooperation among normal agents. Furthermore, in agreement with the behavior observed in absence of zealots for SH and HD games, we expect to observe another equilibrium 0 < i < N . However, finding a closed-form expression for such equilibrium (i.e., the solution of Eq. (14)) is not easy as Eq. (14) is a transcendental equation. After identifying all the equilibria, we examine their stability. Equilibrium i = 1 is stable and corresponds to full cooperation. Let us analyze the stability of other equilibria by focusing on the sign of the random-walk bias given by Eq. (9). The sign exclusively depends on the numerator's sign because the denominator of Eq. (9) is always positive. Specifically, we examine whether Let us concentrate on the case Λ > 0, i.e., By substituting Eq. (11) into Eq. (16) and taking the logarithms of both sides, one obtains By substituting Eqs. (12) and (13) into Eq. (17), one obtains Similarly, Λ < 0 leads to For i = 0, inequality (18) always holds true. This confirms that i = 0 is a reflective equilibrium and that zealots always generate some cooperation.
In Fig. S3 we report T + − T − given by Eq. (9) as a function of the number of cooperators among normal agents, i, for three values of β, three types of game (i.e., HD, PD, and SH), and N = 10 5 . Figure  S3 suggests that the results are qualitatively the same as those for the infinite well-mixed populations reported in Figs. 2 and S4.  Figure S3: Random walk's bias, T + − T − (Eq. (9)), as a function of the number of cooperators among normal agents, i. Each column accounts for a different dilemma, namely, HD (panels a, d, and g), PD (b, e, and h), and SH (c, f, and i). The hue of the solid lines corresponds to the fraction of zealots, f Z , in the population, with the case without zealots being shown in cyan. The square at i = N represents the fully cooperative absorbing equilibrium, and the filling color denotes whether the equilibrium is stable (black) or unstable (white). The results are for the Fermi update rule. Each row corresponds to a different value of β, namely, β = 1 (top row), β = 10 (middle row), and β = 100 (bottom row). The population size is N = 10 5 .

Evolutionary dynamics in infinite populations
In this section, we analyze infinite populations following the approach introduced in Ref. [6]. We study the dynamics of the fraction of cooperators among normal agents,ḟ C , where the dot denotes the time derivative, as a function of f C (i.e., we plot the phase portrait [7]), when we introduce a fraction, f Z ≥ 0, of cooperative zealots into the population. For strategy updating, we consider the Fermi rule [8], and the replicator dynamics [3].

Fermi rule
The average payoff of a cooperator, π C , and that of a defector, π D , are given by and respectively. Thusḟ where P X→Y is the joint probability that one selects a pair of agents with strategies (X, Y) uniformly at random and that the agent with strategy X adopts strategy Y under the Fermi rule. These probabilities read where By substituting Eqs. (23) and (24) into Eq. (22), one obtainṡ At equilibrium, f C (i.e.,ḟ C = 0), one obtains In the absence of zealots (i.e., f Z = 0), we have two equilibria f C = 0 and f C = 1. When f Z > 0, f C = 1 is still an equilibrium but f C = 0 is not, and it is replaced by a new equilibrium located at 0 < f C < 1. The behavior ofḟ C against f C for different values of β and three types of dilemmas is displayed in Fig. S4. Note that panels (a) to (f) also appear in Fig. 2 of the main text.

Replicator dynamics
The replicator dynamics constitutes the infinite size counterpart of the Moran rule [3,9]. Under the Moran process, we first select an agent called the child uniformly at random among the normal agents. Second, we choose an agent called the parent randomly with the probability proportional to the agent's payoff. Third, the parent's strategy replaces the child's one. To ensure that the probability of choosing a parent is well-defined, payoffs must always be positive. Since we have assumed S ∈ [−1, 1] for the payoff matrix (Eq. (2)), S may be negative. Therefore, we alter all entries of the payoff matrix by adding to them a constant ε > 1 such that they are all positive. This alteration preserves the type of dilemma. Therefore, we use where R = 1 + ε, S = S + ε, T = T + ε, P = ε, and ε > 1. For the payoff matrix given by Eq. (28), the payoff of a cooperator is Similarly, for π D we get where π C and π D are the payoffs computed using the entries of the reduced payoff matrix (Eq. (2)). One way to introduce selection pressure in the replicator dynamics is to re-formulate the payoffs as follows where w ∈ [0, 1] determines the intensity of selection. The payoffs displayed in Eqs. (29) and (30), and those in Eqs. (31) and (32) are equivalent with the identification w = 1/(1 + ε). Then, π C and π D shown in Eqs. (29) and (30) are both (1 + ε) -times larger (corresponding to a change in the time scale without changing other aspects of replicator dynamics) than π C and π D given by Eqs. (31) and (32). Since the condition ε > 1 must hold to ensure that the Moran process is well-defined, one must have w ∈ ] 0, 1/2 [. Using w as parameter to control the selection pressure and Eq. (2) as the payoff matrix, the payoffs are given by where π is the payoff averaged over all agents including zealots. The probability of choosing a cooperator or a defector as parent is given by and respectively. The replicator dynamics is driven by the difference between the rate at which defectors become cooperators, T + , and the rate at which cooperators become defectors, T − . Thus, The first term on the right-hand side of Eq. (38) represents the probability that a defector is chosen as child and a cooperator as parent. The second term represents the probability that a cooperator is chosen as child and a defector as parent. By substituting Eqs. (36) and (37) into Eq. (38), one obtainṡ By substituting Eqs. (33)-(35) into Eq. (39), we obtaiṅ In Fig. S5, we showḟ C versus f C for different values of f Z in the case of the HD, PD, and SH games for three different values of selection pressure, w ∈ {0.1, 0.3, 0.49}. We observe the presence of a critical mass effect in all the dilemmas regardless of w.
One obtains the equilibria of the dynamics given by Eq. (40) by solving The solution f C = 1 corresponds to full cooperation. As expected, this equilibrium exists regardless of the value of, f Z , w, and the type of the game. The other solutions of Eq. (41) are the solutions of the following equation: Solving Eq. (42) is not straightforward. However, we can extract some useful information under specific conditions. First, we verify whether f C = 0 is solution of Eq. (42) or not. By setting f C = 0, one obtains Equation (43) has two solutions, i.e., and The solution corresponding to Eq. (44) tells us that f C = 0 is a solution of Eq. (41) only in absence of zealots, which is the same as the case of the Fermi rule. Next, Eq. (45) is equivalent to The denominator is always negative. Therefore, f Z ≥ 0 requires that the numerator must be nonpositive, giving Inequality (47)  Let us study now the effects of selection pressure, w, on the solutions of Eq. (42). In the zero selection limit, Eq. (42) becomes Hence, in the zero selection limit, the only solutions toḟ C = 0 apart from f C = 1 are those obtained in the absence of zealots. The strong selection limit, w = 1, returns a different scenario. By setting w = 1 in Eq. (42) one obtains which leads to The solutions of Eq. (50) are given by . Therefore, there are at most two other solutions to Eq. (41). There is no real solution f C if B 2 − 4AC < 0, which is the case if and only if f Z >f Z , if such anf Z exists. The valuef Z corresponds to the critical mass of zealots needed to observe only full cooperation. The values off Z are displayed in Tab. S1. Table S1: Minimum fraction of zealots,f Z , required to observe exclusively the full cooperative equilibrium (i.e., f C = 1) for a replicator dynamics in the strong selection limit (i.e., w = 0.49).
3 Numerical results in mean-field populations

Simulation setup
We provide here some details about the numerical implementation of our agent-based simulations. Given a population of N agents, we initialize the agents' strategies such that we have a fraction f Z of zealots. Then, we start the evolutionary dynamics where, at each time step, we first perform a uniform random sampling of 5N pairs of agents and let them play and accumulate payoff according to Eq. (2). In this way, each agent has played ten times on average. After agents accumulate their payoff, they update their strategies, payoffs are reset to zero, and we measure the fraction of cooperators among normal agents, f C , and begin a new round. We let the dynamics to evolve for a transient of t tr time steps during which we only store the value of f C . Then, for t > t tr we evaluate the average, and standard deviation, of f C over a moving window of length t tr . For PD and SH games we expect that the dynamics ends in one of its two adsorbing states: f C = 1 and f C = 0, respectively. However, to expedite the computation, instead of waiting until the system reaches exactly the adsorbing state, we set a condition such that if f C > 0.95 (or f C < 0.05) and σ f C < 0.01 the evolution stops because we assume that the system is in a metastable state, and that it will reach the adsorbing state for sure. In the case of the HD game, we know that the stationary state is 0 < f C < 1 and that it takes a lot of time to reach the stationariety. In such case (or if the dynamics does not end up in any of the aforementioned conditions), the dynamics runs "freely" and stops after t max = 5 · 10 5 steps. Then, we compute f C over the last t tr steps and use such value as the "final" value of cooperation.

Exploration of the (T, S) space
In the main text, we focused on typical (T, S) values. To examine the generality of our results in terms of T and S, we numerically explore the (T, S) space in this section. We set N = 1000 and β = 10. In Fig. S6 we show the average fraction of cooperation among normal agents, f C , as we vary T , S, and the fraction of zealots, f Z for the Fermi update rule. The figure indicates that, for the SH game, the transition from the equilibrium with little cooperation to that with full cooperation occurs suddenly as one increases f Z . Note that this is the case for the entire parameter (T, S) region corresponding to the SH game (i.e., T < 1 and S < 0). In contrast, for the HD game, the effect of f Z is only little and gradual in the entire parameter region (i.e., T > 1 and S > 0). The results for the PD game (i.e., T > 1 and S < 0) are mixed. In Fig. S7, we display the time needed for the simulation to converge to the stationary state, t , and its standard deviation, σ t .

Numerical results in networked populations
We have also tested the effect of zealotry on networks for various values of T and S. The degree distribution is a delta function for REG networks, a Poisson distribution for ER networks, and a power-law for BA networks [10]. Moreover, to exclude any other effect than those associated with degree heterogeneity, we fix the number of nodes, N = 1000, and the average degree, k = 6, to be the same for the three types of networks. It is worth noting that degree heterogeneity may boost cooperation [2,11], or decrease it when a cost is associated with each interaction [12]. For this reason, we consider two scenarios: one in which we divide the payoff accumulated by each agent by its degree (i.e., number of connections) to suppress the effect of degree heterogeneity, and another in which we do not carry out the division. The former approach is known as average payoff scheme [13], while the other is usually called additive payoff scheme [11]. Figure S8 shows the fraction of cooperators for the different networks under the additive payoff scheme. A tiny fraction of zealots is enough to spur almost complete cooperation in the HD game when played on a BA network. However, in agreement with the results for the well-mixed populations, an excessive fraction of zealots decreases the fraction of normal cooperators, making defection more appealing as shown by the re-emergence of defection in the upper right corner of the (T, S) space. Moreover, BA networks considerably increase the cooperation in the PD game, albeit such phenomenon occurs only when f Z > 0.25. The SH game, instead, does not display any remarkable difference with the phenomenology observed in well-mixed populations. In contrast to the case of additive payoff scheme, under the average payoff scheme the amount of cooperation over the (T, S) space does not depend remarkably on the network structure. Figure S9 shows the average fraction of cooperators, f C , for different values of T and S, different network structures, and the average payoff scheme. As expected, the pattern of cooperation is roughly independent of the network structure, although ER and BA networks slightly promote cooperation compared to the wellmixed populations in an extensive part of the (T, S) space corresponding to the PD game for f Z ≥ 0.25.