Second-order freeriding on antisocial punishment restores the effectiveness of prosocial punishment

Economic experiments have shown that punishment can increase public goods game contributions over time. However, the effectiveness of punishment is challenged by second-order freeriding and antisocial punishment. The latter implies that non-cooperators punish cooperators, while the former implies unwillingness to shoulder the cost of punishment. Here we extend the theory of cooperation in the spatial public goods game by considering four competing strategies, which are traditional cooperators and defectors, as well as cooperators who punish defectors and defectors who punish cooperators. We show that if the synergistic effects are high enough to sustain cooperation based on network reciprocity alone, antisocial punishment does not deter public cooperation. Conversely, if synergistic effects are low and punishment is actively needed to sustain cooperation, antisocial punishment does act detrimental, but only if the cost-to-fine ratio is low. If the costs are relatively high, cooperation again dominates as a result of spatial pattern formation. Counterintuitively, defectors who do not punish cooperators, and are thus effectively second-order freeriding on antisocial punishment, form an active layer around punishing cooperators, which protects them against defectors that punish cooperators. A stable three-strategy phase that is sustained by the spontaneous emergence of cyclic dominance is also possible via the same route. The microscopic mechanism behind the reported evolutionary outcomes can be explained by the comparison of invasion rates that determine the stability of subsystem solutions. Our results reveal an unlikely evolutionary escape from adverse effects of antisocial punishment, and they provide a rationale for why second-order freeriding is not always an impediment to the evolutionary stability of punishment.


I. INTRODUCTION
Cooperation is widespread in human societies [1][2][3][4][5][6][7]. Like no other species, we champion personal sacrifice for the common good [8,9]. Not only are people willing to incur costs to help unrelated others, economic experiments have shown that many are also willing to incur costs to punish those that do not cooperate [10][11][12][13][14][15][16]. Unfortunately, cooperation is jeopardized by selfish incentives to freeride on the selfless contributions of others. What is more, individuals that abstain from punishing such freeriders are often called second-order freeriders for their failure to bear the additional costs of punishment [17,18]. Several evolutionary models have been developed to study the effects of punishment on cooperation [19][20][21][22][23][24][25][26][27], and it has been pointed out that second-order freeriding is amongst the biggest impediments to the evolutionary stability of punishment [28][29][30][31].
In addition to second-order freeriding, the effectiveness of punishment is challenged by antisocial punishment. The fact that non-cooperators sometimes punish cooperators has been observed experimentally in different human societies [32][33][34][35][36][37], and it has been shown theoretically that this antisocial punishment can prevent the successful coevolution of punishment and cooperation [38,39]. In fact, if antisocial punishment is an option, prosocial punishment may no longer increase cooperation, deteriorating instead to a self-interested tool for pro-tecting oneself against potential competitors [40]. While the punishment of freeriders is aimed at increasing cooperation, antisocial punishment can be a form of retaliation for punishment received in repeated games [32,41], or is simply aimed at cooperators without a retaliatory motive [36,37].
Given the potential drawbacks associated with punishment related to second-order freeriding and antisocial punishment, it has been rightfully pointed out that the maintenance of cooperation may be better achievable through less destructive means. In particular, rewards may be as effective as punishment and lead to higher total earnings without potential damage to reputation or fear from retaliation [42,43]. Although many evolutionary models confirm the effectiveness of positive incentives for the promotion of cooperation [44][45][46][47][48][49][50][51][52], in this case too the challenges associated with second-order freeriding and antisocial rewarding persist [53,54].
Here we use methods of statistical physics to show how the two long-standing problems -namely second-order freeriding and antisocial punishment -cancel each other out in an unlikely and counterintuitive evolutionary outcome, and in doing so restore the effectiveness of prosocial punishment to promote cooperation. We extend the theory of cooperation by considering the spatial public goods game with nonpunishing cooperators and defectors, as well as with cooperators who punish defectors and defectors who punish cooperators. As we will show in detail, spatial pattern formation leads to unconditional defectors forming an active layer around punishing cooperators, which protects them against defectors that punish cooperators. This is a new evolutionary escape from adverse effects of antisocial punishment, which in turn also reveals unexpected benefits stemming from second-order freeriding.
In what follows, we first present the spatial public goods game with prosocial and antisocial punishment, and then proceed with the results and a discussions of their implications for the successful coevolution of cooperation and punishment.

II. PUBLIC GOODS GAME WITH PROSOCIAL AND ANTISOCIAL PUNISHMENT
The traditional version of the public goods game is simple and intuitive, and it captures the essence of the puzzle that is human cooperation [55,56]. In a group of players, each one can decide whether to cooperate (C) or defect (D). Cooperators contribute a fixed amount (equal to 1 without loss of generality) to the common pool, while defectors contribute nothing. The sum of all contributions is multiplied by a multiplication factor r > 1, which takes into account synergistic effects of cooperation, and the resulting amount of public goods is divided equally amongst all group members irrespective of their strategies. Defection thus yields highest short-term individual payoffs, while cooperation is best for the group as a whole.
Here we extend this game by introducing two additional strategies, namely cooperators that punish defectors (P C ), and defectors that punish cooperators (P D ). The former represent prosocial punishment, while the later represent antisocial punishment. Technically, P C players punish D and P D players, while P D players punish C and P C players. In a g group of size G the resulting payoffs are where and N C , N D , N P C and N P D are, respectively, the number of non-punishing cooperators, non-punishing defectors, punishing cooperators and punishing defectors in the g group. In addition to the multiplication factor r we have two additional parameters, which are β as the maximal fine imposed on a player if all other players within the group punish her, and γ as the maximal cost of punishment that can apply. Importantly, the values of both β and γ are kept the same for prosocial and antisocial punishment so as to not give either a default evolutionary advantage or disadvantage. This public goods game is staged on a square lattice with periodic boundary conditions where L 2 players are arranged into overlapping groups of size G = 5 such that everyone is connected to its G − 1 nearest neighbors. Accordingly, each player belongs to g = 1, . . . , G different groups, each of size G. Notably, the square lattice is the simplest of networks that takes into account the fact that the interactions among us are inherently structured rather than random. By using the square lattice, we continue a long-standing tradition that begun with the work of Nowak and May [57], and which has since emerged as a default setup to reveal all evolutionary outcomes that are feasible within a particular version of the public goods game [56]. We should note, however, that our observations are robust and do not restricted to this interaction topology. The only crucial criteria are players should have limited and stable connections with others, which allow network reciprocity to work.
Monte Carlo simulations are carried out as follows. Initially each player on site x is designated either as a non-punishing cooperator, non-punishing defector, punishing cooperator or punishing defector with equal probability. The following elementary steps are then iterated repeatedly until a stationary solution is obtained, i.e., until the average fractions of strategies on the square lattice become time-independent. During an elementary step a randomly selected player x plays the public goods game in all the G groups where she is member, whereby her overall payoff Π sx is thus the sum of all the payoffs Π g sx acquired in each individual group, as described above in Eqs. (1)(2)(3)(4)(5). Next, a randomly selected neighbor of player x acquires her payoff Π sy in the same way. Lastly, player y imitates the strategy of player x with a probability given by the Fermi function where K quantifies the uncertainty by strategy adoptions [58], implying that better performing players are readily adopted, although it is not impossible to adopt the strategy of a player performing worse. In the K → 0 limit, player y imitates the strategy of player x if and only if Π sx > Π sy . Conversely, in the K → ∞ limit, payoffs seize to matter and strategies change as per flip of a coin. Between these two extremes players with a higher payoff will be readily imitated, although the strategy of under-performing players may also be occasionally adopted, for example due to errors in the decision making, imperfect information, and external influences that may adversely affect the evaluation of an opponent. Without loss of generality we use K = 0.5, in agreement with previous research that showed this to be a fully representative value [58][59][60]. Repeating all described elementary steps L 2 times constitutes one full Monte Carlo step (MCS), thus giving a chance to every player to change its strategy once on average. We note that imitation is a fundamental process by means of which humans change their strategies [61][62][63][64]. The application of imitation-based strategy updating based on the Fermi function is thus appropriate and justified, although, as we will show, our results are robust to changes in the details that determine the microscopic dynamics of the studied public goods game. In terms of the application of the square lattice, we emphasize that, despite its simplicity, it fully captures the most relevant aspect of human interactions -namely the fact that nobody interacts randomly with everybody else, not even in small groups, and that our interaction range is thus inherently limited. Applications of more complex interaction topologies are of course possible, but this does not affect our results. This is because our key argument is based on the limited number of interactions a players has, but it does not in any way rely on the specific properties of the square lattice topology.
The average fractions of all four strategies on the square lattice are determined in the stationary state after a sufficiently long relaxation time. Depending on the proximity to phase transition points and the typical size of emerging spatial patterns, the linear system size was varied from L = 400 to 6000, and the relaxation time was varied from 10 4 to 10 6 MCS to ensure that the statistical error is comparable with the size of symbols in the figures. We emphasize that the usage of a sufficiently large system size is a decisive factor that allows us to identify the correct evolutionary stable solutions. Using a too small system size may easily prevent this, for example if the linear size of the lattice is comparable to or smaller than the typical size of the emerging spatial patterns.

III. RESULTS
Before presenting the main results, we briefly summarize the evolutionary outcomes in a well-mixed population. In the absence of a limited interaction range the behavior is largely trivial and resembles that reported before for the traditional two-strategy public goods game [29,55]. In particular, if r exceeds the group size G then both cooperative strategies dominate while all defectors die out. Conversely, below this threshold both defector strategies dominate while all cooperators die out. This behavior is also in agreement with the well-mixed results published in [39]. In short, all the non-trivial evolutionary solutions reported here in the continuation are due to the consideration of a structured population and remain completely hidden if a well-mixed population is assumed.
In what follows, we focus on two representative values of the multiplication factor that cover two relevantly different public goods game scenarios. First, we use r = 3.8, where the spatial selection allows cooperators to survive even in the absence of punishment -this is the well-known manifestation of network reciprocity, where the limited interactions among players allow cooperators to organize themselves into compact clusters, which confers them competitive payoffs in comparison to defectors [65]. Subsequently, we also use a sufficiently small r = 3.0 value, where cooperators can no longer survive solely due to network reciprocity and thus require additional support [58]. For both values of r we determine the stationary fractions of strategies in dependence on the punishment fine β and the punishment cost γ, and we pinpoint the location and type of phase transitions from the Monte Carlo simulation data.
In Fig. 1, we show the full β − γ phase diagram, as obtained for r = 3.8. As discussed above, we refer to this as the strong network reciprocity region. Presented results reveal that antisocial punishment is hardly viable, with P D players surviving only in a tiny region of the β − γ parameter space. Conversely, as the fine β increases, punishing coopera- tors subvert non-punishing cooperators, first via a discontinuous phase transition from the two-strategy D + C phase to the two-strategy D+P C phase, and subsequently via a continuous phase transition to the absorbing P C phase. The discontinuous phase transition is due to indirect territorial competition, which emerges between C and P C players competing against defectors [29], while the continuous phase transition is due to an increasing effectiveness of punishment that stems from the larger fines. The two representative cross-sections of the phase diagram in Fig. 2 provide a more quantitative insight into the nature of these phase transitions. In both cases the application of small punishment fines yields a punishment-free state, where traditional cooperators and defectors coexist due to network reciprocity. If the cost of punishment is considerable, as in panel (a), the D + C phase suddenly gives way to the D + P C phase at a critical value of the punishment fine β c = 0.229, and by increasing β further, a defector-free state is reached at β c = 0.361. This succession of phase transitions remains the same if the cost of punishment is tiny, shown in panel (b), apart from a narrow intermediate region of β, where antisocial punishment replaces non-punishing defectors via a discontinuous D + P C → P D + P C phase transition at β c = 0.284. Interestingly, this phase transition is qualitatively identical to the preceding D + C → D + P C phase transition -in both cases non-punishing strategies are subverted by their punishing counterparts on the grounds of increasing punishment fines. It can also be observed that the emergence of a stable P D + P C phase involves a slight decay of the fraction of P C players, although they quickly recover to full dominance as the punishment fine is increased further.
To sum up, in the strong network reciprocity region antisocial punishment has a negligible impact on the evolution of cooperation. In a small region of the β−γ parameter space antisocial punishers can outperform non-punishing defectors to form a stable coexistence with prosocial punishers. But apart from this, and despite the fact that both forms of punishment are implemented equally effective (the values of both β and γ are kept the same for prosocial and antisocial punishment), antisocial punishment fails and is evolutionary uncompetitive. An exciting question now is what if the network reciprocity alone is not strong enough to support the coexistence of cooperators and defectors? Although previous research has shown that prosocial punishment can be effective if the imposed fines are sufficiently high [66,67], this results was obtained in the absence of antisocial punishment. However, if cooperators can also be punished the situation changes significantly. A subsystem analysis of the public goods game entailing only punishing cooperators and punishing defectors actually reveals that at r = 3.0 cooperation is unable to survive regardless of the values of β and γ, and regardless of the fact that punishing cooperators also benefit from network reciprocity.
Quite remarkably, the evolutionary outcome of the full 4strategy public goods game can be very different. The full β − γ phase diagram presented in Fig. 3 reveals that punishing cooperators can actually dominate completely in a sizable region of the parameter plane. Of course, if the cost of punishment is too high in relation to the imposed fines, defectors dominate. More precisely, C and P C players die out fast, with only D and P D players remaining. In the absence of the two cooperative strategies the relation between D and P D players is neutral since the latter do not need to bear the punishment cost. This yields a logarithmically slow coarsening without surface tension, as in the voter model [68,69]. In this case the probability to reach either the absorbing D or the absorbing P D phase depends on the fraction of these two strategies [70] when all cooperators die out, which is typically higher for D and hence the D(P D ) notation in Fig. 3.
Returning to the absorbing P C phase, since network reciprocity at r = 3.0 is weak, an additional mechanism must be at work that allows the dominance of cooperation despite the low multiplication factor and despite antisocial punishment. This mechanism is illustrated in Fig. 4, where we show a representative spatial evolution of the four competing strategies from a random initial state for parameter values that yield the absorbing P C phase. It is important to emphasize that a sufficiently large square lattice must be used, since otherwise the evolutionary process can quickly lead to a misleading outcome, i.e., to a solution that is not stable in the large population size limit, or to a solution that is highly sensitive on the initial fraction of strategies, as reported in [39]. This is, however, just a finite-size effect because the evolutionary stable solution can spread in the whole population if it has a chance to emerge somewhere locally. Due to the random initial state the number of non-punishing and punishing cooperators starts dropping fast because support from network reciprocity is lacking, both because the value of r is low, and even more so because compact clusters are not yet formed so early in the process. During this stage, if the population would be small, an accidental extinction of P C players would be very likely. Indeed, even with L = 800 they manage to just barely survive, as indicated in panel (c) by a white circle. At this point the temporary winners appear to be D and P D players, which in the absence of cooperators are neutral, and hence perform a logarithmically slow coarsening [69].
However, the unlikely evolutionary twist is yet to come and reveals itself in panels (d-f). Since P C players are weaker than P D players, the only chance for the former to survive is if they form a compact cluster inside a D domain (C can not survive either way because r = 3.0 is too small). Although one might suspect that this "hanging by a thread"-like survival of P C players is merely temporary because the superior P D players will eventually invade their cluster, this does in fact never happen. On the contrary, punishing cooperators eventually rise to complete dominance (the final state is not shown in Fig. 4).
Crucial for the understanding of this counterintuitive evolutionary outcome is the realization that punishing defectors suffer from second-order freeriding of non-punishing defectors as soon as they both meet in the vicinity of punishing cooperators. More precisely, when D players meets with P D players in the vicinity of P C players, then P D players have to bear the additional cost of punishment while D players are of course free from this burden. The same argument is traditionally put forward when it is time to explain why punishing cooperators are uncompetitive next to non-punishing cooperators near defectors, and why in fact punishment is evolutionary unstable. When antisocial punishment is present, however, this very same reasoning helps punishing cooperators to beat defectors that punish them. As a result, D players start invading P D domains, but in parallel P C players also invade D players from the other side of the interface. The thin active layer of D players thus acts as a protection, shielding P C players from a direct invasion of P D . As can be observed in panels (d-f), the shield is not passive, but expands permanently because D players become successful when meeting P D players close to cooperators. (This process will be quantified via an effective invasion rate in the following section.) At the end, when P D players die out, the D shield falls victim to the invasion of P C players, which thus rise to complete dominance. A direct illustration of this mechanism is shown in Fig. 5, where a prepared initial state was used for clarity. The comparison of the evolution of a large but lonely P C domain, and a tiny but D-protected seed of the same strategy illustrates nicely that the previously described "activated layer" mechanism is effective to overcome the danger of simultaneous presence of antisocial punishment.
We show the above-described pattern formation in the animation provided in [71]. At this point, we also emphasize that the key mechanism that is responsible for the recovery of prosocial punishment is not restricted to the application of imitation-based strategy updating and is in fact robust to changes in the microscopic dynamics. For example, if we apply the so-called "score-dependent viability" strategy updating [39,72], the trajectory of evolution remains the same [73]. The only visible difference is that, in the latter case, the interfaces that separate the competing domains are rugged and strongly fluctuating, which in turn decelerates the evolutionary dynamics and prolongs the time needed to arrive at the same final outcome.
As we have pointed out, snapshots in Fig. 4 illustrate clearly that the size of the lattice plays a decisive role in reaching the correct evolutionary outcome from a random initial state. For some parameter values that bring the population closer to a phase transition point or because of the large fluctua- 6. Representative spatial evolution of the four competing strategies from a prepared initial state towards the three-strategy D + PD + PC phase that is sustained by cyclic dominance. Note that blue and red colors dominate cyclically over the course of 16000 MCS from left to right. The colors used are the same as in Figs. 4 and 5. Depicted are snapshots of the square lattice, as obtained for β = 0.52, γ = 0.065, and r = 3. Since a prepared initial state is used, a small square lattice with linear size L = 100 can be used for demonstration.
tions of strategy abundance during the pattern formation even L = 6000 (linear system size) can turn out to be too small. In such cases a prepared initial state, as depicted in panel (a) of Fig. 6, consisting of sizable patches of the four competing strategies, can help to determine the correct composition of the stationary solution. We have used this approach to determine the stability of the three-strategy D + P D + P C phase, which according to the phase diagram in Fig. 3, also forms an important part of the solution. Figure 6 illustrates that such a solution can be observed even if using a very small lattice size, if only suitable initial conditions are used. The alternating oscillations of read and blue indicate that this threestrategy phase is sustained by cyclic dominance. Indeed, due to using a different set of β and γ values from those used in Fig. 4, here P D players beat P C players because of the low value of r, P C players beat D players because of prosocial punishment, and D players beat P D players near cooperators because of second-order freeriding. However, the balance of these invasions is such that neither strategy dies out, and hence the three-strategy D + P D + P C phase is stable. The two representative cross-sections of the phase diagram in Fig. 7 show that the average fraction of competing strategies changes similarly as in the canonical rock-paperscissors model [74][75][76][77]. The stability of the three-strategy D + P D + P C phase hinges strongly on the continuous, albeit oscillating and sometimes nearly vanishing, presence of all three strategies. As the inset in the bottom panel of Fig. 7 shows, the average fraction of D players can be extremely low, and therefore this three-strategy phase can be stable only if the size of the lattice is large enough. We have used L = 5400 to produce results presented in Fig. 7. If the size of the lattice would be smaller, a strategy could easily die out due to random fluctuations, in which case the evolution would terminate into a single-strategy phase.
The invasion rates within the three-strategy D + P D + P C phase, which exists under conditions that are more favorable for the survival of the P D strategy -specifically, if the cost of punishment γ is lower -hence stabilizing the three-strategy phase, can be measured directly by monitoring the fractions of strategies when the evolution is initialized from straight domain interfaces [78,79]. While the meaning of w 1 and w 2 (see the diagram inserted in Fig. 8) is clear from the payoff differences, the determination of w 3 requires further clarification. Namely, if only P D and D players would be present along the interface, then we would of course measure a net zero invasion rate because the two strategies are neutral in the absence of cooperators. However, since we are interested in their relation when P C players are present too, we use parallel interfaces of P C and P D players, separated by thin (width of 5 lattice sites) layers of D players. In this way, although P C and P D  players do not interact directly, the setup properly describes the movement of D layer that is followed by punishing cooperators. This "effective" invasion, which emerges only in the presence of the third party, is highlighted by a dashed arrow in the legend of Fig. 8. The decay of w 2 in the main panel of Fig. 8 highlights that the P C → D invasions are relevant and in fact occur rather frequently, but also that their intensity deteriorates as the cost of punishment increases. Similarly, the P D → P C invasions are also a recurring phenomenon based on the positive value of the corresponding invasion rate w 1 , which indicates that P D players would dominate P C players during a direct competition as a consequence of the small value of r. Nevertheless, this invasion rate also decays slightly as γ increases because the costs associated with the main public goods game come to play second fiddle to the costs of punishment that both these strategies should bear. Lastly, the w 3 (γ) function is also always positive, because even a small cost evokes the secondorder freeriding effect (in this case of course associated with avoiding the costs of antisocial punishment), such that in the presence of P C players D players can invade P D players. Accordingly, as the value of γ increases, so does the w 3 invasion rate, as shown in Fig. 8, which illustrates directly that secondorder freeriding on antisocial punishment is responsible for the evolutionary success of prosocial punishment, specifically for the survival or even for the complete dominance of the P C strategy.
The differences between these three invasion rates, depicted in the inset of Fig. 8, allow us to understand how the relative abundance of strategies changes as a result. For example, w 3 − w 2 quantifies how the fraction of non-punishing defectors changes when we vary the cost of punishment. An in-crease in the value of γ will support strategy D, but the actual beneficiary will be her predator, which is strategy P C . This seemingly paradoxical response of the population to the increase of the punishment cost is a well-known consequence of cyclic dominance, i.e., when directly supporting a particular species will actually support her predator [80]. This in turn explains why P C players rise to full dominance when we increase the value of γ, as well as why P D players dominate when we decrease it. Namely, decreasing the punishment cost supports the P C strategy, which is the prey of punishing defectors.

IV. DISCUSSION
We have shown how second-order freeriding on antisocial punishment restores the effectiveness of prosocial punishment, thus providing an unlikely and counterintuitive evolutionary escape from adverse effects of antisocial punishment. When the synergistic effects of cooperation are low to the point of network reciprocity failing to sustain it, cooperators that punish defectors can still rise to dominance because non-punishing defectors enable their evolutionary success by capitalizing on second-order freeriding and eliminating antisocial punishers as a result. If conditions for punishment are somewhat more lenient, we have shown that a three-strategy phase consisting of non-punishing defectors, punishing cooperators, and punishing defectors becomes stable. The relations within this phase, and its termination to an absorbing punishing cooperator or an absorbing punishing defector phase, can be fully understood in terms of invasion rates along straight interfaces that separate different strategy domains. We have demonstrated that these results are robust to changes in the microscopic dynamics, and we have emphasized that the only important property of the interaction structure is the limited interaction range rather than its topological details. Indeed, the mechanism relies solely on spatial pattern formation, and is the first stand-alone remedy against adverse effects of antisocial punishment, not relying on any additional strategic complexity or other assumptions limiting its general validity. Paradoxically, it turns out that antisocial punishment is vulnerable to the same second-order freeriding that is traditionally held responsible for preventing evolutionary stability of prosocial punishment.
We emphasize that these phenomena can not be observed in well-mixed populations. Furthermore, a reliable study of competing subsystem solutions requires a careful finite-size analysis of the spatial system. Additionally, the usage of random initial conditions may be misleading, especially if using a small system size, because it does not necessarily allow for all possible subsystem solutions to emerge (before they could compete with one another). This difficulties can be overcome by using suitable prepared initial states, which allow the evolutionary stable subsystem solution to form before competition between them unfolds. Since pattern formation and invasions of propagating fronts are general features of multistrategy complex systems, such an analysis is a must when determining the consequences of spatiality.
Our research also reveals that under conditions that favor cooperation, for example when the multiplication factor of the public goods game is sufficiently high for the spatial selection alone to sustain cooperation, antisocial punishment is overall uncompetitive. In fact, even though we have used a fully symmetrical implementation of prosocial and antisocial punishment throughout our paper, antisocial punishers could survive only in very small regions of the parameter space. We may thus conclude that under such conditions cooperators that punish defectors should not be afraid of retaliatory antisocial punishment by defectors.
In comparison to previous findings concerning the symmetrical implementation of prosocial and antisocial rewarding [54], we find that with punishment there is no lower bound on the multiplication factor that would be impossible to compensate with a sufficiently effective punishment system. But there is one with rewarding, i.e., below a critical value of the multiplication factor full defection is unavoidable, and this regardless of just how efficient the rewarding system might be. In case of punishment, ever lower values of the multiplication factor simply require ever higher fines at a given cost for cooperation to be sustained. Quite remarkably, the very same process puts a noose around antisocial punishers, which are defeated by second-order freeriding in their own ranks.
Although social preference models of economic decisionmaking predict that antisocial punishment should not occur [81,82], and despite the fact that antisocial punishment is also inconsistent with rational self-interest and the hypothesis that punishment facilitates cooperation, it is nevertheless remarkably common across human societies [32][33][34][35][36][37]. In the light of this fact, it was important to extend the theory of cooperation in the spatial public goods game with the option that noncooperators can punish cooperators. Rather unexpectedly, the detrimental effects of such antisocial punishment on the coevolution of punishment and cooperation turned out to be minor simply by taking into account the fact that the interactions among humans are inherently structured, entailing a limited number of frequently used links, rather than being random or well-mixed.