How selection pressure changes the nature of social dilemmas in structured populations

When members of a population engage in dyadic interactions reflecting a prisoner's dilemma game, the evolutionary dynamics depends crucially on the population structure, described by means of graphs and networks. Here, we investigate how selection pressure contributes to change the fate of the population. We find that homogeneous networks, in which individuals share a similar number of neighbors, are very sensitive to selection pressure, whereas strongly heterogeneous networks are more resilient to natural selection, dictating an overall robust evolutionary dynamics of coordination. Between these extremes, a whole plethora of behaviors is predicted, showing how selection pressure can change the nature of dilemmas populations effectively face. We further show how the present results for homogeneous networks bridge the existing gap between analytic predictions obtained in the framework of the pair-approximation from very weak selection and simulation results obtained from strong selection.


Introduction
Complex networks are ubiquitous and known to profoundly affect the processes that take place on them [1][2][3][4][5]. From a theoretical perspective, some of the most complex processes studied to date, occurring on complex networks, are related to behavioral dynamics and decision-making, often described by means of social dilemmas of cooperation [6]. Among these, prisoner's dilemma (PD) provides the most popular metaphor of such dilemmas, given that its only Nash equilibrium is mutual defection, despite mutual cooperation providing higher returns [7]-thus the dilemma. We may also assume a dynamical (evolutionary) approach to game theory [8,9], where individuals revise their behavior based on the perceived success of other individuals, creating a gradient of selection (GoS) [10] that dictates the evolution of cooperation in time. In this context, such GoS will always favor free-riders irrespective of the fraction of cooperators and of the relative importance of fitness in the evolutionary process, a result that dictates the demise of cooperation in the population [9].
This result that translates the forecast stemming from a game-theoretical analysis based on Nash equilibrium into a population-wide, dynamical setting, assumes that populations are large and well mixed, that is, everyone interacts with everyone else with equal probability [8]. Yet, when members of a population interact along the links of an underlying complex network this scenario is altered, as the assumption of a well-mixed population no longer holds [6,[11][12][13]. Only recently has it become possible to analyze the population-wide evolutionary dynamics of a game played on an arbitrarily complex network, very much in the same manner that games were analyzed in well-mixed populations [14]. The so-called average gradient of selection (AGoS, see section 2 for details) has unraveled the fundamental changes in evolutionary game dynamics introduced in populations structured along the links of complex networks. While further analysis is required for other classes of dilemmas [6,[15][16][17], the results obtained from the PD [14] show that, at a population-wide level (what we call macro-dynamics) the effective game at stake can be very different from that in which pairs of individuals engage (what we call micro-dynamics), with direct implications both in the time-evolution and invasion of cooperation, analogous to what occurs in finite well-mixed populations [18]. In particular, homogeneous networks seem to favor the co-existence of strategies, whereas heterogeneous networks, in turn, favor their coordination [14], irrespective of the fact that the game individuals locally perceive and play is a PD-see figure 1. The scenarios illustrated in figure 1, however, do not take into account the role of selection pressure (also known as intensity of selection) in the overall evolutionary dynamics of a networked population [19][20][21][22][23][24]. Selection pressure provides relative significance of individual fitness in the evolutionary process, as opposed to an arbitrary or random adoption of strategies. This is important, as selection pressure can be very different depending on the processes at stake. Indeed, in many social interactions, errors in decision-making, perhaps induced by stress or exogenous confounding factors that often translate into a bounded rational behavior of the players [25], may lead to an overall weak selection environment. This contrasts with many situations in nature where selection may be strong [26], as well as in cases where cultural evolution is at stake [9]. Moreover, the fate of cooperation in social networks may depend on how the success of the others is locally perceived-which is related to the number of partners of each player and their social context [11,17,22,27]-turning selection pressure into a central variable in the ongoing challenge of understanding the impact of each social structure on global outcomes [6,[11][12][13][14][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46].
Here, we study the role played by selection pressure in the overall evolutionary dynamics of a population. We find that, for a given class of interaction network, there is an optimum level of selection pressure for which cooperation is maximized. We stick here to paradigmatic examples of homogeneous and strongly heterogeneous classes of network structures, deferring to the appendix the study of other classes of population structure that enlarges the variety of topological features studied here, thereby providing an overall view of the role of selection pressure in a wider range of contexts.

Population structure
We take a population of individuals organized by means of a complex network of social interactions, where individuals are placed on the nodes and interactions are restricted to individuals connected by links. We focus on two classes of population structures (or networks): homogeneous, where every individual shares the same number of neighbors, and heterogeneous, where individuals have a variable number of partners. For the former class, we adopt homogeneous random networks (HR) that are built by randomizing the links of initial homogeneous regular ring networks [47], whereas for the latter we choose the strong heterogeneous case of scale-free network structures generated with the Barabási-Albert algorithm (BA) of growth and preferential attachment [48]. All networks have an average degree of four and population size is N = 1000. As already mentioned in the appendix we extend our analysis to other classes of population structures [1,49], in order to ensure a more complete picture of the effect under study: exponential (EX) [50], random (RN) [51,52] and the highly clustered scale-free networks generated with the minimal model algorithm (MM) [53]; the latter have a degree of heterogeneity similar to BA networks, while exhibiting a high cluster coefficient, in stark contrast with BA networks. We confirmed that our results, obtained from N = 1000, remain valid for larger population sizes.

Evolutionary dynamics
Individuals can assume one of two possible strategies: to cooperate or to defect. Each strategy obtains a payoff that depends on the strategy composition of its neighborhood, given by is the number of cooperators (defectors) in the vicinity of node i, whereas s i is 1 if individual i is a cooperator (C) or 0 if a defector (D). Finally, T (temptation), R (reward), P (punishment) and S (Sucker's payoff) are the game parameters that define all possible pair outcomes in a symmetric two-person-twostrategy interaction (from a game theory perspective). Here, we consider R = 1.0, P = 0.0, T = λ and S = 1 − λ, where λ > 1 stands both for the temptation to defect toward a C and for the fear of being cheated [17,54], thus combining two social tensions in a single parameter. Given that λ > 1.0, we have T > R > P > S, which means that the game at stake is a PD. Further social dilemmas can be considered for different rankings of the game parameters, in particular Stag Hunt (a coordination dilemma, for which R > T > P > S) and the Snowdrift Game (a co-existence dilemma, for which T > R > S > P). Evolutionary dynamics proceeds as long as individuals with higher fitness will tend to reproduce more or, in the context of cultural evolution and social learning, successful individuals will be imitated; in both cases, the fitter behavior will spread in the population. Here, we adopt the pairwise comparison rule [55][56][57] that allows us to explore in terms of a single parameter (β) the full range of possible selection pressures, from neutral evolution to pure imitation dynamics. At each time-step we allow one randomly chosen individual, i, from the population to imitate the strategy of a randomly chosen neighbor, j, with probability p(i, j) given by denotes the fitness of individual i (j) (here associated with the accumulated payoff obtained from playing with all neighbors). As stated above, β 0 defines the selection pressure: for β = 0 we obtain neutral evolution, in which case the evolutionary dynamics proceeds by random drift; at the other extreme, for very large values of β, imitation becomes deterministic, in the sense that any chosen neighbor who is more fit will be always imitated.

Computer simulations
The final fraction of cooperators (FFC) was computed by averaging over 2.5 × 10 5 evolutionary runs, after a transient of 5 × 10 3 generations starting with a population composed by equal fractions of strategies randomly placed on the network. We also computed the quasistationary distributions, corresponding to the fraction of time that the population spent in each non-absorbing state during the first 5 × 10 3 generations over 2.5 × 10 5 distinct evolutions that started with a random initial condition. The GoS is defined as is the probability of increasing (decreasing) the number of cooperators by one in a population with j cooperators. For structured populations this quantity becomes context dependent, as it may vary from node to node, becoming cumbersome to describe it analytically. Instead, here we (numerically) compute an average over all possible transitions that may increase or decrease the number of Cs at each time-step t on a given state with j Cs. [14].
For each individual i with k i neighbors, of whichn i have a different strategy, we compute the probability of changing behavior T i (t) (moving from C to D or from D to C) at time t. This is given by the product of two terms: the probability of selecting a neighbor with an opposite strategy (n i /k i ), and the average probability of effectively imitating one of those neighbors, We can now compute, for a given simulation p, the AGoS, The probability to increase the number of Cs T + p ( j, t) is given by the product of the probabilities of randomly selecting a D for update (N − j/N) and the average probability that this individual actually imitates a C: A similar reasoning can be adopted for [14]. The time-dependent AGoS, G A ( j, g), for a particular generation g is thus computed by averaging over N time-steps (one generation) G A ( j, g) = 1 where c j (g) accounts for the number of times the system was observed in state j at generation g. For a given network type, we run = 2.5 × 10 7 simulations (using 10 3 networks of each type) starting from random initial conditions.

Selection-driven macro-dynamics
As discussed in [14], the evolutionary game dynamics taking place on structured populations leads to the emergence of an effective game, at a macro (population-wide) level, characterized by exhibiting, typically, a pair of internal roots (of coordination and co-existence type) in the AGoS, more akin to the dynamics of N-Person games [10,[58][59][60][61][62][63]. However, as shown, while in homogeneous network structures the dynamics is dominated by the co-existence root, in heterogeneous populations the coordination root characterizes the overall dynamics (cf figure 1).
In figure 2(A), we show a typical profile of the AGoS at different moments of the evolutionary dynamics of a PD played in a population structured according to an HR network, for β = 1.0 and λ = 1.01. We observe a gradual stabilization of the overall shape of the AGoS such that, by the 100th generation, the internal fixed point (and the shape) of the AGoS has already stabilized. Hence, in figure 2(B), we show the location of the internal fixed points of the AGoS at the 100th generation, as a function of β, on top of the corresponding quasi-stationary distributions computed along the lines specified in section 2. Figure 2(B) shows that, whenever β < 0.3 the AGoS indicates no trace of internal roots, that indicates that we are in a PD (or defector dominance) regime, approaching random drift as β → 0 (leading to a flat quasi-stationary distribution). As soon as β > 0.3 the effective dilemma changes abruptly, associated with the appearance of two internal points. The dynamics becomes mainly dominated by the co-existence point (x R , whereas the coordination point x L essentially collapses to x = 0); a feature also reflected in the contour profiles displayed for the quasistationary distribution. It is noteworthy that there is excellent agreement between the AGoS results and those obtained with the quasi-stationary distribution. For β > 50, we observe the appearance of yet another pair of fixed points between x R and x = 0.0, in an overall dynamical scenario close to defection dominance. Hence, homogeneous networks are able to promote cooperation within a small window of selection pressures in which cooperators may co-exist with defectors.
In figure 3, we show the same results as in figure 2 but now for a population structured along the links of a BA network. Again, by the 100th generation, roots of the AGoS have stabilized, and hence in figure 3(B) the location of the internal fixed points (computed at the 100th generation) is shown as a function of β. Whenever β < 0.28 the macro-dynamics is dominated by a single coordination point x L that moves closer to x = 0.0 for increasing β. However, for β > 0.28, a pair of internal points appear below x = 0.1, that occurs simultaneously with a slight increase in the location of the coordination point (x L ) above which the population reaches full cooperation. For increasing β the co-existence point (x R ) almost reaches x = 1.0. The discussion of the detailed mechanisms giving rise to these roots near the monomorphic states falls beyond the scope of the present work, and will be deferred to future work, being related to a multitude of evolutionary deadlocks that may appear at the leaves of scale-free networks [31], which become more significant for high selection pressure.
In both cases (HR and BA, see figures 2 and 3) the positions of the internal points of the AGoS put into evidence the existence of an optimal selection pressure for which cooperation levels are maximized.  Figure 4 shows the result of standard computer simulations (e.g., see [6]) in which, starting from 50% Cs (x i = 0.5) placed at random in the population, one computes the stationary FFC as a function of β and λ. Figure 4(A) shows results for HR networks, whereas figure 4(B) shows the corresponding results for BA networks. For a wide range of values of λ, there is a value of β that maximizes the final fraction of Cs in both network types. We have confirmed that this behavior is independent of both λ and the population structure. For strong selection (β 1.0) individuals approach a more deterministic regime in which the update process consists of copying the strategy of their neighbors whenever these are doing better, however slightly. In such a regime, the population outcome is strongly dependent on the population structure. Between these two extremes one obtains a transition region, characterized by the existence of a local maximum of the FFC in β for a given λ.

Analysis of stationary states of the fraction of cooperators
On HR populations, as long as Cs succeed in forming compact clusters, they may prevent the invasion of Ds [6,13,19,57]. Given that the underlying game is a PD, for weak selection, clusters of Cs are easier to form, but errors in imitation also allow their destruction and/or invasion by Ds. As we increase β we not only decrease the incidence of errors of imitation, hindering the feasibility of forming clusters of Cs, but also hinder the feasibility of them getting destroyed or invaded as a result of errors of imitation. Hence, it is not surprising that the population evolves to a stationary regime in which Cs and Ds may coexist in the population. From a population-wide perspective, the macro-dynamics will be characterized by an effective co-existence dilemma as was shown in figure 2.
On BA heterogeneous networks the situation is more complex, although we can identify the basic mechanism that leads to the results shown in figure 4. Indeed, we observe a decoupling in the effective intensity of selection between interactions that involve at least one hub (nodes of high degree) and those that involve none (involving mostly leaves, that is, nodes of smallest degree). Because hubs are able to accumulate a large fitness-at least one order of magnitude higher than that of other nodes (see below)-they are seen as preferential role models. Hence, potential behavioral imitation between hubs and leaves occurs at an effectively higher selection pressure than interaction among leaves. This mechanism is discussed in further detail in section 3.3.
From an individual perspective this mechanism hints at how each individual should interact with his peers so that cooperation levels at the population-wide level are maximized: imitate deterministically the strategy of the most influential (high fitness) individuals and less so all those that are at a similar level of influence to you. In sum, for a greater good, social and context diversity must be taken seriously [27,64].

A route for cooperation in scale-free networks
To put into evidence the mechanism responsible for the intensity of selection that maximizes cooperation levels in scale-free networks, we divide the nodes of a BA network into three degree classes, similar to [64]. We classify nodes as high degree nodes (HDN) if k i > k max /3, medium degree nodes (MDN) if z < k i k max /3 or low degree nodes (LDN) whenever k i < z.
Taking into account that a node from a degree class can either be a C or a D we consider six possible classes of nodes (see figure 5 for details). We can thus collapse all interactions between nodes in the original network into interactions taking place on a meta-network of 6 (meta-) nodes, where we shall consider only links between nodes of different strategies. Figure 5 exemplifies such a meta-network. The values provided for each node correspond to the average payoff of nodes of the respective class computed for 10 3 configurations obtained by randomly distributing an equal fraction of Cs and Ds on BA networks. As an illustration, figure 5 shows that, for the parameters chosen, indeed, HDN can accumulate payoffs one order of magnitude higher than nodes of other classes.
We can take the average payoffs of each type of node and compute the average probabilities of imitation in each possible C-D link of the meta-network. Whenever any of these processes occurs under weak selection we expect to obtain a probability of imitation near 0.5, whereas deterministic imitation occurs whenever probabilities approach 1.0.
In figure 6, we show the probability of imitation between nodes with different strategies as a function of β. The lines in each panel correspond to the links in the meta-network identified using the node-numbers in figure 5. Black dashed lines display the corresponding FFC.
For low β all three types of transitions occur near random imitation ( p = 0.5), but as we increase β we observe a change of behavior, as some transitions reach a deterministic regime ( p = 1.0) before others, that is, for lower values of β. On the other hand, for high β all transitions group again and become deterministic. This trend is observed in all panels. In figure 6(A), we plot the probability of imitation for transitions between nodes of the same degree class but different strategy, whereas in figures 6(B) and (C) we show how the probability of imitation changes with the intensity of selection for transitions that involve at least one HDN. Clearly, these interactions reach a deterministic regime for values of β below those in figure 6(A). This is so because the higher fitness difference between hubs and leaves leads to an effect similar to what one would obtain for smaller fitness differences but high intensity of selection β. Note further that the values of β at which we observe the transition in these interactions fit quite well the sharp increase of FFC with β. For the other type of interactions available, involving an MDN and an LDN (see figure 6(D)), we observe that the transition to a deterministic regime occurs for higher values of β (compared with the other panels in figure 6), leading to a 'decoupling' between these transitions and those discussed before. In fact, it appears that the transition to a deterministic regime in these interactions is positively correlated with the decrease of the FFC.

Conclusions
In this paper, we have shown that the macro-dynamics of a population-whose individuals engage in a PD of cooperation-strongly depends on two aspects: (i) the network of interactions in which individuals are embedded and (ii) the intensity of selection-or selection pressure-associated with strategy revision. For scale-free social networks the macro-dynamics is akin to a coordination game, regardless of the intensity of selection, whereas for homogeneous networked populations it depends on the intensity of selection: for strong selection we observe a dynamics similar to a co-existence game while for weak selection regimes we recover a typical dynamics of a PD.
The fact that the game at the macro-level remains a PD, in which Cs have no chances, has been obtained previously for homogeneous random graphs, in the framework of pair-approximation [36,65,66]. Pair-approximation, however, was leading to an apparent contradiction with early results from computer simulations by Szabó and Fáth (see [6] for details), that were evidencing a window of opportunity for the PD game played on homogeneous networks under strong selection. The present results resolve the apparent contradiction, showing how the macro-game changes with β.
This work also provides evidence for the existence of an 'optimal' intensity of selection at which levels of cooperation on structured populations are maximized (see also [19][20][21]). Furthermore, we characterize the population-wide dynamics responsible for such an outcome, while discussing the fundamental topological and dynamical mechanisms that led to these results on both homogeneous and scale-free networks. In the appendix we further discuss these issues for other types of networks.
On scale-free networks the mechanism that induces the existence of an optimal level for cooperation results from high levels of social diversity [27] (here, network heterogeneity) that give rise to a decoupling in the effective intensities of selection at which the imitation between individuals of different degree classes occurs.  [53] as a function of the temptation (λ) and of the intensity of selection (β). Initially, the population was composed by equal frequencies of each strategy placed at random. We observe that for fixed λ there is an optimal value of β at which cooperation levels are maximized. These results show the same type of behavior identified in the other SF populations discussed in the main text.

Appendix. Results for additional network topologies
In figure A.1, we show the FFC for scale-free networks whose structure was generated with the Minimal Model (MM) algorithm [53], that unlike BA exhibit high values of Cluster Coefficient (CC) [1]. The behavior obtained is similar to that already identified and discussed in figures 3 and 4 of the main text, allowing us to conclude that the underlying mechanism that generates this optimal behavior for a specific intensity of selection (β) does not depend on the CC. Instead, these results seem to be a direct consequence of the strongly heterogeneous nature of the network of interactions that give rise to the decoupling mechanism discussed in section 3.3. Moreover, figure A.1 corroborates the idea [17,67] that high clustering combined with a heterogeneous network structure offers a clear enhancement of cooperation, when compared to the results obtained in the absence of such clustering. Figure A.2 shows the overall dynamics of a population that is organized by means of Erdős-Rényi random networks of interaction. We used random networks with 1000 nodes and average degree of four [1]. These random networks possess a low level of heterogeneity, exhibiting a single-scale degree distribution [49]. Figure A.2 (A) shows the internal points of the AGoS at the 100th generation as a function of the intensity of selection, where one observes that macro-dynamics is mainly dominated by a coordination point that, for strong selection regimes (β > 1.0) is located close to x = 0.0, in agreement with the FFCs shown in figure A.2 (B). Figure A.3 shows the overall dynamics of populations organized via networks exhibiting an exponential degree of distribution [49]. Such exponential networks with 1000 nodes and average  degree of four were generated using an algorithm similar to the Barabási and Albert [48], in which preferential attachment is replaced by random attachment [1,50]. These networks exhibit levels of heterogeneity that fall in between those accruing to random and to scale-free networks [49]. The results in both figures A.2 and A.3 show no evidence for the existence of a value of β at which cooperation is maximized; instead, we obtain a threshold value, above which cooperation is sustained. Furthermore, the analysis of these two cases puts into evidence the impact of degree heterogeneity on the overall cooperation dynamics, as cooperation successfully dominates for a broad range of intensities of selection and temptations to defect in populations structured by exponential networks.