Robustness of Cooperation in the Evolutionary Prisoner's Dilemma on Complex Networks

Recent studies on the evolutionary dynamics of the Prisoner's Dilemma game in scale-free networks have demonstrated that the heterogeneity of the network interconnections enhances the evolutionary success of cooperation. In this paper we address the issue of how the characterization of the asymptotic states of the evolutionary dynamics depends on the initial concentration of cooperators. We find that the measure and the connectedness properties of the set of nodes where cooperation reaches fixation is largely independent of initial conditions, in contrast with the behavior of both the set of nodes where defection is fixed, and the fluctuating nodes. We also check for the robustness of these results when varying the degree heterogeneity along a one-parametric family of networks interpolating between the class of Erdos-Renyi graphs and the Barabasi-Albert networks.


I. INTRODUCTION
Evolutionary dynamics has proved to be a useful theory to describe evolution of biological systems at all levels of organization [1]. Rooted in the basic tenet of Darwinism, the replicator dynamics [2,3,4] of evolutionary game theory provides an elegant mathematical description of how natural selection among (phenotypes) strategies takes place when the reproductive success of individuals (and then the future abundance, i.e. frequency, of strategies) depends on the current phenotypic composition of the population (frequency-dependent fitness). In this regard, one of the current theoretical challenges to the explanatory powers of evolutionary game dynamics is the understanding of the observed evolutionary survival of cooperative behavior among individuals when selfish actions provide a higher benefit (fitness). Perhaps the best suited (and most used) model to formally describe the puzzle of how cooperation arises is the Prisoner's Dilemma (PD), a two-players-two-strategies game, where each player chooses one of the two available strategies, cooperation or defection: A cooperator receives R when playing with a cooperator, and S when playing with a defector, while a defector earns P when playing with a defector, and T against a cooperator. When T > R > P > S, the game is a PD. Given this payoff ordering, in a well-mixed (unstructured) population where each agent interacts with all other agents (or a representative sample of the population composition), defectors are fitter and thus the fraction of cooperators asymptotically vanishes.
Among the various mechanisms that have been proposed to explain how natural selection can lead to cooperative behavior (like kin selection, group selection, direct or indirect reciprocity) [5], a simple one is based on leaving off the well-mixed population hypothesis, so that each individual only interacts with agents in its neighborhood, as specified by some graph or network of "social" interactions. Agent-based-modelling approaches [6] of this kind in Theoretical Biology [7], Economics [8] and Social Sciences [9] often benefit in a natural way from Statistical Physics methods, concepts and techniques (also scientists), so favoring fruitful (synergic) interdisciplinary (socio-, bio-, econo-) physics research [10], often termed Physics of Complex Systems [11,12].
Early pioneering numerical work [13] on the PD game in two-dimensional square lattices, made the observation that, unlike in unstructured populations, cooperators and defectors can coexist in the lattice indefinitely. In [13] each individual node played with its immediate neighbors each time step accumulating a payoff, then updated its strategy by imitating the one of highest payoff in its neighborhood, including itself (best-takes-over reproduction rule) and back again for very large times. When passing from a "mean field" (well mixed population) interaction description to a lattice structure of interactions, one has to specify various details (of varying importance) on both, a) the lattice characteristics, e.g. regular or not, randomness of various kinds, finite size effects, etc., and b) the specific form of the microscopic dynamics of reproduction process, e.g. deterministic rules or probabilistic ones, synchronous or asynchronous updating, what types of stochastic fluctuations are allowed, etc. The study of many, if not most, of important aspects of the issue have generated for more than a decade a wealthy literature, of a great interest from the Statistical Physics perspective (e.g., [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28]; for a recent review, see [10], where an extensive list of references can be found).
Nowadays, the existence of cooperation-promoting feedback mechanisms that are rooted deep into the interaction structure is indisputably accepted. It has been termed spatial, or lattice reciprocity, in analogy to direct reciprocity (through iterated game strategies) and indirect reciprocity (through reputation, or scoring, of agents). Simply said, the clustering of cooperators in the lattice could provide high enough fitness to the cooperator nodes exposed to invasion, to the extent of preserving cooperators from evolutionary extinction, even when defection is blatantly favored by the one-shot (two-players) game analysis. For negligible values of P − S ≃ 0, when T − R increases from zero cooperation decreases slowly, and becomes zero at values of (T /R) − 1 well beyond zero. The region (in parameter space) of coexistence of strategists is the genuine battlefield where the competition between strategies adopts interesting, non-trivial aspects: The transition region between two clear-cut phases, i.e. all-cooperators (all-C) prevailing at T /R ≃ 1, and all-defectors (all-D) at higher values of T /R. More recently, a set of works have extended this perspective to a most intriguing and ubiquitous class of networks, say scale-free networks, a "focuss issue" nowadays.
There is an accumulated evidence that many real biological [29,30], social [31] and technological [32,33,34] systems are neither regular nor simplest random graphs (not to say well-mixed populations) of entities or agents, but they are described by some distinctive metric (path length based) and topological (structure and size of local neighborhoods) properties. They often show a so-called scale-free (SF) distribution density of degree, P (k) ∼ k −γ , where the degree k of a node is the number of connections it shares with its neighbors [35,36], so their connectivity patterns depart considerably from lattice homogeneity (lacking of a sharp characteristic scale of connectivity).
The ubiquity and importance of complex networks raised quite naturally the question of how natural selection works on top of different types of complex networks of agents [19,37,38,39,40,41].
In this case (as in other nonlinear dynamical processes in networks [42,43]) one has to deal with two sources of complexity, the evolutionary dynamics and the complex structure of the substrate, which are entangled. Interestingly, the sort of processes that evolutionary game dynamics is aimed to model may well be very relevant to understand real networked systems through the study of a variety of scenarios of co-evolution of both strategies (phenotype survival) and network (evolving topological features) [38]. Among other works exploring various aspects on the evolution in complex networks, see [44,45,46]. From here we focuss attention on fixed network settings and how degree heterogeneity influences evolutionary dynamics of PD.
Some recent extensive numerical works on PD (and closely related) games [39,40,41] on SF networks, using probabilistic updating rule (random neighbor pair-comparison, and update with probability proportional to fitness difference) have shown that the absence of a sharp characteristic scale of degree in the network greatly enhances the "lattice reciprocity" mechanisms of evolutionary survival of cooperation. For example, highly connected (hubs) cooperator nodes have the chance of high payoffs and resist well invasion by easily invading less connected neighbors, which in turn increase hub's payoffs and invading capabilities [40]; this positive feedback mechanism does not operate in the case of defector hubs and illustrates in a simple way one of the biasing effects of graph heterogeneity.
In a recent exploration of these heterogeneity based cooperation-promoting mechanisms, using the kind of implementation of replicator dynamics on graphs specified above in the previous paragraph, one observes generically [41] that fixation of the cooperation (as well as defection) strategy on certain nodes occurs after (often-not-large) sensible transients, so that any asymptotic trajectory of population states defines a partition of the network into three sets: the set C of nodes where cooperation is fixed, the set D of nodes where defection is fixed, and the set F of fluctuating nodes that experience forever cycles of invasion by the competing strategies. In other words, the observed stationary value of the average fractionc of cooperators (see definition in section 3), in any asymptotic (long-term) trajectory, has two additive contributions: a) the relative size µ(C) of the set of pure cooperators, and b) the overall fraction of timeT c spent by fluctuating nodes as cooperators, weighted by its relative size µ(F ), saȳ The analysis of global connectedness inside the sets C and D of fixed strategy nodes reveals that the lack of a significant characteristic scale of degree is neatly associated to a simply connected C set, while D is fragmented into many clusters in the wide transition region (coexistence of strategies) between asymptotic uniform (µ(C) = 1, all-C, and µ(D) = 1, all-D) equilibria. This structure of D in the strategies coexistence regime is similar to that exhibited by both C and D sets for the Erdös-Renyi random class of networks (i.e. with Poissonian distribution density of degrees, and thus a significant characteristic scale: the network average degree) [41]. All previous results [41], were obtained for a unbiased (50%) initial proportion of (randomly placed) cooperators, for all the analyzed stochastic trajectories.
In this paper, we are interested in exploring the robustness of these observations reported in [41] on the behavior of the partition sets, for a limiting one-parameter form of the PD game, say P − S = 0: the border with the Snowdrift game (see next section). Robustness against parameter P − S variation, and others, will be analyzed elsewhere [49]. In particular, we focus here on two aspects of robustness: First, the influence of varying initial fraction of cooperators on the network partition sets (C, D, F ) of asymptotic trajectories. The model, its dynamical rules and structural characteristics, as well as the necessary technical details, are the contents of section II. The results are described and analyzed in section III. Second, in section IV, we show how those observed behaviors of the partition vary along an interpolating family of networks whose heterogeneity can be one-parametric tuned, from ER limit to BA limit, that is, we check robustness against decreasing heterogeneity of the network. Conclusions and some prospective remarks can be found in ending section V.

II. THE MODEL
The Prisoner's Dilemma game is defined in its more general form by the payoff matrix: where the element a ij is the payoff received by an i-strategist when playing against a j-strategist, with i = 1 meaning cooperator, and i = 2 defector. The payoff ordering is given by T > R > P > S. Other payoff orderings have received other names, e.g. T > R > S > P corresponds to the so-called Snowdrift (or Hawks and Doves, or Chicken) game. Following several studies [13,39], the PD payoffs have been set to R = 1 (so the reward for cooperating fixes the payoff scale), T = b > 1, P = 0 (no benefit under mutual defection), and P − S = ǫ = 0. This last choice places us in the very frontier of PD game. It has the effect of not favoring any strategy when playing against defectors (while being advantageous to play defection against cooperators). Small positive values of the parameter ǫ ≪ 1 leads to no qualitative differences in the results [13,39,49], so the limit ǫ → 0 + is agreed to be continuous.
The dynamic rule is specified as follows: each time step is thought of as one generation of the discrete evolutionary time, where every node i of the system plays with its nearest neighbors and accumulates the payoffs obtained during the round, say P i . Then, individuals are allowed to synchronously change their strategies by comparing the payoffs they accumulated in the previous generation with that of a neighbor j chosen at random. If P i > P j , player i keeps the same strategy for the next time step, when it will play again with all of its neighborhood. On the contrary, whenever P j > P i , i adopts the strategy of j with probability Note that this dynamic rule, though stochastic, does not allow the adoption of irrational strategy, i.e. Π i→j = 0 whenever P j ≤ P i .
Let us now specify precisely the family of networks on top of which the evolutionary PD game is evolved. Strategists are located on the vertices of a fixed graph of average connectivity k = 4.
The heterogeneity of the networks is controlled by tuning a single parameter α, so that when α = 0 the networks are of the Erdös-Renyi class of random graphs, and when α = 1 they are of the Barabási-Albert (BA) [47] scale-free networks class. Let's first describe the algorithm to construct a BA network of size N. In this case, one starts from a fully connected set of m 0 nodes and at each time step a new node is linked to m = 2 nodes preferentially chosen, namely, the probability that node i receives one new link is proportional to its degree, k i / j k j . Avoiding multiple connections and iterating the preferential attachment rule N − m 0 times a SF network with an exponent γ = 3 is generated. On the other hand, random single-scale networks are built up following the standard recipe to generate ER networks [36]. Finally, networks with an intermediate degree of heterogeneity can be built following the recipe introduced in [48]. The algorithm combines the mechanisms of preferential (with probability α) and uniform random linking (1 − α) in such a way that starting from α = 0 and increasing its value, the networks generated are successively more homogeneous with a heavy tail whose exponent is equal to (α = 0) or larger than The time scale of microscopic invasion processes (updating rule) is controlled by β −1 , which is the highest connectivity of pair's nodes; this makes that very high payoff of a hub due to its very high k is sensibly balanced by β ∝ k −1 [39], with the side effect that the invasion processes from and to hubs are slowed down, if hub's (and neighbor's) payoff is much smaller than its con-

III. DEPENDENCE ON THE INITIAL CONDITIONS
The initial conditions for the stochastic trajectories that we consider here are such that an initial number ρ 0 N of nodes (0 < ρ 0 < 1) are randomly chosen as cooperators. In Fig. 1    degree k), see e.g. [24]. In other words, even the small amounts of heterogeneity of ER networks, are enough to allow for cooperation-promoting feedback mechanisms to work.
As stated in the introductory section I, it has been reported in [41] that for any asymptotic trajectory there is a partition of the network into three sets, namely the set C of pure cooperator nodes, the set D of pure defector nodes, and the set F of fluctuating nodes. From now on we denote by ρ c = µ(C) the measure (relative size) of the set of pure cooperators (averaged over initial conditions and network realizations), and by ρ d = µ(D) that of the set of pure defectors.
The behavior of ρ c and ρ d versus the game parameter b is plotted in Fig. 3 for different initial distributions as a function of the parameter b.
The first remarkable result is that in ER networks, the density of pure cooperators does not depend on ρ 0 for the whole range of b values, in sharp contrast to the above mentioned results for the tails of the average level of cooperation c (b) (see Fig. 2). As anticipated in the introduction (see Eq. 1), there are two additive contributions to the average fraction c of cooperators, namely the measure ρ c of the set of pure cooperators, and the overall fraction of timeT c spent by fluctuating nodes as cooperators, weighted by the relative size ρ f = µ(F ) of the fluctuating set: Though the first contribution is, for ER networks, independent of ρ 0 , the second one does indeed depend on initial conditions, as inferred from Fig. 2   is consistent with the lack of degree preference (correlation) of pure defectors, which cannot take distinctive advantage of degree inhomogeneity: The higher their instantaneous payoff, the more likely they invade neighboring nodes, which has the effect of diminishing their future payoff.
Finally, we analyze the connectedness of the pure strategists sets, as measured by the number of cooperator cores N cc , and defector cores N dc . For BA networks, and ρ 0 = 1/2, we had reported in [41] the result that for all values of b where C is not an empty set, it is connected, i.e. N cc = 1.
This result turns out to be independent of ρ 0 (see Fig. 4): There is only one cooperator core in BA networks, which contains always the most connected nodes, for any initial fraction of cooperators.
The grouping of pure cooperators into a single connected set C allows to keep a significant fraction of pure cooperators isolated from contacts with fluctuating nodes. This "Eden of cooperation" in- network, so that N dc = 1. Thus, N dc (b) must increase first and then decrease to 1. In Fig. 4 we show the computed N dc (b) curves for BA networks for several values of ρ 0 . It is remarkable that these curves almost collapse, in spite of the fact that the fraction ρ d of pure defectors does indeed depend on ρ 0 , a numerical fact for which we have not found a plausible explanation.
In Fig. 4 we also show for ER graphs N cc (b) and N dc (b), for different values of ρ 0 . Regarding the number of cooperator cores, we see that except in the small range 1.4 < b < 1.6, the different curves coincide, in fair agreement with the independence of ρ c on initial conditions. Note that in the small interval where they do not coincide, the fraction ρ c of pure cooperators is below 1%, for all values of ρ 0 . On the other hand, we see that for higher initial proportion ρ 0 of cooperators, the set D is more fragmented and also that N dc reaches its maximal values at higher values of b.

IV. INFLUENCE OF THE DEGREE OF HETEROGENEITY
In order to inspect how the results depend on the distribution of nodes' degrees, we have monitored the same magnitudes studied throughout this paper when the value of α varies between 0 and 1. As introduced above, this makes the networks less heterogeneous as α grows and approaches 1. Figure 5 shows, from left to the right, the average level of cooperation c , the density of pure cooperators ρ c and the density of pure defectors ρ d as a function of b for several values of α. In this case, the initial distribution of cooperators was set to ρ 0 = 1/2, i.e., the nodes have the same probability to cooperate or defect at t = 0. The results show that indeed the densities of pure strategists and the average level of cooperation do depend on α, that is to say, the figure confirms the role played by the underlying topology. The more homogeneous the graph is, the smaller the level of cooperation in the system. Moreover, the transition for different values of α is smooth and does not exhibit an abrupt crossover from one kind of behavior (α = 0) to the other (α = 1).
We have also explored how nodes where strategies have reached fixation are organized into clusters of cooperation and defection as a function of α. Figure 6 summarizes our computations for the number of cooperator cores. In this case, we have represented N cc as a function of 1 − ρ c (that grows with b) in order to have the same scale for different values of α until cooperation breaks down. The observed dependence with α is again smooth and no abrupt change in the behavior of this magnitude occurs. It is worth stressing that as soon as the underlying network departs from the limit α = 0 corresponding to a BA scale-free network (whose P (k) ∼ k −3 ), the number of CC slightly differs from 1. This means that some realizations give rise to more than one cluster of CC. The probability to have such realizations is very small, but in principle, they are possible. As α is further increased beyond zero, it is clear that pure cooperators do not organize anymore into a single cooperator core. We think that this deviation is due to the fact that when α > 0 the exponent γ of the underlying network, which still is a scale-free degree distribution, is larger that 3. It is known that this value of γ marks the frontier of two different behaviors when dynamical processes are run on top of complex heterogeneous networks [50,51].
This is the case, for instance, of epidemic spreading. For 2 < γ ≤ 3, the second moment of the degree distribution P (k) diverges in the thermodynamic limit, while it is finite if γ > 3. As the critical properties of the system are determined by the ratio between the first (that remains finite for γ > 2) and the second moment, the divergence of the latter when N → ∞ and 2 < γ ≤ 3, makes the epidemic threshold null. On the contrary, when the process takes place in networks whose γ > 3, the epidemic threshold is recovered, although no singular behavior is associated to the critical point [50,51]. We expect that a similar phenomenology is behind the results shown in Though the numerical results presented here correspond to network sizes N = 4000 (in section III) and N = 2000 (section IV), we have study also larger networks (up to N = 10 4 ), with no qualitative differences in the results. The increase of network size, while keeping constant the average degree k , turns out to be beneficial for cooperation, due to the fact that it has the effect of increasing the maximal degree, and thus the range of degree values. This further confirms how efficiently cooperation takes advantage from degree heterogeneity.
The robustness of these results against game parameters variation will be analyzed elsewhere [49], one should expect that the network partition (C, D, F ) along asymptotics stochastic trajectories is generic in evolutionary game dynamics in graphs, for the kind of stochastic updating rule considered here. Our results also suggest that more works are needed in order to fully characterize the behavior of the PD game in heterogeneous graphs. The use of real networks, with emphasis on the role of mesoscopic (community) structures is addressed in [46]. Of particular interest would be to perform the sort of analysis carried out here in scale-free networks with an exponent 2 < γ < 3, which will make it feasible to connect evolutionary dynamics with other dynamical processes taking place on top of scale-free networks. Our hope is that this sort of study might provide a deeper understanding of what is going on at the microscopic level and might help to comprehend what universal mechanisms drive the evolution of complex heterogeneous networks as well as the reasons behind their ubiquitous presence in nature.