Evolutionary dynamics of cooperation in neutral populations

Cooperation is a difficult proposition in the face of Darwinian selection. Those that defect have an evolutionary advantage over cooperators who should therefore die out. However, spatial structure enables cooperators to survive through the formation of homogeneous clusters, which is the hallmark of network reciprocity. Here we go beyond this traditional setup and study the spatiotemporal dynamics of cooperation in a population of populations. We use the prisoner's dilemma game as the mathematical model and show that considering several populations simultaneously give rise to fascinating spatiotemporal dynamics and pattern formation. Even the simplest assumption that strategies between different populations are payoff-neutral with one another results in the spontaneous emergence of cyclic dominance, where defectors of one population become prey of cooperators in the other population, and vice versa. Moreover, if social interactions within different populations are characterized by significantly different temptations to defect, we observe that defectors in the population with the largest temptation counterintuitively vanish the fastest, while cooperators that hang on eventually take over the whole available space. Our results reveal that considering the simultaneous presence of different populations significantly expands the complexity of evolutionary dynamics in structured populations, and it allow us to understand the stability of cooperation under adverse conditions that could never be bridged by network reciprocity alone.


Introduction
Methods of statistical physics, in particular Monte Carlo simulations and the theory of phase transitions [1][2][3], have been successfully applied to a rich plethora of challenging problems in the social sciences [4][5][6][7]. The evolution of cooperation in social dilemmas-situations where what is best for the society is at odds with what is best for an individual-is a vibrant example of this development. Many reviews [8][9][10][11] and research papers that reveal key mechanisms for socially preferable evolutionary outcomes have been published in recent years [12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. Since cooperative behaviour is central to the survival of many animal species, and since it is also at the heart of the remarkable evolutionary success story of humans [27,28], it is one of the great challenges of the 21st century that we succeed in understanding how best to sustain and promote cooperation [29].
It has been shown that phase transitions leading to cooperation depend sensitively on the structure of the interaction network and the type of interactions [30][31][32][33][34], as well as on the number and type of competing strategies [8,[35][36][37]. An important impetus for the application of statistical physics to evolutionary social dilemmas and cooperation has been the seminal discovery of Nowak and May [38], who showed that spatial structure can promote the evolution of cooperation through the mechanism that is now widely referred to as network reciprocity [39,40]. A good decade latter Santos and Pacheco have shown just how important the structure of the interaction network can be [13], which paved the way further towards a flourishing development of this field of research. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
But while research concerning the evolutionary dynamics of cooperation in structured populations has come a long way, models where different populations do not interact directly but compete for space at the level of individuals have not been considered before. Motivated by this, we consider a system where two or more populations are distributed randomly on a common physical space. Between the members of a particular population the interactions are described by the prisonerʼs dilemma game. But there are no such interactions between players belonging to different populations, and hence players are unable to collect payoffs from neighbours belonging to a different population. The populations on the same physics space, for example on a square lattice, are thus neutral. Nevertheless, all players compete for space regardless to which population they belong, so that a player with a higher fitness is likely to invade a neighbouring player with a lower fitness.
As we will show, such a conglomerate of otherwise neutral populations gives rise to fascinating spatiotemporal dynamics and pattern formation that is rooted in the spontaneous emergence of cyclic dominance. Within a very simple model, we observe the survival of cooperators under extremely adverse conditions where traditional network reciprocity would long fail, and we observe the dominance of the weakest due to the greediness of the strongest when considering different temptations to defect in different populations.
In what follows, we first present the studied prisonerʼs dilemma game and the details of the mathematical model. We then proceed with the presentation of the main results and a discussions of their wider implications.

Prisonerʼs dilemma in neutral populations
As the backbone of our mathematical model, we use a simplified version of the prisonerʼs dilemma game, where the key aspects of this social dilemma are preserved while its strength is determined by a single parameter [38]. In particular, mutual cooperation yields the reward R=1, mutual defection leads to punishment P=0, while the mixed choice gives the cooperator the suckerʼs payoff S=0 and the defector the temptation T 1 > . We note that the selection of this widely used and representative parameterization gives results that remain valid in a broad range of pairwise social dilemmas, including the snowdrift and the stag-hunt game. All players occupy the nodes of a L×L square lattice with four neighbours each. To introduce the simultaneous presence of different populations, the L 2 players are assigned to i n 1, 2, , = ¼ different populations uniformly at random. All i populations contain an equal fraction of C i cooperators and D i defectors, who upon pairwise interactions receive payoffs in agreement with the above-described prisonerʼs dilemma game. Importantly, between different populations players are payoff-neutral with one another, which means that when C i meets C j or D j , its payoff does not change, and vice versa. In the next subsection, we first consider the model where all populations have the same temptation to defect (T T i = for all i), and then we relax this condition to allow different temptations to defect in different populations.
We use the Monte Carlo simulation method to determine the spatiotemporal dynamics of the mathematical model, which comprises the following elementary steps. First, a randomly selected player x acquires its payoff x P by playing the game potentially with all its four neighbours. Next, player x randomly chooses one neighbour y, who then also acquires its payoff y P in the same way as previously player x. Finally, player x imitates the strategy of player y with the probability w K 1 exp x y 1 = + P -P -{ ( ( ) ) } , where we use K 0.1 = as the inverse of the temperature of selection to obtain results comparable with existing research [8]. Naturally, when neighbouring players compete for space then the above describe microscopic dynamics involves not only the adoption of more successful strategy but also the imitation of the involved population tag.
In agreement with the random sequential simulation procedure, during a full Monte Carlo step (MCS) each player obtains a chance once on average to imitate a neighbour. The average fractions of all microscopic states on the square lattice are determined in the stationary state after a sufficiently long relaxation time. Depending on the proximity to phase transition points and the typical size of emerging spatial patterns, the linear system size was varied from L=400 to 6600, and the relaxation time was varied from 10 4 -10 6 MCS to ensure that the statistical error is comparable with the size of the symbols in the figures.

Results
Naively, one might assume that introducing several populations simultaneously which bear the same serious conflict of competing strategies might not bring about any changes in the evolutionary outcome. As is well known, the Nash equilibrium of the prisonerʼs dilemma game is mutual defection [41], and since this applies to all populations, the overall outcome should be mutual defection too. This reasoning is actually completely correct in well-mixed populations, where the consideration of different, otherwise neutral populations really does not change the result: cooperators die out in all populations as soon as T 1 > . But as we will show next, this naive expectation is completely wrong in structured populations, where excitingly different evolutionary outcomes can be observe due to the simultaneous presence of different populations.
As far as cooperation promotion is concerned, and before elucidating the responsible microscopic mechanism for such favourable evolutionary outcomes, we show in figure 1 how the fraction of cooperators changes in dependence on the temptation to defect T for different numbers of populations n that form the global system. For comparison, we also show the baseline n=1 case, which corresponds to the traditional version of the weak prisonerʼs dilemma game on the square lattice, and where cooperators benefit from network reciprocity to survive up to T 1.037  [43]. It can be observed that, as we increase n, the fraction of cooperators increases dramatically. In fact, the higher the value of n, the higher the stationary fraction of cooperators in the whole system.
The spatiotemporal dynamics behind this promotion of cooperation in a complex system consisting of two populations can be seen in the animation provided in [42], while a representative snapshot of the stationary state is shown in figure 2. In both cases cooperators are depicted blue while defectors are depicted red, and different shades of these two colours denote adherence to the two different populations. In figure 2, we have circled two crucial details that explain how the patterns evolve over time. The white circle marked 'I' highlights that dark red defectors can easily invade dark blue cooperators. However, the invaded space is quickly lost to light blue cooperators belonging to the other population. The latter, on the other hand, are successfully invaded by light red defectors from their own population, who are in turn again invaded by dark blue cooperators. In this way the loop is closed, revealing the spontaneous emergence of cyclic dominance in the form     , which determines the stationary distribution of strategies in our system. As is well-known, the cyclic dominance is crucial for the maintenance of biodiversity [36], which in our case translates to the survival of all four competing strategies, and thus to the sustenance of cooperation even at very high temptation values. This cyclic dominance can be observed directly if we launch the evolution from a prepared initial state, such that homogeneous domains of the competing strategies are separated by straight interfaces, as in the animation provided in [44] (in this animation a higher T 1.5 = temptation to defect was used to yield clearer propagating fronts). It can be observed that conceptually similar propagating fronts emerge as were observed before in rockpaper-scissors-like systems [36].
Turning back to figure 2, the white ellipse marked 'II' highlights another important aspect of the spatiotemporal dynamics, namely the smooth interface separating the two cooperative strategies in the absence of defectors. This may be surprising at first because these strategies are payoff-neutral, and thus a voter-modellike coarsening with highly fluctuating interfaces would be expected [45]. Indeed, while a C 1 cooperator does not benefit from the vicinity of a C 2 cooperator, other C 1 cooperators close by of course increase each otherʼs payoffs (and vice versa for C 2 cooperators). As a consequence of this the payoffs of C 1 and C 2 cooperators along the interface differ, so that one will likely invade the other. This process always aims to straighten the interfaces. If an interface cannot be straightened, for example around a small island, the latter will shrink due to an effective surface tension.
Lastly in terms of the results presented in figure 1, it remains to explain why the larger the number of populations forming the global system the higher the level of cooperation in the stationary state, and this Figure 1. The stationary f C fraction of cooperators in the whole system in dependence on the temptation to defect T, as obtained for different numbers of populations n that form the global system (indicated by the number along each curve). For reference the result of the classic one-population (n=1) spatial prisonerʼs dilemma game is shown as well. These results indicate that the introduction of additional populations whose members are payoff-neutral between one another significantly improves the survival chances of cooperators.
regardless of the temptation to defect. To that effect we provide in [46] an animation showing the spatiotemporal dynamics when n=3, and in figure 3 a representative snapshot of the distribution of strategies on the square lattice in the stationary state. These results reveal that the increasing positive effect is due to the fact that the addition of one new population i always yields one additional prey to the cooperators in other populations. At the same time, no new predators to them are introduced, i.e. D i defectors who act as the prey to cooperators in the other populations are predators only to C i cooperators, but the latter find their prey in defectors from other populations too. The snapshot in figure 3 features two white ellipses, where it is highlighted that the plain red D 3 defectors are dominated by both C 1 (dark blue) and C 2 (light blue) cooperators (see also the animation in [46]).
Thus far, we have only considered cases where the temptation to defect was the same in all populations. By relaxing this restriction, the number of free parameters increases significantly, yet it is still possible to determine general properties of the spatiotemporal dynamics that governs the evolutionary outcomes in a presented system.
We begin by presenting results for the generalized two-population setup where T T 1 2 ¹ . As we have shown above, the emergence of cyclic dynamics between the four competing microscopic states in general dictates a stable coexistence. By increasing the temptation to defect in one population practically increases the rate in the corresponding D C  invasion. The consequences of this fact, based on the fundamental principles of cyclic dominance [36], actually completely explain the evolutionary outcomes in figure 4. The first potentially surprising observation is that increasing the temptation to defect T 2 between D 2 defectors and C 2 cooperators will not only lower the stationary fraction of C 2 and increase the stationary fraction of D 2 , but also elevate the fraction of C 1 cooperators. This is because D 2 defectors are prey to C 1 cooperators, and it is well-known that a species entailed in cyclic dominance is promoted not by weakening its predator, but rather by making its prey stronger. This paradox is a frequently observed trademark of systems that are governed by cyclic dominance [47]. However, despite the described boost to the growth of C 1 cooperators, the overall fraction of all cooperators in the whole system decreases slightly as we increase T 2 towards very large values, as illustrated in the inset of figure 4.
For a better demonstration of the acceleration of the D C 2 2  invasion and the resulting boost to C 1 cooperators (dark blue), we provide an animation in [48], where an extreme high T 100 2 = was used at L=400 linear system size. As the animation shows, although C 2 cooperators (light blue) are invaded very efficiently by The key mechanism that is responsible for the emerging spatial pattern is highlighted by a white circle marked 'I'.
Together with the animation provided in [42], it can be observed that dark red defectors invade dark blue cooperators, but light blue cooperators invade dark red defectors. Likewise, light red defectors invade light blue cooperators, but dark blue cooperators invade light red defectors. This spontaneous emergence of cyclic dominance in the form D C D C D is responsible for the sustenance of cooperation even at very high temptation values that can be observed in figure 1. The white ellipse marked 'II' highlights the smooth interface between both cooperator strategies in the absence of defectors, which is surprising given that the two strategies are payoff-neutral and thus should be subject to voter-model-like coarsening. For clarity a L=400 linear system size was used. D 2 defectors (light red), the abundance of D 1 defectors (dark red) always offers an evolutionary escape hatch out of extinction of C 2 cooperators. In agreement with the above described cyclic dominance, D 2 defectors are fast invaded by C 1 cooperators. Interestingly, D 1 defectors would also beat C 1 cooperators because T 1.05 1 = is above the T 1.037 = cooperation survival threshold of a single population, yet the D C 2 2  propagating front always comes to the rescue, bringing with it D 2 defectors as prey.
In comparison to the results obtained when the temptation to defect is the same in all populations (see figure 1), it may come as a surprise that cooperators die out if T 2.85 > , and this despite the fact that qualitatively the same cyclic dominance emerges there. The explanation of this difference illustrated in figure 4 is that in the symmetrical case the D C 1 2  and D C 2 2  invasion rates change simultaneously as we vary T. However, it is precisely this simultaneous change of invasion rates that may jeopardize the stable coexistence in models of cyclic dominance. As shown previously for a symmetric 4-strategy Lotka-Volterra system, the coexistence  . The inset shows the overall fraction of cooperators in the system in the large T 2 limit. disappears if the difference between the invasion rates exceeds a threshold value [49]. For an illustration, the effective food-web of the four competing strategies in a two-population model is shown in left panel of figure 5.
Naturally, if we allow different temptation values in different populations the behaviour becomes even more complex, as we show next using still a relatively simple three-population system as an example. The effective food-web is shown in the right panel of figure 5. If we just vary T 3 , while the temptation to defect in the other two populations remains fixed at T T 1.05 1 2 = = , the D C 3 3  invasion rate will influence invasions in several other cycles in the effective food-web. Examples include the D C D C D cycle, all of which contain the elementary D C 3 3  invasion that is directly affected by T 3 . This is why it is almost impossible to predict the response of a system comprised of several neutral populations, even if only a single temptation to defect is varied. For the above n=3 case, the results showing how different T 3 values affect the evolutionary outcome are presented in figure 6. It can be observed that upon increasing the value of T 3 , the stationary fraction of C 1 and C 2 cooperators is not affected, even though they are the predators of D 3 who should in principle be promoted by large T 3 values. On the other hand, the overall fraction of all defectors in the system remains very low. But the most exotic reaction is that of the fraction of C 3 cooperators, which is of course the direct prey of D 3 defectors. While initially their fraction in the stationary state decreases to a shallow minimum across the intermediate range of T 3 values, it ultimately increases to complete dominance above a threshold value. In other words, while defectors survive when all T values in the system are equal to 1.05, they die out if we increase one of them sufficiently, as it happens in figure 6 when the T 3 value is sufficiently large. Due to the symmetry of the model the same results are of course obtained if either the value of T 1 or T 2 would be enlarged instead of the value of T 3 .
To better understand and illustrate the seemingly paradoxical effect the increasing T 3 value has on the evolutionary outcome, we provide an animation from a prepared initial state in [50]. Here the square lattice is horizontally divided into two parts, where in the top half C 1 cooperators (dark blue) are framed by D 1 defectors Figure 5. The effective food-web of all competing strategies in a two-(left) and three-population (right panel) system. We emphasize that the depicted relations between strategies exist only in a spatial system, where cooperators can invade defectors from other populations. If we consider solely pairwise interactions, the relation between C 1 cooperators and D 2 defectors (or C 1 and C 2 ) is of course payoff-neutral, as defined in the mathematical model. . Importantly, invasions through the horizontal border are not permitted because we want to compare the independent evolution of both sub-systems. Since D 2 defectors are not present, C 2 cooperators have no natural predator. As a consequence, the whole system will evolve into a pure C 2 (light blue) phase. However, the really interesting aspect of this animation is how the mentioned sub-systems reach this state. In the top half, D 1 defectors are less aggressive, and therefore their invasions are less salient. This has two important consequences. In the first place, their payoffs are not that high for the other strategies to imitate them, and so the C D 1 1 border is fluctuating rather strongly. Secondly, D 1 defectors do not form a homogeneous front along this border. The latter would be essential for a fast invasion of C 2 cooperators (light blue), who are their predators. In other words, the effective invasion of C 2 cooperators can only happen via the invasion of D 1 defectors. The latter conditions is completely fulfilled in the bottom half where D 3 defectors are more aggressive. Here defectors form not just a more compact invasion front, but they also form a thick, uniform stripe, which is an easy target for C 2 cooperators. Consequently, the more aggressive defectors will die out much faster than their less potent D 1 counterparts in the top half of the square lattice.
This process just described is actually very common when the value of the temptation to defect in one population is significantly larger than the corresponding values in other populations. Of course, the extinction of the most aggressive defector frequently involves also the extinction of its cooperator prey. Sometimes, however, if the system size is large enough, it may happen that the prey of the more aggressive defectors manages to separate itself in an isolated part of the lattice and hang on until his predators die out. Such a situation is illustrated in figure 7, where the white ellipses and circles mark plain blue cooperator spots who got rid of their natural predators (plain red). In the absence of the latter, the arguably weakest cooperators become the strongest, and they eventually rise to complete dominance by invading defectors from the other two populations who themselves continuously invade their cooperators. The whole evolutionary process can be seen in the animation in [51], where we have used prepared initial patches of the six competing strategies to make the spatiotemporal dynamics that leads to the described pattern formation better visible. Additionally, for a faster evolution, we have used a smaller L=180 linear system size. In effect, the plain blue cooperators use the defectors from the other two populations as a Trojan horse to invade the whole available space. And despite of starting as the weakest, they turn out to be the dominant due to the greediness of their direct predators. As in figures 2 and 3, different shades of blue and red depict cooperators and defector belonging to different populations. White ellipses highlight the weakest C 3 cooperators (plane blue), who manage to survive despite the large T 3 value giving a huge evolutionary advantage to their direct predators D 3 (plain red). What is more, due to their greediness, D 3 defectors are actually the first to die out, thus paving the way for C 3 cooperators to rise to complete dominance by using D 1 and D 2 defectors (light and dark red) as a Trojan horse to invade the territory of C 1 and C 2 cooperators (light and dark blue). This is an example where the weakest ultimately dominate because of the greediness of the strongest. For clarity a L=360 linear system size was used.

Discussion
We have studied the spatiotemporal dynamics of cooperation in a system where several neutral populations are simultaneously present. The evolutionary prisonerʼs dilemma game has been used as the backbone of our mathematical model, where we have assumed that strategies between the populations are payoff-neutral but competing freely with one another as determined by the interaction graph topology. Within a particular population the classical definition of the prisonerʼs dilemma game between cooperators and defectors has been applied. We have observed fascinating spatiotemporal dynamics and pattern formation that is unattainable in a single population setup. From the spontaneous emergence of cyclic dominance to the survival of the weakest due to the greediness of the strongest, our results have revealed that the simultaneous presence of neutral populations significantly expands the complexity of evolutionary dynamics in structured populations. From the practical point of view, cooperation in the proposed setup is strongly promoted and remains viable even under extremely adverse conditions that could never be bridged by network reciprocity alone. The consideration of simultaneously present neutral populations thus allows us to understand the extreme persistence and stability of cooperation without invoking strategic complexity, and indeed in the simplest possible terms as far as population structure and overall complexity of the mathematical model is concerned.
The central observation behind the promotion of cooperation is that, if we put two payoff-neutral populations together, then only cooperators can benefit from it in the long run. While the advantage of mutual cooperation is readily recognizable already in a single population, and it is in fact the main driving force behind traditional network reciprocity, the extend of it remains limited because cooperators at the frontier with defectors always remain vulnerable to invasion. This danger is here elegantly avoided when a cooperative cluster meets with the defectors of the other population. In the latter case the positive consequence of network reciprocity is augmented and cooperators can easily invade the territory of the foreign defectors. Importantly, this evolutionary success of cooperators in one population works vice versa for cooperators in the other population(s) too. Due to this symmetry, it is easy to understand that, as we have shown, the larger the number of populations forming the system, the more effective the promotion of cooperation.
We have also shown that the already mentioned positive impact can be enhanced further if we allow different temptations to defect in different populations. Counterintuitively, in a system where the population specific temptation to defect values are diverse enough, defectors die out first whose temptation value is the largest. And this turns out to be detrimental for defectors in other populations too. An extreme aggressive invasion namely leads to the fast depletion of the prey-in this case the cooperators from the corresponding population-which in turn leads to the extinction of the predators. However, the reverse situation is not valid: if the most vulnerable cooperators somehow manage to survive, they eventually rise to complete dominance, using defectors from other populations as Trojan horses to invade cooperators from other populations. This gives rise to the dominance of the weakest due to the greediness of the strongest, and it also reminds us that dynamical processes in different populations should not be too diverse because this jeopardizes the stability of the whole system.