The dynamics of cooperation in asymmetric sub-populations

Sacrificing personal benefits for a common good is at odds with the fundamental principle of Darwinian evolution: if only the fittest survives, then there should be no place for cooperation. But cooperative behavior actually abounds, and constitutes one of the most persistent and fascinating puzzles of nature. One solution to this puzzle is network reciprocity, where the collective dynamics of cooperators affords them protection against invading defectors. Commonly, however, such a competition does not unfold in isolation. Populations are often divided into sub-populations, with different evolutionary rules describing the interactions between them. Here we propose and study a paradigmatic model that captures the essence of this setup. Specifically, if two players belong to the same sub-population, they play the prisoner’s dilemma game. If not, they play either the harmony game, the snowdrift game, the stag-hunt game, or the prisoner’s dilemma game. Due to such an asymmetry in the interactions across sub-populations, a fascinating evolutionary dynamics sets up that greatly expands the survivability of cooperators. For instance, when the harmony game applies, cyclic dominance spontaneously emerges, wherein cooperators in one sub-population become predators of defectors in the other sub-population. One also may observe self-organized segregation, wherein both sub-populations maintain a mixed state of cooperators and defectors. As a general rule, we show that the lower the dilemma strength between sub-populations, the more abundant the cooperative strategy in the entire population. Results are confirmed by means of Monte Carlo simulations with pair approximation method, which reveals a rich plethora of novel and generally valid paths to cooperation.


Introduction
Cooperation is the act of sacrificing personal benefits for a common good, and while it is widespread in bacteria, bees, ants, birds and mammals [1], is in fact what distinguishes us most as humans. But, since this is clearly at odds with Darwin's famous thesis that only the fittest survive, the abundance of cooperation in nature constitutes an intriguing puzzle that ought to be unsustainable through the course of evolution. Rather unexpectedly, this most fascinating aspect of our biology turned out to be akin to collective behavior that physicists have observed decades ago in spin models and in many other many-particle systems [2]. Methods of statistical physics are now playing a key role in helping us to better understand cooperation and when it can proliferate [3][4][5][6][7][8][9]. Monte Carlo simulations and the theory of phase transitions in particular [10][11][12], random walk dynamics [13], as well as methods of network science [14][15][16][17][18][19][20], are enabling unprecedented insights into the details of microscopic and collective evolutionary dynamics behind cooperation in social dilemmas (those situations where what is best for an individual conflicts with what is best for the society).
Traditionally, the theoretical backbone for the mathematical description of social dilemmas is evolutionary game theory [21][22][23], which has revealed several, and now indeed famous, mechanisms for cooperation, including kin selection, direct and indirect reciprocity, network reciprocity, and group selection, as reviewed in reference [24]. Research in physics has long focused on the evolution of cooperation in single and isolated networks [25][26][27][28][29][30][31][32][33][34][35][36][37][38], although more recent years have seen the shift towards multilayer and interdependent networks [39][40][41][42][43][44][45][46][47][48][49][50][51]. A particularly interesting avenue in this regard concerns the interactions among populations, which relaxes the unrealistic assumption that competition within a particular population unfolds in isolation. In details, the whole population is divided into several sub-populations due to geographic distance or biological isolation, which gives rise to the possibility of each sub-population allowing different evolutionary laws. An initial study has been presented in reference [52], where a simple assumption was made that strategies between different sub-populations are payoff-neutral with one another, and yet it was shown that the complexity of evolutionary dynamics is significantly expanded. Besides, sub-populations with asymmetric interaction and among different species were also proposed [53,54], where payoff inequality plays an important role in the evolution of cooperation, was also proposed [41], Here we propose a paradigmatic novel model that mathematically accounts for the fact that the evolutionary dynamics within a particular sub-population is governed by one type of social dilemma, while different rules apply between sub-populations. More precisely, if two players belong to the same sub-population, they will play the prisoner's dilemma game, at variance, they will play either the harmony game, the snowdrift game, the stag-hunt game, or the prisoner's dilemma game, depending on the parameter values of the model. We study the model in detail by means of Monte Carlo simulations, obtaining the complete phase diagrams for the key parameters, and determining the elementary properties of the governing microscopic dynamics. Our research shows that the asymmetry in the dynamics between sub-populations gives rise to the spontaneous emergence of cyclic dominance, to self-organized segregation, and indeed to many other fascinating evolutionary outcomes, which together transcend established boundaries of cooperator survival in social dilemmas.

Methods
In the proposed model, we consider a two-player two-strategy game, which includes harmony (H) game, stag hunt (SH) game, snowdrift (SD) game and prisoner's dilemma (PD) game. All the agents are assigned to the vertices of an L × L square lattice with period boundary, and initially each node is set to be a cooperator (C) or a defector (D) with equal probability. Furthermore, the network is split into N (N = 1, 2, 3, . . .) randomly distribute sub-populations, each of which contains therefore almost the same fraction of cooperators and defectors initially. For the sake of illustration, we here focus on the case of N = 2, where four kinds of players can be identified: cooperators in population I (I C ), defectors in population I (I D ), cooperators in population II (II C ), and defectors in population II (II D ).
In our work, a PD game is played when two interactive players belong to the same sub-population, whereas a different game is played by two agents belonging to different sub-populations. As for the payoff matrix, two cooperators receive a reward R = 1, and mutual defection yields a punishment P = 0. When instead a cooperator encounters a defector, the former one receives a sucker's payoff S and the latter earns a temptation to defection T. For simplicity, we define the game in the same sub-population as game 1 (G1), where S = 0 and T = b > 1, namely, a weak PD game (T > R > P S) [55,56]. The corresponding payoff matrix M 1 is given by Similarly, the game between sub-populations is game 2 (G2) which is specified by the choice of the parameters (T, S): a PD game for (T > 1, S < 0, T > R > P S), a stag hunt (SH) game for (T < 1, S < 0, R > T > P > S), a snowdrift (SD) game for (T > 1, S > 0, T > R > S > P) and a harmony game (HG) for (T < 1, S > 0, R > T, S > P). The corresponding payoff matrix M 2 is given by We performed Monte Carlo (MC) simulations, where each individual has the ability of imitating the strategy and the population tag of its neighbors over time according to payoff difference. For simplicity, the four possible strategic states of a generic player x are represented as I C : S x = (1, 0, 0, 0) T , I D : S x = (0, 1, 0, 0) T , II C : S x = (0, 0, 1, 0) T , II D : S x = (0, 0, 0, 1) T , so that a unique payoff matrix M can be defined, given by The elementary step of the MC simulation is performed as follows. First, a randomly selected player, say i, obtains its payoff by interacting with its four nearest neighbors, where Ω i is the neighborhood of player i. Next, a randomly chosen neighbor of player i, say j, is considered, for which the payoff is calculated as P j = k∈Ω j S T k MS k . Finally, player i copies the strategy and population tag of player j with probability where K is the parameter accounting for the agent's irrationality. To confront our study with a well established framework, we here set K = 0.1 [45]. During one full MC step, each player has the chance of adopting the neighbor's strategy and tag once on average. The results presented below were obtained on square lattices with degree k = 4; the network size was varied from L = 200 to L = 600 (in order to avoid finite size effects) and the equilibration required up to 100 000 MC steps.

Results
We first study how the game (i.e. G2) between sub-populations affects the total frequency of cooperation in the whole population. To this aim, we need to explore the influence of T and S separately, which directly decide the game type of G2. In figure 1(a), it is clearly unveiled that the social dilemma will be easily resolved with increase of S, and the survival of cooperation will become easier. When S 0, G2 will be a PD game between sub-populations, whose dilemma strength is even stronger than that of weak PD version G1 (i.e. S = 0) (dilemma strength is usually used to measure survival status of cooperation, please refer to [7]). For S = 0 and T = 1, the critical threshold b C , marking the extinction of cooperation, equals to 1.07. For negative S, cooperation even vanishes from b = 1.0. However, positive S will effectively reserve the disadvantageous situation of cooperation, because G2 between sub-populations become SD game. Thus, it seems interesting to state that interactions between sub-populations decide the evolution of cooperation in the whole population, which goes beyond the traditional definition of network reciprocity [4]. If dilemma strength of G2 (between sub-populations) is stronger, it will diminish the benefits of network reciprocity of each single sub-population; on the contrary, the reciprocity of G2 (in SD game) will insert continuous cooperators to each sub-populations and thus avoid the excessive exploration of free-riders. We now turn our attention to the influence of T in figure 1(b). It is clear that smaller T makes G2 become harmony (H) game (whose equilibrium state (ES) and evolutionary stable state (ESS) are complete cooperation), and thus guarantees the total cooperation frequency even if the internal environment of each sub-population is extremely unfavorable. While for larger T (T 1.0), it is similar to the case of negative S in figure 1(a), stronger dilemma strength between sub-populations even destroys the whole environment of cooperation. Next, to give further explanations of these interesting phenomena, we examine how the game model between sub-populations affects the organization of cooperation (clusters) from the perspective of microcosmic snapshots. Figure 2 depicts a prepared initial distribution of four strategies on networks, where each strategy occupies one quarter of networks, and deep (light) blue and deep (light) red In each sub-population, the larger the value of b, the higher competition defectors have, thus no matter which model in G2, defectors can invade cooperators if they have same sub-population tags. While, different models in G2 lead to different results. For SH model, because cooperators in sub-population I (II) have a little advantages than defectors in sub-population II (I), resulting light (deep) blue area invades deep (light) red area. However, cooperators cannot form stable clusters with defectors in another sub-population before the fast invasion of defectors in the same population, which leads to a full defection state. When G2 is harmony game, although defection clusters are more competitive than cooperation clusters when they belong to the same sub-population, cooperators have much competition than defectors when they belong to different sub-populations. Accordingly, a cyclic dominance of four strategies I D → I C → II D → II C → I D arises on the network, cooperators can offset the invasion by invading another sub-population. For SD game, an interesting phenomenon is that the players organized into two large mixed clusters by themselves, i.e. I C + II D and II C + I D clusters. The boundary of each mixed cluster are full of defectors, so cooperators can be protected by another sub-population's defectors, thus averting the exploitation by defectors in the same sub-population. It is worthy mentioning that there is no difference between the two mixed clusters (I C + II D cluster and II C + I D cluster), random imitation occurs at the interface of them. With the increase of MC steps, one of the clusters will occupy the whole network, i.e. only left I C + II D or II C + I D clusters. For PD game, cooperators no longer have larger competition when they interactive with defectors belong to another sub-population, I C and II C will be exploited not only by I D , but also II D , which leads to a full defection state. In short, asymmetric interactions provide more beneficial circumstance for cooperators to survive on sub-population network. Different models in G2 lead to different dynamic characteristics of cooperation, especially SD game or H game generates more benefit for the evolution of cooperation.
In order to have an intuitive understanding of different phases according to different parameters, we obtained phase diagram of T-S panel for different temptation to defect of G1 in figure 3. The results both consist of four phases when b = 1.1 and 1.9, cooperators only (all C), defectors only (all D), cooperators and defectors which belong to different sub-populations (I C + II D or II C + I D ), four strategies coexistence phase (I C + II C + I D + II D ). For small b [panel (a)], all C phase occupies a large parameter space, especially in the harmony game area (left top corner). The coexistence phase of the four strategies has a narrow area in the middle of panel (a). In the SD region, participants on the network self-organize to produce stratification at the beginning of evolution stag as shown in the third line of figure 2. Finally, there will be left only I C + II D or II C + I D . For all D phase, it is always a combination of large T and small S, especially G2 is prisoner's dilemma game. When b increases to 1.9 [panel (b)], the number of the phase is same with that of panel (a), except that the greater the dilemma strength of G1, the larger the all D phase area, and the smaller the all C phase area. In particular, harmony game and snowdrift game which have lower dilemma strength in G2 are more conductive to the existence of cooperators. Compared panels (a) and (b), it is clear that large dilemma strength in G1 impedes the maintenance of cooperation on networks, but the small dilemma strength in G2 is beneficial for the existence of cooperation.
To evaluate the influence of T and S, we present a quantitative analysis of cooperators dynamic with small b = 1.  cooperation always decreases with the increase of T, the same conclusion as reference [7], the larger the dilemma strength, the lower level of cooperation. However, if there is no asymmetric interaction between two sub-populations, cooperators will diminish when T is larger than 1.1. Considering the suckers payoff S, larger S leads higher competition to cooperators when they encounter defectors. Thus, cooperation can be promoted effectively with the increase of S. Panel (b) depicts the frequency of cooperation as a function of S, it has the same results as panel (a) obtained. The larger value of S (T), the higher (lower) frequency of cooperation. When we consider a large value of b, an interesting result emerges in panel (c). For fixed S, cooperation usually decreases with the increase of T. But for S = 0.5 or 1, cooperation rate has a rise and fall phenomenon as T increases. That is because the phase transition happens at this stage, clearly description will be shown in figure 5. While before that, it seems to be meaningful to sum these interesting phenomena up to now. When we fix dilemma strength in G1, small dilemma strength in G2 can guarantee high cooperation level. However, with the heterogeneous interactions between asymmetrical sub-populations, large dilemma strength cannot always lead to low cooperation level, which depends on the phases of the system Similar to figure 4, here we also describe how cooperative evolves with four-strategy circumstance (see figure 5). The background colors of light yellow, white, light green and light blue represent the bistable state of all C, the coexistence of the four strategies, the bistable state of I C + II D or II C + I D and the bistable state of all D, respectively. In panel (a), the cooperation-dominated network will reach full I C or II C states in the light yellow region, there only shows one of two states. When T falls into the white area, I C , II C , I D and II D exist simultaneously in the network together. As T increases, the cooperation rate in the light green area   suddenly increases dramatically as the phase shifts to the I C + II D or II C + I D state (we only show II C + I D here). When we fix T = 1.5 [panel (b)], small S (light blue region) will cause all D phase to appear, which contains full I D or II D state. For larger S (light green area), cooperators can coexist with defectors that belong to the other sub-population (I C + II D or II C + I D ) through participating in SD game. Similar phase transition occurs on the network when b = 1.9 [panels (c) and (d)].
Finally, it is also interesting to validate the robustness of our observations on different topologies, i.e. we consider the evolution of cooperation on Erdős, Rényi (ER) network [57] with 10 000 nodes and average degree k = 4 in figure 6(a) more complex scenario which contains three sub-populations in figure 7. On ER network, similar results are obtained, i.e. cooperation increases with S, while decreases with T. In  In the case of three sub-populations, the dynamics on the network are similar to the situation of two sub-populations, which can be explored by the evolution snapshots of six type players with prepared initial distribution. From deep to light colors, they represent population I, II and III, respectively. Blue and red color represent defectors and cooperators, respectively. G2 is SD game between sub-population I and II, the interface of them exhibits the coexistence of cooperators and defectors, harmony game occurs between II and III, the interface exhibits cyclic dominance of four strategies, SH game occurs between III and I, the interface exhibits defectors dominant state. At last, as the evolution of different strategies, cooperators in sub-population II and defectors in sub-population I coexist on the whole network. particular, the rise and down phenomena of cooperation rate in figure 6(b) are caused by phase transition, which has been clearly described in figure 5. When we consider three sub-populations on the grid network, it contains six types of player initially, thus the payoff matrix can be rewritten as, where T 1 , T 2 , T 3 represent defector's payoff when they meet cooperators belonging to another sub-population, S 1 , S 2 , S 3 represent cooperator's payoff when they meet defectors belonging to another sub-population. Here, the network consists of three scenarios of interaction between different sub-populations, leading to a more complex dynamic. In figure 7, we consider a prepared structure, each sub-population can interact with another sub-population sufficiently. From left to right, the time steps are 0, 100, 200, 500 and 49 999, respectively. We consider different configuration of T-S value, SD game (T 1 = 1.5, S 1 = 0.5) between sub-population I and II, SH game (T 2 = 0.5, S 2 = −0.5) between sub-population I and III, harmony game (T 3 = 0.5, S 3 = 0.5) between sub-population II and III. The dynamics vary between different sub-populations, similar to the results in figure 2. Cyclic dominance of four strategies occurs in sub-population II and III, and SH game between sub-population I and II soon lapse into a full defectors state [as shown in the right column of panel (c)], but it will be invaded by the cycle dynamics. While, SD game between sub-population I and II is more competitive than the other two games, resulting in the coexistence of cooperators and defectors belonging to sub-population II and sub-population I finally.

Conclusion
To sum up, we have introduced a fundamental mathematical model that removes two crucial restrictions in the theoretical description of evolutionary dynamics of cooperation. In the first place, it takes into account the fact that the evolution within a particular population does not unfold in isolation. As such, our model consists of sub-populations that interact with one another. Secondly, it takes into account that the evolutionary rules within a particular sub-population are different than at the interfaces where players with different origin meet. This gives rise to an asymmetry in interactions between players that belong to the same sub-population and between players that belong to different sub-populations. In particular, we have considered a setup where two players that belong to the same sub-population play the prisoner's dilemma game, while they play either the harmony game, the snowdrift game, the stag-hunt game, or the prisoner's dilemma game, if they belong to different sub-populations. What game it is exactly is determined by only two key parameters of the introduced mathematical model, which thus smoothly interpolates amongst several different scenarios that may apply in real life. The removal of these restrictions reveals fascinating evolutionary dynamics that exposes new and generally valid paths to cooperation in social dilemmas. In particular, we have performed detailed and systematic Monte Carlo simulations of the proposed mathematical model, obtaining complete phase diagrams for the key system parameters. We have observed continuous and discontinuous phase transitions to absorbing and mixed states, and we have also determined the elementary properties of the governing microscopic dynamics that is responsible for these transitions. Highlights include observing the spontaneous emergence of cyclic dominance, wherein cooperators in one sub-population become predators of defectors in the other sub-population. This occurs when players from different sub-populations engage in the harmony game, where due to the lenient conditions in terms of the absence of a social dilemma the reverse invasions-cooperators invading defectors-across the two sub-populations become possible. This phenomenon is obviously different with the results in [41], which only considers intrapopulation imitation. Furthermore, in our research, interpopulation interactions depend on the density of each sub-population and clusters of players on the network, and evolve dynamically with different strategies. Each type of player can compete with other three types of players for living space, so the evolutionary dynamic occurs in any two types of players. In such a circumstance, network reciprocity provides an opportunity for cooperators to outperform or coexist with defectors from another sub-population. Similarly, when players from different sub-populations engage in SD game, especially for large temptation and sucker's payoff, cooperators and defectors will dominant population I and population II, respectively, i.e. I C + II D or II C + I D state, which is similar to polarized strategic probability densities raised in [41]. It corresponds to self-organized segregation among the two sub-populations, whereby both of them can sustain a mixed state of cooperators and defectors. We have shown that this is due to the fundamental differences in the microscopic dynamics that govern the prisoner's dilemma and the snowdrift game. While in the later case we have short-range order based on a checkerboard pattern, in the former we have long-range order that is due to the comparatively slow growth of compact cooperative domains. As a general conclusion, we have shown that the lower dilemma strength between sub-populations, the more abundant the cooperative strategy in the entire population.
We have confirmed all the main results of Monte Carlo simulations with the pair approximation method, obtaining indeed good qualitative agreements with this analytical approach. The latter in turn further generalizes our findings and significantly relaxes the conditions under which they may apply, as indeed the pair approximation method has very lenient restrictions in terms of the topology of the interactions. As depicted in appendix part, pair approximation only considers a nano structure, and the frequency of strategy pairs. Thus it can only predict the trend of cooperation, but it fails to adequately illustrate the formation of small clusters of cooperators (like figure 2). We have also confirmed the robustness of our results by considering a third sub-population and other network structure, where again the originally observed evolutionary dynamics prevailed.
Interactive diversity brings higher strategic complexity and is accompanied by more dynamical relations, which is conductive to the existence of diversity of strategies. Just like in reference [58], when individual behavior is democratically decided by vote, it is more conducive for people to adopt cooperative strategies. Reference [59] argued that under the influence of reputation and third-party punishment, considering various interaction factors is conducive to people's choice of prosocial behavior. As illustrated in references [60,61], interactive diversity appears when people face different interaction partners, here we argued that the asymmetric interactions can lead to complex types of strategies, and the diverse evolutionary dynamics of different strategies contribute to the coexistence of strategies and the maintenance of cooperators. In particular, when dilemma strength of G1 is fixed (even if the dilemma strength is very high), the smaller G2 dilemma strength is, the more favorable it is for cooperation. In addition to the theoretical contributions that transcend established boundaries of cooperator survival in social dilemmas, we believe our research also opens up the path towards more realistic experimental research on cooperation, which has in recent years gained steadily on momentum [62][63][64]. It also reinvigorates the importance of cyclic dominance beyond its traditionally considered key role in mathematical description of biodiversity [65][66][67][68][69][70], and it invites us towards further leveraging statistical physics to improve our understanding of cooperation in the broadest possible sense [71]. We hope that the newly proposed mathematical model and the interesting insights it affords towards transcending established boundaries of cooperator survival in social dilemmas will succeed in motivating future research along similar lines.

Appendix. Pair approximation
On a square lattice, each individual has four nearest neighbors. The focal player A can initially choose one of the four strategies, namely I C , I D , II C or II D . From the perspective of a mean-field approximation, the probability of the strategy of the focal site is equal to the frequency of that in the network. In particular, each strategy has the same fraction initially, that is F I C = F I D = F II C = F II D = 0. 25. In what follows, however, we outline the pair approximation method. The model contains the following elementary steps. Initially, the focal player A obtains the payoff by interacting with its four neighbors x, y, z and B. Secondly, a randomly chosen neighbor of player A, say B, obtains its payoff in the same way. Finally, A imitates the strategy of B according to the probability defined by equation (2).
As each strategy has initially the same probability, the links of different strategies satisfy P I C ,I C = P I D ,I D = P II C ,II C = P II D ,II D = P I C ,I D = P I D ,I C = P I C ,II C = P II C ,I C = P I C ,II D = P II D ,I C = P I D ,II C = P II C ,I D = P I D ,II D = P II D ,I D = P II C ,II D = P II D ,II C = 1/16. By using the pair approximation method, the temporal evolution of the different links can be explicitly described. For brevity, we only report three of the sixteen equations, because all others have analogous form.
Here N I C and N I D denote the number of cooperators and defectors in the neighborhood of the focal player A, respectively. P I D A is the payoff of the player A, calculated according to equation (1) when it acts as a defector, and P I C B is the payoff of the player B when it acts as a cooperator. W[] is the probability that the focal player A imitates the strategy of the player B, according to equation (2). After calculating the frequency of each link, the fraction of the four competing strategies can be determined as follows.
F I C = P I C ,I C + 2 * P I C ,I D + + 2 * P I C ,II C + 2 * P I C ,II D , F I D = P I D ,I D + 2 * P I D ,I C + + 2 * P I D ,II C + 2 * P I D ,II D , F II C = P II C ,II C + 2 * P II C ,I C + + 2 * P II C ,I D + 2 * P II C ,II D , F II D = P II D ,II D + 2 * P II D ,I C + + 2 * P II D ,I D + 2 * P II D ,II C .
The results of this pair approximation method are shown in figures 1 and 4, and are indeed in very good agreement with the results of our Monte Carlo simulations.