The Structure of Mutations and the Evolution of Cooperation

Evolutionary game dynamics in finite populations assumes that all mutations are equally likely, i.e., if there are n strategies a single mutation can result in any strategy with probability 1=n. However, in biological systems it seems natural that not all mutations can arise from a given state. Certain mutations may be far away, or even be unreachable given the current composition of an evolving population. These distances between strategies (or genotypes) define a topology of mutations that so far has been neglected in evolutionary game theory. In this paper we re-evaluate classic results in the evolution of cooperation departing from the assumption of uniform mutations. We examine two cases: the evolution of reciprocal strategies in a repeated prisoner's dilemma, and the evolution of altruistic punishment in a public goods game. In both cases, alternative but reasonable mutation kernels shift known results in the direction of less cooperation. We therefore show that assuming uniform mutations has a substantial impact on the fate of an evolving population. Our results call for a reassessment of the ''model-less'' approach to mutations in evolutionary dynamics. Copyright: ß 2012 García, Traulsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: The authors have no support or funding to report.


A Payoff matrix in repeated games with minimal memory strategies
To derive the payoff matrix for all the strategies we compute the normalized payoff, given by The complete payoff matrix is then where have used auxiliary functions that correspond to different cases as follows: Function A: Distinct behaviour in the first round Consider the case of ALLC playing against STFT. Here, in the first round ALLC will play cooperate and STFT will play defect, getting one-shot payoffs S and T respectively. In subsequent rounds they both cooperate, and each player gets R per interaction. Function A arises whenever the strategy receives payoff x in the first round of the repeated game, and payoff y in all the subsequent stages, it is given by Note that when δ = 0 only one interaction takes place, thus only x matters. Likewise, when δ = 1 and the game is played forever the first interaction payoff vanishes.
Function B: Distinct behaviour in the first two rounds In the previous case one round was enough for the strategies to settle on one action. Now consider the case TFT playing against SALLC. Here the strategies need two rounds before settling. In the first round they will play C and D respectively. In the second round,TFT switches to defection and SALLC starts cooperating. From the third round on it is all mutual cooperation. So more generally, a strategy gets a payoff x in the first round, a payoff y in the second round and a payoff z from then on. This means that the payoff has the form Simplifying, we get an expression for function B Function C: Actions cycle with period two In the previous two cases strategies settle on one action after one or two rounds. We can also have cycling actions, with strategies repeating themselves every few rounds. Consider the case of TFT playing against STFT. In the first round they cooperate and defect respectively. In the second round they both copy each other's behaviour, defecting and cooperating respectively.
The strategies struggle to settle on one action, failing to coordinate the cycle repeats forever. The general form for this payoff is given by when simplified, we get function C C(x, y) = x + yδ 1 + δ .
Function D: Actions cycle with period four Finally, we can get longer cycles. Consider the case of TFT playing against TFT −1 . Both strategies will start cooperating in the first round, each getting R. In the second round TFT copies her opponent's last action, sticking to cooperation while TFT −1 reverses her opponent's action switching to defection; TFT gets S and TFT −1 gets T. In the third interaction TFT switches back to defection, and TFT −1 reverses the last action of her opponent, playing D; they both get P on mutual defection. In the fourth round TFT sticks to defection, and TFT −1 switches to cooperation, getting T and S respectively. In round five they both cooperate, starting the cycle again.

B Existence and location of stable mixtures
The accuracy of the theoretical prediction depends on non-homogeneous populations being transient [1]. In particular, for the monomorphic transition matrix to be accurate, we require that a new mutation only arises after the population has fixated [2]. A sufficiently small mutation rate guarantees such accuracy if there are no stable internal equilibria between any two strategies in the game. Since mutation is sufficiently small, the dynamics will be confined to the edges of the simplex. This means that we need only inspect all pairs of strategies in the game for possible coexistence. It turns out that there are internal mixed equilibria in the the repeated prisoner's dilemma. We can circumvent this difficulty if internal equilibria are close enough to the boundary of a homogeneous state. For any stable mixed equilibriax on the edge we require that either When condition 9 or condition 10 is met, the non-homogeneous states are guaranteed to be transient again, even in the presence of internal equilibria -given a sufficiently small mutation rate (see figure 1 for an illustration). Table 1 summarizes stable mixed equilibria for the game matrix presented in section A. It is easy to verify that all internal equilibria in Table 1 vanish if δ < 1/3. However, this is a meaningless area of the parameter space, because on average the game is played less than twice, rendering repetition ineffective. We therefore require that for all internal equilibria are sufficiently close to the boundary of a homogeneous population as per equations 9 and 10. For the standard parameters that we analyze (R = 3.0, S = 0.0, T = 4.0, P = 1.0) and N = 50, this is guaranteed whenever δ ≥ 1 75 24 + √ 2451 ≈ 0.980101. Which means that the choice of δ = 0.99 guarantees the accuracy of the prediction for all conditions in Figure 2 in the main text.