Double-dealing behavior potentially promotes cooperation in evolutionary prisoner's dilemma games

We investigate the effects of double-dealing behavior on cooperation in evolutionary games. Each individual in a population has two attributes: character and action. One's action may be consistent with one's character or not. We provide analytical results by a mean-field description of evolutionary prisoner's dilemma games (PDGs). Moreover, we give numerical results on different networks, ranging from square lattices to scale-free networks (SFNs). Two important conclusions have been drawn from the results on SFNs. Firstly, if only non-influential individuals (those with low degrees) have chances of becoming double-dealers, cooperation is certain to deteriorate. Secondly, when influential individuals (those with high degrees) adopt double-dealing behavior moderately, cooperation would be enhanced, which is in opposition to the traditional belief. These results help us to understand better the social phenomenon of the existence of double-dealers. In addition to the PDG, other types of games including the snowdrift game, the stag-hunt game and the harmony game have also been studied on our model. The results for these three games are also presented, which are consistent with the results for the PDG qualitatively. Furthermore, we consider our model under the co-evolution framework, in which the probability of an individual changing into a double-dealer and the individual strategy both could evolve during the evolutionary process.

Zimmermann et al [27]. In their work, unsatisfied individuals have chances of severing D-D links and creating new links by choosing other individuals randomly from the whole population. In a later work by Ebel and Bornholdt [28], all individuals have opportunities to change their neighbors for maximizing their own payoffs. Inspired by the seminal contributions of these authors, many co-evolutionary rules have arisen [29]- [33]. Further issues in co-evolutionary games, such as that the impact of co-evolutionary rules may depend strongly on the time scales associated with the strategy and structure evolution, have also been considered [34]- [36]. Moreover, some other features of individuals, such as individual noise level, could evolve with individual strategy to help reach a higher individual payoff [37,38].
In these previous studies, one question has not been taken into account. During the evolution, each individual has the character as a cooperator or defector, whereas she may adopt an action opposite to her character when participating in games. If the above happens, we call it double-dealing and we call her a double-dealer. Then the question arises: How does double-dealing behavior affect cooperation in evolutionary games? Actually, this phenomenon is not at all strange to us. Double-dealers really appear in our lives and sometimes their deeds are completely inappropriate to their reputation. First we consider persons with the character cooperator defect in games. Two reasons may lead to double-dealing behavior. One is that some persons are really dissemblers who want to gain a good reputation and at the same time grab considerable benefits by defection. The other is that some persons are indeed ready to cooperate, but adopt error actions just accidentally due to a lack of caution in interactions. Despite the different causes, double-dealing behavior leads to the same results: double-dealers are regarded as cooperators by others, yet choose to defect when participate in games. Then we consider the inverse thing, that is, some persons who are regarded as defectors adopt the strategy of cooperation. There are two possible reasons. One is that those persons with bad reputation are guided by their conscience and thus change their decisions in favour of cooperation. The other possibility is owing to a lack of caution since the choice of cooperation is not their original intention. The existence of double-dealers is a realistic and complex social phenomenon. The double-dealing behavior of individuals could make an immediate impact on strategy evolution dynamics. When a player learns from a double-dealer, the character of the individual will be learned but not her actual action. In this paper, we will investigate the effects of double-dealing behavior on cooperation in the framework of evolutionary PDGs by considering different types of network structures from regular lattices to complex networks. The results show that different network structures will lead to different results. Moreover, some amazing conclusions will come from our studies. In particular, in SFNs, moderate double-dealing behavior by influential individuals who own high degrees could promote the cooperation level of the whole population. In addition, other types of games, such as the snowdrift game (SDG), the stag-hunt game (SHG) and the so-called harmony game (HG), will be considered, too.
The rest of this paper is organized as follows. In section 2, we define our model for evolutionary PDGs in which double-dealing individuals will be introduced. In section 3, the mean-field description of the model is given. Numerical results and discussions are presented in section 4. We carried out our model on different networks, including two kinds of square lattices, random networks, regular random networks (RRNs) and SFNs. In section 5, other types of games including the SDG, SHG and HG have been considered. In this section, we also present some preliminary results for the situation where the double-dealing behavior co-evolves with strategy pattern. In the last section, we give a summary.

Model
In our model, we assign each individual two attributes at first. One is the character as a cooperator or a defector, which is public information. The other is the actual action, either cooperation or defection, during interactions with others. The character and action of an individual may be consistent or opposite with a certain probability, which will be denoted as follows: in the rest of the paper, we mean an individual with the character as a cooperator (or a defector) by the term 'a cooperator' (or 'a defector').
One generation of evolution dynamics contains two steps, playing games and updating strategies. In the first step, each individual i plays PDGs with her neighbors simultaneously and cumulates the total payoff p i , which is the sum over all interactions she participates in. For simplicity but without loss of generality, we set the payoff matrix of PDG as T = 1 + r , R = 1, P = 0 and S = −r , in which r denotes the ratio of the costs to the net benefits of cooperation [39]. Initially, the character of each individual is randomly assigned as a cooperator or a defector. In the consequent interactions, all individuals probably become double-dealers with given probabilities. Specifically, for a cooperator, we assume that she has the probability u to practice defection when playing games with all her neighbors. In contrast, for a defector, she has the probability v to practice cooperation when participating in games. In the second step, each individual i randomly chooses one of her neighbors j and then replaces her own strategy s i with the chosen neighbor's strategy s j according to the probability [40] where K characterizes the intensity of the noise related to the replacement of strategy and we set K = 0.1 throughout our work. It should be especially pointed out that in the update process, the 'strategy' of an individual means her character, but not the actual action adopted by her. Since individuals' actions may not be consistent with their characters, it is possible that an individual learns the strategy of cooperation from a cooperator even though this cooperator neighbor defects in games and selfishly gains great benefit for herself. The opposite situation may also occur. Synchronous strategy update is used in the present work.

Mean-field description
We consider an infinite well-mixed population with a fraction of cooperators ρ and defectors 1 − ρ as individuals' characters. By assuming that the characters of a cooperator and a defector are passed to their neighbors according to their relative performance, that is, as compared to the average population payoff P = ρ P C + (1 − ρ)P D , the replicator equation for the dynamics can be written asρ = ρ(P C − P) = ρ(1 − ρ)(P C − P D ) [2], where the payoff P C is for an individual with the character as a cooperator (C char ) and P D an individual with the character as a defector (D char ). First, we consider a C char . The payoff can be given by The expression is composed of four parts separated by payoff matrix parameters. We explain the first part as an example. For a clear illustration, we present figure 1 to show the interactions of a focal C char in a well-mixed population. According to the description of our model, a C char defects with the probability u, yet she adopts cooperation with the probability 1 − u. We call her an individual with the action cooperation (C act ) or an individual with the action defection (D act ) according to her actual action, respectively. In the rest of the population, the fraction of C char s is ρ, which is multiplied by 1 − u, and we could get the fraction of C act s among C char s. On the other hand, among D char s, there also exist C act s and the fraction is v(1 − ρ). These two parts will bring R to the focal player. The other terms in expression (2) can be derived by the same method. Similarly, the payoff of a D char is given by Given the payoff matrix set in our model, we can obtain the following dynamics equation immediately:ρ This equation has two equilibria, ρ * 1 = 0 and ρ * 2 = 1. Linear stability analysis shows that, under the condition u + v < 1, the equilibrium ρ * 1 = 0 is stable, whereas in the case of u + v > 1, the other equilibrium ρ * 2 = 1 is stable. Moreover, the equation is invariant under the transformation u → v, v → u. Therefore, it can be directly concluded that in the u-v space, the equation would be symmetric with respect to u = v. On the other hand, under the transformation u → 1 − v, v → 1 − u, a minus sign appears in the right-hand side of the equation. Then it can be concluded that the equation owns an antisymmetry with respect to u + v = 1, which means that the stabilities of the two equilibria ρ * 1 = 0 and ρ * 2 = 1 exchange. In both cases, ρ t reaches its steady state after a transient time less than 1000 generations. We employ SFNs here, which will be described in detail later. The payoff matrix parameter of PDG is r = 0.2.

Numerical results
In this section, the model on different structural networks is studied in detail. We monitor the fraction of C char s ρ and the payoff P avg that is the average over the whole population. It is easy to confirm that the fraction of C act s, denoted by ρ act , is proportional to P avg . It implies that large ρ act leads to high P avg and vice versa. Hence, the information about ρ act could be exactly obtained from P avg . We run 1000 independent realizations and each realization runs for a total of 6000 generations. Each data point in steady state results from the last 2000 generations. In order to confirm the validity of our results under the above conditions, we plot ρ t , averaged over 1000 realizations at any given time, as a function of time, shown in figure 2 for two sets of parameters u = 0.3, v = 0.3 and u = 0.8, v = 0.15. In both cases, ρ t reaches its steady state after a transient time less than 1000 generations. Further simulations show that 4000 generations are sufficient for transient time for the following numerical simulations in this work, even at the parameters where the extinction of cooperation is approached. Let us start from square lattices with periodic boundary conditions and with N = 100 × 100 nodes. Square lattices with degree z = 4 and 8 are employed and the payoff matrix parameter r = 0.01 is set. Contour plots in figure 3 show ρ and P avg in the u-v space on these two square lattices. As shown in figures 3(a) and (c), the results are in good agreement with our analytical prediction above except that the network reciprocity sustains cooperation at u = v = 0. In figure 3(a), ρ = 0 below the diagonal of u + v = 1 and ρ = 1 above, which is symmetric with respect to u = v and antisymmetric with respect to u + v = 1. In figure 3(c) for z = 8, ρ is a decreasing function as u and v increase. Since ρ is not zero at u = v = 0, the antisymmetry about u + v = 1 relates to ρ and 1 − ρ, but not 0 and 1. Meanwhile, the symmetry about u = v remains. As shown in figures 3(b) and (d), P avg is only high for small u and large v in comparison with that at u = v = 0. In a word, double-dealing behavior damages not only ρ but also ρ act on regular lattices. 7 Figure 3. Contour plots in the u-v space for the fraction of C char s ρ and the average payoff P avg , on two regular square lattices with degrees z = 4 and 8. (a, b) ρ and P avg for z = 4; (c, d) ρ and P avg for z = 8, respectively. The symmetric features are shown clearly, which are in good agreement with our analytical prediction. Clearly, both ρ and ρ act are damaged by double-dealing behavior on regular lattices. The payoff matrix parameter of PDG is r = 0.01.
Then we turn to random networks which are based on Erdös-Rényi (ER) networks [41]. The system size is N = 5000 and the average degree is z = 8. Links are created by randomly choosing two individuals each time until N z/2 links are connected. Results for ρ and P avg in the u-v space on ER networks are given in figures 4(a) and (b).
It can be clearly observed that the symmetry about u = v breaks in figure 4(a), yet the antisymmetry about u + v = 1 still exists. To find the reason for the symmetry breaking, we carry out our model on RRNs, in which each individual has exactly the same degree. Results are shown in figures 4(c) and (d), presenting ρ and P avg , respectively. Both the symmetry and antisymmetry remain in figure 4(c). Comparing figure 4(a) with (c), we conjecture that it is most likely that the degree heterogeneity in ER networks breaks the symmetry about u = v.
Until now, we have considered our model on square lattices, ER networks and RRNs. The results are in agreement with the mean-field description, whereas no improvement in cooperation was found. Double-dealing behavior of individuals is adverse to cooperation on these three types of networks. However, there exist virtually many real networks in our world exhibiting the scale-free property in degree distribution [4]. In the following, we will apply the model to a standard SFN suggested by Barabási and Albert [10]. The network starts with m 0 connected individuals. For each step, a new individual with m links is added to the population and the new one preferentially links to those that have large degrees already until the population size reaches N . In simulations, we set m 0 = m = 3 and the size of population N = 3000. Figure 5 gives the contour plots for ρ and P avg in the u-v space on SFNs.  As can be noted in figure 5(a), the symmetries are just similar to those on ER networks, that the antisymmetry about u + v = 1 remains, whereas the symmetry about u = v breaks due to degree heterogeneity of the networks. Furthermore, in figure 5(a), we can see that for a small value of u, ρ changes non-monotonically as v increases. For instance, at u = 0, ρ first increases to a maximum at about v = 0.2 and then decreases as v continues to increase. On the other hand, with a given value of v, ρ is a decreasing function of u. Moreover, we are concerned not only with ρ but also ρ act . In figure 5(b), P avg is shown, which is proportional to ρ act . In figure 5(b), if we draw a line from the point u = v = 0, we can find a non-monotonic change as the distance from u = v = 0 along the line increases unless the slope is too small. This means that, under the condition of the existence of double-dealing behavior with small probabilities, both ρ and ρ act are promoted. It is worth mentioning that u = 0 means C char s defect with a certain probability u, which can affect positively the level of cooperation.
Before further explaining for figure 5, let us discuss some special cases (u = 1, v = 0; u = 0, v = 1; u = v = 1/2). In the case of u = 1 and v = 0, all individuals adopt the action of defection, which is exactly the lowest level of ρ act . Since P avg is proportional to ρ act , the lowest P avg should be obtained. In the case of u = 0 and v = 1, in contrast, all individuals adopt the action of cooperation, which is the highest level of ρ act , and consequently the highest P avg should be achieved. As for another special case, u = v = 1/2, individuals adopt cooperation or defection with equal probability, that is, they act randomly, which will lead to a moderate average payoff between the maximum and the minimum. As shown in figure 5(b), P avg is the lowest at the lower right part (including the point u = 1, v = 0) and the highest at the upper left part (including the point u = 0, v = 1), whereas at u = v = 1/2, P avg is on a moderate level corresponding to moderate ρ act , which are in agreement with these special cases. In addition, not only on SFNs but also on other types of networks, square lattices, ER networks and RRNs, contour plots for P avg , which is proportional to ρ act , are in good agreement with these special cases (see figures 3 and 4).
The phenomena in figure 5 can be explained as follows. First, we consider ρ for any given u. When v is small, there are only a few D char s who cooperate and get few payoffs, which is adverse to the spread of D char and so ρ increases. As v increases, more D char s cooperate. With high probability, two connected D char s cooperate at the same time and bring themselves more payoffs, which is advantageous to the spread of D char and so ρ decreases. In addition, the increase of ρ is fast at small v and the decrease is slow at large v. Then, we consider ρ act . For simplicity, we let u = 0. The change of ρ act against v can be explained by the equation ρ act = ρ + (1 − ρ)v. For small v, ρ act is mainly determined by the first term of the equation, ρ, so it first increases and then decreases. For large v, ρ act is determined by the second term, (1 − ρ)v. As ρ, which is nearly invariable for large v, is taken into account, we can see that ρ act increases to a high level due to a large value of v.
To further understand the results in figure 5, we modify the model by considering a degree threshold k th for individuals who are double-dealers. As is well known, hubs play a crucial role in maintaining cooperation in SFNs. In order to gain a clear picture of different roles of hub double-dealers and non-hub double-dealers, we consider the following two situations. Firstly, we assume that only individuals whose degrees are less than k th have chances of becoming double-dealers, under which condition most individuals could become doubledealers, as a great number of low-degree individuals exist in SFNs. Secondly, we assume that only individuals whose degrees exceed k th are capable of becoming double-dealers. We set k th = 30 for simulations and approximately 1% of individuals have degrees over k th . Figure 6  gives contour plots in the u-v space for ρ and P avg in these two modified models. With an increase of u, ρ decreases for a given v. In figure 6(b), P avg also shows similar changes as figure 5(b) along the bias from u = v = 0. Furthermore, we also note significant differences between these two groups of figures. The blue region in the plot for ρ shown by figure 6(a) extends remarkably and figure 6(b) shows P avg also getting worse in comparison with figures 5(a) and (b), respectively. The reason for the worse results is that double-dealing behavior is prohibited in hubs. In other words, only non-hub individuals having chances of becoming double-dealers will harm cooperation.
When we turn our attention to figures 6(c) and (d), we find two significant differences in comparison with figures 5(a) and (b). We first focus on ρ and P avg against v. For large v, no matter what the value of u, red color corresponding to high level of cooperation occupies the top of the two plots for ρ and P avg , which is proportional to ρ act . For a given u, with an increase of v, ρ increases and the mechanism for the increase is the same as that in figure 5(a), whereas for large values of v, ρ does not decrease and a monotonic change occurs. This is the first difference between figures 6(c) and 5(a). It can be explained as follows. In figure 6(c), even for large v, only hubs could become double-dealers so that few connected D char s cooperate simultaneously, which cannot support the same mechanism as that in figure 5(a) for D char s increasing their payoffs. Thus ρ does not decrease for large v in figure 6(c). The first difference also gives a clue why ρ and ρ act get worse in figures 6(a) and (b) compared with figure 5. In figures 6(a) and (b), we set the threshold k th , which allows only non-hubs to perform doubledealing behavior and so hubs are confined. As a result, the effects of hubs being double-dealers are cut off and consequently ρ and ρ act get worse. Moreover, as shown in figures 6(c) and (d), for small v (for instance v = 0), ρ and ρ act first increase and then decrease against u. It is the second difference with figure 5 where ρ and ρ act always decrease with an increase of u. In figure 5, an essential condition for promoting cooperation in character (or in action) is v = 0; that is, the existence of double-dealing behavior of D char s is required, whereas, in figures 6(c) and (d), even for v = 0, cooperation can be promoted by increasing u. The second difference is of great significance and reveals a meaningful phenomenon in the real world that doubledealing behavior of those influential persons whose roles are just the same as hubs in SFNs has positive effects on cooperation in society. In particular, against the traditional belief, the influential positive persons' double-dealing behavior could promote cooperation.
We give below some speculation on reasons for the non-monotonic change of ρ and ρ act at v = 0 in figures 5(c) and (d). For small u, only a few C char hubs (C-hubs) practice defection and improve their payoffs by exploiting others; consequently, more D char s learn from C char s, leading to an increase of ρ. Along with an increase of u, more C-hubs practice defection, which makes it possible that two connected C-hubs will both turn into double-dealers and receive lower payoffs. Thus, we suspect that it is the interactions among double-dealing C-hubs that cause the later decrease of ρ.
To confirm the above speculation, we extend the model further to SFNs with degree-degree correlation r k defined by Newman [42]. In the case of negative r k , individuals with high degrees tend to connect to those with small degrees, and vice versa. We generate networks with degree-degree correlation by employing the Xulvi-Brunet-Sokolov (XS) algorithm [43]. We set r k = −0.1 and 0.3 for negative and positive correlation, respectively. The results for ρ and P avg in the u-v space on SFNs with degree-degree correlations are shown in figure 7.
As shown in figure 7, unless the ratio of u to v is too small, cooperation is promoted in the two cases of negative r k and positive r k by double-dealing behavior, in contrast with that at u = v = 0. Furthermore, comparing the upper diagrams (negative r k ) and the lower diagrams (positive r k ), we find the level of cooperation is remarkably higher in the former. It can be concluded that positive degree-degree correlation in networks will restrain cooperation, while the interactions among hubs are enhanced. Thus, our speculation for figures 6(c) and (d) is reasonable: it is the increasing interactions among double-dealing C-hubs that are responsible for the decrease of ρ for large u at v = 0.

Extensions of the model and results
Now we are interested in the effects of double-dealing behavior on other types of games. In this section, we will only focus on SFNs. By setting the reward R = 1 and the punishment P = 0 and then varying −1 < S < 1 and 0 < T < 2 as the payoff matrix parameters, one can cover other interesting games systematically, including the SDG, SHG and the so-called HG. Together with the PDG, these four games just hold four different quadrants in the two-dimensional T -S parameter plane. Figure 8 shows the results for the effects of double-dealing behavior on those three types of evolutionary games. Contour plots in the u-v space for the fraction of C char s ρ and the average payoff P avg on SFNs with degree-degree correlation r k . (a, b) ρ and P avg for a negative degree-degree correlation, r k = −0.1. (c, d) ρ and P avg for a positive degree-degree correlation, r k = 0.3. Comparing the top diagrams and the bottom ones, one can observe that the level of cooperation is remarkably higher in the networks with negative degree-degree correlation. The payoff matrix parameter is r = 0.2.
The top panels in figure 8 are the results for the SDG, the middle ones for the SHG and the bottom ones for HG. The fractions of C char s, ρ, are shown in contour plots in the u-v space. Except for the difference in game type, the left column depicts a similar situation to that in figure 5(a). The middle and right columns depict two additional situations considering a degree threshold, that is, only individuals whose degrees are lower/higher than the threshold could become double-dealers, which are similar to figures 6(a) and (c). We first discuss the results for the SDG. In contrast to figure 5(a) for the PDG, there are two significant differences. One is the absence of the optimal behavior of ρ against v at small u; the other is the appearance of the optimal behavior of ρ against u at small v. These observations state that, in the SDG, only double-dealing behavior of C char s is helpful for cooperation. It is our important finding that double-dealing behavior of D char s cannot improve cooperation, which is strongly against common sense. Considering that the middle and right columns show the effects of double-dealing behavior of low-degree and high-degree individuals, respectively, the results in the left column could be explained by the combination of the other two. Moreover, taking into account that most individuals have low degrees in SFNs, the left column will be decided mainly by the middle one. It should be mentioned that figure 8(c) for the SDG, in which only high-degree individuals could become double-dealers, is similar to figure 6(c) for the PDG. The middle panels of figure 8 give the results for the SHG. Interestingly, the same features could be found when comparing with the results for the PDG. The bottom panels show the results for the HG. Taking into  account that ρ is already extremely high at u = v = 0, we find that they are similar to the results for the SDG. Based on the above results, we conjecture that the impacts of double-dealing behavior are similar in the upper right and upper left quadrants, and at the same time, those in the lower left and lower right quadrants are similar, in the payoff matrix T -S parameter plane.
Motivated by the work of Szolnoki et al [37,38], we are interested in the final distribution D(u) of the probability u for an individual acting differently from her character, in the case that u is allowed to co-evolve with her strategy. D(u) is defined as the percentage of individuals taking u in the steady state. In order to do this, we follow the line in [37,38]. We establish a set of u that contains 51 different values ranging from 0 to 1 and with step 0.02. Then we assign each individual a value of u randomly chosen from this set. For the evolution of u assigned to each individual, we introduce another updating rule in the second stage defined in section 2, the form of which is the same as that defined by equation (1). When the co-evolution reaches its steady state assured by the time sequence of ρ t as in previous simulations, we calculate the distribution D(u).

Conclusion
We present a model under the evolutionary games framework considering the existence of double-dealers in a population and investigate the effects of double-dealing behavior on cooperation. We have carried out our model on different networks for evolutionary PDGs, ranging from square lattices to complex networks. Our results reveal that double-dealing behavior does harm to cooperation of the population on square lattices, ER networks and RRNs. However, considering that networks in the real world usually exhibit scale-free feature in degree distribution, we apply our model to SFNs. We find that if double-dealing behavior is confined to low-degree individuals, the level of cooperation will certainly decrease. In contrast, if high-degree individuals with the character cooperator defect moderately, cooperation can be promoted. These results really overthrow our traditional belief. In society, it is well accepted that positively influential persons should act in accordance with their characters. Moreover, holding the outside and the inside consistent is respected as a moral standard by the public. In contrast, our results show that those influential individuals who take opposite actions to their characters might promote cooperation. Furthermore, we modify our model by setting a degree threshold for individuals and carry out our model on SFNs with degree-degree correlation and then we give an explanation for our results.
In short, double-dealing behavior would not be always bad for our society although it is against our traditional moral standard. Those influential individuals who are double-dealers (with moderate probabilities) could promote cooperation in society. In addition, double-dealing behavior studied in the present work is closely related to double moral, which has been studied very recently by Helbing et al [44]- [46]. Their studies are extremely inspiring and provide some interesting conclusions. Here, we would like to present some comparisons and discussions. In their work, four different behavioral strategies have been considered.
In addition to the classical strategies of cooperation and defection, they consider punishing cooperators (PC) and punishing defectors (PD) as additional strategies. PD are defectors who punish other defectors despite being non-cooperative themselves; thus their behavior is double moral. Since PD punish defectors, their actions are somewhat like the way D char s cooperate in our model because their punishment actions are essentially favorable to cooperation. In particular, the parameter v in this work, which is the probability for D char s to become double dealers, is somewhat like the parameter f in [44], which is the punishment fine. The results are also coincident. Our results in figure 5(a) show that, at u = 0, ρ first increases and then decreases against v and, in [44], the inset of the right panel in figure 1 shows that there exists an optimal level of f for which the cooperation level reaches the maximal. Furthermore, in [44], they considered punishment actions by cooperators, which support cooperation in a sense, yet they did not consider double moral behavior by cooperators. However, we have also studied the effects of double-dealing behavior by C char s and found that cooperation could be promoted to some extent by C char s' double-dealing behavior.