Social Pressure and Environmental Effects on Networks: A Path to Cooperation

In this paper, we study how the pro-social impact due to the vigilance by other individuals is conditioned by both environmental and evolutionary effects. To this aim, we consider a known model where agents play a Prisoner's Dilemma Game (PDG) among themselves and the pay-off matrix of an individual changes according to the number of neighbors that are"vigilant", i.e., how many neighbors watch out for her behavior. In particular, the temptation to defect decreases linearly with the number of vigilant neighbors. This model proved to support cooperation in specific conditions, and here we check its robustness with different topologies, microscopical update rules and initial conditions. By means of many numerical simulations and few theoretical considerations, we find in which situations the vigilance by the others is more effective in favoring cooperative behaviors and when its influence is weaker.


I. INTRODUCTION
The emergence and survival of cooperative and, more in general, pro-social behaviors in nature and human societies has been one of the most debated issues in natural and social sciences for a long time [1][2][3]. Indeed, there is an apparent, yet paradoxical, contrast between the advantages of selfish strategies at the level of individuals, which should be expected to be mostly preferred by natural selection, and the ubiquitous presence of cooperation and altruism at the level of communities (not necessarily in humans) [4]. In order to solve this problem, over recent decades many different mechanisms have been proposed [5][6][7][8]. What emerges from this great deal of study is that there is not a single universal mechanism that enhances cooperation against defection, but that different phenomena have a different explanation. In particular, if we limit our discussion to the pro-social behaviors in human communities, it has been demonstrated how indirect reciprocity [9], partner selection and punishment [10,11] or gossip [12] can foster cooperative strategies in various situations.
Another factor which has shown its effectiveness in favoring human cooperation is the vigilance by others: more precisely, people tend to adopt more altruistic behaviors when they are observed by peers [13,14], or even when they simply feel they are watched [15][16][17] (Monitoring Hypothesis). In Ref. [18], a game-theoretical model able to describe this effect was presented. In particular, the effect of the vigilance was considered as a reduction of the temptation to defect in a Prisoner's Dilemma Game played in complex networks. As a result, the higher the level of vigilance, the higher the final degree of cooperation throughout a population. In that work, the behavior of the model was tested only in a few kinds of complex networks (essentially random and scale-free), with just one evolutionary rule (replicator) and always with the same initial conditions (completely random). As we have stressed above, the effect of the various mechanisms that determine the dynamics of these phenomena generally is not universal: for instance, the same topological structure can, in some cases, foster cooperation, or hinder it in different situations [7]. Therefore, in this paper, we aim to deepen and further clarify the results reported in Ref. [18], testing the robustness of its results by changing different aspects and parameters of the original model. In practice, we will focus on three factors: the topology (that is, the network on which the population evolves), the evolutionary algorithm (the rule following which the individuals adapt their strategies), and the initial conditions. This is important because, in the real world, communities live in different environments and evolve in different ways, so that a test of this type allows for better evaluating the reliability of the model and the entirety of its results.
The paper is organized as follows: in the next section, we will define the model. Then, in Section III, we will present the results of the simulations, and, where possible, of some theoretical analysis. Finally, in Section IV, we will discuss such results and sketch some perspectives.

II. MODEL
We consider a population of N individuals interacting through an evolutionary Prisoner's Dilemma Game (PDG) under vigilance pressure. The population is set on a given network, which is equivalent to assigning links between the individuals that can interact directly: according to the distribution of links, the topology of the system will be different. Every player is characterized by a strategy, C (cooperation) or D (defection), and, at each elementary time step, plays a round of the PDG with her neighbors, and her neighbors do the same on their turn. After each interaction, an individual i gets a payoff according to her payoff matrix: where C i , D i are the strategies adopted by the player herself, and C j , D j the strategies used by the neighbor j; the fitness collected by i in a single step of the dynamics will be the sum over all the payoffs collected with each neighbor. Even though the averaged payoff per neighbor can also be used to define the fitness [19], the total payoff allows for better singling out the role of the topology for the emergence of cooperation, and is more common in literature [2,7]. Of course, to have a PDG, it must be T i > 1 ∀i; furthermore, in order to reduce the parameters of the model, we fix the value of P and restrict to the weak Prisoner's Dilemma (wPDG), that is the case P = 0 [5]. Indeed, the wPDG has been often used in literature, since it makes the model simpler preserving its main features [7,8].
Moreover, every player can be either in a vigilant state, that is, controlling her neighbors' strategy, or not. Defining the variable V i which is equal to 0 if player i is not vigilant, and equal to 1 if she is, a non vigilant individual can become vigilant following a Watts' threshold rule [18,20]: where m i is the number of neighbors of the node i that are already vigilant, k i is the degree of node i, and θ i ∈ [0, 1] the personal threshold of node i above which she becomes vigilant. In this work, we consider this threshold constant and equal for every player: θ i = θ ∀i. In order to keep the model as simple as possible, we do not take into consideration any costs for becoming vigilant. This does not affect the generality of our results: as shown in reference [18], the cost of vigilance does not change qualitatively the behavior of the model, simply shifting possible transitions towards non-cooperative states for lower values of the temptation.
The pressure due to the vigilance makes the temptation to defect effectively lower than in the absence of any external control: actually, it has already been demonstrated that people feel uncomfortable if they adopt anti-social behaviors just feeling observed [16,17]. In terms of the payoff matrix, we can model this phenomenon linking the temptation entry in the matrix (1) to the number of vigilant neighbors: where b is the value of the temptation in the absence of vigilance.

Evolutionary Rules
After all the individuals have played a round of the game, they update their strategies synchronously, according to a given rule. In this work, we have studied two different update algorithms: unconditional imitation (UI), and a mixed update rule (MUR), inspired by Ref. [21], and compared their performance to replicator (REP), the update rule used in Ref. [18].
Replicator -With REP, we proceed as follows. Let s i be the strategy the individual i is playing, and π i her payoff. With the proportional imitation rule, each individual i randomly chooses one from her k i neighbors (individual j) and adopts her strategy with probability: where Φ = max(k i , k j )[max(1, T )−min(0, S)] so that p t ij ∈ [0, 1]. Unconditional Imitation -with the UI rule, in order to evolve her strategy, every player imitates the one adopted by the neighbor that has obtained the best payoff, provided it is larger than her own (otherwise, nothing happens).
Mixed update rule -in this case, with probability q, the player simply imitates the strategy of one of her neighbors picked up at random, and with probability 1 − q evolves according the UI rule described above. While the REP rule is more representative of evolutionary phenomena in biology, this one describes better the dynamics underlying the decision making processes of human beings: therefore, it depicts more realistically social phenomena [21,22].
In any case, whatever the update rule is, the strategies of the individuals are updated synchronously, and, after the update, the payoffs of the players are set again to zero. Finally, after revising their strategies and payoffs, players update their vigilance status, according to the rule given in Equation (2).

III. RESULTS
We accomplished many simulations of the model defined in the previous section with different parameter values, topology, and update rules, in order to generalize the results presented in Ref. [18]. In order to characterize and analyze the behavior of the model, we will consider the quantity ρ , that is, the final average cooperator density, measured after a transient of 100,000 generations and averaged over a time window of 100 generations, if the system has reached a stationary state defined by the slope of the average fraction of cooperators ρ being inferior to 10 −2 .
If not, we let the system evolve subsequent time windows of 100 generations. In this way, it will be easy to discern when the cooperation finally invades the system or is removed, or possible intermediate configurations.
All of the simulations presented here have been carried out with systems of N = 1000 individuals, large enough to consider negligible the finite size effects [23]. Moreover, we confirmed the robustness of our results by accomplishing some simulations with smaller populations.
We will take into consideration basically monoplex networks, and the topologies used in this paper are: (i) Erdös-Rényi (ER) random networks [24]; (ii) Barabasi-Albert (BA) scale-free networks [25]; (iii) regular two-dimensional lattices (with absorbing boundary conditions); and (iv) link-added small-world (LASW) random networks [26,27]. Unless explicitly indicated, the initial conditions are totally random, so that at the initial stage of the dynamics, on average, there are 50% of cooperators; analogously, the initial vigilant players are also picked up at random: therefore, if only cooperators can be vigilant, we will have at the beginning the 25% of vigilant cooperators. Otherwise, in Section III D, the initial vigilant individuals will be 50% of the population, equally distributed among cooperators and defectors.
Finally, we stress the fact that we aim to test the robustness of the outcomes presented in [18], so that in each of the next subsections, we will start usually from the original results and change only one feature of the model. Therefore, in Section III A, we will change, with respect to the analogous cases in [18], only the update rule, in Section III B the topology, and so on.

A. Influence of the Update Rule
Here, we check how the behavior of the system changes by varying the way the individuals evolve their strategies, compared to Ref. [18] (Section IIIA) where the REP rule is used, so we consider the same topology for comparison purposes, i.e., monoplex ER and BA networks with average degrees of z = 4 and z = 16.

Unconditional Imitation
In Figure 1, the final average cooperation density as a function of the temptation b is shown for different values of the threshold θ in the ER case, while, in Figure 2, we report the same results for a BA network.
As it is easy to see, the cooperation is much more supported in ER topology than in BA. Comparing such results with the ones presented in Ref. [18], we notice that, with the REP rule, the cooperation is favored both in ER and BA networks. Therefore, we can conclude that the presence of hubs hinders the emergence of cooperative behaviors with a purely deterministic evolution algorithm, i.e., a small amount of noise is necessary for cooperation to overcome this barrier.
The fact that, on scale-free topology, the unconditional imitation hinders cooperation is further confirmed by taking into consideration a duplex BA-BA network, which is when the network of game dynamics and the one of vigilance dynamics are separated. With replicator update, the system tends to fully cooperative final configurations up to large values of b [18]; on the contrary, we verified numerically also that, with the UI rule, the final level of cooperation is very low already for b ≃ 1.

Mixed Update Rule
We want now to check the robustness of the model with respect to the MUR rule, which is more realistic in the human interactions [22]. Apart from the case of q = 0.3 in ER networks (Figure 3), θ (vigilance) has no effect on cooperation, as shown in Figures 4, 5 and 6. On the other hand, the value of the parameter q ( i.e., the level of non-strategic imitation) does have effect: both for ER and BA. When the probability of following the non-strategic imitation rule is low (q = 0.3) (Figures 3 and 5), we can find some levels of cooperation, but, with higher values (i.e., q = 0.5) (Figures 4 and 6), cooperation is hindered, as it happened in Section III A 1.
It is worth stressing the fact that increasing the weight of the non-strategic imitation hinders the cooperation. This could be explained by considering that, by the UI rule, cooperators connected with other cooperators have a very high fitness and are surely imitated by a linked defector. To clarify this idea, let us consider a defector j with four neighbors, among which there is only one cooperator i. Since cooperators tend to cluster, it is likely that the three defectors are connected to other defectors, getting in a single game round a fitness equal to four, whilst i will be probably linked to three cooperators, gaining 3b (for the sake of simplicity, we assume that every individual has exactly four links). Thus, if b > 4/3, player j will definitely turn herself cooperator if evolved by the UI rule, while remain a defector with probability 3/4 following the non-strategic update algorithm. Indeed, in all Figures 3-6, we observe that the final cooperator density practically vanishes just around b ≈ 1.2-1.3, coherently with the above considerations.

B. Other Topologies
Up to now, we have considered the most classical examples of complex topologies, that is, ER and BA networks. Here, we aim to check the behavior of the model on topological structures with different features. In particular, ER and BA networks differ mainly for the fact that, in the former, there are no hubs (nodes with much more connections with respect to the average), contrarily to what happens in scale-free BA networks [28]. In any case, both have a small diameter ( i.e., the average distance between two nodes picked up at random scales as the logarithm of the system size), and a small clustering coefficient (i.e., the probability that two neighbors of a third node are also neighbors is much smaller than 1). Therefore, it is worth considering networks with one or both diameters and clustering coefficients different from ER and BA networks. As already hinted, we consider here only the replicator evolution rule, in order to compare the role of topology with the original results in [18].
For this purpose, we took into consideration a Watts-Strogatz Small-World topology, which has the property of behaving locally as a regular lattice-like network (i.e., high clustering coefficient), but as a random network globally (small diameter). Moreover, we built such a network following a different procedure from the one presented in Refs. [26,29]: starting from a regular square lattice of N = 1000 nodes, each one with z = 4 neighbors, we added links between non-connected nodes with a probability p, as in the LASW model defined in Ref. [27]. In this way, by tuning the parameter p, we can explore the lattice (p = 0), and small-world (0 < p 2z/N ) topologies. Now, as illustrated in Figure 7, we see how, in the lattice, the system cannot sustain cooperation (left graph), but increasing the density of short-cuts, the cooperation is mostly enhanced, even better than in ER topology (middle and right graphs). Interestingly, the results do not depend on θ, apart from the fact that defection easily overcomes cooperation when θ = 1 already for small values of b.

C. Different Initial Conditions
A simple mean-field analysis of the model suggests that the outcome of the dynamics should also depend on the initial conditions, in particular on the initial distribution of the vigilant players. Actually, the vigilance can have an effective influence on the evolution of the system only if the vigilant individuals are enough to make the others vigilant too, following Equation (2). Now, considering a mean-field approach, the probability that an individual with k connections has initially m vigilant neighbors is where a 0 is the initial density of the vigilant individuals. Then, the average density of vigilant neighbors at the beginning of the dynamics can be easily computed: Therefore, the effect of vigilance should become noticeable for θ < a 0 : since we usually set initially half of cooperators as also being vigilant, we expect a transition from high cooperation to defection for θ larger than a critical θ * such that where ρ 0 = 0.5 is the initial cooperator density. Of course, we also expect that the network structure changes at least partially this picture. In fact, the influence of the initial conditions is almost completely removed in non-trivial topologies, as we are going to show in the following.
In Figure 8, we present the final cooperator distribution for a system on a square lattice evolving by the REP rule. As it is easy to realize, if the number and distribution of initial vigilant individuals is such that no other player can be activated, then there will be no effect of the vigilance and the cooperation vanishes already for small values of the temptation b. On the contrary, as the initial distribution allows, even through statistical fluctuations, that some inactive player can have enough vigilant neighbors to get activated, then the number of vigilant individuals soon increases and the system ends up in a configuration with a higher level of cooperation, independently from the initial number of vigilant agents. This is true also in ER random networks, as shown in Figure 9: in the end, there is practically no effect of the initial vigilant density on the final fate of the dynamics. Indeed, as can be proven by comparing these results with the Figure 1a of the Ref. [18], ρ is always very close to the value of the case a 0 = 0.25, apart some slight differences. This same picture holds for BA networks as well: also with this topology, the final level of cooperation does not depend on the initial distribution of the activated players, as reported in Figure 10.
Therefore, we can finally state that the dynamics turn out to be robust with respect to varying the initial conditions, so that what has been presented in the previous subsections can be considered as general results with respect to the initial configuration of the system.

D. Case of Vigilant Defectors
Until now, we have set that only cooperators can also be vigilant players. In fact, in a PDG, defectors also have interest in being connected with cooperators, so it is plausible to also consider a situation where someone who is not a cooperator can be vigilant. In practice, in human interactions, those who also adopt anti-social behaviors can force the others to behave fairly [12,30,31]. Therefore, we considered the case in which every player, independently from the fact that she is either a cooperator or a defector, can be a vigilant one.
As Figures 11 and 12 show, the fact that a defector also contributes to vigilance pushes cooperation dramatically both in ER and BA monoplex networks, having full cooperation in BA monoplex networks for all values of θ and b. Now, we show the results for a square lattice, where the effect is expected to be magnified with respect to the remaining topologies. Actually, as shown in Figure 7A, in this topology, cooperation is mostly hindered.
In Figure 13 we see that already a very small probability ̺ 0 of being an initial vigilant (A) helps cooperation to invade the population, already for not-too-high vigilance (θ 0.5), and almost the same for ̺ 0 0.05 (B and C). In addition, for θ = 0.6, the final cooperator density does not vanish even at higher values of b. This is, of course, an expected result, since allowing more individuals to activate as vigilant ones decreases much more the average temptation of every player, according to the Equation (2). This outcome holds also in different topologies.

IV. DISCUSSION AND CONCLUSIONS
The model of vigilance firstly presented in Ref. [18], and further developed here, treats the pro-social effect due to the control (be it real or just perceived) by peers as a decreasing of the temptation to defect: the more neighbors watch out for the behavior of a subject, the less is the probability that the latter adopts a selfish strategy. Even though the preliminary results of this approach turned out to be promising, before considering it a viable way to describe this phenomenon, it was necessary to test its full validity. Therefore, in this paper, we have aimed to ascertain that the main features of the model are basically robust: that is, we verified that, through the mechanism of vigilance proposed here, cooperation is actually fostered for broad values of the parameters at stake and in different environmental configurations. In particular, we showed that the beneficial influence of the vigilance works in more realistic configurations, allowing us to hypothesize that what has been repeatedly observed in experiments and field observations can be actually explained as a smaller temptation to defect in presence of controllers.
The results, which, in our opinion, allow us to consider the model realistic and particularly useful are the following: • Vigilance needs the small-world effect (the presence of short-cuts connecting individuals physically far away from each other) to be efficient in fostering cooperation: indeed, in regular lattices, Figure 7, it does not help, and the small-world property is ubiquitous in most real social systems (only the smallest communities can be modeled by complete graphs, and Euclidean topologies are even more uncommon in human societies).
• Vigilance works not only when the individuals update their strategy by means of an essentially evolutionary rule (REP), but also when they evolve through more typically "social" mechanisms as pure imitation (at least on ER networks); moreover, considering the mixed rule, which takes into account the intrinsic non-strategic component of humans' decision making processes, we found that the cooperation can tolerate the influence of irrationality only when this is low (q < 0.5), coherently with the results of Ref. [22].
• Concerning again the update rule, it is worth stressing that, in heterogeneous networks (scale-free), vigilance is beneficial for cooperation only with replicator update, whilst with strategic imitation (UI) the presence of hubs appears to be detrimental for the emergence of pro-social behaviors.
• The results do not depend sensitively on the initial conditions (at least in heterogeneous topologies): this is a fundamental feature of the model since it is usually hard to determine the initial conditions for real social systems; on the other hand, in complete graphs (i.e., in mean-field approximation), this is not true, but only small human communities can be described in this way, and, in such cases, different dynamical mechanisms are at work [32].
Therefore, we can state again the main result of this work: to confirm the reliability of the model and its potentiality. Of course, further investigations are needed to validate definitively the model -in particular, experiments explicitly aimed to check if this peculiar kind of phenomenon (decreased temptation in a PDG) actually takes place when subjects play in the laboratory. These kinds of studies are already planned for future work.