Emergence of cooperation in a coupled socio-ecological system through a direct or an indirect social control mechanism

The successful management of a common pool resource (CPR) by its social agents is key to its sustainability. In order to maintain the ecological resource of such a coupled socio-ecological system, cooperative effort from the whole society is required and this necessitates proper social control mechanism to regulate the social order. In this paper, we explore the effects of two social control mechanisms: ostracism and voluntary enforcement, on agents that actively exploit the CPR within societies of different social network structures (as encapsulated by its average network degree). By means of numerical simulations and analytical approximation, we made inference on the dynamical behavior of the system in terms of its phase diagram. The phase diagram is found to contain a plethora of phases and associated phase transitions as we enumerate them based on a social and an ecological parameter. At a large average network degree, the features of the phases are observed to be similar for the direct voluntary enforcement and indirect ostracism mechanisms. However, these features exhibit a unique difference at low average network degree, where we uncover a new phenomenon of non-equilibrium oscillatory phase attributed to the voluntary enforcement mechanism.


Introduction
It is well known that the social and economic progress of mankind depends crucially on the provision of ecological resources of a quality suitable for its sustenance, such as a hospitable climate, clean air, fresh water, and nutritious food. It involves a dynamical process of interdependency between humans and the Earth's ecosystem (which is being described as a coupled socio-ecological system (SES)), with a restraint on the excessive use of these ecological resources as the most promising way to a sustainable future. When the resources are freely available to be shared among the users (common pool resources (CPR)), rational individuals tend to maximize their resource exploitation to pursue their own interest which invariably leads to the dire consequence of resource degradation. Hardin described this social-ecological equilibrium as the tragedy of the commons [1]. On the other hand, Ostrom showed that many traditional societies can initiate collective and cooperative actions for sustainable practices without the involvement of any centralized governance [2]. There are several efforts made to investigate such coupled SES, whether by using specific models [3][4][5][6][7], or by empirical and experimental studies [8][9][10]. Despite all these efforts, the basic social structure and mechanism that underlies these societies remains unexplained, because our understanding on how cooperative actions emerge from them is very limited.
Different societies employ different social actions to regulate social order [11,12]. These social actions may be classified into a direct or an indirect approach. Previously, we have examined into the indirect approach where norm-abiding individuals collectively compel the violator with social pressure and ostracism. Social ostracism enforces societal norm through social disapproval from the norm conformists (cooperators) to the norm violators (defectors). The fear of such implicit social punishment can prevent individuals from adopting uncooperative behavior, or cause them to change from being a defector to a cooperator. Note that no cost is being borne by the cooperators for the conduct of their social actions which have a stronger effect when more people obey the norm [13]. Specifically, societal disapproval is being meted out by excluding defectors from social protection, help, and other benefits. Ostracized defectors would also suffer reflexive responses such as sadness and anger because their relational needs have been damaged [14]. Moreover, various studies have found empirical evidences of indirect pressure as a consequence of the ostracism mechanism [13,[15][16][17][18][19]. It is notable that ostracism is very similar to another mechanism known as social exclusion [20][21][22]. In social exclusion, freeriders are shunned and ignored, with the collective production gain and benefits of the community being excluded from the defectors. Such punishment on the free-riders or defectors is enforced by cooperators at a cost which constitutes a direct form of social control. This cost is however absent in ostracism, thus rendering ostracism an indirect social control mechanism.
Traditionally, social scientists assume a more direct approach in maintaining social order through the process of reward and punishment. This mechanism is operated at a personal level, in which a certain group of people undertakes the role of voluntarily punishing the violator [23] or rewarding the conformist [24]. Since such actions are costly, the potential enforcers may be reluctant to punish the norm violators voluntarily [25,26]. This mechanism is in fact common in animal society [27,28] and can also happen in human society [29][30][31][32]. Depending on the culture of the society, there could be many varieties of social mechanisms. Nonetheless, these mechanisms are typically classified into either indirect collective control or direct individual control [33,34]. In this paper, we shall consider equity driven ostracism as the representative mechanism for collective social control and voluntary enforcement as the mechanism for individual social control. In addition to social mechanism, social scientists also propose the idea of using network density as an effective form of social control [11,[35][36][37].
Recently, game theory has also been employed to investigate how cooperation can emerge from the exploitation of CPR. In these studies, game theoretic framework provides the context for agents to make rational decision which can affect the conditions of the environment. These conditions then feedback on the well-being of the agents, and thus causing further influence on their decisions. Such game-environment feedback was explored in a co-evolutionary two players game modeled by prisoner's dilemma in terms of replicator dynamics [38]. The consequential oscillating tragedy of the commons cycle between states where cooperators dominate in a depleted environment while defectors dominate in a repleted environment, with the nature of the evolution dependent on the details of the payoff matrix. Another version of game-environment interaction is the collective risk social dilemma game where a feedback mechanism is incorporated between the CPR and the cooperators' contribution [39]. Here, a collective target with respect to the CPR has to be reached before the environment is deemed sustainable by the combined actions of the cooperators and defectors.
Unlike these game theoretical approaches, our model depicts the drivers of cooperation more explicitly in order to bridge the gap between the empirical results and theoretical findings [40]. It is due to Tavoni et al, and hence we term it as the TSL model. This model is general because it encompasses all important features of the CPR frameworks [3,5,41,42]. Nonetheless, we have modified it to include social interactions by embedding it in a social network with asynchronous pairwise interaction [40,43,44]. Asynchronous update is known to be appropriate for simulating social processes [45,46], and it is also useful for avoiding the generation of pattern artifacts [45,47]. Before analyzing the dynamical behavior and stability of the SES under the effects of network structure and external control parameters, we incorporate the social mechanisms of either ostracism or voluntary enforcement into the model.
The interesting phenomena we aim to explore in this study is that of phase transition. Phase transition is an important topic in physics in both condensed matter system and statistical mechanics, which concerns mainly with transitions that happen in equilibrium systems. There are, however, interesting phase transition scenarios in non-equilibrium systems that mirror that of equilibrium systems. Examples are the epidemic spreading of diseases [48] and the emergence of traffic congestion [49]. In this paper, we investigate into the phenomena of non-equilibrium phase transition in a coupled social-ecological system that is subjected by two forms of social control mechanisms. Through the phase diagram approach from statistical physics, we evaluate into the phases and its transition in this system due to equity driven ostracism and voluntary enforcement mechanism with respect to specific network structures. To perform the theoretical analysis, we adopt techniques from dynamical system theory and the method of pair approximation. We encounter interesting coupled social-ecological dynamical behavior beyond equilibrium [50]. We observe the phases of cooperation, defection, mixed equilibra, and disequilibrium in the form of oscillatory behavior. We studied into the details of the transitions between these phases. The motivation is to yield the form of social control that is beneficial to the system. Before we present our results, let us first describe our coupled socio-ecological model and the social control mechanisms.

Model
We consider a population of N agents having equal access to a CPR. In our framework, each agent is represented by a node while their interaction with other agents in the network is being denoted as a link. For simplicity, we consider the social network to take the form of a random graph. The resource is assumed to be served by a constant natural inflow c up to a finite resource capacity R max . It is depleted by a natural depreciation d. In addition, each agent extracts the resource for their own production purposes, which further reduces the availability of the resource. Putting these processes together, the following time evolution of the resource dynamics is obtained: is the mean of the agent's effort e i . The production yield is determined via the Cobb-Douglas production function with decreasing returns, i.e., g = a b F E R , with α, β and γ being constant parameters. If w is the opportunity cost, the payoff of agent i is then given by p = e F E we i i i . Next, we consider two different forms of social control mechanisms: ostracism and voluntary enforcement. In the former case, an agent can decide to be either a cooperator or a defector. Cooperators are individuals who follow social norm by limiting the extraction of the resource according to established agreement. Defectors, on the other hand, are agents who violate the social norm and the agreement. Their main motivation is to maximize their own personal benefit. Thus, the effort of a defector e d is greater than that of a cooperator e c . The effort of each agent e i is therefore equal either to e c or e d depending on whether the agent is a cooperator or a defector, respectively. As the selfish behavior of defectors is socially undesirable, every defector naturally encounters social disapproval from cooperators in their social circle. In the TSL model, social disapproval is effected through ostracism in the form of Gompertz function: where k c gives the number of neighboring cooperators of a defector with k neighbors [20]. Note that h, t and g are constant parameters. Thus, a defector needs to pay an additional social cost which is captured by the defector utility: Notice that the strength of the social cost is not only related to the number of cooperators but also to the production yield difference between cooperators and defectors. In this case, the societal reaction from the adjacent cooperators is stronger when the payoff dissimilarity between cooperator and defector is larger. This phenomenon is known as equity driven ostracism which is based on observations from real-world common pool society [7]. No such cost is borne by a cooperator, and thus the utility of a cooperator is given by p = u c c . In the case of voluntary enforcement, there are three strategies: cooperator, defector, and enforcer. Cooperators and enforcers fulfill the agreement by avoiding over-exploitation of the resource. Defectors, as before, exert more effort and extract the resource beyond what was promised to achieve a higher gain. However, the role of monitoring and administering socially approved behavior in this society lies with the enforcer. Here, the defectors are punished socially by their neighboring enforcers, while the cooperators ignore the norm violating behavior of its neighboring defectors. In punishing a neighboring defector, each enforcer bears a cost of h c , while the punished defector incurs a sanction damage of h s . This implies an enforcer utility of p =u hk e c c d and a defector utility of p =u hk d d s e , with k d (k e ) being the number of neighboring defectors (enforcers) of an enforcer (a defector). The parameters h c and h s give the unit cost of enforcement and sanction damage respectively. Again, the cooperators bear no social cost, its utility is given by p = u c c . Unlike ostracism, social punishment on the defectors is performed individually rather than collectively by the enforcers, who sacrifice themselves in order to uplift social justice in the society.
The coupled SES evolves as the agents update their strategies while utilizing the resource. At each time step, we employ the approach of asynchronous pairwise comparison. Specifically, a randomly chosen agent i compares its utility with a randomly chosen neighbor j. If the neighbor's utility is lower than his own utility, nothing happens and agent i maintains its previous strategy (strategy x). If the neighbor's utility is higher than its utility, it will copy the neighbor's strategy (strategy y) with a probability proportional to their normalized utility difference: where u max is the maximal utility achievable. In order to prevent the system being trapped in a state where all agents adopt the same strategy, we include a mutation mechanism where a random agent occasionally takes a random strategy. Note that the mutation occurs very infrequently, in the order of one mutation every ten thousand time steps. Finally, we employ the following social and ecological parameters in our simulations of the evolution of the system: d=50; R max =200; γ=10; α=0.6; β=0.2; w=15; e c =0.00966; e d =0.03652; t=−150; g=−10; h c =0.02. Note that we vary the values of h s , h, and c during the simulations.

Ostracism mechanism
In order to establish the phase diagram, the simulation was performed until the whole system has converged to a statistically stationary state. Spatial averaging has been performed at the boundary between phases of the phase diagram, which tends to be noisy. An examination of these stationary states revealed the presence of diverse stable phases across different values of the socio-ecological parameters. For the case where ostracism is the social control mechanism with ostracism strength h and resource inflow c as our respective social and ecological control parameter, we have uncovered four unique phases. They are the phase of pure cooperator (C); pure defector (D); coexistence of cooperator and defector (CD); and the bistable state (B). Previous analysis has shown that a larger resource inflow enhances resource availability which has the effect of tempting the agents to undertake defective behavior. This implies the emergence of the D phase at large resource inflow. On the other hand, a change in the ostracism strength modifies the social pressure to subdue the defective behavior, and a larger strength would be more effective against defection. Hence, the C phase is expected at a larger h. At intermediate c and h, the above two processes compete with each other and in consequence, we observe the B phase.
Let us begin by considering social interactions that are highly connected (á ñ = k 45, N=50). The phase diagram of this situation is displayed in figure 1. We observe that the four phases occupy different regions in the phase diagram depending on the values of c and h. In the figure, the C phase is represented by the the cyan region while the D phase by the blue region. The CD phase is given by the red region. Here, we find a single stable equilibrium state which consists of certain proportions of cooperators and defectors. The yellow region is special because it corresponds to the B phase where there exists two stable equilibrium states, instead of one which occurs for the other three phases. In fact, the B phase could be a combination of the following two stable equilibrium states (refer to We would like to highlight the existence of phase transition between the phases. The phase transition is either continuous which happens between phases with one stable equilibrium state (such as between the C and CD, or the CD and D phases), or discontinuous which arises between phases with one stable equilibrium state and two stable equilibrium states (such as between the C and B, CD and B, or D and B phases). In both the phase diagrams of figures 1 and 3, we have illustrated the transitions between the phases. We have used solid line to represent continuous phase transition and dotted line for discontinuous phase transition. Note that these lines are obtained analytically based on the pair approximation of the socio-ecological stability analysis conducted in section 5, and we observe good agreement between our theoretical analysis and the numerical results in the figure. Moreover, we notice the encroachment of two solid lines into the yellow region in figure 1. These two solid lines are related to the continuous phase transition of each of the two stable equilibrium states of the B phase (see figure 2). In particular, the left solid line depicts a transition from pure cooperator state to a coexistence of cooperator and defector state for one of the stable equilibrium state of the B phase, while the right vertical solid line is for the transition from a coexistence of cooperator and defector state to pure defector state for the other stable equilibrium state. In general, we have different dynamical effects as we vary the two control parameters independently. Let us first consider fixing h and changing c. This is analogous to tracing the phase diagram horizontally. In the case of low ostracism strength, the system shifts from the C phase to the CD phase and then to the D phase as c increases, and all phase transitions are continuous (refer to figure 2(a) for h=0.1). Under this circumstance, the social pressure is too weak to prevent defector invasion, such that the fraction of cooperators reduces continuously for each increment of c. A further increment in h would lead to the same sequence of phase changes, albeit with a lower rate of conversion from cooperator to defector as c increases. When h is increased beyond a critical value, we observe the occurrence of bistable effect. Here, the spread of defection is contained effectively by the greater ostracizing effect of each neighboring cooperator. More importantly, the system now undergoes a discontinuous transition resulting in a collapse of cooperative behavior. In other words, the system now transits sharply to the other stable equilibrium state of the B phase which possesses a majority of defectors (or even solely defectors). Once the transition takes place, the system no longer revisits the previous state when c is reduced unless it is taken further down to another critical point since there are not enough cooperators to ostracize the defector. These results illustrate the nonlinearity of the ostracism mechanism whose effective operation depends on the number of cooperators surrounding a defector exceeds a threshold. We thus observe the occurrence of a hysteresis phenomenon which results from the presence of two stable equilibrium states between the two critical points in the system (see figure 2(a), for h=0. 34). Note that both the stable equilibrium states typically consist of a coexistence of cooperators and defectors. However, one of the states has a majority of cooperators while the other a majority of defectors. For very high ostracism strength, the point of continuous transition of the upper  where defection is being subverted by the ostracism mechanism. Note that the x-axis represents resource inflow, while the y-axis ostracism strength. Solid (dotted) line indicates continuous (discontinuous) phase transition. stable equilibrium state of the B phase (see figure 2(a), h=0.7) lies beyond the point of discontinuous transition where the equilibrium state with a majority of defector transits sharply to the pure cooperator state. At the other end of the B phase, the continuous transition of the lower stable equilibrium state from a defector majority state to the pure defector state is observed to occur at a fixed c independent of h. On the other hand, the discontinuous transition where a cooperator majority state collapses to a pure defector state is found to happen at a much larger resource inflow ¢ c than this fixed c value, with ¢ c increases as h increases. Next, let us discuss the situation of increasing h and fixing c, which is similar to tracing the phase diagram vertically from bottom to top. This vertical slice of the phase diagram is represented in figure 2(b) where we have plotted the analytical solution of P c against h. In the case of low resource (c = 10), cooperativity increases continuously and monotically as h increases. For the case of intermediate resource inflow and beyond (i.e.,  c 20), there occurs the interesting B phase. As shown in figure 2(b), when the whole society collapses into the D phase at c=50, the cooperativity cannot be revived by varying h until a critical value. This is because no matter how strong h is, there is always not enough cooperators to perform effective ostracism. The best way to escape from the D phase is to reduce resource inflow (i.e., follow the horizontal path in the phase diagram). The result here indicates that the variation of one parameter can be more effective in allowing the system state to escape from the basin of attraction of a particular stable phase (in this case, the D regime) than through the variation of another parameter. Thus a good understanding on which parameter to vary would greatly improve our management of CPR system. Since the social system is coupled to the ecological system, any change in social behavior affects the amount of resource utilization whether it is a gradual reduction or a sudden collapse. These social and ecological changes would influence the total payoff of the society.
Finally, we explore the effect of a change in network connection on the configuration of the phase diagram. If the average network degree were to decrease, we observe a corresponding reduction in the size of the B phase. In other words, the width of the hysteresis curve becomes smaller (see figure 3). This outcome is directly related to a decrease in social interactions within the society as it affects the configuration of neighboring cooperators ostracizing a defector. When average network degree is low (á ñ = k 5), having one less cooperator would already cause a rather large reduction in the social pressure on the neighboring defectors. Conversely, any appearance of a cooperator would greatly strengthen the social pressure towards the neighboring defectors. Therefore, the system would follow the same path no matter we increase or decrease the temptation to defect through c. There is no hysteresis and no B phase, unless the ostracism strength is sufficiently strong to yield the bistable effect. This explains the reduction in size of the B phase. More rigorous analysis had been performed via analytical approximation in section 5.

Voluntary enforcement
While we have shown that cooperation can emerge through the indirect social mechanism of ostracism, the question we address in this section is whether it can also emerge in a more direct mechanism. Here, we shall establish and explore such a direct mechanism known as voluntary enforcement which involves a more complex three strategies system. In voluntary enforcement, we require two social parameters for its effective operation. The parameters are the cost of enforcement h c and the strength of sanction h s . Again, we shall analyze the coupled socio-ecological dynamics of the system through the phase diagram, which is defined by the ecological parameter of resource inflow c and the social parameter of sanction strength h s . As before, there is a competitive effect between c and h s . An increase in c tends to enhance defective behavior, while a larger h s imposes a greater sanction on the defectors and hence damps the occurrence of defectors invasion. We have fixed the cost of enforcement at a constant value, i.e., h c =0.02. In our three strategies system, we observe nine possible dynamical phases. They are the phase of pure cooperator (C); pure defector (D); pure enforcer (E); coexistence of cooperator and defector (CD); coexistence of cooperator and enforcer (CE); coexistence of defector and enforcer (DE); coexistence of cooperator, defector, and enforcer (CDE); the bistable state (B); and disequilibrium oscillatory dynamics (O).
We begin our analysis with a highly connected network (á ñ = k 45). We notice that when both c and h s are small, the defector utility is low and hence there is no defector in this regime. Instead, we observe a coexistence of cooperators and enforcers. Without any defector, the utility of cooperators and enforcers is the same, which makes both strategies indistinguishable. We can thus deem this CE phase (as indicated by the yellow region in figure 4) to be analogous to a pure cooperator phase. Since an increase in resource inflow would enhance defector utility, we expect an emergence of defectors to challenge the cooperator-enforcer (CE) dominance beyond a critical c. Once these defectors appear, they cannot be neutralized by the existing enforcers because the sanction is too weak. Moreover, it is very likely that the defectors are connected to the enforcers in a high average degree network. Each defector would give additional cost to the neighboring enforcers which cause them to convert either to a cooperator or a defector. This leads to the CD phase (orange region in figure 4) with a coexistence of cooperators and defectors. If we were to further increase c, the utility of the defectors would surpass the cooperators with the consequence of all cooperators becoming defectors. This gives the D phase, which is the light blue region in figure 4.
Let us now consider large h s . In this case, the punishment from the enforcer is strong and it reduces the utility of the neighboring defector significantly. Nonetheless, the enforcer must also pay a social cost of h c to punish each neighboring defector. If the system is dominated by defectors, the total cost of enforcement for an enforcer is high and the enforcers could not overcome the dominance of the defectors. On the other hand, if the system is dominated by enforcers, the tendency to defect would succumb to the total punishment from neighboring enforcers. Such a situation creates bistability in which both defector dominance and enforcer dominance can both be stable. The time evolution of these two different stable equilibrium states are observed in figure 5. This bistability region in parameter space is quite large and is represented by the brown region in figure 4. Interestingly, as we reduce the average network degree, the size of this B phase shrinks and we observe the emergence of a disequilibrium oscillatory O phase.
This brings us to examine the other extreme of low average network degree connection (á ñ = k 5). As in the high average network degree case, we observe the occurrence of the yellow CE regime; and for low h s , the orange CD phase and the light blue D phase (see figure 6). When h s becomes sufficiently large, the situation becomes more interesting. While there is again the competitive effect between a tendency to defect and a strong enforcement against defection, we observe an absence of a clear separation between defector dominance and enforcer dominance in lieu of the low average degree of network connection. It gives rise to a regime where cooperators, defectors and enforcers coexist without bistability. This is the CDE phase which is indicated by the red region in figure 6. Note that such competitive effect between h s and c can also lead to a disappearance of cooperators in the society if c is large enough. When this happens, we have the DE phase (light green region in figure 6). The competition between h s and c can become even most interesting in a regime in which there is no longer any stable equilibrium states. In this non-equilibrium regime, which occurs for a range of c and h s , we observe that the temptation to defect is large enough to lure a cooperator to become a defector. The number of defectors increases and becomes a majority in the system. However, the sanction from the enforcer is large enough to reduce defector utility and deters the spread of the defectors. The number of defectors then decreases with the enforcers gaining control of the situation. During this process, cooperators can sneak in to replace the enforcers because of their larger utility, which results from the enforcers having to pay the cost of punishing the defectors. As the number of enforcer decreases, the punishment becomes weaker and the population of defector grows and dominates the system again. This leads to a cyclical behavior in the dynamics, which we term it as a state of oscillatory disequilibrium (dark blue region in figure 6). Note that the time series of the six different phases in the scenario of low average network degree are shown in figure 7 with the one displaying cyclical behavior illustrated in figure 7(f). In the phase diagram for á ñ = k 45 and á ñ = k 5, we have again included our analytical estimation on the parameter range (indicated by a line in the diagram) in which the phase transition occurs. The detailed analysis of our approximation is discussed in the next section. Due to the dynamical complexities of the three strategies, we have simplified our solutions by considering only the specific behavior of the transition from one phase to another at the boundary between the phases. Generally, we found the result to be either a continuous or a discontinuous transition, with the analytical solutions obtained being quite accurate only for certain phase transitions. Similar to section 3, a solid (dotted) line indicates a continuous (discontinuous) transition from one stable equilibrium state to another stable equilibrium state. Specifically, a point in the dotted line is the point of intersection between the stable and unstable solution of the system for a given set of parameters.

The dynamics of two strategies via ostracism mechanism
We can express the updating mechanism described in section 2 in terms of a set of stochastic differential equations. Before we do that, we need to fix the notations of our analysis. Let P x be the probability of selecting a random agent with strategy x, and | q x y be the conditional probability of finding an agent with strategy x that is connected to another agent with strategy y. Then, Bayesian identity dictates that = = | | q P q P P x y y y x x x y , where P xy is the probability of having a xy link in the system.
Since we have N agents in total, the number of agents with strategy x can be approximated as N x =NP x . We T N x be the transition probability of increasing [decreasing] N x agents by one. From the transition probabilities, we can construct the master equation of the system as follows: where τ represents the simulation time step. We rescale this master equation with the population size N such that = P N N x x and t = t N . We then obtain P P t N P P t P P N t T P N P P T P P P N T P N P P T P , 1 , 1 , 1 We next perform Taylor's expansion with respect to t (P x ) until the N 1 ( N 1 2 ) term on the left (right) of equation (2), and neglect the higher-order terms in order to obtain the following Fokker-Planck equation [51]: = - which is equivalent to the associated nonlinear Langevin equation, whose noise term h ( ) t is linked to the distribution ( ) P P t , In the thermodynamic limit at which  ¥ N , the diffusion term vanishes. The nonlinear Langevin equation simplifies to , 5 x x x which takes the following form for the cooperative strategy in our model: The condition of equilibrium is satisfied by setting By means of the probability identity: cd , we can rewrite equation (7) in the following form: which can be expanded as follows: where r i is the probability that the fraction of n c i cooperators occurs in the defector neighborhood. Moreover, is the amount of ecological resource at equilibrium, which is obtained by setting Hence, the probability distribution that a defector is connected to i nearest-neighbor cooperators is given by: By putting the approximation given by equation (15) into (9), we then solve for a set of stable and unstable fixed points. The stability of these fixed points is determined through equation (6). We first express the righthand side of this equation in terms of P c using equations (13), (14) and = -P P 1 d c . This allows us to plot the curve of P t d d c against P c (see figure 8). The fixed points are determined from the intersection of this curve with = P t d d 0 c . As shown in figure 8, the stability of the fixed point is evaluated by ascertaining whether a perturbation from the fixed point leads to a convergence (stable fixed point) or divergence (unstable fixed point) from it. Note that similar stability analysis is performed for the voluntary enforcement mechanism discussed in the next section.
Note that a set of stable fixed points form the stable equilibrium states of the system. The point at which a continuous transition between phases occurs is found through the intersection between the stable solution and P c =0 (or P c =1). Specifically, the stated transition is between the CD phase and the D phase (or C phase). On the other hand, the point of discontinuous transition is obtained from the intersection between the stable and unstable solutions, i.e., at the point of the fold bifurcation. At this point, the stable and unstable solution collides and annihilates. Therefore, beyond the fold bifurcation, the system collapses abruptly to another stable fixed point, causing a transition that is discontinuous between the phases. Note that these analytical predictions have been borne out against the numerical results as shown in figures 1 and 3.

The dynamics of three strategies via voluntary enforcement
Let us now reapply the equilibrium condition of equation (5) and solution given by equation (10) to voluntary enforcement. This involves an analytical approximation of a three strategies system by means of equation (5), whose dynamical evolution is given as follows: To solve these equations, we need to know the transition probability of each specific updating mechanism. In general, it is difficult to determine these transition probabilities, and this is true also for the mechanism of voluntary enforcement. Hence, our approach is to examine the transition between phases at the boundary locally. This simplifies the analysis as we only focus on a specific phase transition within the phase diagram. We first apply this approach to the boundary between the C and CD phases, and also between the CD and D phases. In this restricted scenario, the system has no enforcer. We can capture this situation by employing equation (8): , whose solutions are given by the two vertical lines in the phase diagrams of figures 4 and 6. Note that these vertical line solutions represent continuous transitions which are obtained by the intersection between the stable solution and P c =0 (or P c =1). They are only applicable at the boundary between the C phase and CD phase, or between the CD phase and D phase, and nowhere else. Beyond these regimes, the vertical line solutions are no longer relevant and should be ignored.
We next consider the interface between the CE phase, and the phase where cooperators, defectors, and enforcers exist. Here, we assume that there is only a few enforcers and a few defectors amidst a cooperator majority near the boundary. We surmise that the enforcer is only connected to one defector, and the defector is only connected to one enforcer, because there are only a few of them. This condition is described by the equation: Its solution is represented by the curved solid line which is a continuous transition separating the CE and the B phase of figure 4, and between the CE phase and CDE phase for figure 6. Note that beyond the boundary and as c increases, we observe an increasing dominance of the enforcers in both the systems of á ñ = k 45 and á ñ = k 5. Let us now introduce a small number of enforcers within a cooperator-defector majority. This situation corresponds to the boundary between the CD and D phase with the B phase (or the CDE phase) in figure 4  (figure 6). Note that the enforcers cannot invade the cooperators because they suffer the cost of punishing the neighboring defectors while having the same payoff as the cooperators. The enforcers can only affect the defectors by reducing their payoff through sanctioning actions. This suggests the following relationship between the enforcer and defector strategies: where we have assumed that the defector is only connected to one enforcer due to the low population of enforcers, while an enforcer is surrounded by -( ) | k q 1 d e defectors. The solutions of this equation can take either of two forms in the phase diagram: a dotted line boundary (see figure 4) or a solid line boundary (see figure 6). The former represents a discontinuous transition which is obtained by the intersection between the stable and unstable point, while the latter is a continuous transition which is derived in the manner described in the paragraph above.
Finally, we estimate the boundary of the disequilibrium oscillatory phase. Near to this phase, we expect a small number of defectors and cooperators. We therefore assume that the enforcer is connected to one defector while the defector is connected to -( ) | k q 1 e d enforcers. This circumstance is given by the equation: In summary, the combination of all the estimates for each specific situation approximates the transition boundaries between the phases for the three strategies system. Here, we observe good correspondence between our analytical predictions and the numerical results. In fact, we can also utilize this analytical approximation to understand the underlying mechanism that drives the dynamics of the disequilibrium oscillatory phase (refer to figure 9). Without loss of generality, let us consider the case where parameters have been set at c= 50 and h s =0.12. Note that in the following analysis, we have adopted a quasi-static approach since the dynamics is no longer at equilibrium. From this perspective, we first plot the stable and unstable fixed point solutions (blue curve in figure 9) by fixing h s =0.12 while varying c. We have set the y-axis as P e with = --P P P 1 e d c . In the figure, we have marked the stable state at c=50 by a circle (position 1), such that the system now resides in the CDE state quasi-statically. Since the payoff of the cooperator is higher than the enforcer (because the latter bears the cost of punishing the defectors), we expect a gradual transformation of strategy from the enforcer to the cooperator. This dynamics can be illustrated by assuming that 20% of the enforcers have changed to the cooperator strategy. We plot the stable and unstable solutions for this case (red curve), which shows a reduction in the number of enforcers at the stable fixed point. This stable state at c=50 is marked by a circle (position 2). Again, this state is quasi-static and it eventually exceeds the bifurcation point as the system progresses. The consequence is a collapse of this state to a state of total defection (position 3). The total defection state is unstable in the O phase. Upon the introduction of a small number of enforcers (in lieu of the mutation mechanism), the system moves to a DE state (position 4 in green curve). The DE state is not stable and with the appearance of a few cooperators, the system returns to the quasi-static condition of the blue curve. Under this condition, the system flows back to the original CDE state (position 1) and the cycle repeats itself.

Discussion and conclusion
Through investigating the effects of a direct (voluntary enforcement) and an indirect (ostracism) social control mechanism on a society of interacting agents, we have developed an understanding on the associated cooperative dynamics that underpins the coupled SES. When the level of social interactions is high as exemplified by a high average degree network, we observe that both mechanisms give rise to similar cooperative behavior via the phase diagrams. In particular, there is present a bistable phase for both cases. It is notable that the B phase contains hidden systemic risk since an onset of external pressure may tip the whole system onto an alternative stable state. Once the system encounters such a regime shift, it is no longer easy for it to revert back to its original state due to the presence of hysteresis. This shift is very abrupt and can be disastrous. Thus, for a high average degree network, the greater chance of being socially connected becomes a double-edge sword: defection is either easily induced through the temptation of resource abundance, or being circumvented by the underlying social control mechanism, as long as c and h (or h s ) are sufficiently large. This dual possibilities lead to the observed B phase. On the other hand, if the strength of social control goes beyond (or fall below) the effect that results from resource inflow, we would observe the occurrence of the phase of social cooperation (or defection).
When social interaction (i.e., average network degree) reduces, the B phase region is observed to shrink. With the disappearance of the B phase, abrupt phase transition and its associated regime shift is found to be replaced by continuous transition in phase. This phenomenon occurs in the case of the ostracism mechanism. For voluntary enforcement, the situation is more interesting. Here, we observe that the disappearance of bistability is accompanied by the appearance of a new type of dynamical phase. This new phase is oscillatory (which we term the disequilibrium oscillatory phase), with a variation in the proportion of cooperators, defectors, and enforcers as each strategy tries to gain dominance over the other. It should be noted that the outcome as exhibited by either of the social control mechanisms is the consequence of the fact that a more meagre interactions tend to bias one phase over the other. In the case of direct social control, the inclusion of a specific enforcement entity that pays a social cost for its action creates additional dynamics. This dynamics is the source that generates the observed oscillatory behavior, rendering the direct voluntary enforcement mechanism more difficult to control compared to the indirect ostracism mechanism.
In conclusion, we have uncovered that the incorporation of social interactions among agents with coupling to the resource environment creates a scenario of multiple phases. The phases are observed to consist of stable states or even an oscillation between states, which is in direct contrast to the standard premise of a phase of general equilibrium in traditional system theory. Through a comparison on the outcome of different social control mechanisms, we have gained new insights on the intricate dynamical interactions that exist between the agents as they are affected by their social network as well as the social and ecological parameters of the coupled SES. Moreover, our analysis had thrown light on the details in which phase transitions occur within the system, whether abruptly as a discontinuous transition or smoothly as a continuous transition. We perceive that the new understanding we obtained would improve our ability to manage and control diverse coupled SES: by promoting good transition and avoiding bad transition among the phases; and to navigate carefully if the system is on the verge of a discontinuous phase transition.