Enforcement and inequality in collective PES to reduce tropical deforestation: Effectiveness, efficiency and equity implications

how inequality in wealth, framed as differences in deforestation capacity , affects policy performance. We find that introducing individual level sanctions can improve the effectiveness, efficiency and equity of collective PES, but there is no silver bullet that consistently improves all 3Es across country sites. Public monitoring reduced deforestation and improved the equity of the program in sites with stronger history of collective action. External sanctions provided the strongest and most robust improvement in the 3Es. While internal, peer enforcement can significantly reduce free riding, it does not improve the program ’ s efficiency, and thus participants ’ earnings. The sanctioning mechanisms failed to systematically improve the equitable distribution of benefits due to the ineffectiveness of punishments to target the largest free-riders. Inequality in wealth increased group deforestation and reduced the efficiency of Community enforcement in Indonesia but had no effect in the other two country sites. Factors explaining differences across country sites include the history of collective action and land tenure systems.


Introduction
Tropical deforestation is the largest source of carbon emissions from Agriculture, Forestry and Other Land Use (AFOLU) activities (IPCC, 2019), also driving biodiversity loss (Gibson et al., 2011) and threatening the livelihoods of local communities (Angelsen et al., 2014).To meet the global climate, biodiversity and sustainable development goals, adequate policies for reducing deforestation need to be implemented at regional and local scales (Ostrom, 2010).Among the set of policy options to reduce deforestation are positive incentives ("carrots"), which aim to increase the welfare of forest users by incentivizing or rewarding their conservation activities, and disincentives ("sticks"), which aim to deter deforestation activities by punishing or increasing the cost of non-environmentally friendly behaviour (Börner et al., 2020).
Payments for Ecosystem Services (PES) programs are positive incentives that reward forest users conditional on conservation performance.They consist on voluntary agreements at the individual or group level, under which the providers agree to supply ecosystem services in exchange for payments (Wunder, 2015).PES are a commonly used tool in the efforts to reduce deforestation (Min-Venditti et al., 2017;Salzman et al., 2018) and a key component of Reducing Emissions from Deforestation and forest Degradation (REDD+) initiatives worldwide.Collective PES are characterized by assigning the payment to a group instead of an individual, based on their collective performance (Hayes et al., 2019;Pfaff et al., 2019).Collective PES are preferred when land is managed under collective ownership, when individual actions are hard to identify, or when spatial coordination of conservation activities is particularly important, such as in watershed or biodiversity management (Engel, 2016).
Although collective PES help solve the global collective action problem of forest conservation, they face a number of challenges to provide effective, efficient (i.e., cost-effective) and equitable outcomes (3E) (Angelsen and Wertz-Kanounnikoff, 2008) at the local level.First, they create a local collective action problem: the individual compensation from collective PES is only partly conditioned on individual behaviour (Hayes et al., 2019).Participants have an incentive to free ride on others' conservation actions, which can decrease the overall effectiveness of the policy as compared to an individual based PES (Gatiso et al., 2018;Hayes et al., 2019;Kerr et al., 2012;Midler et al., 2015;Narloch et al., 2012;Ngoma et al., 2020).Second, a related challenge is to balance conservation costs and benefits in a way that is equitable among program participants (Hayes et al., 2019;Hayes and Murtinho, 2018).Collective PES are likely to be implemented in communities with heterogenous participants in terms of household labour, capital and physical access to forests, which can in turn affect policy performance as well as exacerbate existing inequalities (Andersson et al., 2018b).
Stronger monitoring and enforcementintroducing individual "sticks" with the collective "carrots" -can help navigate these interrelated challenges as it reduces the incentives to free ride.However, strong monitoring and enforcement involves additional implementation costs (Börner et al., 2014).Thus, higher program effectiveness and equity might reduce economic efficiency (Pascual et al., 2010;Wu and Yu, 2017), yet there are few empirical evaluations of such trade-offs .In this article, we compare how different monitoring and enforcement strategies perform in terms of the 3Es in a collective PES.We define effectiveness as the degree to which deforestation is reduced from a baseline level.Efficiency is the degree to which the monitoring and enforcement achieve conservation outcomes for the least cost, from the perspective of the community members.Equity has both a distributional and procedural dimension, and thus includes the distribution of earnings amongst PES participants as well as their fairness perceptions (Lliso et al., 2021;Loft et al., 2017;Pascual et al., 2010).
We conducted a framed field experiment (FFE) in three countries with high forest cover but different local governance contexts: Brazil, Indonesia and Peru.We compare three strategies to reduce the free rider problem in a collective PES: (i) Public monitoring of individual deforestation, (ii) monitoring with peer sanctions (Community enforcement) and (iii) monitoring with external sanctions (Government enforcement).We also evaluate whether inequality in wealth, framed as differences in deforestation capacity, affect the performance of a collective PES.Recent research suggests inequality might affect program and institutional performance (De Geest and Kingsley, 2021;Nockur et al., 2021).Even though a number of economic experiments have examined the effects of economic inequality on cooperation (De Geest and Kingsley, 2019;Hauser et al., 2019;Kingsley, 2016;Tavoni et al., 2011), few have tested it with actual natural resource users (Loft et al., 2020;Narloch et al., 2012;Vorlaufer et al., 2017), and none have examined the question across multiple countries.

Reducing the free-rider problem
Collective PES programs in which it is hard to exclude community members from the benefits of the collective payment are similar to the common-pool resource (CPR) problem; the benefit individuals receive from the group compensation is not proportional to the individual conservation actions (Hayes et al., 2019;Martin et al., 2014).To maximize own net earnings individuals can free ride by appropriating the common pool resource (i.e., deforesting), creating a negative externality on the rest of the group by reducing the collective payment.
A central strategy to reduce free riding is to increase its cost by introducing sanctions.The first type of sanction that we evaluate is the non-monetary sanction of publicly revealing individual deforestation decisions, which can induce guilt or pride (Masclet et al. 2003;Lopez et al. 2012).We also consider two monetary sanctions that can be classified at the opposite sides of a governance spectrum: (i) a centralized, external sanctioning institution, and (ii) a decentralized, internal sanctioning institution in which community members sanction their peers.The experimental literature indicates that in general, when faced with the threat of an external, centralized sanction, participants significantly increase cooperation (Cardenas, 2004;Gelcich et al., 2013;Lopez et al., 2012;Rodriguez-Sickert et al., 2008;Velez et al., 2010;Vollan et al., 2019).This is consistent with non-experimental evidence showing how law enforcement by authorities provides effective results to reduce tropical deforestation (Busch and Ferretti-Gallon, 2017;Tacconi et al., 2019).Even though the expected net benefit of free-riding decreases as the probability of the external sanction increases, experiments show that the probability of the sanctions does not greatly affect their overall effectiveness (Cardenas, 2004;Lopez et al., 2012).
Both monetary sanctioning strategies have potential shortfalls.External sanctions might undermine the legitimacy and liberty of participating communities, potentially crowding out motivations for cooperative behaviour (Cardenas et al., 2000;Kube and Traxler, 2011;Lopez et al., 2012).Furthermore, in many situations, external regulations and sanctioning are hard to implement, because of costly monitoring, lack of political interest, or corruption (Karsenty and Ongolo, 2012;Sundström, 2015).In turn, when individuals must regulate common-pool resource use on their own, they incur monitoring and enforcement costs.If these costs are too high, they erode the benefits of more cooperation (Ostrom et al., 1992).The effectiveness of each sanctioning strategy has been evaluated in the context of homogenous populations in experimental games (see Vollan et al., 2019), but there is no research evaluating how they perform relative to each other in terms of the 3Es and with heterogenous populations.

The effect of economic inequality in management of the commons
It has for long been recognized that agent heterogeneity and inequality affects the level of cooperation in social dilemmas, but in ambiguous ways (Agrawal, 2001;Baland and Platteau, 1999).Broadly, three types of inequalities can affect collective action: inequality in wealth or endowments, inequality in interests or incentives, and inequality in identity (Baland and Platteau, 1996) 1 .Critical factors that determine the effect of inequality on commons outcomes include the incentive structure facing the participants (e.g., individual endowments) and the characteristics of the public good, such as whether it creates positive or negative externalities, or whether it offers the same returns to all participants (Dayton-Johnson and Bardhan, 2002).
Inequality has positive effects on collective action if the wealthiest agents face stronger incentives to cooperate, for example, by receiving a larger share of the benefits from the common pool.In such cases, the elite has higher interests in collective action, and thus involve themselves more actively in setting rules and enforcing them (Baland and Platteau, 1999).Similarly, inequality in opportunity costs of conservation of a CPR (i.e., the returns to the best outside option) increases cooperation, as players with more valuable external options put less pressure on the common resource (Cardenas et al., 2002).Further, an increase in wealth inequality leads to reduced deforestation when the demand for the common resource is increasing at a decreasing rate with wealth (Alix-Garcia, 2008).In this case, more inequality entails less overall deforestation because the poor reduce their deforestation more than what the wealthy increase it.
Other evidence suggests that economic heterogeneity has negative effects on the commons.For example, there is less collective action in groups with unequal landholdings (Adhikari and Lovett, 2006;Varughese and Ostrom, 2001), and more deforestation in countries with higher inequality (Ceddia, 2019;Koop and Tole, 2001).Fairness and equity considerations are important determinants of people's behaviours and affect cooperation rates (Almås et al., 2010;Fehr and Schmidt, 1999).In experimental games, inequality in endowments or returns from the public good creates trade-offs between an efficient and an equitable distribution of benefits (Kingsley, 2016;Koch et al., 2021;Nikiforakis et al., 2012).Participants with higher endowments place higher value in efficiency while those with lower returns prioritize equity (Nikiforakis et al., 2012).Inequality in endowments also has negative effects on cooperation by creating distinct social identities (Martinangeli and Martinsson, 2020), decreasing levels of trust or social preferences amongst group members (Andersson and Agrawal, 2011), or reducing the positive effects of communication (Cardenas, 2003;Gangadharan et al., 2017).
In sum, the impact of inequality on the commons greatly depends on the type of inequality, the degree of inequality, the preferences and characteristics of the group, and the broader socioeconomic and institutional context.In observational studies, the effect of economic inequality on commons outcomes is hard to identify, because different types of inequalities interact simultaneously.For example, inequality in endowment coupled with inequality in the marginal benefits from the public good can have positive effects on cooperation, but negative effects when only one type of inequality is present (Hauser et al., 2019;Naidu, 2009).Experimental methods reduce such potential sources of bias.In this paper, we use experimental data to focus on how inequality in wealth, framed as the 'capacity to deforest' affects participation in a collective PES.

Framed field experiments and the study sites
Framed field experiments (FFEs) engage real stakeholders who have experience with the problem at hand.They recreate the decision-making situation in a controlled, hypothetical setting but with real (cash or inkind) incentives, thus serving as a testbed of alternative real-world policy interventions (Shreedar et al., 2020).Participants bring their own experiences and values, which increases the external validity of the results (Anderies et al., 2011;Cardenas and Carpenter, 2008;Finkbeiner et al., 2018;Gelcich et al., 2013).FFEs never fully capture all the nuances of the actual field settings, but they offer the advantage of manipulation and random assignment of treatments in a controlled setting (Ostrom, 2006), and allow for replication and direct comparison among different groups or samples.While it is impossible to capture the precise magnitudes of the treatments that could be observed in natural environments, the significance and direction of the effects in field experiments are relevant to capture (Kessler and Vesterlund, 2015).Simplified experimental games help identify general principles and patterns of behavior.
An important question of collective PES is how they perform in different local governance contexts (Hayes et al. 2019).The three sites selected for the study in Pará (Brazil), Central Kalimantan (Indonesia) and Ucayali (Peru), have characteristics that make them relevant for a comparison of the effects of a collective PES under different sanctioning institutions.At the country level, the selected villages share similar socioeconomic and institutional characteristics, such as drivers of deforestation and poverty levels (Sills et al., 2017).However, the country sites show differences in local reliance on forests and land tenure systems.Forests are owned communally in the Peruvian site, in the Indonesian site the land is owned by the state, while at the site in Brazil land is owned individually by colonist farmers.In the Peruvian and Indonesian sites, households have community level institutions for collective decision-making, while in Brazil there are no such institutions.Households control, on average, an area of ~2.0 ha for subsistence and commercial agriculture in the Peruvian and Indonesian sites, while in the Brazilian site, households control, on average, an area of 44.8 ha of forest and 38.7 ha of agricultural land, mostly pastures.In Brazil and Peru land tenure is in most cases considered secure, in the sense that collective and individual boundaries of properties are legally recognized.On the contrary, tenure is considered weak in the Indonesian site because village and households do not have legal recognition of the land they manage and forest access is based on local customary laws, which give individuals land claim when they have invested on that land (e.g., planting, clearing land) (Sills et al., 2014).Furthermore, deforestation activities by smallholders serve different economic purposes.In Indonesia, the production is mostly for subsistence consumption, while in Peru, and even more so in Brazil, it is conducted for market purposes.Average household deforestation is higher in Brazil (1.8 ha yr − 1 ) than in Peru (0.43 ha yr − 1 ) and Indonesia (0.04 ha yr − 1 ).Agricultural income share is higher in Peru (20.3%) than in Brazil (16.2%) and Indonesia (9.7%), while the livestock income share is much higher in the Brazilian site (47.4%)than in the Peruvian (6.4%) and Indonesian (4.7%) sites.Income inequality is highest in Brazil, while inequality in assets and land is highest in Indonesia (see Supplementary Information (SI), section B4 for a detailed description of the study sites).

The basic experimental set-up
The FFE was implemented with 720 participants in 24 villages between October 2019 and January 2020, equally split between the three country sites.Five experimental sessions were conducted in each village, summing up to 30 participants per village (see SI, section B4).The average age of the participants was 44 years, and 52% of them were men.
In the experiment, a group of six forest users shared access to a forest under a collective PES.In each round the participants simultaneously chose how many forest plots they would transform to agricultural land (croplands and pastures).Individual earnings depended on how many plots each participant had deforested and on how many forest plots were left standing once all participants had made their decisions.This framing is relevant for how collective PES operate on the ground: in many cases, benefits are distributed equally amongst participants, while cooperation and willingness to join varies amongst them (Hayes et al. 2019).We introduced the collective PES in the baseline stage, therefore we did not evaluate the additionality of the collective PES as compared to a pure open-access situation, as the topic has been well explored in other experimental studies (Andersson et al., 2018a;Handberg and Angelsen, 2019;Kaczan et al., 2017;Moros et al., 2019;Ngoma et al., 2020).Rather, we focused on identifying and comparing strategies to mitigate the local free-rider problem identified in collective agreements.
The experiment consisted of four stages with six rounds each.In the first stage, we introduced the baseline with the collective action problem.With a total stock of forest plots equal to S, and given the maximum allowed number of plots to deforest x i , the monetary pay-off during the baseline stage for participant i in round t was: The two conditions necessary for creating a social dilemma are that: (i) the return of deforestation of forest land x it is higher than the individual return of the collective PES (δ < 1), and (ii) the individual return from deforestation is lower than the group benefits from the collective PES (nδ > 1), with n being the number of forest users.Thus, the parameters must satisfy the condition δ < 1 < n δ.The levels of the parameters were set at S = 60, and δ = 0.4.We specified that each forest plot was equivalent to 0.5 ha.Considering individual pay-off maximizing users, the Nash Equilibrium, defined as the set of strategies where no one has an incentive to change their behaviour, occurs when everyone maximizes deforestation.However, from the perspective of the group, the best strategy is when there is no deforestation at all, as it yields higher returns than the Nash equilibrium.Thus, self-maximizing individual strategies lead to outcomes that are not socially optimal and lower individual earnings.
Inequality in wealth, or in the "capacity to deforest", was introduced by modifying the maximum number of forest plots that a participant could convert to agricultural land.Our inequality treatment was framed in terms of household's differences in capital needed to establish agricultural plots.In half of the experimental sessions, the Unequal groups, three randomly chosen "low capacity" participants could deforest a maximum of four plots (equivalent to 2 ha), and three "high capacity" participants could deforest up to eight plots (4 ha).In the Equal groups, all participants had a "medium capacity" to deforest six plots (3 ha).To strictly focus on the effects of inequality in wealth (i.e., individual endowments), the same aggregate deforestation capacity was maintained in Equal and Unequal groups.Further, the marginal benefits of deforestation were kept constant and equal across participants.Hence, the cooperation incentives were the same for every participant.
A major rationale for implementing collective PES is that it allows to reduce the monitoring and enforcement costs as compared to individual PES.We thus assumed that the group deforestation was perfectly monitored, and PES was fully enforced at the group level.This also allowed to make the experiment more easily understood by participants.Throughout the experiment the PES payment was distributed equally among participants, as communities with collective PES often distribute the earnings based on an individual basis and on egalitarian principles, not based on individual contributions (Hayes et al., 2019;Robinson et al., 2016).Although payments can be subject to elite capture (Andersson et al., 2018b;Persha and Andersson, 2014), we retain the same return to be able to identify the effect of unequal wealth distribution.
The stock of forestland was reset in every round, to avoid effects due to accumulated forest loss.Each plot of agricultural land was worth points, while each plot of forest gave 24 points to the group, equivalent to 4 points to each player.In other words, the collective PES covers for the opportunity costs of conservation at the collective level, but not at the individual level, creating the social dilemma.
In all sessions, each participant had a payoff table indicating his/her earnings as a function of his/her and others' decisions.Visual support was provided to explain the collective action dilemma, using a cardboard with 60 green squares.Each square represented a forest plot, and showed the group payoff of 24 points, and the individual payoff of points.Whenever deforestation took place, yellow paper stickers indicating the individual payoff of 10 points replaced the green squares.Before the baseline stage started, the structure and procedures of the common-pool resource were carefully explained, and any questions raised were addressed (see section B6 of SI for the script).
Participants knew who the other members of the group were, thus bringing their expectations and relationships with each other to the experiment.Individual actions remained anonymous to avoid postexperimental effects, such as retaliation, and to better capture individual preferences without the confounder of social pressure.While some individual deforestation decisions in real life can be visible to neighbours and authorities, operating in an anonymous environment is relevant as some decisions are not fully open: for example, when farmers try to "hide" their deforestation by converting forest far from the forest edge.
To conserve anonymity and reduce spillovers throughout the stages, each participant was represented by a letter of the alphabet, only known to the participant and the experimenter, and the letter was changed in each stage.No verbal communication between participants was allowed for multiple reasons.First, communication cannot be assumed a priori in our research sites: the study sites do not have the same local institutions that allow to discuss and collaborate.Further, verbal communication is a well-researched treatment found to increase cooperation in experiments (Chaudhuri, 2011;Ostrom, 2006), also in the context of collective agreements (Midler et al., 2015;Rodriguez et al., 2021).Experiments are most useful when they incorporate prior knowledge (Ludwig et al., 2011).The comparative impacts of increasing monitoring and enforcement in collective PES is less explored.Without communication we were able to clearly identify individual motivations to respond to different types of sanctions.Finally, no verbal communication reduced the risk of losing anonymity during the experiment by revealing own decisions or deforestation capacity.

The monitoring and enforcement treatments
Our treatments were implemented sequentially: in the second stage, after the baseline, we introduced Public monitoring.During this stage, once participants had chosen how many forest plots to deforest, the number of plots deforested by each was publicly revealed using their secret letter.The Public monitoring treatment allowed to explicitly separate the effect of two key elements of environmental governance that are often merged in experimental games: monitoring and sanctioning (Andersson et al., 2014).This allowed to evaluate whether there is an effect of just increasing the amount of information available to players through announcing individual conversion.One of the central mechanisms by which communication affects cooperation is by filling gaps in knowledge about future intentions of others and allowing participants to adjust their expectations (Cardenas et al., 2004).In that sense, the individual level monitoring introduced in stage two (and kept throughout the following stages), served as non-verbal communication, as participants could adjust expectations after seeing others' individual decision and not just the aggregate.
For the third and fourth stages, we alternated between first introducing Community enforcement, followed by Government enforcement, or vice-versa (see Fig. S5 in SI).This allowed to control for spill-over or learning effects from the two treatments.The Community enforcement treatment recreated a self-enforced collective PES, in which community members themselves could choose to sanction each other.This treatment captures the individual motivations to engage in self-enforcement.Self-enforcement involves some individual-level costs, that can be monetary or non-monetary, such as the time spent on monitoring activities, to report a non-cooperative individual, or the cost of bringing it up in a community assembly.
The Community enforcement stage consisted of two steps.The first step was identical to the Public monitoring stage.In the second step, each participant chose whether or not to assign a punishment to other participants.Assigning a punishment had a cost of 10 points for the punisher but it subtracted 30 points to the punished participant.This punishment-cost ratio (3:1) follows common practice in experimental games (Chaudhuri, 2011;Vollan et al., 2019).To avoid excessive punishment, the maximum number of allowed punishments in each round was limited to three, and each punishment had to be assigned to a different participant.Information about the punisher and punished participants in each round were made public by using their secret letters.This procedure allowed retaliation and reputation building, while maintaining anonymity.
The Government enforcement treatment recreated a policy-mix scenario, in which a collective PES is implemented along with an external enforcer who randomly monitors individuals and assigns sanctions to those who deforest.The treatment allows to identify the benefits of a 'hybrid approach' to forest conservation (Lambin et al., 2014).Individual level enforcement can operate even if PES benefits are provided at the collective level, but it is likely to be more costly than the aggregate level, and thus not fully enforced.During this stage, a probabilistic exposure to a third-party sanction was introduced, representing imperfect government enforcement (Cardenas et al., 2000;Velez et al., 2010).This is considered to be a better representation of the weak and costly forest enforcement that exists in most tropical forest countries (Robinson et al., 2010).The inspection probability for each participant was 1/3, and if inspected, for each plot deforested they lost 15 points.The sanction was non-deterrent as the expected benefit of deforestation was still higher than the one from conservation (i.e., it did not change the optimal strategy for a risk neutral participant).Government enforcement was costless to participants because in real-world scenarios smallholders cannot decide on the stringency and provision of government enforcement.For a detailed description of the payoff functions and optimal strategies in each stage, see SI (section B1).

Hypotheses
Given that non-monetary considerations can motivate cooperative behaviour (Lopez et al., 2012;Masclet et al., 2003), and that cooperation is often conditional on others' actions (Rustagi et al., 2010), at least two effects of the Public monitoring treatment are conceivable: (i) the display of own non-cooperative behaviour might induce some guilt and reduce the conversion in the following rounds; (ii) the conditional cooperators might reduce the willingness to cooperate, seeing some noncooperative members (high converters), and thus increase deforestation.
We expect monetary sanctions to further increase cooperation, but the relative effectiveness of each enforcement strategy is difficult to predict a priori.Government enforcement is likely to be more effective and efficient than Community enforcement because it imposes a norm of zero deforestation by punishing any deforestation if inspected, and it incurs no cost to participants.Community enforcement offers, however, the opportunity to better target the largest free-riders (compared to random sanctioning by Government) and participants can be punished more than once.We conjecture that the effects of enforcement will differ across sites, given the difference in land tenure regimes and history of collective governance.These differences are particularly relevant for peer punishment, which is dependent on cultural and social norms (Bruhin et al., 2020;Eriksson et al., 2017;Herrmann et al., 2008).
The second category of hypotheses relates to the effect of inequality in wealth.Evidence from lab experiments suggests that without sanctions, inequality in individual endowment does not affect average cooperation when the aggregate endowment is the same between equal and unequal groups, as participants will move towards the noncooperative outcome (Kingsley, 2016;Nockur et al., 2021;Reuben and Riedl, 2013).Once sanctions are introduced, participants with the highest capacity to deforest are expected to reduce their deforestation the most (Kingsley, 2016;Vollan et al., 2019).Thus, the introduction of monitoring and sanctioning should have heterogenous effects depending on the individuals' capacity to deforest.Inequality in endowments can in addition attenuate the positive effects of punishments or increase their frequency (Bernhard et al., 2006;Kingsley, 2016), increase risk taking attitudes (Payne et al., 2017), as well as reduce the preferences for internal enforcement institutions as compared to external (De Geest and Kingsley, 2019).Thus, we expect inequality in deforestation capacity to decrease the positive effects of the enforcement mechanisms, in particular efficiency.

Data analysis
We operationalized the 3E outcomes as follows.To evaluate effectiveness, we used the group and individual deforestation levels.For efficiency, following Cason and Gangadharan (2015), we calculated an index based on the realized earnings of participant i in each round t (π it ), the self-maximizing (Nash) strategy of the baseline stage (π NE ) and the socially optimal payoff (π SO ), such that: The realized earnings π it has three components: the agricultural income from forest conversion, the payment from the standing forest (the same for all group members), and the costs of received sanctions and assigned punishments during the Community and Government stages.Under the Nash strategy participants convert their maximum, and it gives the minimum payoff for the group (π NE ).Under the socially optimal payoff, conversion is zero and the group outcome is maximized (π SO ).Both of the latter indicators are constant across rounds and stages.The efficiency of each treatment compares individuals' realized payoffs π it to the socially optimal outcome π SO .Higher earnings indicate higher efficiency.Our definition of efficiency considers only the enforcement costs and assumes no monitoring costs for the aggregate forest outcome.This is a reasonable assumption in the case that the PES implementer is shouldering those costs.
To measure equity at the group level and for each stage, we calculated a Gini coefficient of individual earnings (Cowell, 2011) and the perceived fairness of each enforcement strategy using a post-experiment questionnaire (see SI, section B2).
We used Wald tests, Friedman tests, and repeated measures ANOVA tests to compare group averages, and multilevel linear mixed effects models to evaluate individual level effects.We included random effects across participants and sessions in all regression models (Rabe-Hesketh and Skrondal, 2008) to control for the dependence of observations within experimental sessions and individuals across rounds.We present our main results as linear models, as they produce unbiased predictions in public good games data and their interpretation is more straightforward than probit and tobit models (Ai and Norton, 2003;Kent, 2020), but use ordered probit models as a robustness check (Moffatt, 2015).To control for potential learning effects and temporal trends, the order of enforcement (whether Community or Government enforcement was played first), the experimental round within stages (from 1 to 6), and a dummy (from 1 to 5) indicating the order of the experimental session within a village were included in all the models.Likewise, to control for behavioral preferences across participants, we included variables measuring risk (Binswanger, 1981), social preferences (Fehr et al., 2013), see SI section B2 for a detailed description of elicitation methods.
We also measure and include trust as a control, given the empirical evidence indicating how trust shapes experimental outcomes (Andersson et al., 2018a;Pfaff et al., 2019).The distribution of covariates is balanced across treatments except for risk and social preferences, which are included as control in all subsequent analyses (see SI, section B3).

Effectiveness
Overall, the results lend support to the hypotheses that Public monitoring works as a social sanctioning mechanism and reduces deforestation, and that introducing monetary sanctions further increases PES effectiveness (Fig. 1).Group deforestation was high in the baseline stage: on average 15.9 and 16.8 forest plots were deforested in Equal and Unequal groups, out of a maximum of 36.Public monitoring significantly decreased group deforestation by 1.2 units in both the Equal (p < 0.04) and Unequal groups (p < 0.03), equivalent to 7.5% and 7.1% reduction respectively.In turn, Community enforcement decreased deforestation by 4.9 units or 30.8% (p < 0.001) in the Equal groups and by 5.7 units or 33.9% (p < 0.001) in the Unequal groups compared to the baseline.Government enforcement was the most effective, decreasing deforestation by 8 units or 50.3% (p < 0.001) in the Equal groups and by 7.5 units or 44.6% (p < 0.001) in the Unequal groups compared to baseline.Although group deforestation is higher in Unequal than Equal groups, the difference is not significant in any of the stages (SI, Table S1).
There are, however, important differences between the countries (Table 1).In Indonesia we observe no differences between the treatment effects of the Community and Government enforcement (Wald test, p = 0.59), and Public monitoring had no significant effects in Brazil (Wald test, p = 0.82).Furthermore, while inequality in deforestation capacity had no effect in Brazil or Peru, it significantly increased group deforestation in Indonesia by 0.4 units or 10%.We further examined whether the effectiveness of the enforcement mechanisms depends on (i) the inequality treatment and (ii) the order of the enforcement.Contrary to what was hypothesized, we found no significant interactions with inequality (SI, Table S4).Thus in our study sites inequality in deforestation capacity arising from wealth differences do not affect the overall effectiveness of the free-riding mitigation measures.We find, however, that the order of enforcement matters.When Community sanctions are introduced after Government enforcement, their effectiveness increases  (SI, Table S5).In other words, previous exposure to external enforcement increases the effectiveness of internal sanctions.Further decomposing the treatment effects by participant type (i.e., deforestation capacity) reveals that overall participants with a high (low) deforestation capacity deforested more (less) than their mediumcapacity counterparts (Table 2).Importantly, there are heterogenous responses to treatment depending on the participant type.For example, the Public monitoring effect in Peru is dominated by the response of wealthy participants (Table 2, column 4).In general, wealthy participants responded more to the Community and Government enforcement, while the behavioural response from participants with low deforestation capacity was in general weaker.As a result, there were no significant differences in predicted deforestation levels among participant types during the Community and Government enforcement stages in any country (Fig. 2.).The introduction of sanctions equalized individual deforestation levels.
We further examined the proportion of forest plots deforested from the maximum allowed (instead of the absolute number of plots) and found no significance in the interaction terms (SI, Table S6).Thus, the heterogenous effects by participant type manifest in absolute changes in deforestation, not in relative changes.Country differences are again observed, and participants in Indonesia with low deforestation capacity converted a higher proportion than their medium-capacity counterparts, which explains why there are no significant differences in absolute deforestation levels between the two groups levels (Table 2, column 3).

Efficiency
Recall that the efficiency index is individuals' realized payoffs relative to the socially optimal outcome, cf.Eq. ( 2).Public monitoring of individual deforestation increased efficiency in Indonesia and Peru.Government enforcement was the most efficient treatment in all countries (Table 3).Community enforcement, on the other hand, did not increase efficiency compared to the baseline stage, in any of the country sites (Table 3).Thus, the benefits of the disciplining effect of peer punishment were not sufficient to outweigh its cost.This result is not only contingent on the fact that Government enforcement had no costs to participants during the experiment.Artificially introducing a cost to Government enforcement that resembles the cost of Community enforcement at the group level finds that Government enforcements remains more efficient as compared to Community enforcement (see SI, Table S7).Moreover, in Unequal groups in Indonesia and for the total sample, Community enforcement decreased efficiency and thus participants' earnings (Table 3, columns 2 and 6).The lower efficiency observed in the Unequal groups during the Community stage is explained by the higher frequency of costly punishment in Unequal groups (16.9 per session on average) as compared to the Equal groups (11.7 per session on average), a statistically significant difference (SI, Table S8).

Equity and fairness
Overall, inequality decreased with the introduction of Public monitoring and Government enforcement, but not with the introduction of Community enforcement, as indicated by the Gini coefficients (Table 4).But there are differences across groups and sites.Both Public Monitoring and Government enforcement decreased inequality in earnings in the Equal groups, while in the Unequal groups only Public monitoring had a significant effect in reducing inequality (Fig. 3).Across sites, in Brazil none of the enforcement strategies reduced inequalities.In Peru only Public monitoring reduced inequality.In Indonesia, Community enforcement increased inequality in Unequal groups, and in Equal groups both Government enforcement and Public monitoring reduced inequality (Fig. 3).
Why did the treatments not reduce inequalities significantly, despite deforestation rates being equalized across participant types?If we calculate the Gini coefficient of earnings without including the punishment costs, there are significant reductions in inequalities (Table S10 and Fig. S3, see SI).Thus, it is the punishment behaviour during the Community enforcement, as well as the random nature of sanctioning from the part of Government which inhibits positive distributional effects of enforcement.
Participants perceived Government enforcement as fairer than Community enforcement.Half (51.1%) thought that Government enforcement was fairer than Community enforcement, while 24.6% favored Community over Government enforcement.The remaining participants considered both enforcements to be equally fair (21.3%) or that neither institutional arrangement was fair (3%).In Peru participants were more likely to mention that both types of enforcement were equally fair (41%), while in Indonesia and Brazil most participants thought Government enforcement was fairer, with 64 % and 54 % of the participants, respectively.The probability of choosing either Government or Community enforcement as fairer was independent of being a participant with high, medium or low deforestation capacity (see Table S11, in SI).

Solving the free-rider problem
Collective payments for forest conservation create a local collective action problem, as individual forest users have incentives to free ride on

Table 2
Treatment effects on individual deforestation interacted with deforestation capacity, by country. (1) ( Public monitoring of individual deforestation had a positive, albeit modest effect on group deforestation.This is consistent with studies showing that monitoring activities can increase PES effectiveness (Martin et al., 2014) and forest protection in general (Slough et al., 2021a), but also that they are far from being sufficient to ensure perfect compliance (Wunder et al., 2018).In our study, the effect was significant only in the country sites which have history of local collective action in terms of forest management and rule setting (Peruvian and Indonesian sites).This suggests that previous experience with collective agreements is an essential ingredient for getting a positive conservation impact of individual monitoring.The experimental literature has also demonstrated how previous communication or successful cooperation positively influences collective outcomes (Gangadharan et al., 2017;Rodriguez et al., 2019).While in our experiment the individual monitoring was anonymised, non-anonymised reporting, where the identity of the individuals is revealed, could have yielded even stronger effects.For example, public disclosure has stronger effects when noncooperating individuals are singled out (Spraggon et al., 2015).Our  Government enforcement is the most robust policy to increase the effectiveness and efficiency of the collective PES and was effective in all country sites and inequality contexts.In addition, previous exposure to external sanction increased the effectiveness of Community enforcement.Introducing external sanctions allows to coordinate on particular norms that can serve as focal-points (Gelcich et al., 2013;Nikiforakis et al., 2012).Moreover, we show that the random targeting of largest free-riders inhibits the positive distributional effects of enforcement.Accurately identifying the largest free-riders is therefore necessary to strengthen the positive equity effect of external enforcement.An impartial, strong external enforcement might be difficult to implement in situations of weak governance and corruption, where private interests or lack of funding might conflict with the provision of the public goods (Karsenty and Ongolo, 2012;Sundström, 2015).This is still a major challenge for effective environmental regulation.Nonetheless, most participants perceived Government enforcement as being fair, which indicates that effectiveness and efficiency considerations do not contradict equity and fairness ones.Emphasizing the potential win-win outcomes of external sanctions is particularly important considering that enforcement and sanctioning of PES non-compliance often lacks political support (Wunder et al., 2018).
Community enforcement can deliver on conservation outcomes but potentially entails a significant cost to community members.Results from the Indonesian site show that, compared to the baseline stage, introducing costly peer punishment creates significant trade-offs between effectiveness on the one hand, and efficiency and equity on the other.One of the reasons for lower effectiveness and efficiency of peer punishment is the existence of antisocial and retaliatory punishments (Bruhin et al., 2020;Nikiforakis, 2008;Vollan et al., 2019).Indeed, the uneven distribution of costs and benefits of collective agreements can lead to within community conflicts (Hayes et al., 2019).Community enforcement effectiveness and efficiency could be improved if collective PES implementers facilitate communication and increase social capital amongst PES participants (Koch et al., 2021).A large body of experimental evidence has shown the positive effects of communication on cooperation (Cardenas et al., 2002;Chaudhuri, 2011;Gangadharan et al., 2017;Hackett et al., 1994;Tavoni et al., 2011).But while communication typically increases effectiveness, it has limited positive distributional effects (Rodriguez et al. 2021).Given that strong community governance remains a major challenge (Dokken et al., 2014;Murtinho and Hayes, 2017) our study highlights the need to guarantee   S8) for full model specification and regression results.Vertical lines represent 95% confidence intervals of the coefficients.
that communities have an arena to discuss strategies and define their monitoring and sanctioning rules in the implementation of collective PES.Non-experimental studies suggest stakeholder involvement and external support from intermediaries such as NGOs facilitate participation and cooperation in PES in general (Izquierdo-Tort et al., 2021;Murtinho and Hayes, 2017;Pham et al., 2010), and can reduce elite capture (Persha and Andersson, 2014).

The effect of inequality
Our study provides new evidence of how wealth inequality, understood as differences in the capacity to engage in deforestation, can negatively affect the effectiveness and efficiency impacts of environmental regulations.The effect of wealth inequality cannot, however, be generalized across study sites: it was only significant in Indonesia, where it both increased deforestation as well as reduced efficiency.Considering that Indonesia has lower tenure security compared to the other sites, our results are consistent with the theory of collective action: one of the eight design principles for successful management of the commons is to have clearly defined boundaries (Ostrom, 1993).Inequality, when coupled with insecure tenure, has negative effects on cooperation, but does not have significant effects in sites with clear land tenure (communal or individual).Other factors explaining the strong inequality effect in the Indonesia site include higher pre-existing inequality in landholdings and assets compared to the other two sites, and stronger customary rules of forest management.These factors also explain why there were no differences in the effectiveness of external and internal enforcement in this country site, coinciding with a similar experiment conducted in Namibia (Vollan et al., 2019).While the impact of inequality seems to depend on the country site, future research could examine how this effect is mediated by factors such as levels of trust and social preferences amongst participants.The heterogenous findings across sites highlight the importance of considering different populations in inequality studies.
A result generalizable across country sites is that wealthy participants with high deforestation capacity tended to be more responsive to (the threat of) sanctions than their poorer counterparts.This result is particularly interesting considering that all participants faced the same incentives to cooperate and the same sanctioning costs.The lower responsiveness of poorer participants to sanctioning is consistent with being more averse to disadvantageous inequality than to advantageous inequality (Fehr and Schmidt, 1999).Evaluations of collective PES also show that wealthier residents are more likely to change their behaviours (Hayes et al., 2017).

Policy implications and limitations
Two important considerations for the external validity and policy implications of our results should be noted.First, that the endowment inequality was created exogenously.Different results could be expected with endogenous inequality (i.e., with a real effort task), as the origin of wealth differences affects fairness perceptions (Almås et al., 2010).Future inequality studies could evaluate what happens when wealth inequality (i.e., differences in endowment) are stronger or when they are interacted with other sources of inequality, such as the returns of collective PES or of the private good (e.g., Vorlaufer et al., 2017).Second, the experiment simulated a best-case scenario of perfect and costless monitoring conditions: PES was perfectly monitored, and everyone could observe others' deforestation and could punish all players at the same cost (Community stage) or with the same probability (Government stage).Arguably, conditions in the field are different; it might be costly to track individual deforestation, or power relations can modify enforcement costs amongst community members.Experimental evidence shows that external enforcement and collective PES maintains strong effects even with lower sanctioning probabilities than in this study (Andersson et al., 2018a;Lopez et al., 2012;Vollan et al., 2019), or when the sanctions are provided only at the collective rather than individual level (Cason and Gangadharan, 2013).On the other hand, under imperfect monitoring, the effectiveness and efficiency of peer punishments decreases (Boosey and Isaac, 2016;Grechenig et al., 2010;Shreedar et al., 2020), as do the acceptability and preference for a decentralized institution (De Geest and Kingsley, 2019).These findingsalong with our resultspoint to the advantages of external enforcement as compared to internal enforcement mechanisms when implementing collective PES.Given the known positive effects of community monitoring in the management of common-pool resources (Buntaine and Daniels, 2020;Slough et al., 2021b), a combination of bottom-up monitoring with higher-level sanctioning could be a promising strategy to increase individual compliance in collective agreements.Yet, it could potentially decrease the economic efficiency (earnings) as the PES participants incur the monitoring costs.
Overall, we showed how different sites respond to increased monitoring and enforcement in collective PES.The fact that we find heterogenous responses to the treatments lends support to the external validity of our results; the sites with less history of collective action are less responsive to peer punishment and individual monitoring.Our findings are useful to policymakers and PES implementers as they consider options for designing more effective, efficient and equitable interventions, in particular, the potential benefits of increasing monitoring and enforcement.Relevant criteria affecting the impacts of enforcement mechanisms include tenure regimes, histories of collective action, and previous exposure to centralized enforcement.

Conclusion
Collective payments are a promising conservation policy to reduce global deforestation, but their effectiveness is jeopardized by the fact that they entail incentives for individual free riding.As collective PES gain traction, policy makers and practitioners should consider strategies that can help solve the free-riding problem intrinsic to such payments and thus deliver effective, efficient and equitable (3E) outcomes.Our study is the first to show the implications of different monetary and nonmonetary sanctioning strategies to limit free-riding, and to link these outcomes to different land tenure and institutional contexts.Compared to a situation of collective PES without any individual monitoring and enforcement, we show that introducing monitoring and enforcement allows to significantly increase the benefits of collective PES.
Public monitoring of individual decisions has limited effectiveness as compared to the introduction of monetary sanctions, and a significant effect is only observed in sites with a stronger history of collective action.Community enforcement (internal, peer-to-peer sanction) increases effectiveness but can reduce the efficiency and equity of collective PES, especially when implemented in communities with unequal access to resources.We find important variations in impacts; for example, in Indonesia the reduction in deforestation from Community enforcement is higher than in the other two sites, and inequality in the access to forest resources significantly increases group deforestation.However, across the sites, external, Government enforcement provides the strongest and most robust results in terms of effectiveness and efficiency outcomes.Further, punishment that does not effectively target free-riders hampers the positive distributional effects of both enforcement strategies.
Finally, we find that implementing collective PES in groups with inequality in wealth can have negative effects on conservation and exacerbate the trade-offs between effectiveness, efficiency and equity outcomes.In addition to individual free riding, a challenge in designing and implementing PES is to manage such trade-offs, and our results suggest that these are particularly pronouncedand thus PES implementation more challengingin contexts with unequal forest access.The results are relevant for both collective PES schemes as well as groupbased incentive schemes in general.

Fig. 1 .
Fig. 1.Aggregate group deforestation (number of plots) per round, per country.The Community and Government stages were played randomly in either rounds 13-18 or rounds 19-24.Vertical lines represent 95% confidence intervals.

Fig. 2 .
Fig. 2. Predicted deforestation depending on participant's deforestation capacity, by treatment and country site.Vertical lines represent 95% confidence intervals.

Fig. 3 .
Fig. 3. Average marginal treatment effects of Public monitoring, Community and Government enforcement on the Gini coefficient, for Equal and Unequal groups and by country.See SI (TableS8) for full model specification and regression results.Vertical lines represent 95% confidence intervals of the coefficients.

Table 1
Treatment effects on individual deforestation decisions, by country sites.

Table 3
Treatment effects on efficiency, by country.

Table 4
Average Gini coefficient in Equal and Unequal groups, by stage a,b .Gini coefficients are in general low because the collective benefits were large.We chose to have a high base collective payment for ethical reasons.
a b