Increasing the impact of collective incentives in payments for ecosystem services

Collective payments for ecosystem services (PES) programs make payments to groups, conditional on specified aggregate land-management outcomes. Such collective contracting may be well suited to settings with communal land tenure or decision-making. Given that collective contracting does not require costly individual-level information on outcomes, it may also facilitate conditioning on additionality (i.e., conditioning payments upon clearly improved outcomes relative to baseline). Yet collective contracting often suffers from freeriding, which undermines group outcomes and may be exacerbated or ameliorated by PES designs. We study impacts of conditioning on additionality within a number of collective PES designs. We use a framed field-laboratory experiment with participants from a new PES program in Mexico. Because social interactions are critical within collective processes, we assess the impacts from conditioning on additionality given: (1) group participation in contract design, and (2) a group coordination mechanism. Conditioning on above-baseline outcomes raised contributions, particularly among initially lower contributors. Group participation in contract design increased impact, as did the coordination mechanism. & 2017 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Payments for ecosystem services (PES) programs offer contracts in which landholders are paid for specified environmental actions or outcomes. Recent decades have seen rapid growth in PES programs (Ferraro, 2011;Porras et al., 2008). In many lower-income settings, PES programs have become a preferred policy approach due to their potential to better balance conservation and livelihood outcomes relative to other types of conservation policies (Ferraro and Kiss, 2002;Leimona, 2009;Pagiola et al., 2005;Sims and Alix-Garcia, 2017;Wunder, 2007).
PES programs typically make payments to individuals for actions undertaken on land managed at an individual or household level (Kerr et al., 2014;Porras et al., 2008). Individuals can weigh the private costs of the actions required relative to the payment offered and thus make a rational participation decision 2 (Ferraro and Kiss, 2002). This approach is well subsidies to similarly increase public goods provision (Bagnoli and Mckee, 2016;Cadsby and Maynes, 1999;Segerson, 1988;Segerson and Wu, 2006;Xepapadeas, 1991), these mechanisms do not translate well to voluntary PES contexts due to their reliance on random fines or budget imbalances. We return to this point in Section Collective action and PES below.
Second, we find that participants who were initially less cooperative (i.e., those with lower prosocial tendencies) responded more to conditionality on additionality. We show that this response is not simply because those who initially give less have more room to improve. While we cannot test the precise mechanism, the response is consistent with a model in which individuals motivated by non-monetary incentives face partial motivational crowding out from the new external monetary incentive. Those with less non-monetary motivation (i.e., those who give less initially) have less scope for motivational crowding out and so respond more strongly to the new external monetary incentive. We note other potential explanations, such as heterogeneity in non-monetary preferences, below. This result has implications for collective PES targeting, as well as for the likely success of programs as they expand beyond highly motivated, first-mover communities.
Third, we find that giving participants a veto of conditionality on additionality raises the impact of that conditionality. It appears that just having a "voice" in rule-setting raises the social acceptability of the resulting rule, regardless of whether that voice is used to change the rule. This finding is relevant for community consultation during contract design. Finally, we find that an internal coordination mechanism, specifically the ability to monitor peers' contributions and issue sanctions, raises both baseline contributions and the impact of conditionality on additionality. We are not aware of any previous study that has explored how better internal group function interacts with collective PES contract design.
We proceed in the next section with a review of relevant literature. Section Setting provides environmental and institutional context for our field laboratory experiment. Section Model and empirical approach lays out our hypotheses in light of that context, and provides details of our experimental design. Section Results presents our experiment outcomes, supported by insights from survey, focus group, and interview data. Section Discussion and conclusion presents some implications.

Collective action and PES
Research specifically focused on collective PES is limited (Kerr et al., 2014). 5 However, relevant conceptual insights can be drawn from the substantial literature on common pool resource (CPR) management. 6 Ostrom (1999Ostrom ( , 1990, Feeney et al. (1990), Baland and Platteau (1996) and others have described conditions in which groups successfully self-organize to manage collective resources. These conditions include: (1) a well-defined resource; (2) a small stakeholder group with shared norms and interdependencies; (3) governance seen as appropriate by those affected; and (4) matched scales between the resource and governing institutions (Agrawal, 2002). These suggest guidelines for effective collective PES design. First, it suggests targeting cohesive, internally cooperative communities with shared norms and interdependencies. Second, it suggests that if collective contracts are negotiated in a way that utilizes and reinforces existing community governance mechanisms, it might strengthen those institutions and promote successful outcomes. Third, it suggests that involving participants in contract design may add legitimacy relative to a top-down approach (see also Del Corso et al., 2017;Marshall, 2005;Reed, 2008). We test these suggestions explicitly in our experiment.
A closely related problem is addressed by the non-point source pollution literature. 7 "Non-point source" describes pollution that cannot be traced to individual emitters due to monitoring costs or environmental stochasticity. Only ambient levels are measurable. Consequently, rules to reduce pollution must target groups of emitters. As with collective PES, an efficient outcome requires mechanisms to align individual incentives with the social benefits of cooperation. Segerson (1988) and Xepapadeas (1991) proposed tax and subsidy mechanisms with this aim. A number of subsequent laboratory experiments (Alpízar et al., 2004;Camacho-Cuena and Requate, 2012;Spraggon, 2004Spraggon, , 2002 demonstrated the potential for these and related approaches to achieve social optima. However, they are not balanced budget mechanisms, 8 creating a potentially high financial burden on the regulator (in cases of subsidies) or on individuals (in cases of taxes). Balanced 5 Empirical evidence includes a small number of program evaluations and case studies. Community payments in Ecuador reduced individuals' overuse of shared grazing lands, and increased community rule-setting around grazing lands (Hayes et al., 2017(Hayes et al., , 2015. Small community payments for reduced forest use in Madagascar improved attitudes towards monitoring and regulation-based conservation actions, but did not change observed behavior (Sommerville et al., 2010). Community payments led to greater support for a bird conservation program in Cambodia (Clements et al. 2010). Variation in policy design suitable for making causal claims is limited. 6 We consider collective PES to represent a public good (PG) problem. All community members benefit from the collective payments (if they are shared evenly or spent on public goods), and from the ecosystem services generated, yet securing those benefits requires costly individual effort. However, there are obvious parallels with CPR situations. Like collective PES, CPR problems require individuals to face private costs (restraint in resource use) for the sake of a social good (a more productive resource). 7 A helpful reviewer pointed out a further literature that could inform collective PES programs, that concerning threshold solutions to public goods (PGs) problems. Threshold solutions elicit efficient PG provision by paying only when a group meets a contributions target. This is combined with a moneyback guarantee that eliminates the risk of free riding (Bagnoli and McKee, 1991;Cadsby and Maynes, 1999). This mechanism departs significantly from collective PES as implemented in Mexico, or elsewhere to our knowledge, and thus does not form the basis for our study. 8 A balanced budget measure is one where the total tax or subsidy is equal to the social value of the change in behavior.
budget versions (see Xepapadeas, 1991) require random fines that pose fairness concerns in the PES context. 9 Later studies have tested variants of these mechanisms in concert with other ways of inducing cooperation. Poe et al. (2004) and Vossler et al. (2006) combined Segerson's (1988) model with group discussions. Cochard et al. (2005) incorporated a within-group cost of pollution, an approach mirrored in Reichhuber's et al. (2009) framed CPR game. Cason and Gangadharan (2013) combined a tax with peer-to-peer sanctions. Unlike our experiment, most of these studies test policy instruments designed to motivate socially optimal outcomes under assumptions of self-interested behavior. As a result, in cooperative environments they often find over-compliance. By contrast, we deliberately use a monetary incentive insufficient to motivate monetarily rational contributions. Such low levels of payment are common in PES programs, particularly in developing countries where budgets are particularly limited (Porras et al., 2008). 10 The resulting reliance on prosocial tendencies leads to the possibility of motivational crowding (described in the next section), and thus an ambiguous prediction for the sign of the average impact of conditionality on additionality. Similarly, this reliance on prosocial tendencies increases the likelihood that social interactions will matter, motivating the further elements of our study's focus (participation in rule-setting and internal coordination mechanisms).

Interactions between monetary and non-monetary incentives
Many activities may be motivated by either intrinsic incentives (e.g. interest, sense of duty) or extrinsic incentives (monetary reward, punishment). For intrinsically-motivated activities, the addition of new extrinsic incentives can cause "motivational crowding," the displacement of intrinsic motivation and possibly even a net loss of total motivation (Bowles, 2008;Bowles and Polanía-Reyes, 2012;Cameron et al., 2001;Deci et al., 1999;Festré and Garrouste, 2015;Frey and Jegen, 2001). This is theorized to occur when extrinsic incentives question autonomy or undermine opportunities for positive peerrecognition (Ryan and Deci, 2000). Theory and evidence in the social psychology and economics literatures suggests that motivational crowding may occur during the presence of extrinsic incentives, or after the extrinsic incentive has been removed (Deci et al., 1999;Frey and Jegen, 2001).
A number of studies have tested for motivational crowding in natural resource management contexts. Using field lab experiments, Cardenas et al. (2000) found that even a weakly enforced regulation crowded out social cooperation for firewood conservation. Velez et al. (2010) found that regulation crowded out pre-existing social cooperation in fishing communities. Experiments by Gelcich et al. (2013) found external sanctions to complement norms of group cooperativeness among fishers, but only for groups with high social capital. Most relevant to our study, Yañez-Pagans (2015) examined collective behavior under Mexico's national PES program and did not find net motivational crowding out from the existing payment. These studies suggest it is possible, but by no means certain, that the average impact of conditionality on additionality could be negative or diminished from what might be otherwise expected. Sanctions, in the form of withdrawn payments, may lower contributions if they crowd out pre-existing intrinsic motivation, during and/or after their imposition. In addition, we might expect a greater chance of such motivational crowding among higher initial contributors (those with higher intrinsic motivation, and thus more to lose in the presence of new external incentives). This motivates our focus on initial cooperativeness as a modifier of conditionality's impact.

Setting
Mexico offers an ideal location to study collective natural resource management and social interactions. An estimated 60% of forest cover is found in núcleos agrarios (Madrid et al., 2009), a form of local governance which administers important elements of communal land ownership and management. Usufruct rights are granted by the community to individuals for housing plots and agricultural purposes, while forests and large-scale pasturelands remain under collective use. Management activities in these areas are most often conducted through communal labor, coordinated by community governance institutions (Schroeder and Castillo, 2013).These institutions have a consistent, state-mandated form across communities (although differ in quality). A general assembly of all land rights holders (asamblea), an executive committee (comisariado), and a supervisory council (consejo de vigilancia) oversee allotments of individual parcels, land management decisions, and the management of some community-allocated subsidies, including PES programs (Barnes, 2009).
Mexico faces ongoing deforestation, including in ecologically important primary forests, predominantly due to conversion to cropping and pasture (Muñoz-Piña et al., 2008). 11 Mexico also has a high level of biodiversity, ranking among the top five countries globally for endemism of vascular plants and vertebrate species (Mittermeier et al., 1998;Myers et al., 2000). A large rural population facing relative socio-economic disadvantage (Brandon et al., 2005) implies tensions between agricultural land uses and conservation. PES arguably represents a sensible policy response given the extent of private and community forestland and the need to balance social and environmental goals.
In 2003, CONAFOR introduced a national PES program, Pagos por Servicios Ambientales Hidrológicos (PSA-H), which aimed to improve water quality via forest conservation, while also reducing rural poverty (McAfee and Shapiro, 2010). The program has had many adaptations since, but has consistently provided participants with 5-year contracts for per-hectare conditional payments in locations deemed important for ecosystem services provision. 12 This PES program is one of the largest in the world, with about 3.4 million hectares enrolled from 2003(Shapiro-Garza, 2013. The program targets communal and smallholder properties with the majority of land enrolled being communally held and managed (Yañez-Pagans, 2015).
A sub-program of the PSA-H, Mecanismos Locales de Pago por Servicios Ambientales a través de Fondos Concurrentes 13 (FC), was first implemented in 2008 to facilitate locally-designed PES agreements between downstream ecosystem services beneficiaries (e.g., utilities and businesses) and upstream land owners. Federal funding is provided to cover half of the payments (FONAFILIO et al., 2012). Although the FC program supports the development of PES initiatives for multiple ecosystem services, most of the agreements are targeted towards water provision. Given that they are negotiated between upstream and downstream parties, contracts vary across locations. However, minimum standards and the overarching framework are imposed by CONAFOR.
Our model and experiment reflect key elements of the Mexican FC program, from the upstream (provider communities') perspective. We use a graduated sanctioning mechanism, reflecting the fact that if deforestation is detected (by satellite monitoring and annual field visits), payments are reduced proportional to the percentage of the total area lost. The finite 5-year contract period motivates our test of both treatment and post-treatment impacts. FC contracts are primarily negotiated by local intermediary organizations (socios locales) with upstream communities and are ratified by community asambleas, prompting our focus on participation in rule-setting. Further real-world motivation is explained in the model and experiment design sections below.

Model
Our model of collective PES follows the structure of a voluntary contributions (VC) game. Contributions g, by individuals i, with endowments y, help a community (of size n) fulfill a PES contract. In return, the contract pays the community an amount a, proportional to the sum of contributions made. 14 Individual payoffs, π i , are given by: As in standard VC games, this setup implies optimal individual contributions of zero (assuming <1 a n ), regardless of the contributions made by other group members. Conditionality is achieved by levying a sanction, f, which scales with the discrepancy between total group contributions and a target, T. The marginal, rather than lump sum nature of this sanction is consistent with both the FC program (described above) and many PES programs elsewhere. (Individuals or communities that fail to uphold a contract on a proportion of enrolled land typically lose payments for that proportion.) The effect of the sanction is to reduce total payments, mimicking partial non-payment for a partially unfulfilled contract. (In the real program, and thus in our model, payments are always ≥0.) Given that an individual's marginal payoff is negative in the baseline setup, the effect of the sanction is to lower a contribution's marginal loss, up to the level of the target. 15 We use two levels of target, T. Our higher T requires additional contributions, relative to baseline, from all groups (i.e. motivates additionality). Our lower T does not. 16 Given the sanction, the individual's payoff function becomes: Which still implies optimal individual contributions of zero (assuming < + 1 a f n ). Of course, within the model, such a result is a function of our parameter choices. Yet it is conceptually consistent with collective PES as experienced in our study sites. 12 Recipients are expected to fence forests to exclude stock, prevent illegal logging, collect garbage, create firebreaks, dig water infiltration ditches, and build retention walls to prevent soil erosion, among other activities. 13 Translation: local mechanisms for payments for environmental services through matching funds. 14 The public good we focus on is the collective contract payment, although we note that collective direct benefits from the forest itself could play the same role. Focus group discussions in our study sites indicated that communities perceived direct benefits from forest conservation including cleaner spring water and ecotourism benefits. These do not involve rival extraction choices, and thus we do not use a CPR game. 15 A helpful reviewer pointed out that a payment increase could achieve the same change in marginal incentives as this sanction. Whether that would have the same impact is an interesting empirical question (as sanctions and rewards differ in framing for a given incentive) but one that is beyond the scope of this paper. We consider a sanction consistent with the real-world policy approach typically taken for contract non-compliance. 16 We know that our higher T (and not our lower T) requires additional contributions by comparing it to control groups' contributions, and to treated groups' contributions immediately prior to treatment. T levels were selected based on piloting. This approach to choosing contract targets is feasible for real PES design.
Consequently, non-monetary returns to giving, ( ) m g i , are necessary for positive contributions to occur. These could be altruism, reputation, or other prosocial concerns (Andreoni, 1993;Chan et al., 2002;Masclet et al., 2003). 17 Salience, s, may also matter. For example, reputational benefits may accrue if an individual's contributions are known to others. Following Levitt and List (2007), we assume an additive utility function with monetary and non-monetary components. Omitting the sanctions, we have: With the first order condition giving an optimum contribution of: Greater salience can increase the utility maximizing quantity of g i . The imposition of a sanction will also increase g i under these assumptions, as seen in the resulting first order condition: This utility function features linear monetary but non-linear non-monetary returns to contributions. The former is a reasonable assumption for transfers at the scale of a laboratory experiment, although it might not be strictly true for transfers at the scale of significant PES for poor households. The relevant assumption, however, is simply greater relative curvature of non-monetary returns. This is supported by behavioral game theory and results from standard experiments. 18 To summarize, a collective sanction conditional on aggregate contributions of a group can increase the equilibrium amount of individual contributions g i , provided that there are non-monetary returns. We expect this outcome even though marginal monetary returns remain negative for the individual. We can infer that non-monetary returns are present from contributions made in the initial round of our experiment. This is the basis for our research question I: does conditionality on additionality encourage greater collective contributions? As discussed in Section Interactions between monetary and nonmonetary incentives, it is possible that the introduction of externally imposed incentives motivationally crowds out non- This makes the sign of net impacts from conditionality on additionality uncertain. We next consider heterogeneity in non-monetary returns. Communities participating in FC differ considerably in the extent of their apparent cooperation for provision of local public goods (as we expect is the case for communities in any diverse PES setting). Consider two types of participant, differentiated by low (A) and high (B) non-monetary-returns to cooperation such that The functional form of ( ) m . determines an individual's response to the sanction. We do not speculate on the relative differences of the functional form for high and low types, except to note where * g i is an optimum private contribution, the increase in * g i A due to the sanction will be greater than the increase in * g i B . Alternatively, or in addition, it is possible that ( ) m . decreases in the presence of external sanctions due to motivational crowding out, as discussed above. Given that high types by definition have greater intrinsic (non-monetary) motivation, it is likely they would see a greater decrease in their non-monetaryreturns from a sanction's imposition than would low types, i.e.
. This implies that the increase in * g i A due to the sanction will be larger than the increase in * g i B . This is the basis for our research question II: do initially lower contributors respond more positively to conditionality on additionality? We stress that we cannot test the specific mechanism for any differential impact, and either one or a combination of the above mechanisms (or others) could plausibly explain results.
We can also use the non-monetary component of utility ( ) m g i to speculate on the impact of participation in rule-setting.
Increased participation in an institution's design and implementation could engender greater buy-in and thus support for 17 This is consistent with real experience under the FC program. Surveyed participants make considerable voluntary contributions of time, effort and resources to their communities. For example, 31% have held leadership positions, 61% attend at least one collective meeting per year, and 56% of households have a member who made a voluntary contribution of time to forest-related work activities. 18 The ultimatum game, for example, shows that perceived fairness can shift with relatively small changes in contributions, triggering dramatic changes in prosocial preferences. In the provision of public goods, group members often make peer comparisons to determine what is fair. Thus, even relatively small changes in contribution levels, such as moving from below what others contribute to moving to above what others contribute, can greatly shift non-monetary returns to g i . the institution's goal (Del Corso et al., 2017;Frey and Stutzer, 2006;Kroll et al., 2007;van Noordwijk and Leimona, 2010;Wahl et al., 2010;Walker et al., 2000). Extending once again our utility function and indicating choice in rule-setting by c, In this case, non-monetary returns would be greater (for ∀ > g 0 i ), such that we would expect individuals' responses to an incentive to be more positive when they had the chance to participate in rulesetting. This is the basis for our research question III: do groups with a right to veto conditionality on additionality respond more positively to that conditionality? Finally, we consider the impact of an internal coordination mechanism. Some group situations permit individuals to see their peers' actions and issue individually targeted punishments. If we assume that greater penalties (p) are levied on those who make "insufficient" contributions in the eyes of their peers, providing those contributions are salient, i.e. ( ) then the utility function becomes: ,, m a x0 , , This implies: This gives a larger optimum, * g i , under both the collective sanction condition and baseline condition, than under the equivalent situations without the internal mechanism. The role of salience, s, is also apparent: publically revealing contributions encourages greater cooperation both by increasing the non-monetary component of utility and via the penalty mechanism. This forms the basis for our research question IV: do groups in which individual members can monitor and respond to each other's contributions contribute more, in the baseline and in response to conditionality?
To summarize, our model gives rise to four related research questions, which together aim to evaluate the effectiveness of collective conditionality in a context of social interactions. For convenience, we restate these questions here: I. Does collective conditionality on additionality encourage greater collective contributions? II. Do initially lower contributors respond more positively to conditionality on additionality than initially higher contributors? III. Do groups with a right to veto conditionality on additionality respond more positively to that conditionality than those without a right to veto? IV. Do groups in which individual members can monitor and respond to each other's contributions contribute more, either in the baseline and/or in response to conditionality, than those who cannot?
We investigate these questions using the framed field-laboratory experiment described in the next section.

Experiment
Framed field-lab experiments allow the study of potential institutions with relevant populations and with experimental control. Settings and policies are stylized, yet are incentivized and framed in ways analogous to reality (Harrison and List, 2015). They provide a means to simulate the impacts of policy changes that cannot otherwise be easily observed (or do not yet exist). The use of real payments increases incentive compatibility relative to hypothetical questionnaires.

The voluntary contributions (VC) game
We used a VC game, framed as a forest-conservation PES, to investigate the above four research questions. Key features are motivated by characteristics of the FC program. Each group had five participants, each of whom were endowed with 10 cards said to represent time that could be spent on community activities (reforestation, forest management) or on oneself and one's family. Participants were told that community activities improve the quality of community forests and thus generate a PES payment to the community, which is shared equally among the group members. 19 We employed visual aids and detailed examples. A verbal quiz was administered before the game commenced to ensure all participants intuitively understood the payoff function. In the absence of sanctions or internal mechanism treatments, this function is: Focus group evidence suggests that this stylized scenario is realistic. Participants identified "the entire community", or community segments based on tenure status or particular communal working groups, as responsible for forest conservation. Similarly, benefits from PES payments and from forest products were widely perceived as accruing to the community collectively. As is typical in voluntary contributions games, payments are evenly divided. 20 The dominant individual strategy is always to contribute zero to the group, however, the social optimum was achieved if all participants in the group contributed their entire endowments. Group composition was quasi-anonymous. Three groups played in the same room, so that participants could not know which of the 15 people present were part of their group of 5.

Treatments
The game lasted for 12 rounds. We applied a collective target, T, to treatment groups in rounds 5-8 to investigate whether collective conditionality on additionality encourages greater collective contributions (question I). As described in the Model section, we used two levels of T, 20 or 50, which represent conditionality on additional and non-additional levels of collective contributions respectively (see footnote 16). Groups were told that their contributions had to collectively meet or exceed T, otherwise a sanction would be applied on the group as a whole. The sanction amount was twice the collective shortfall. 21 This modifies the payoff function to: Pre-treatment rounds 1-4 provide a within-group baseline. Post-treatment rounds 9-12 test for persistent post-treatment impacts. Participants learned of their group's collective contributions, any sanctions, and their resulting payoffs, after each round, via their confidential payoff sheet. Experimenters explained these results verbally to each participant (but without revealing individual payoffs to other players) when updating payoff sheets.
We used a sorting procedure to investigate whether initially lower contributors respond more positively to conditionality on additionality (question II). The 15 participants per session were sorted into three groups of 5 participants each, based on their initial ('round zero', or R 0 ) contributions. Group type A comprised the 5 participants with the lowest R 0 contributions, B comprised the 5 middle-ranking participants, and C the highest. Participants were not told the basis for this sorting, did not know who their initial or resorted group members were, and did not know that the game would repeat after the first round.
In order to investigate whether groups with a right to veto conditionality on additionality respond more positively to that conditionality (question III), we compared contributions under three different levels of participation in the choice of T. Levels were: (1) experimenters chose one of the two T options without input from participants; (2) a randomly selected participant chose one of the two T options; or (3) a randomly selected participant chose one of the two T options, which was then put to a (confidential) vote immediately before implementation. In the vote case, T was implemented only if it was endorsed by a majority of the group. 22 We investigated whether groups in which individual members can monitor and respond to each other's contributions contribute relatively more (question IV) using a subset of sessions that featured an exogenously specified "internal mechanism." This provided all players anonymized information about group members' contributions, plus the ability to show disapproval by issuing anonymous penalties to other players. Penalties cost one unit to send and two units to receive. Table 1 shows all treatment combinations.

Participants
Four hundred and thirty five participants (and approximately 60 more in pilot sessions), from 15 FC-participating communities took part. Communities were clustered in four locations, one each in Jalisco and Colima, and two in Oaxaca (Fig. 1). We collaborated with three civil society associations (INDAYU and FARCO in Oaxaca, and MABIO in Colima) and directly with one núcleo agrario (Ejido San Agustín in Jalisco). Sites were selected based on discussions with CONAFOR staff and exploratory field visits in 2013, with primary criteria being collective properties enrolled in FC between 2008-2013, and a focus on hydrological services.
Treatments were applied approximately evenly in each of the four regions, with the order randomized within region. 23 While early participants could conceivably communicate with later participants, this possibility was minimized in three out 20 Equal division of payments represents a maximally collective arrangement. We recognize, however, that there are reasons why PES payments in practice could be distributed unevenly and thus be only partially collectivized. Elite capture could see resources directed to individuals with political power. Or, communities could choose to allocate payments based on individual effort, turning a collective PES program into an individual-based PES program (albeit one partially administered at a local level. In this case, the community makes use of information unavailable to the state.) Even in cases where payments are spent on public goods, individuals within communities will derive differential utility from them. Such variation in payments does exist across FC communities. However, we focus on the "pure" collective case to provide more fundamental insights, and to avoid confounding responses to collective incentives with responses to allocation schemes (uneven distribution may make collective action more or less difficult, depending on the distribution mechanism and rationale, see Rodriguez, 2016). While modeling allocation scheme variants would be useful, it is beyond the scope of this paper. 21 As previously discussed, a scaling sanction is a stylized version of the approach used by CONAFOR in cases of partial contract non-fulfillment (see Section Setting). 22 To credibly contextualize these participation mechanisms, framing differed slightly between treatments. A lack of choice in (1) was described as a CONAFOR decision and justified on a basis of forest conservation. Randomly selected participant choice in (2) was described as a choice made by a local NGO, interested either in increasing rural incomes or forest conservation. The vote in (3) was described as a negotiation between a local NGO and the community. Impacts of framing differences, independent of the impact of the participatory mechanism, are discussed in Section Group Veto. 23 One exception is the internal mechanism treatments, which for logistical reasons were implemented later in the field season. of four locations by only undertaking one or two sessions in each community 24 and by requesting participants not speak about game details to others. We followed up our experiment with a survey and focus group discussion with participants, as well as interviews with community leaders to gather contextual data. Individual participants' earned between MXN 180-300 (USD 12-20), depending upon performance. Participant characteristics varied across sites (Table 2). Almost two-thirds of participants had lived in their current community all their lives. About half are women and about half farm as their primary source of income. The average time spent in formal education is just under eight years. In terms of forest work, participants' households average five days per month of paid work, and over seven days a month of unpaid work. Characteristics are balanced between the control and treatment group (joint orthogonality test: F (9, 383) ¼ 1.10, p ¼ 0.360), suggesting reasonable internal validity. 25 At least two potential sources of selection bias should be considered when evaluating external validity. First, participation is voluntary (as is the case in all field-lab experiments), with invitations to take part extended by our organizational partners to all eligible community members. Results could reflect the tendencies of more-community orientated participants, or those more interested in forest conservation. Second, communities themselves self-select to participate in the FC program, and thus may have above average interest in forest conservation, collective organizational ability, and/or links to an interested civil society organization. Further, eligible communities vary considerably in their socio-economic profile. Inferences are drawn from relative behaviors across randomized treatments (using difference-in-differences with individual fixed effects Notes: The treatment is a shift in conditionality, implemented in rounds 5-8 by imposing a target, T. There are two target levels (T ¼ 20, T ¼ 50), which approximate conditionality on non-additional and additional levels of collective contributions relative to the baseline rounds, respectively. to control for time-invariant participant characteristics, see Section Analysis), and thus while selection bias operating at both levels could alter the magnitude of responses, we consider it unlikely to qualitatively change our study's conclusions. However, caution is required when extrapolating, particularly to very different social and environmental contexts.

Analysis
To explore question I we estimate the average treatment effect (ATE) of conditionality using a difference-in-differences (DiD) regression: where g ti is the contribution of individual i in round t, treatment is indicated by dummy variable T , and P 1 and P 2 are dummy variables for the policy and post-policy periods respectively. Sorted groups (A, B, and C) are pooled to give ATEs estimated by β 1 (during policy) and β 2 (post-policy). In this and subsequent individual-level panel regressions we use individual fixed effects, v i , to control for any time-invariant differences between participants. 26 The idiosyncratic error term is ε ti . Standard errors are robust to heteroskedasticity and clustered on group. Given the bounded range of the dependent variable (0-10), we present random-effects Tobit models in addition. We explore the differential impacts of conditionality on additionality (question II) with the same difference-in-differences regression augmented with an interaction term that includes the average of the underlying measure used for group sorting. This is participants' initial round (R 0 ) contributions, g 0 : 1 1 2 2 3 1 0 4 2 0 5 6 0 1 1 2 2 3 1 0 4 2 0 To investigate group participation in rule setting (question III), we add to the first model dummy variables, j, representing the three rule-setting sub-treatments: experimenter choice, participant choice, and participant choice þ vote: We explore the impact of the internal coordination mechanism (question IV) with a triple-differences regression. The mechanism takes the form of anonymous revelation of contributions and the ability to send penalties, and applies for all rounds for certain sessions. Dummy variable M indicates the coordination mechanism: Experimental results are supported by focus group data. This was transcribed from audio recordings from ten communities, and then coded and analyzed using NVivo qualitative data management software. Interviewer notes were consulted for sessions in two additional communities where audio recordings were not possible.

Conditionality on additionality
Average treatment effects Conditionality on additionality has a positive and significant impact on collective contributions during the treatment rounds, R 5-8 (Table 3). The ATE from imposition of the high collective target (T ¼ 50, which requires additional contributions), is relatively large: 3.5-4.6 units out of an endowment of 10 units. We do not see net motivational crowding out while the treatment is in place: any decreases in intrinsic motivation were more than offset by the increased external incentive. The post-policy (R 9-12 ) ATE is positive and significant but, not surprisingly, is considerably smaller (0.42-0.49 units). This persistent impact from the temporary treatment suggests that a durable shift in preferences for cooperative behavior was induced by the external incentivea process known as "internalization" in the social psychology literature (Ryan and Deci, 2000).
Data from focus groups provides supportive evidence of the positive effect of conditionality on additionality. Acceptance was widespread among participants for levying sanctions on communities that do not comply with forest-conservation rules. Participants from a majority of communities stated that such community-level sanctions are fair and necessary À or, at the least, understandable given the collective administrative arrangements. Some focus group participants described conditionality and sanctions as reinforcing existing social attitudes and messages from government about the importance of forest conservation. These data also support the experiment's underlying setup: participants at multiple sites described conservation as a community responsibility, one that individuals should feel obliged to support.
We next test the impact of the low collective target, T ¼ 20. This target is non-binding on almost all groups, meaning that it is usually met or surpassed even in the absence of the treatment. Hence, this target does not condition on additionality for most groups. For those groups with total contributions above 20 during pre-treatment rounds (R 1-4 ), impact from imposing T ¼ 20 during the treatment rounds (R 5-8 ) is likely due to a non-monetary incentive. Possibly, the target signals social desirability of contributions (see for e.g., Bowles, 1998). Groups with contributions below 20 in R 1-4 (i.e. those for whom T ¼ 20 represents conditionality on additionality) increase their contributions significantly (p o 0.01) in response to the treatment with T ¼ 20 (Table 4, columns 1 and 3). Groups with contributions in R 1-4 that are clearly above T ¼ 20 (columns 2 and 4) do not change their behavior in response to this treatment. Broadly speaking, conditioning on non-additional collective contributions shows little to no impact on contributions. 27 27 Groups with baseline contributions just slightly above T ¼ 20 show a small, significant increase in contributions due to the T ¼ 20 treatment in one but not both models. At most, this could suggest a slight non-monetary impact on those groups who are not affected by conditionality on additionality, but are very close to being affected. We find significance in the 21-50 range, but not the 22-50 range or higher (tables not shown for space reasons). We do not believe this substantially detracts from our conclusion of no to little impact from conditioning on non-additional collective contributions.

Heterogeneity: ATEs by initial contribution
Question II asks whether conditionality on additionality has a greater impact on initially low contributing groups. The initial contribution, g 0 , made during the sorting round (and thus before individuals can be influenced by the behavior of others) proxies for underlying prosocial tendency. A group whose members contribute more in R 0 (i.e. a type C group member) may contribute more over the course of the experiment. Indeed, contributions in R 0 are correlated with contributions in subsequent non-treatment rounds R 1-4 (r ¼ 0.387, p o 0.05). However, that does not in itself suggest which group type, A, B, or C, will respond most strongly to this conditionality.
A plot of average contributions of individuals belonging to the three group types, both treated and control, over time, suggests that initially low contributors (type A) respond more to increased collective conditionality than do initially high contributors (type C) (Fig. 2). Difference-in-differences regression shows that this discrepancy is statistically significant (a 0.26-0.36 unit reduction in the ATE per additional unit of R 0 contribution, p o 0.05) ( Table 5, Fig. 2. Average contributions across rounds, under control and treatment, by sorted group type. Group type A represents those individuals that make the smallest initial contributions (solid lines), B represents those that make midrange initial contributions (dash lines), and C those that make the largest initial contributions (dotted lines). For each group type, average contributions under control conditions are depicted in grey, and those under the treatment (Target ¼ 50, i.e. conditionality on additionality) are depicted in black. The treatment is applied in rounds 5-8. is consistent with the suggestion discussed in Sections Interactions between monetary and non-monetary incentives and Model. Initially high contributors may have greater intrinsic motivation (which explains their high contributions in R 0 ), which is partially crowded out by the imposition of new external incentives. In contrast, those who contribute less initially (type A) can be expected to have less intrinsic motivation. They thus have less intrinsic motivation to lose, and consequently, less scope for motivational crowding out. As a result, they show a greater net increase in contributions due to conditionality on additionality. However, it should be noted that this result is not a test of this specific causal mechanism. Alternative explanationsincluding heterogeneity in preferences for money or heterogeneity in the curvature of marginal nonmonetary returnsare also plausible given this result (see Model section).

columns 1 and 3). This result
Putting aside the mechanism, we further investigate research question II with an eye towards possible ceiling effects. This heterogeneity result could reflect the fact that the 10 endowed units are a "ceiling" on contributions, one that is more likely to constrain those individuals who contributed more initially. We guard against this possibility by first removing from the sample all individuals who at any point in the game contributed all ten units of their endowment. 28 The remaining individuals were thus only those not constrained from contributing more should they have wished. Again, we see a negative and significant result on the interacted difference-in-differences variable (Table 5, columns 2 and 4), indicating that low contributors responded more positively to conditionality on additionality than their high-contributing counterparts.
For a final test of this result, we calculate a standard difference-in-differences variable, change, from the difference between average contributions over the policy and the pre-policy periods for treated individuals, minus the same difference for the n control individuals (on average). We generate a further variable, shortfall, from the difference in contributions between an individual's effective target (i.e. their share of the group target, T 5 ), and the individual's average baseline Notes: Dependent variable is individuals' contributions. g 0 avg. is the average contribution given by group members in R 0 (sorting round). Treat n R 5-8 n g 0 avg. may be interpreted as the change in the impact of the treatment (Target ¼ 50) during the treatment rounds, due to a one unit change in the average contribution given by group members in R 0 . The subsample "Max. 9" includes only individuals who never give the maximum of ten units and thus are assumed unconstrained by ceiling effects. Standard errors (in parentheses) are robust to heteroscedasticity and clustered at the group level for OLS regression. Fixed effects are at the individual level for OLS regression. n p o 0.10. nn p o 0.05. nnn p o 0.01. 28 We removed individuals who gave ten units even just once, including those who gave ten units during a treatment round (when they may have been expected to give ten units). We thus consider this a strong control against ceiling effects. contributions. 29 Dividing change by shortfall yields the fraction of the distance to the target (T ¼ 50) that the individual achieved. For a given rise in contributions, in absolute terms, this metric assigns larger fractions achieved to initially higher contributors. An increase of one unit, for example, gives a fraction of 1.0 for a player who contributed nine units, on average, during the baseline, but a fraction of 0.25 for a player who gave six units, on average, during the baseline. Mechanically, this would give a positive relationship between the fraction and initial contribution, g 0, if there were no heterogeneous result. However, if initially low contributors increase by relatively more due to the treatment, we would expect to see a nonmonotonic or overall negative relationship. Consistent with this expectation, we see a significant non-monotonic relationship between the fraction and initial contribution, g 0 . (Table 6, column 4). Again this suggests that initially low contributors are responding relatively more strongly to conditionality than initially high contributors, or at least more strongly than initially middling contributors. We return to this finding in our discussion section.

Other social interactions
The extent to which a group will successfully manage to fulfill a collective PES contract depends on group members' attitudes toward the PES institution, and on the relationships among group members. Attitudes toward the PES institution may be influenced by participating in the contract's rule-setting process. Additionally, group behavior may be a function of the extent to which members can see and influence each other's decisions. We test whether holding a veto on conditionality on additionality increases collective contributions (question III), possibly via a sense of ownership over the rule change. We then test if a mechanism by which members can influence other members allows a group to better respond to an external incentive (question IV).
To provide baseline context for examining these social interactions, we first show that peers influence each other's behavior. 30 Having peers who give more in the initial round encourages a participant to contribute more in subsequent Notes: Change is the difference between an individual's average contribution in pre-policy (R 1-4 ) and policy periods (R 5-8 ). Shortfall is the difference between the required contribution (10 units for an individual, given that the group target is 50) and the individual's average contribution in pre-policy (R 1-4 ) periods. g 0 is the contribution given in R 0 (sorting round). The subsample "Max. 9" includes only individuals who never give the maximum of ten units and thus are assumed unconstrained by ceiling effects. Standard errors in parentheses. We adjust both of these distance measures for baseline trends by subtracting the average change in contributions over time in the control groups (terms on right). 30 We regress an individual's contributions against the difference between his/her initial contribution and that of his/her group members ( ) − g g i group 0 0 , controlling for his/her own initial contribution ( ) g i 0 : ( ) As previously discussed, participants in different sessions differ in their initial propensity to contribute. Consequently, sorted groups (A, B and C) are not necessarily equivalent across sessions, and a given initial contribution, g 0 , could be the lowest in a C group or the highest in a B group, for example. This variation helps to cleanly identify peer effects. Controlling for g 0, β 1 is the average effect of the initial difference between a participant's and his/her peers'

Group veto
Question III expressed our expectation that having a veto over conditionality on additionality raises the impact of that conditionality, should it be implemented. Our test of this notion captures a real life element of institutional variation in this setting. Communities participating in the Mexican national PES programs develop a forest management plan based on minimum standards set by CONAFOR combined with their selection of voluntary standards from a menu of options. In some cases, choices are made by community leaders alone (the comisariado), and in others, a vote is held by all the eligible members of the community (the asamblea). Likewise, communities have considerable freedom in deciding how PES payments will be allocated.
In line with our hypothesis, the ATE of conditionality on additionality (T ¼ 50) is highest under the treatment allowing maximum participation, i.e., when a randomly selected participant is given the power to choose T and his/her choice is put to a vote (Table 8). This participant choice þ vote treatment ATE is significantly above the participant choice treatment alone (Wald test: F (1, 71) ¼ 7.17, p o 0.01).
Focus-group data support this finding. Participants in Colima suggested that internal decisions regarding contracts could and should be made through the asamblea, implying both local choice and a veto on any contract dimensions. Qualitative data also seem to help us interpret the lack of significant difference between the ATEs for the experimenter-chosen T and the random participant-chosen T (Table 8, columns 1 and 2, Wald test: F (1, 71) ¼ 0.68, p 4 0.1). Recall that our framing of Notes: Dependent variable is individuals' contributions. This is averaged across pre-policy period rounds (R 1-4 ) in the case of the OLS model (i.e. a cross section analysis). g 0 is the contribution given in R 0 (sorting round). g 0 difference is the difference between g 0 and g 0 of other group members in R 0 . Standard errors (in parentheses) are robust to heteroscedasticity and clustered at the group level for OLS regression.   Rodriguez et al. (2015) also reported high levels of community confidence in CONAFOR at other upstream FC sites. This could be confounding the impact of less rule-setting participation in this subsample. Overall, the most striking result is that from the clean comparison between the remaining two treatments, participant choice, and participant choice þ vote. Participation in the form of a simple veto on conditionality significantly increases the impact of that conditionality.

Internal mechanisms for cooperation
In proposing question IV, we speculated that a mechanism that facilitates internal cooperation may increase both baseline contribution levels and the response to conditionality on additionality. The mechanism we test is anonymized information about group members' contributions, coupled with a means to issue anonymous sanctions to other group members. Evidence of this mechanism's effectiveness can be seen in Fig. 3. Contributions increase across rounds in groups with the internal mechanism (IM) but decrease in those without. We confirm statistical significance using a triple differences regression, which shows significantly higher contributions in IM groups in later rounds relative to non IM treated groups (Table 9) (p o 0.05). We find a positive and significant increase in the ATE of between 1.74-2.98 units out of 10 (p o 0.1), indicating that the collective conditionality on additionality is more effective in IM conditions. This positive treatment effect is strongest among initially low contributing groups. Overall, our findings match our expectation that communities that are able to coordinate through peer-to-peer interactions can better solve collection action problems and can therefore better respond to external policy demands.
Supporting the relevance of internal mechanisms in our field setting, focus-group data indicate the presence of intracommunity monitoring and sanctioning for environmental behaviors in many communities. Focus group participants consider it their responsibility not only to abstain from acts that damage forests (e.g., lighting fires, cutting protected tree species, littering), but also to report violations by others. Local sanctions varied widely amongst communities but included fines, additional work and even imprisonment. The rules that were described to us in focus groups include both formal state regulations and a large number of rules determined at a community level. While by definition, internal mechanisms are not under the control of policy makers, resources supporting collective PES can be guided toward communities with demonstrated ability to coordinate on key decisions.

Discussion and conclusion
Most PES programs to date are based on individual contracts, for environmental actions on land titled to individual households or firms (Kerr et al., 2014;Porras et al., 2008). The prominence of communally titled land, however, particularly in developing countries, is increasingly motivating PES contracts with collectivesgroups of farmers, or communities. In addition to better matching existing tenure arrangements, collective PES may lower transaction costs and improve spatial congruence with biophysical systems or areas of habitat (Kerr et al., 2014;Swallow and Meinzen-Dick, 2009).
Collective PES may also help solve information problems. At an individual level, PES administrators do not know what actions or ecosystem services outcomes individuals would have provided without the program. Even when such baseline information is in hand, monitoring individuals' actions and outcomes within the program is costly. At a group or aggregate level, both baselines and actions under programs may be approximated at a reasonable cost. For example, a program can target areas with higher baseline deforestation risk (as determined by satellite monitoring) (Wünscher et al., 2008), or, it can pay for actions not otherwise common across a landscape (Pagiola et al., 2007). At larger scales, it is easier to determine both baseline outcomes and program outcomes, and thus it is easier to write contracts that are additional to those baseline outcomes. Analogous approaches have been proposed in other policy domains, such as the control of non-point source pollution (Rodriguez, 2016;Segerson, 1988;Xepapadeas, 1991). For PES, ways to increase additionality are valuable given that analyses of programs in developing countries to date have found, at best, modest evidence of positive environmental impact (Alix-Garcia et al., 2012;Muñoz-Piña et al., 2008;Pattanayak et al., 2010;Porras et al., 2008;Wunder et al., 2008).
Along with others in this small but growing literature (Hayes et al., 2017;Kerr et al., 2014;Swallow and Meinzen-Dick, 2009), we argue that there is great value in understanding what institutional features can contribute to successfully contracted collective action. We used a field-laboratory experiment with real PES participants to examine the impacts of collective conditionality on additionality in PES contracts. In collective settings where free riding is possible, positive contributions from individuals often relies on intrinsic motivations such as other-regarding preferences, possibly supported by social norms. External monetary incentives, even if strictly conditional, are in themselves insufficient because individual contributions are monetarily irrational. Given this, there exists the possibility of diminished or even net negative outcomes if an external 0.598 nnn (0.116) Treat n g 0 avg.
0.087 (0.143) g 0 avg. n R 5-8 À 0.121 À 0.143 (.074) (0.089) g 0 avg. n R 9-12 À 0.049 À 0.065 (.088) (0.089) Treat n g 0 avg. n R 5-8 À 0.369 nnn À 0.296 nnn (.116) (0.114) Treat n g 0 avg. n R 9-12 0.102 0.154 (.115) (0.111) IM n Treat n g 0 avg. n R 5-8 À 0.317 n À 0.216 (.173) (0.231) IM n Treat n g 0 avg. n R 9-12 0. Notes: Dependent variable is individuals' contributions. IM is a dummy variable indicating internal mechanism treatment (applied in all rounds). IM n Treat n R 5-8 may be interpreted as the change in the impact of the treatment (Target ¼ 50) due to presence of the internal mechanism. g 0 avg. is the average contribution given by group members in R 0 (sorting round). Standard errors (in parentheses) are robust to heteroscedasticity and clustered at the group level for OLS regression. Fixed effects are at the individual level for OLS regression. intervention was to "crowd out" those critical intrinsic or non-monetary motivations (Bowles, 2008;Frey and Jegen, 2001;Rode et al., 2015). Such motivational shifts could also last beyond the life of incentives (Bowles, 2008;Fehr and Falk, 2002). We found that conditionality on additionality raised collective contributions, despite the fact that for an individual, contributions were monetarily irrational even with such conditionality. In this regard, our model simulates real contexts: many PES programs, including those at our study sites, provide payments that are in themselves too low to motivate monetarily rational contributions (Porras et al., 2008). To the extent possible in a stylized lab setting, we also see evidence for a durable shift in behavior, i.e. posttreatment impacts. This is relevant for PES programs where time-limited, non-renewable contracts are common.
We found that the increase in contributions was greater for those who initially contributed less (a result that is robust to ceiling effects). This smaller increase in contributions for the initially high contributors is consistent with those participants having more intrinsic motivation and thus more scope to be affected by motivational crowding out, as we describe in our theory model. It is important to note that this consistency does not rule out other explanations per se, as our experiment did not test between alternative mechanisms that could lead to heterogeneous responses specifically. Yet our results suggest that by motivational shifts, differential preferences, or other mechanisms, there is relatively less impact from conditionality on additionality among higher contributors. As programs expand, and thus move beyond the most enthusiastic self-selecting communities, they will increasingly need to contract with communities with lower intrinsic environmental motivation and lower baseline contributions. For policy purposes, our finding supports prior suggestions that such targeting may increase program impact (see for example, Alpízar et al., 2013Alpízar et al., , 2004Pfaff and Sanchez-Azofeifa, 2004;Robalino and Pfaff, 2013). This possibility has been raised for the context of Mexico (Muñoz-Piña et al., 2008) and is an ongoing design issue for PES broadly.
We investigated two further social interactions relevant for collective PES program design. The first concerns who gets to participate in decisions about the design and implementation of contract rules. We found that giving the community a veto over the imposition of greater conditionality raised the impact of that conditionality. This occurs even if the veto is not used and has no tangible impact on the rule that results. We secondly found that if a community can coordinate responses to external intervention, both baseline contributions and the impact of greater external incentives upon contributions increased. In our experiment, this mechanism took the form of information about others' contributions and an ability to penalize peers for unsatisfactory contributions. This mechanism mirrors features of the internal governance structures we see in our Mexican study sites. This finding supports the targeting of more-organized communities for PES participation. Yet we also note that PES policies may bring about improvements in community governance, not merely benefit from them (Hayes et al., 2015). In our Mexico study sites, governance is reportedly enhanced in some cases through FC programs' partnerships with civil society organizations (Popovici, 2017).
In sum, our results inform the design and targeting of collective PES. They suggest that policy makers should set contracts that are conditional on additional actions, prioritize contracts with communities with strong internal governance, and offer communities a role in making the rules that affect them. These insights likely apply to other settings with collective decision-making characteristics.