More strictly protected areas are not necessarily more protective: evidence from Bolivia, Costa Rica, Indonesia, and Thailand

National parks and other protected areas are at the forefront of global efforts to protect biodiversity and ecosystem services. However, not all protection is equal. Some areas are assigned strict legal protection that permits few extractive human uses. Other protected area designations permit a wider range of uses. Whether strictly protected areas are more effective in achieving environmental objectives is an empirical question: although strictly protected areas legally permit less anthropogenic disturbance, the social conflicts associated with assigning strict protection may lead politicians to assign strict protection to less-threatened areas and may lead citizens or enforcement agents to ignore the strict legal restrictions. We contrast the impacts of strictly and less strictly protected areas in four countries using IUCN designations to measure de jure strictness, data on deforestation to measure outcomes, and a quasi-experimental design to estimate impacts. On average, stricter protection reduced deforestation rates more than less strict protection, but the additional impact was not always large and sometimes arose because of where stricter protection was assigned rather than regulatory strictness per se. We also show that, in protected area studies contrasting y management regimes, there are y2 policy-relevant impacts, rather than only y, as earlier studies have implied.


Introduction
National parks and other protected areas are at the forefront of global efforts to protect biodiversity and ecosystem services.
Understanding how these reserves affect environmental and social outcomes is thus crucial to building the evidence base for conservation policy. Recently, scholars have made advances in attempting to isolate the causal effects of protected areas separately from confounding factors that jointly affect where protected areas are placed and the outcomes of interest (e.g., Andam et al 2008, Sims 2010, Joppa and Pfaff 2011, Nelson and Chomitz 2011, Canavire-Bacarreza and Hanauer 2013. Although these analyses have shed light on protected area impacts, most treat 'protection' as if it were homogeneous. In practice, however, protected areas are heterogeneous in their management objectives and the human uses that they permit. Protected areas that severely restrict human uses have been a focal point of controversy in the 'fortress conservation' debates (Brockington 2002). Thus understanding how their environmental and social effects compare to less strictly, and often less controversial, protected areas is important. This understanding would also be a useful input into recent debates about 'degazetting' and 'downgrading' protected areas to less strictly protected or unprotected status (Mascia and Pailler 2011).
As noted by Andam et al (2013), theory is not clear about the way in which environmental impacts vary with the strictness of protection. Holding all other attributes equal, stricter protection implies a lower probability of anthropogenic disturbance and a higher probability of restoration. However, the social conflicts that strictly protected areas can engender may lead to lower overall enforcement or compliance with protected area regulations. Moreover, establishing strictly protected areas is politically more difficult on accessible, productive lands and may often be guided by aesthetics or other criteria (Dudley and Stolton 2012) that favor more remote, less productive lands. For example Joppa and Pfaff (2009), show that the stricter the management category of tropical forest protected areas, the more likely the area is found on 'higher and steeper lands further away from roads and urban centers'. Thus the lands to which strict protection is assigned may be, on average, less likely to be disturbed and more likely to regenerate in the absence of protection compared to less strictly protected areas. This phenomenon is called the 'residual reserve problem' in the conservation planning literature (Pressey and Bottrill 2008). If less strict protection is more likely to be assigned to lands facing higher human pressures, less strictly protected areas may result in greater avoided losses even though they legally permit more disturbances than strictly protected areas 6 .
Without theory to provide guidance, the effect of different management categories is ultimately an empirical question. One study (Nelson and Chomitz 2011) estimates the average 6 The same theoretical ambiguity is present in the context of social impacts. Strictly protected areas may preclude more productive uses than less strictly protected areas, but they also may be established on lower opportunity cost lands and potentially induce more tourism business opportunities, more infrastructure and more ecosystem services. Thus the net social effect of the management rules is difficult to predict based on theory alone. effect of protected areas on fire density in the tropical forest biome conditional on the strictness of protection as measured by International Union for Conservation of Nature (IUCN) protected area categories from the World Database on Protected Areas (WDPA). Their results suggest that, for Latin America and Asia, the average effect was larger in multiple-use protected areas than in strictly protected areas (in Africa, the estimates are imprecise). However, the estimates may be sensitive to hidden bias: most of the protected areas lack true baseline (pre-protection) measures of the outcomes and the study uses measures of seven confounding variables to control for the assignment of all kinds of protection across all countries.
Country-level studies that consider the particular aspects of the country's protected area assignment process may provide more accurate estimates of impacts. Three recent studies examine the relative effectiveness of strictly protected areas, multi-use protected areas and indigenous reserves in the Amazon (see also Nepstad et al 2006). Pfaff et al (2012) claim that sustainable use areas and indigenous reserves generated more avoided deforestation, on average, than strictly protected areas. Soares-Filho et al (2010), using a different sample and (unusual) empirical design, claim that strictly protected areas were more effective than sustainable use areas (and indigenous reserves were the most effective). Nolte et al (2013) generate ambiguous results. In Thailand, Sims (2010), claims that wildlife sanctuaries and national parks were more effective than less strictly protected forest reserves in generating additional forest cover. In Costa Rica, Andam et al (2013) estimate that less strictly protected areas induced more regrowth on cleared forests than strictly protected areas, but this difference is not statistically different from zero.
We estimate the effects of differing levels of protection on avoided deforestation in four countries that are important for biodiversity conservation: Bolivia, Costa Rica, Indonesia (Sumatra) and Thailand. We improve upon the aforementioned literature in four key ways. First, with the exception of Andam et al (2013), previously published studies do not clearly define the estimated treatment effects (in particular, the counterfactual status is often vague): yet when comparing across two types of governance regimes, there are at least four policy-relevant treatment effects. Second, no study has used estimates of these effects to gain more insights into the mechanisms through which legal restrictions affect outcomes. Third, previous studies do not test the statistical significance of the differences in treatment effects across protection types (i.e., how confident can one be that the differences in the point estimates are not simply the result of sampling variability?). Fourth, by using consistent empirical designs and definitions of treatment effects, our study facilitates cross-national comparisons. Table 1 summarizes the data from the four countries. The Bolivian and Costa Rican data span the entire country (Costa Rica data come from Andam et al 2008). The Thai data cover the north and northeastern part of the country (see ). The Indonesian data cover the island of Sumatra. All pixels are forested at baseline and thus the relevant outcome is whether the pixel is forested or deforested at the end of the study period. The supplementary information (SI, available at stacks.iop.org/ERL/8/025011/mmedia) file describes the data sources, definitions of 'forested' and 'deforested', and other data characteristics (SI tables 1 and 2 available at stacks.iop.org/ERL/8/025011/mmedia).

Defining strictness of protection
We cannot observe de facto enforcement of regulations. Instead, we define the strictness of protection based on de jure criteria; i.e., what the law says the regulations are. As noted above, de facto enforcement may differ across protected areas. For example, according to one report (ICEM 2003), more than 500 000 people may have been living inside wildlife sanctuaries and national parks in Thailand, contrary to the regulations for these reserves. Following previous studies (e.g., Soares-Filho et al 2010, Nelson and Chomitz 2011, Joppa and Pfaff 2011, Pfaff et al 2012), we define the 'strictness' of protection in the Bolivia, Costa Rica and Sumatra analyses by using the six IUCN management categories (World Database on Protected Areas; WDPA). These categories use standardized definitions and imply decreasing strictness of regulations as the category number increases (IUCN 1994). We define protected areas as 'strictly protected' if they fall into categories I-IV, and as 'less strictly protected' if they fall into categories V-VI. The former category includes strict nature reserves, wildlife refuges and national parks, and the latter includes multiple-use protected areas and integrated management areas. These categories match closely the de jure protected area rules in the three countries (SERNAP 2007, Evans 1999, Jepson 2001. In Thailand, we define wildlife sanctuaries and national parks as 'strictly protected' (IUCN I-II) and forest reserves as 'less strictly protected'. Thailand's forest reserve area statutes do not match exactly with official IUCN categories, but they are, in practice, multiple-use reserves, i.e., IUCN category VI (ICEM 2003, Emphandhu and Chettamart 2003, Fujita 2003, FAO 2009). Thus the analyses in all four sites contrast protected areas that strongly restrict human use to protected areas that allow extractive uses.

Estimands
To precisely describe the relevant treatment effects, we introduce some notation. Let D s = 1 if the forest parcel is strictly protected, D ls = 1 if it is less strictly protected, and D s = D ls = 0 if it is unprotected. Let Y = 1 if the forest is deforested during the study period, and Y = 0 if it remains forest. Thus, for every parcel, there are three potential outcomes: (1) Y s , the potential deforestation under strict protection, (2) Y ls , the potential deforestation under less strict protection, and (3) Y 0 , the potential deforestation under no protection. Given three treatment states and three potential outcomes, each country analysis has three observable outcomes and six counterfactual outcomes.
For each country, we estimate four average treatment effects on the treated (ATT): 7 (1) ATT s,0 = E(Y s − Y 0 |D s = 1); the expected difference between deforestation on strictly protected forests and deforestation on strictly protected forests had they instead not been protected at all; (2) ATT ls,0 = E(Y ls − Y 0 |D ls = 1); the expected difference between deforestation on less strictly protected forests and deforestation on less strictly protected forests had they instead not been protected at all; (3) ATT s,ls = E(Y s − Y ls |D s = 1); the expected difference between deforestation on strictly protected forests and deforestation on strictly protected forests had they instead been less strictly protected; and (4) ATT ls,s = E(Y ls − Y s |D ls = 1); the expected difference between deforestation on less strictly protected forests and deforestation on less strictly protected forests had they instead been strictly protected. All ATTs are measures of avoided deforestation.
With the exception of Andam et al (2013), the studies cited in the introduction are not clear on what treatment effects they are estimating and how they test for differences across effects. Their text suggests they estimate ATT s,0 and ATT ls,0 and then informally contrast their magnitudes and p-values. The studies do not attempt to differentiate among three possible reasons for any observed differences: sampling error, the strictness of protection, or the underlying characteristics of 7 As noted by Andam et al (2008), the average treatment effect (ATE), which is the mean effect for a randomly chosen parcel from the population of forested parcels, is not policy-relevant given that protection is not assigned to randomly chosen parcels. The more policy-relevant treatment effect is the average effect of protection on the parcels actually assigned protection (ATT). the land protected (i.e., strictly and less strictly protected lands may differ in characteristics that affect potential deforestation in the presence and absence of a given level of protection; for example, holding protection strictness constant, protection placed on less-threatened forests would generate less avoided deforestation).
The other two treatment effects, ATT s,ls and ATT ls,s , also address policy-relevant questions: how much different would deforestation on strictly protected forests have been had these forests instead been less strictly protected, and vice versa? Estimating the two additional treatment effects allows one to compare strict and less strict protected areas with similar covariate distributions. Thus by combining estimates of ATT s,ls and ATT ls,s with other data and statistical tests, one can also elucidate the relative contributions to ATT s,0 and ATT ls,0 from the strictness of protection and the location of protection.

Empirical design
To estimate the counterfactual deforestation rates E(Y 0 |D s = 1) and E(Y 0 |D ls = 1), we use a quasi-experimental matching design like that used in Andam et al (2008) and others. The matching algorithms reweight the unprotected forest parcels to create a control group that, on average, is observably similar to the protected parcels in the distributions of important baseline covariates known to jointly affect the placement of protected areas and deforestation in the absence of protection (i.e., factors known to affect land use). The baseline characteristics selected for each country are described in the SI (available at stacks.iop.org/ERL/8/025011/mmedia).
Under the assumption that there are no systematic unobservable differences among the matched protected and unprotected parcels in characteristics that affect deforestation in the absence of protection, the expected deforestation of the matched unprotected parcels represents the expected counterfactual deforestation of the protected parcels had they not been protected. Thus the difference between deforestation on the protected and matched unprotected parcels is an unbiased estimator of ATT s,0 and ATT ls,0 (for more conceptual detail, see Ferraro 2009). The same approach is used to estimate ATT s,ls and ATT ls,s (the matched control parcels are from the other category).
For each country analysis, we select the matching algorithm that achieves the best covariate balance after matching. The final algorithms chosen use (single) nearestneighbor covariate matching with replacement, but each analysis uses a different weighting approach (see final row of table 1). Multiple measures of the differences in the covariate distributions between protected and unprotected parcels are used to measure covariate balance (SI tables S3-S18 available at stacks.iop.org/ERL/8/025011/mmedia). We further control for bias that can remain after matching in finite samples by using a post-matching bias-correction procedure (Abadie and Imbens 2006). A final source of potential bias comes from spillovers, both positive and negative, whereby protection affects forest cover in matched control units (Andam et al 2008). We find no evidence of spillovers, on average, in our study sites (SI table S19 available at stacks.iop.org/ERL/8/ 025011/mmedia).
To characterize the precision of our estimates, we calculate heteroskedasticity robust standard errors (Imbens and Wooldridge 2009, Abadie and Imbens 2006, Abadie et al 2004. These standard errors allow for heteroskedasticity both within and across treatment arms by calculating conditional variances via a secondary matching algorithm that matches units within treatment arms (treated units to treated units, etc). To test the null hypothesis that ATT s,0 = ATT ls,0 , we use Welch's t-test. Table 2 presents the estimates of the four treatment effects for each of the four countries (see table S20 (available at stacks.iop.org/ERL/8/025011/mmedia) for breakdown by treated and control units). If the estimated ATT is less than zero, the treatment reduced deforestation compared to the counterfactual condition; i.e., generated avoided deforestation. If ATT ls,0 − ATT s,0 is greater than zero (first row, third column where t-statistic is presented in brackets), more strictly protected areas reduced deforestation by a greater amount than less strictly protected areas did (i.e., generated more avoided deforestation).

Results
To illustrate the importance of estimating all four estimands and the statistical significance of the differences between ATT ls,0 and ATT s,0 , consider the Bolivia results. Strict protection reduced deforestation on strictly protected forests by, on average, an additional 2.3 percentage points (p < 0.01) compared to their counterfactual deforestation with no protection (ATT s,0 ): in other words, 2.3% of the strictly protected forests would have been deforested had they not been strictly protected. Less strict protection reduced deforestation on the less strictly protected forests by an additional estimated 1.3 percentage points on average (p > 0.10) (ATT ls,0 ). The difference between these two estimates is small (one percentage point) and not statistically different from zero (p > 0.10). However, were strictly protected areas instead assigned less strict protection (ATT s,ls ), the deforestation observed would have been, on average, 2 percentage points higher (p < 0.05). In contrast, assigning stricter protection to less strictly protected areas (ATT ls,s ) would have had little impact on average (two-tenths of a percentage point; p > 0.10).
Comparing avoided deforestation on strictly protected areas (ATT s,0 ) to the avoided deforestation on less strictly protected areas (ATT ls,0 ), our estimates imply that more strictly protected areas reduced deforestation by a greater amount in three countries (see first row of each country panel). Costa Rica, Sumatra and Thailand's strictly protected areas experienced an estimated 10-13 percentage points less deforestation than less strictly protected areas. In Bolivia, the difference between the estimated treatment effects is small and statistically insignificant. Table 2. Heterogeneous treatment effects. Effect on deforestation by level of protection. (Note: ATT x,y = Average treatment effect on the treated for x category of protection compared to the counterfactual y, where s = strictly protected, ls = less strictly protected, and 0 = unprotected. (Abadie-Imbens heteroskedasticity robust standard errors.) (t statistic based on Welch's t-test; H0: ATT s,0 = ATT ls,0 .)) . 3.2. How different, on average, would avoided deforestation have been on strictly protected areas had they instead been less strictly protected (ATT s,ls )?

Strictness of protection
The estimates imply that assigning forests to stricter protection, instead of less strict protection, reduced deforestation in three of the four sites: Bolivia, Sumatra and Thailand (see second row of each country panel). In Costa Rica, the estimated effect is similar in sign and magnitude to the estimate in Bolivia, but we cannot reject the null hypothesis of zero ATT.
3.3. How different, on average, would avoided deforestation have been on less strictly protected areas had they instead been more strictly protected?
The estimates imply that assigning a forested area to less strict protection, instead of more strict protection, increased deforestation in three of the four countries: Costa Rica, Sumatra and Thailand (see third row of each country panel). In Bolivia, the estimated effect is near zero and we cannot reject the null hypothesis of zero ATT.

Selection or restriction?
Although we only observe the de jure strictness of the regulations, we can combine the results from table 2 with information on the characteristics of the protected and unprotected forested lands (tables S3-S17) to clarify whether the results in table 2 are driven by the strictness of protection or by selection (i.e., where protected areas are located). Recall the three possible reasons for the estimated difference between ATT s,0 and ATT ls,0 : sampling error, differences in the levels of strictness (the treatment), or differences in the underlying productivity and accessibility of the land assigned to strict and less strict protection. Uncertainty about sampling error is captured in the t-test statistic in table 2. The underlying productivity and accessibility of the land is held constant for the ATT s,ls and ATT ls,s estimands, while only strictness is changed.
Combining this information yields insights into the relative contributions of legal restrictions and selection to protected area effects on deforestation. For example, holding site selection in Sumatra constant, changing strictly protected areas to less strictly protected status would reduce the protected area impacts substantially (forgo 21 percentage points of avoided deforestation), whereas moving less strictly protected areas to stricter protection status would increase impacts substantially (19 percentage points increase in avoided deforestation). Furthermore, strictly protected areas are, on average, located on no better lands in terms of productivity and accessibility (i.e., no more threatened). These observations imply that the strictness of protection is very important in explaining the differences between ATT s,0 and ATT ls,0 . Following the same logic, we observe that, like in Sumatra, the strictness of protection is the main driver of differences between ATT s,0 and ATT ls,0 in Bolivia and Thailand. In contrast, a combination of selection and, to a lesser extent, strictness drives differences in Costa Rica.

Discussion
Our results imply that the effects of regulatory strictness are heterogeneous within and across countries: although greater strictness of a protected area's IUCN management category is often associated with greater levels of avoided deforestation, the differences in the effects across management categories are not always large and can arise from differences in the way in which the strictness of protection is spatially assigned (i.e., site selection), as well as the strictness per se. Stricter is not necessarily better and more evidence is needed to guide policymakers in the choice of protected area management categories.
To help build the evidence base, our study can be extended in seven ways. First, although our results are sufficient to show differential effects conditional on the management category, they are not sufficient to form a global picture of such heterogeneity. Our sample was one of convenience rather than representation. Our study should be replicated in other nations and extended to include other environmental outcomes (e.g., degradation, regrowth; Andam et al 2013), biomes and time periods. Second, to understand the full effects of protected areas, the effects of management categories on social outcomes should also be investigated (Sims 2010). One can then explore the way in which the environmental and social impacts of different management categories vary conditional on observable characteristics of the landscape and the people .
Third, our study design should be extended to contrasts of community-managed, government-managed and co-managed ecosystems (Somanathan et al 2009). Fourth, there is a glaring lack of cost data in the literature on environmental impact evaluations. Even if more strictly protected areas generate greater avoided deforestation, they may be less cost effective. Fifth, the way in which ex post impact studies like ours can inform ex ante conservation planning exercises to site new protected areas needs further exploration (Pressey et al 2007). Sixth, our study only looks at de jure regulations rather than de facto enforcement, yet theory points to the strong effects that enforcement can have on outcomes (e.g., Albers 2010, Robinson and Lokina 2011). Future studies could improve on our design through collaborations among scholars and practitioners that help identify de facto enforcement levels. Without these finer resolution data, decomposing any observed heterogeneous effects into their constituent mechanisms will be difficult. Yet probing the heterogeneity of conservation actions and the mechanisms through which they operate is a crucial step in building the evidence base in environmental policy, an effort that Miteva et al (2012) call Conservation Evaluation 2.0.