National satellite-based humid tropical forest change assessment in Peru in support of REDD+ implementation

Transparent, consistent, and accurate national forest monitoring is required for successful implementation of reducing emissions from deforestation and forest degradation (REDD+) programs. Collecting baseline information on forest extent and rates of forest loss is a first step for national forest monitoring in support of REDD+. Peru, with the second largest extent of Amazon basin rainforest, has made significant progress in advancing its forest monitoring capabilities. We present a national-scale humid tropical forest cover loss map derived by the Ministry of Environment REDD+ team in Peru. The map quantifies forest loss from 2000 to 2011 within the Peruvian portion of the Amazon basin using a rapid, semi-automated approach. The available archive of Landsat imagery (11 654 scenes) was processed and employed for change detection to obtain annual gross forest cover loss maps. A stratified sampling design and a combination of Landsat (30 m) and RapidEye (5 m) imagery as reference data were used to estimate the primary forest cover area, total gross forest cover loss area, proportion of primary forest clearing, and to validate the Landsat-based map. Sample-based estimates showed that 92.63% (SE = 2.16%) of the humid tropical forest biome area within the country was covered by primary forest in the year 2000. Total gross forest cover loss from 2000 to 2011 equaled 2.44% (SE = 0.16%) of the humid tropical forest biome area. Forest loss comprised 1.32% (SE = 0.37%) of primary forest area and 9.08% (SE = 4.04%) of secondary forest area. Validation confirmed a high accuracy of the Landsat-based forest cover loss map, with a producer’s accuracy of 75.4% and user’s accuracy of 92.2%. The majority of forest loss was due to clearing (92%) with the rest attributed to natural processes (flooding, fires, and windstorms). The implemented Landsat data processing and classification system may be used for operational annual forest cover loss updates at the national level for REDD+ applications.


Introduction
Deforestation and degradation of tropical rainforest are important global issues due to their role in carbon emissions, Environmental Research Letters Environ. Res. Lett. 9 (2014) 124012 (13pp) doi:10.1088/1748-9326/9/12/124012 biodiversity loss, and reduction of other ecosystem services (Millennium Ecosystem Assessment 2003, Foley et al 2007. Of global gross forest cover loss from 2000 to 2012, 32% occurred within tropical rainforests . Almost half of rainforest loss was found in South America, primarily in the Amazon basin. Peru, where humid tropical forests cover about 60% of the total land area, possesses the second largest portion of Amazon rainforest after Brazil. While forest cover loss in Peru is substantially lower than in Brazil , humid tropical forest loss is considered the primary source of carbon emissions at the national scale (MINAM 2011). The large extent of remaining intact rainforest within the country (Potapov et al 2008) provides Peru an incentive for combating deforestation and conserving rainforest ecosystem services. The Peruvian government has developed an agenda to decrease deforestation rates toward net zero deforestation by 2021 (Piu and Menton 2013). Many national and international initiatives aimed at monitoring and reducing deforestation are underway in Peru, several of them specifically in support of the reducing emissions from deforestation and forest degradation (REDD+) national program. The international REDD+ negotiations, sponsored by the United Nations Framework Convention on Climate Change (UNFCCC) since 2005, have provided methodological guidance on REDD+ through conservation of forest carbon stocks, sustainable management of forests, and enhancement of forest carbon stocks in developing countries. Preparation activities (i.e. the readiness phase) for the implementation of REDD+ projects at the national level have been developed in Peru since 2009. The Ministry of Environment (Ministerio del Ambiente, MINAM) served as the primary governmental agency responsible for the national REDD+ implementation. One of the main tasks of this process, as requested by the UNFCCC, is to develop a robust and transparent national forest monitoring system (NFMS) for the monitoring and reporting of the REDD+ activities.
According to the Intergovernmental Panel on Climate Change guidelines (IPCC 2006), required data for estimating emissions from forest cover changes include forest extent, emissions factors (i.e. coefficients quantifying emissions per unit activity), and activity data (i.e. data on the forest cover changes). The only viable data source for timely monitoring and consistent quantification of forest cover change at national scales and annual time intervals is satellite imagery (Hansen and Loveland 2012). A number of satellite-based change detection methods for forest monitoring were prototyped over Peru, as local or national projects, or as a part of continental or global analyses. Landsat data and on-screen digitizing were used to map deforestation within the Peruvian Amazon for 1985-1990and 1990-2000time intervals (MINAM 2009. Conservation International performed the nation-wide analysis of forest extent and loss for the 1990-2000 interval using supervised image classification (CI 2008). They reported forest loss of 56 300 ha yr −1 for 1990-2000 decade. In 2014, the Dirección Generale de Ordenamiento Territorial (DGOT) within MINAM produced an estimate of deforestation from 2000 to 2009, using Landsat data and a rule-based linear mixture modeling algorithm (DGOT-MINAM 2014). Forest loss was estimated by DGOT as 91 100 ha yr −1 for 2000-2005 interval and 163 300 ha yr −1 for 2005-2009 interval. Gross forest cover loss within the country was also mapped within global mapping initiatives using moderate resolution imaging spectroradiometer (MODIS)  and Landsat data .
According to the IPCC guidelines on greenhouse gas emission inventory (IPCC 2006), the data on forest cover change provided by a NFMS should be clearly documented, complete at the national scale, consistent between time intervals, comparable between countries, and include uncertainty estimates. In determining the suitability of proposed methods for change detection for an operational NFMS, several factors should be taken into account. These factors include: (i) data continuity, cost and access; (ii) time and effort required for data processing and map characterization by the national team; (iii) repeatability across space and through time, meaning consistency of the national monitoring program as well as portability to other countries and regions. The goal of the current project was to develop and prototype an efficient methodology for initial forest cover and change assessment in the context of the development of a NFMS and implementation of REDD+ activities in Peru. The choice of data and methods was guided by several requirements: (i) wall-to-wall national coverage of free-of-charge satellite data; (ii) cost-effective and fast data processing algorithm; (iii) statistically validated results. Landsat data were selected for the national assessment as the only medium spatial resolution data available free-of-charge. The Landsat data processing and analysis algorithm was guided by our experience with global forest loss mapping  and applied at the national scale using an improved version of our data processing system. Areas of primary and secondary forest extent and total gross forest cover loss were estimated using probability-based sampling. Landsat and high spatial resolution RapidEye imagery were employed to characterize forest cover and change within sample blocks. Specifically, gross forest cover loss was quantified from high spatial resolution data for each sample block; sample-based estimates were subsequently incorporated with mapped forest loss information to reduce the standard error of the estimate. Area of gross forest cover loss was disaggregated in time (annually), in space (by regions), and by disturbance factors using wall-to-wall annual Landsat forest loss maps. The work was carried out as a collaboration between the University of Maryland (UMD) and MINAM, with input data processing performed by UMD, image interpretation and mapping performed by the MINAM REDD+ Project, and validation provided by an independent analysis. Our results depicted forest cover extent and change from 2000 to 2011 and are presented as a baseline of activity data for national REDD+ activities measuring and reporting.

Data
Two satellite datasets were used for the analysis: (i) wall-towall medium spatial resolution data for forest cover change mapping; and (ii) samples of high spatial resolution data for total forest and change area estimation and wall-to-wall map validation. The wall-to-wall dataset consisted of Landsat multispectral imagery and digital elevation data; the samplebased dataset consisted of Landsat and RapidEye imagery.

Landsat data
For this analysis, we used the entire archive of Landsat 7 ETM+ imagery with less than 75% cloud cover available at the United States Geological Survey National Center for Earth Resources Observation and Science (USGS EROS). The total number of Landsat scenes was 11 654. All images were acquired in the standard terrain-corrected L1T format, which provides systematic radiometric, geodetic and topographic accuracy. We used four reflectance bands: red (band 3), near infrared (NIR, band 4), shortwave infrared (SWIR, bands 5 and 7), and the emissive thermal infrared band 6 to calculate multi-temporal metrics (section 3.1).

Topography data
Elevation above sea level and slope were added as additional data layers for the classification. We used the void-filled seamless Shuttle Radar Topography Mission (SRTM) digital elevation model available at http://srtm.csi.cgiar.org, and calculated slope. Both inputs were re-projected and resampled to match the 30 m Landsat pixel size.

RapidEye data
Multispectral remote sensing imagery from the RapidEye constellation was acquired for 30 validation sample blocks of 12 × 12 km each. The majority of images were from the year 2011 with a small number of images from 2010 and 2012. The constellation has five identical earth observation satellites recording radiance in five spectral bands corresponding to the blue, green, red, red-edge and NIR part of the electromagnetic spectrum. The imagery has a spatial resolution of 5 m.

Methods
Our wall-to-wall gross forest cover loss mapping and sampling analysis was carried out within the humid tropical forest zone of Peru. The zone was delineated by MINAM (2012) and extends from the eastern slopes of the Andes into the Amazon basin. Landsat data from year 2000 to 2011 were automatically processed and composited by UMD to create a nation-wide set of multi-temporal spectral metrics. These spectral metrics were used as independent variables for wallto-wall gross forest cover loss detection for the 2000-2011 time interval; MINAM performed this task using a supervised decision tree classification algorithm. The total gross forest cover loss was attributed by date (year of change) and by disturbance agent by UMD. The map was also employed to create a stratification of high and low forest cover loss; sample blocks were randomly selected from each stratum. For each sample, forest cover 2000 and forest cover loss 2000-2011 were characterized using Landsat and RapidEye data. Sample-based attribution was performed by both organizations with input from independent experts. Sample-based estimates provided total gross forest cover loss area, while the wall-to-wall Landsat-scale map was used to disaggregate this area by change date, change factor, and by region. Estimates of primary and secondary forest cover area were produced only from sample data.
Our analysis was based on the following set of definitions. Forest was defined as areas with trees above 5 m and tree canopy cover above 30% within Landsat 30 m pixels, and included natural forests, secondary regrowth, and tree plantations. Forest cover loss was defined as any disturbance event leading to complete or nearly complete removal of tree cover within a Landsat pixel. Primary forests were defined as intact or mature secondary forests absent of visible signs of recent alteration by human activity. All other tree cover, whether regrowing forests or tree plantations was considered as secondary forests.

Landsat data processing
The Landsat data were transformed from L1T imagery to multi-temporal metrics using the same approach as was implemented for the global forest cover produce of Hansen et al (2013). The approach is easily applied to smaller areas, for example at national-scale. A precursor effort was implemented for the Democratic Republic of Congo (Potapov et al 2012), including a sample-based accuracy assessment (Tyukavina et al 2013).
All Landsat 7 ETM+ L1T images were reprojected from the local UTM projections to the Sinusoidal projection with central meridian 60°W and resampled to the common output raster grid. Using the common raster grid facilitated per-pixel data compositing, and the choice of equal-area projection allowed for easy area estimation. We converted the digital numbers of the reflective bands to top-of-atmosphere reflectance and the thermal band to brightness temperature using the standard protocol (Chander et al 2009). A set of quality assessment models was then applied to each pixel of an image, resulting in a quality data layer that identified pixels affected by clouds, haze, and cloud shadows (Potapov et al 2012). The quality assessment was performed using a set of bagged decision trees (Breiman et al 1984) derived from a large random set of training data collected throughout the tropical biome.
A radiometric normalization was applied to all images to reduce reflectance variations between image dates due to atmospheric conditions and surface anisotropy. We employed a low spatial resolution (250 m) cloud-free surface reflectance product from MODIS as a normalization target. To normalize each Landsat image, we calculated a mean bias between MODIS surface reflectance and Landsat top-of-atmosphere reflectance for each spectral band over the image area consisting of good quality land observations, and applied the bias value to adjust Landsat reflectance. Across-track reflectance anisotropy was corrected by modeling Landsat TOA-MODIS surface reflectance bias as a function of the Landsat scan angle (Potapov et al 2012). In addition to the reflectance bands, two indices were calculated for each Landsat image, the normalized difference vegetation index (NDVI, Tucker 1979) and the land surface water index (LSWI, Xiao et al 2004).
The analysis of the Landsat image time series uses multitemporal metrics (DeFries et al 1995, Hansen et al 2008, Potapov et al 2012 and allows for handling time-series data with variable observation density. Multi-temporal metrics capture spectral change within a standardized feature space not tied to any specific time of year. In so doing, they enable the extrapolation of rule sets, such as decision tree models, across large areas. The metrics feature space thus allows for the detection of forest change during 11 years of observations over the entire humid tropical forests within the country. Metrics were extracted at a per-pixel level from all cloud and cloud shadow-free observations from 1999 to 2011. Our metrics set included start/end image composites, rank-based metrics, and trend analysis metrics. To produce rank-based metrics, band reflectance values from 2000 to 2011 were ranked based on (i) band reflectance value or (ii) corresponding ranks of selected indices (NDVI, LSWI) and brightness temperature. Metrics were created from observations representing selected percentile values (minimum, 10%, 25%, 50%, 75%, 90%, maximum) and averages were calculated between these values for each band and rank method. A separate group of metrics was derived for per-band reflectance change during the analyzed time interval. These metrics included: (i) slope of linear regression of band reflectance versus image date, and standard deviation of band reflectance from 2000 to 2011; (ii) maximum positive and negative change in reflectance between consequent observations; and (iii) selected statistics (minimum, maximum and range) for temporal segments defined by per-band absolute maximum and minimum reflectance values.

Change detection
Multi-temporal metrics derived from cloud-free Landsat observation covered 99.8% of the humid tropical biome area. The remaining area (mountain ridge crests with permanent orographic clouds) was not processed by the change detection algorithm due to data limitations.
Gross forest cover loss from 2000 to 2011 was mapped using a supervised bagged classification tree model (Breiman et al 1984). A group of image analysts at MINAM performed visual interpretation of the training sites, mapping areas of stable forest cover and gross forest cover loss from 2000 to 2011. The composites of the first and last cloud free observations, along with maximum band reflectance composites, were used for data visualization. Analysts used a number of additional datasets, including freely available high-resolution images from GoogleEarth™ as reference materials to aid interpretation. All events resulting in forest cover loss (permanent or temporal) at the 30 m pixel scale, including agriculture clearings (even followed by forest regrowth within the same time interval), logging, fire, flooding, and storm damage were mapped together as the forest cover loss class.
The classification tree model related manually interpreted training with the entire set of spectral metrics for the 2000-2011 time interval. Classification was done iteratively, by examining the output map, correcting training, and rerunning the classifier again until a sufficiently accurate product had been created. The final product was extensively visually checked and manually corrected for remaining noise and local errors. As a result of manual correction, 51 thousand ha (3.1% of total loss area) of loss was manually added to the map product. These areas were not detected as change due to limited sensitivity of the classification model. Noise and other sources of committed error of forest cover loss were manually removed, totaling 31 thousand ha (1.9% of total loss area). Corrections were made predominantly over mountain areas where frequent cloud cover, topographic shadows, and low tree canopy cover density caused local errors.

Gross forest loss attribution by year and disturbance agent
To disaggregate change areas by forest clearing date we employed an analysis of annual NDVI profiles. For each year the minimal annual NDVI value was collected per pixel. For all change pixels, we analyzed inter-annual minimal NDVI difference, and the year representing the highest drop in NDVI was selected as the change date.
The goal of forest loss cause attribution was to separate natural disturbance and fires from anthropogenic forest clearings. The attribution process included two steps. First, all forest loss due to flooding and river meandering was mapped automatically using annual water masks collected from all cloud-free image observations. Second, visual analysis of change areas was performed and used to identify and label losses due to fires, landslides, and windstorms. The remaining forest cover loss was attributed to anthropogenic forest clearing within primary, secondary forests and forest plantations.

Sample-based area estimation and map validation
The sample-based analysis goals were to estimate area of gross forest cover loss, primary forest extent, and map accuracy. Collecting reference data to validate 11 years of land-cover change is challenging. There are no adequate field data that could be used for such a purpose. Commercial high spatial resolution data are not uniformly available for Peru, particularly for the earlier years of the study, precluding a probability-based sample design using exclusively high spatial resolution imagery. Landsat observations (single-date images) interpreted through time are a viable validation reference data set and have been used to assess forest loss area and map uncertainty . However, the preference is to include high spatial resolution data for determining omission errors related to the spatial scale of forest disturbance (Tyukavina et al 2013). For this study, a set of RapidEye images from the year 2011 was available at the national-scale, and a hybrid validation approach was employed: high spatial resolution (RapidEye) data for year 2011 was compared with Landsat data for the year 2000. The spatial resolution of the RapidEye data (5 m) enabled mapping of disturbance at sub-Landsat-pixel scale. Landsat data, including the 15 m panchromatic band, were used as the reference year 2000 imagery. The choice of the validation sampling design (figure 2) was strongly influenced by RapidEye data costs, as we were only able to purchase 30 images for circa year 2011 from a national coverage. A two-stage cluster sampling design was implemented with the first stage being a stratified random sample of clusters and the second stage being a simple random sample of pixels within clusters. The size of the clusters was guided by the RapidEye image tile size. Each cluster was a 12 km by 12 km block, and consequently the sample pixels were spatially constrained to the 30 selected sample blocks. The sampling frame over the country consisted of 5532 12 × 12 km blocks within the humid tropical biome (sample blocks with biome area coverage of less than 40% were excluded from sampling, reducing the total sampled area of the biome by less than 1%). To select 30 sample blocks (clusters), we employed a stratified random sampling design. Two strata were selected, high change (the blocks with the highest gross forest loss area comprising 50% of the total change within the sampling frame) and low change (the blocks of low change totaling the remaining 50% of total forest cover loss). The threshold separating the high-and lowchange strata was 9.8% of gross forest loss per block. The high-change stratum included 337 blocks and the remaining 5195 blocks comprised the low-change stratum. Guided by Neyman optimal sample size allocation rules (Cochran 1977), we selected 70% of the 30 total sample blocks (21 blocks) from the low-change stratum, and 30% (nine blocks) from the high-change stratum. Within each sampled block, 100 points (representing 30 × 30 m Landsat pixels) were selected by simple random sampling. The selected pixels within a block constitute the second stage sample.
Selected sample points (Landsat pixels) were visually analyzed using Landsat 2000 and RapidEye 2011 data ( figure 3). For each pixel, analysts recorded the fraction of gross forest cover loss (both from natural and anthropogenic factors). In addition, for 2000 the analyst recorded the forest type: primary or secondary forest. Primary forests were distinguished by the absence of visible disturbance and a spectral signature and texture similar to nearby surrounding intact forest tracts. Forest plantations, regeneration on old clearings, and fallows were considered secondary forests. Analysts also marked pixels located on the edge of forest cover loss patches, which we expected to have lower classification accuracy compared to 'core' forest loss and no-change areas. Reference results were provided for all except 20 points that had no cloud-free RapidEye data and were excluded from the analysis.
Sample-based results (per-pixel comparison of reference change fraction and map-based change detection) were used to produce estimates of forest loss area and map accuracy metrics, along with the standard errors associated with these estimates (a 95% confidence interval can be obtained by adding and subtracting two times the standard error of the estimate). Using primary/secondary forest reference data we were able to estimate the proportion of primary forests in 2000 as well as the proportion of primary forest loss. The estimation formulas are presented in the appendix.

Gross forest cover loss
Estimation of gross forest cover loss area within the Peruvian humid tropical biome (total area 78.6 million ha) for the 2000-2011 time interval was the main objective of the proposed methodology. The sample-based area, which is collected from a probability sample and attributed using high spatial resolution data, is considered the primary estimate for forest extent and change. Total gross forest cover loss 2001-2011 equaled 2.439% (SE = 0.162%) of the humid tropical forest biome area.
Sample-based estimates also allowed us to disaggregate total forest loss into primary forest clearing and secondary forest rotation. In 2000, Peruvian primary humid tropical forest cover totaled 92.63% (SE = 2.16%) of the humid tropical forest biome area, and secondary forest the remaining 7.37%. Forest loss comprised 1.32% (SE = 0.37%) of primary forest area and 9.08% (SE = 4.04%) of secondary forest area. Of total gross forest cover loss, 64.88% (SE = 6.75%) was found in the primary forests.
Our map-based gross forest cover loss area (2.07% of the humid tropical forest biome area) was found to be below the sample-based estimate (2.44%). Accuracy measures show overall map accuracy of 99.4% (SE 0.2%). However, the Landsat-based map underestimates forest cover loss, with producer's accuracy of gross forest loss class of 75.4% (SE 2.5%) and user's accuracy of 92.2% (SE 1.9%) (table 1). As we expected, edge pixels have the highest uncertainty of classification results. The overall accuracy of edge pixels (n = 138) was 73.0%. Excluding these pixels (remaining sample n = 2844), overall accuracy increased to 99.8%, producer's accuracy to 86.3%, and user's accuracy to 95.4%.

Annual and sub-national gross forest cover loss
While the sample-based method allowed us to estimate total gross forest loss area with high precision and known uncertainty range, the wall-to-wall Landsat change detection map allows for annual and sub-national disaggregation of the total loss (figure 4, table 2).
The annual gross forest cover loss indicated an increasing trend in loss over time with At the sub-national level, 80% of total gross forest loss was concentrated within five regions: San Martin, Loreto, Ucayali, Huánuco, and Madre de Dios. Visual analysis of the forest loss map and satellite data revealed different patterns of forest loss within these regions. We suggest that the primary cause of forest loss in San Martin and Huánuco regions is clearing for agriculture and pastures. These regions featured the highest rate of forest cover loss (8.99% and 9.76% of total tropical forest biome area, respectively). In Loreto and Ucayali, large-scale industrial clearing for oil palm plantations is also present, together with small-and medium-size agriculture clearing. Gold mining (Asner et al 2013) and large-scale agriculture clearing along the interoceanic highway are the leading cause of forest loss in the Madre de Dios region.
Annual loss rates varied between Peruvian regions (figure 5). Pronounced intensification of forest cover loss in Loreto and Ucayali region is most probably an outcome of the recent expansion of industrial oil palm plantations (Gutiérrez-Vélez et al 2011). In Loreto region, a single contiguous patch of new oil palm plantations established between 2007 and 2011 accounted for more than 11 thousand ha of forest clearing. San Martin region, where forest loss is mostly due to concentrated agriculture expansion, featured high fluctuations in annual forest loss area.

Disturbance types
Forest disturbance type mapping is important in differentiating the factors of forest loss (natural or anthropogenic disturbance types) as carbon monitoring programs seek to reduce human-induced conversion of natural forests. The inclusion  of natural factors of forest loss can be of value in assessing underlying factors such as climate change (Easterling and Apps 2005).
The majority of gross forest cover loss (92.2%) was attributed to clearing for agriculture and tree plantations. The rest was due to natural disturbance, mostly represented by flooding and river meandering (6.0%), and fires (1.5%). Windstorm damage represented 0.3% of total loss, and landslides 0.02%. Forest loss due to river course change was mostly present along the Amazon River and its major tributaries (figure 6.). Fires damaged forests along the Amazon and Ucayali rivers in Loreto and Ucayali regions. Windstorms were found mostly in remote forest areas of the Amazon basin in Loreto region.
While the total area of natural disturbance is small, the annual rate of such change fluctuated significantly (figure 7). The extreme drought year of 2005 was associated with the highest burned area extent (almost 30% of total burned area and another 25% attributed to the year 2006). A one-year delay in fire date attribution is typical for remote sensing products , so we may suppose that in total, the year 2005 was responsible for more than one-half of total burned area over the study period. Extreme windstorms accompanied the drought of 2005, and were also present in the last year of the study. Conversely, forest loss due to flooding and river meandering was the lowest in 2005, likely due to decreased stream flow resulting from the lack of precipitation. In general, the area of natural disturbance increased during the observation period.

Discussion
An effective algorithm for national-scale forest change detection is a key component of any NFMS and a requirement for a country to participate in REDD+ activities. Such algorithms should be capable of precise and timely estimation of activity data (primarily forest cover loss) and associated uncertainty for national carbon emission monitoring. Several different algorithms and approaches are in development for tropical countries (Asner et al 2009, Huang et al 2010, INPE 2010, Lehmann et al 2013, and the comparison of these methods should take into account several major factors, including cost, required effort, data processing and analysis speed, and consistency of results at national and international levels. Freely available, long-term data sources are preferred inputs to operational monitoring programs. The example presented here is based on Landsat data, available free-ofcharge for the entire globe. Higher spatial resolution Rapi-dEye data were purchased in a sampling mode, thereby minimizing data costs. When using large volumes of timeseries data, it is necessary to have a standard and automated processing scheme for deriving cloud-free time-series metrics binned to a standard gridded map projection. The system presented here employed automated radiometric normalization, quality assessment, and time-series feature derivation within the defined study period. The standard for such processing of land products is MODIS (Justice et al 2002), and new Landsat-based systems are now being developed (Roy et al 2010). Our approach follows the MODIS model, transforming single daily observations into a consistent set of national (continental, global) time-sequential metrics .
To date, most national-scale analyses have employed single image footprints as the basic unit of analysis (Asner et al 2009, Huang et al 2010, INPE 2010, Lehmann et al 2013. Per-scene data analyses have several major drawbacks (Hansen and Loveland 2012). First, they significantly increase latency of product derivation and increase amount of effort in generating products. Second, separate per scene analyses compromise the consistency between neighboring scenes. Third, employing raw uncalibrated Landsat data often requires analysts to apply advanced data processing techniques, including radiometric and geometric corrections. Such requirements increase overall effort and associated costs of the methodology, including greater initial capacity per analyst. Alternatively, we advocate the use of pre-processed data in the form of national time-series metrics and composites. Such data facilitate the timely and internally consistent derivation of national-scale forest change products. Two important aspects of our approach deserve to be highlighted. First, the only initial requirement for national-scale mapping using the presented method is expertise in visual interpretation of forest extent and change. Visual analysis of preprocessed image composites does not require special knowledge of data management techniques and could be easily performed by a forester or resource management specialist. Second, national-scale products can be derived and iterated with low latency, because the classification model derived from training data is implemented for the entire area at once. The ability to rapidly iterate classifications is important and quickly leads to a final map product. Iteration also illustrates the concept of versioning and as new input data are made available, new versions of maps can be made if improved accuracy is documented to justify an update. Standard pre-processed data inputs also foster product consistency between countries.
The computer and software progress in recent years allowed for national image processing to be performed within the countries by regional specialists. While global image analysis still requires cloud computing capabilities , national analysis based on preprocessed gridded data may be performed using personal workstations and standard image analysis software. We have prototyped this approach in other countries, initially in the Democratic Republic of Congo (Potapov et al 2012). This paper presents a first product with an operational in-country agency, the Peruvian MINAM.
A precise forest definition is an important part of the established forest change reporting system. The national forest program in Peru adopted the FAO criteria of forest, defined as minimum tree cover of 10% and height of mature woody vegetation of 2 m and above (MINAM 2014). Our minimum tree canopy cover threshold of 30% is higher than the minimum tree cover criterion proposed by the Peruvian NFMS program. Our use of a 30% forest cover threshold is based on our previous Landsat studies (Broich et al 2011 and is meant to facilitate interpretation of training and validation data. As the study area consisted of the humid tropical forest ecozone of Peru, there is limited extent of open canopied woodlands in the 10-30% canopy range. Using the percent tree canopy cover product of Hansen et al (2013), we quantified the difference in forest extent between the two definitions; the area of forest within the 10-30% cover range added only 0.4% forest extent to our definition. For humid tropical forests, the low end of forest cover definitions should not greatly impact forest extent estimation. However, for dry tropical environments, where open canopied woodlands and parklands are extensive, the application of differing low-end cover thresholds will greatly impact forest extent estimates. Spectral signatures and textures were interpreted in combination with landscape context to label primary forest cover. Primary forests are spatially extensive, have dark spectral signatures and marked texture. Mature secondary forests can be indistinguishable from primary forest due to similar structure and composition (assessed using Landsat imagery) and are included in our primary forest class. Younger secondary forests differ in their structure and composition, as suggested by Chokkalingam and De Jong (2001). The more homogeneous canopies of regrowing natural forests and plantations is readily visible in both Landsat and Rapi-dEye imagery. Our mapping of secondary forests in the Democratic Republic of Congo (Potapov et al 2012) relied on smoother canopy structure and a resulting brighter NIR reflectance.
While the Landsat-based wall-to-wall forest cover loss map is required for the national-scale analysis, the area estimate of gross forest cover loss is based on the Landsat/ RapidEye assessment. The higher spatial resolution of the RapidEye sensor (5 m/pixel compared to 30 m/pixel of Landsat) allowed for the precise detection of change at Landsat sub-pixel scale. As cautioned by Olofsson et al (2013Olofsson et al ( , 2014, estimating area from a map is subject to bias attributable to classification errors. The comparison of user's and producer's accuracies from the error matrix (table 1) shows that the Landsat-based map underestimated forest cover loss. The sample-based estimate of forest loss area is 17.5% higher (2.44% of the biome area) than the map-based estimate (2.07%). Such underestimation is possible for map products based on medium spatial resolution satellite data in areas of predominantly small-scale forest clearing (Tyukavina et al 2013). The estimate derived from the sample of high spatial resolution data (RapidEye) is assumed to provide a more accurate assessment of forest loss than does the Landsat-derived map; the final area of forest loss is thus estimated from the RapidEye imagery.
However, the wall-to-wall map is still beneficial as it is employed as auxiliary information in a model-assisted estimator (Särndal et al 1992, Stehman 2009) that results in a substantially more precise estimate of area of forest loss than would be obtained from just the RapidEye sample data. The standard error for the estimated percent area of forest loss would have been 0.60% had the map not been used in the estimation compared to the standard error of 0.16% which was reported for the estimate that incorporated the map information. While the final area estimation was made using high spatial resolution sample data, the Landsat-based map was used (i) to define the strata used to select the sample, (ii) as an input to reducing overall estimate uncertainty (see appendix) and (iii) to disaggregate the total loss area by year, province, and disturbance cause. For this disaggregation, we assumed that the relation of sample-based and Landsat-based forest loss area is uniform in time and in space. We believe that this is a valid assumption for Peru where small-scale forest disturbance is widespread. Our total area disaggregation approach, however, may not be suitable for precise subnational forest loss trends reporting in the context of REDD activities. The same sampling approach could be used to provide sub-national estimates, but the sample size would need to be substantially larger. Because we did not have an adequate sample size for regional estimates, using the wall-towall Landsat-based forest loss map was the only viable approach to provide change estimates per region. Our proposed methodology was implemented in Peru to map a baseline forest cover loss from 2000 to 2011. Updates of forest loss information are planned using more recent Landsat data. The multi-temporal metric approach yields standard image datasets for any temporal window-from one year to more than a decade. Moving forward, systematic annual updates are planned using the presented processing approach. Using the same data type for both baseline and monitoring products ensures consistency. Production and validation of the updated forest loss maps for Peru is now in the implementation stage.

Conclusion
An accurate, low latency approach for national-scale mapping of gross forest cover loss in the Peruvian humid tropical forest biome was implemented using Landsat time-series datasets. High spatial resolution sample data from RapidEye were used to produce final area estimates and uncertainties, and to validate the Landsat-based results. Validation confirmed the high accuracy of the Landsat map, which in turn was used to disaggregate change spatially (by region), temporally (at annual intervals), and by disturbance type (anthropogenic clearing and natural disturbance). Gross forest cover loss results are the key component of the NFMS and are important for the Peruvian REDD+ team as a baseline for on-going forest change monitoring. This approach is now being extended to the present in deriving a 14 year record of forest loss in Peru. The presented approach is based on, and consistent with, the global methodology used by Hansen et al (2013), and may be implemented at the national level for any country requiring basic information quantifying forest extent and change, whether for REDD+ or other forest management purposes.

Acknowledgments
The project was supported by Betty and Gordon Moore foundation, KfW, the Peruvian Ministry of the Environment (MINAM), Servicio Nacional Forestal y de Fauna Silvestre (SERFOR), Amazon Cooperation Treaty Organization (OTCA), US Government Inter-agency program SilvaCarbon, and USAID Forest Carbon, Markets and Communities Program (FCMC). SS was supported by NASA Carbon Monitoring Systems Program grant #NNX13AP48G.

Appendix. Technical details of estimation of forest loss and accuracy
The sampling design is a two-stage cluster sample with a stratified random sample of clusters (12 km × 12 km blocks) selected at the first stage and a simple random sample of 100 pixels selected within each sampled cluster at the second stage. The accuracy analysis is based on a population error matrix (table A1), and the cell entries of this error matrix must be estimated from the sample. Each pixel is classified as loss or no loss by the map, but the reference classification can be one of five values for proportion of area of forest loss for a sample pixel, 0, 0.25, 0.50, 0.75, or 1.00. For a given sample pixel, the proportion of area allocated to each cell of the error matrix was determined as follows. Let c hij = 0 if sample pixel j in cluster i of stratum h is mapped as no change and let c hij = 1 if sample pixel j in cluster i of stratum h is mapped as forest loss. Let r hij = proportion of area of forest loss according to the reference classification for sample pixel j in cluster i of stratum h (r is used for 'reference'). Suppose the sample pixel is mapped as no change (c hij = 0). Then the proportion of area of the sample pixel (P kl,hij ) allocated to row k and column l of the error matrix is: For example, if c hij = 0 and r hij = 0.25, then P 11,hij = 0.75 (proportion of agreement of no loss), P 12,hij = 0.25 (proportion of map omission error of loss), and P 21,hij = P 22,hij = 0. If the sample pixel is mapped as forest loss (c hij = 1), then: = = P P 0(no area mapped as no change), For example, if r hij = 0.25 and c hij = 1, then P 21,hij = 0.75 (proportion of map commission error of loss), P 22,hij = 0.25 (proportion of agreement of loss), and P 11,hij = P 12,hij = 0.
The accuracy estimates were produced using the SUR-VEYMEANS procedure of the Statistical Analysis Software (SAS version 9.3, SAS Institute, Cary, North Carolina, USA). The SAS program used to implement this analysis is available upon request to the corresponding author. For the sampling design implemented, the primary sampling unit is a cluster (indicated by the subscript i) and each cluster is assigned to a stratum h. The basic estimator used throughout is a ratio estimator (see documentation provided by SAS version 9. where h is the stratum index (h = 1, 2), i is the cluster index in stratum h for the sample size of n h (i = 1, 2, …, n h ) clusters sampled from the N h clusters available in stratum h, j is the index of the pixel (j = 1,…, m hi ), m hi is the number of pixels sampled in cluster i of stratum h from the M hi pixels available, x hij and y hij are defined to yield the parameter of interest (see Table A1. Structure and notation for a population error matrix based on a census of all pixels. The cell entries P kl represent the proportion of area with map class k and reference class l.

Map class No change Forest loss Row total
No change P 11 P 12 P 1+ Forest loss P 21 P 22 P 2+ Col. total P +1 P +2 1 subsequent explanation below), and w hij is the estimation weight (i.e., inverse of the inclusion probability) for sample pixel j in cluster i of stratum h. For the two-stage cluster sampling design, the inclusion probability of a pixel is the product of the first-stage inclusion probability (n h /N h ) and second-stage inclusion probability (100/160 000). The second-stage inclusion probability is derived from taking a simple random sample of 100 pixels from the 160 000 pixels available in each sampled cluster. The variance estimator for R is based on a Taylor series approximation (Särndal et al 1992): , then the denominator of the ratio estimator is M , an estimator of the total number of pixels in the population. To estimate forest loss as a proportion of primary forest area, we define y hij = P 12,hij + P 22,hij (reference proportion of area of forest loss) and define = x 1 hij if the pixel is in primary forest and = x 0 hij otherwise. To estimate the proportion of area of forest loss, we can incorporate the forest loss map information to construct an estimator that has smaller standard error than a direct estimator of this proportion from the error matrix (i.e., the direct estimator would be + P 2 ). For the two-stage cluster sampling design, many sample clusters did not have any sample pixels selected that were mapped as forest loss. Consequently, the estimator used is a difference estimator (Särndal et al 1992, chapters 6 and 8) as opposed to a poststratified estimator (Stehman 2013). The difference estimator is based on the sample differences e hij = y hij − x hij where y hij is the reference proportion of forest loss and x hij is the map proportion of forest loss for sample pixel j in sample cluster i of stratum h. The general form of the difference estimator of the total number of pixels of forest loss in stratum h is where X h = total number of pixels mapped as forest loss and Ê h is the estimated total of the differences e hij for stratum h. Ê h is estimated from the two-stage cluster sample as follows. For sample cluster i, the estimated total of the differences is where M hi is the number of pixels in all of cluster i (m hi will denote the number of pixels sampled in cluster i) and ē hi is the sample mean of the differences in cluster i of stratum h. Then the estimated total of the differences for stratum h is where N h = number of clusters in the stratum and n h = sample size for the stratum. Ŷ D h , was computed for each of the two strata constructed for the sampling design and the total number of pixels of estimated forest loss for all of Peru was the sum of the two stratum totals (i.e., sum the Ŷ D h , estimates for the two strata). The total estimated number of pixels of forest loss was then divided by the total number of pixels in the biome to scale the estimate of forest loss as a proportion of biome area. The variance estimator for Ŷ D for each stratum depends only on the variance of Ê h because X h is a known constant. The variance estimator is then ( ) V Êˆh is computed for each of the two strata and then the two estimated variances are added to provide the estimated variance for the entire biome, ( ) V ŶˆD . The standard error of Ŷ D is the square root of this estimated variance, and dividing this standard error by the total number of pixels in the biome yields the standard error of the estimated proportion of area of forest loss.