Mapping pervasive selective logging in the south-west Brazilian Amazon 2000–2019

Tropical forests harbour the highest biodiversity on the planet and are essential to human livelihoods and the global economy. However continued loss and degradation of forested landscapes, coupled with a rapidly rising global population, is placing incredible pressure on forests globally. The United Nations has developed the Reducing Emissions from Deforestation and forest Degradation (REDD +) programme in response to the challenges facing tropical forests and in recognition of the role they can play in climate mitigation. REDD + requires consistent and reliable monitoring of forests, however, national-level methodologies for measuring degradation are often bespoke and, because of an inability to track degradation effectively, the majority of countries combine reporting for deforestation and forest degradation into a single value. Here, we extend a recent analysis that enabled the detection of selective logging at the scale of a logging concession to a regional-scale estimation of selective logging activities. We utilized logging records from across Brazil to train a supervised classification algorithm for detecting logged pixels in Landsat imagery then predicted the extent of logging over a 20 year period throughout Rondônia, Brazil. Approximately one-quarter of the forested lands in Rondônia were cleared between 2000 and 2019. We estimate that 11.0% of the forest area present in 2000 had been selectively logged by 2019, comprising >11 500 km2 of forest. In general, rates of selective logging were twice as high in the first decade relative to the last decade of the period. Our approach is a considerable advance in developing an operationalized selective logging monitoring system capable of detecting subtle forest disturbances over large spatial scales.


Introduction
The ten countries reporting the highest forest losses over the last 15 years are all in the tropics (FAO 2015). Tropical forests are among the most biodiverse ecosystems on Earth, play a crucial role in the global carbon and hydrological cycles, and support human livelihoods and the global economy (Pan et al 2011, Lewis and Maslin 2015. Moreover, there is increasing recognition that tropical forests will be vital in nature-based solutions mitigating climate impacts and reaching targets in the Paris Climate Agreement (Houghton et al 2015, Griscom et al 2017. However continued loss and degradation of tropical forests, coupled with a rising global population and growing energy demands, are putting enormous pressure on forests globally .
In response to both the challenges and opportunities tropical forests present, the United Nations (UN) has developed the Reducing Emissions from Deforestation and forest Degradation (REDD +) programme. REDD + aims to mitigate climate impacts while maintaining the myriad of services forests provide (e.g. flood prevention, control soil erosion, maintain biodiversity, cultural traditions, etc) through sustainable forest management (UN-REDD 2018). An essential component in REDD + is consistent monitoring systems for national-level reporting of anthropogenic greenhouse gas emissions from activities affecting forests. Guidelines for estimating and reporting emissions from forest degradation are based on methods for land use change recommended by the Intergovernmental Panel on Climate Change (IPCC 2019) to facilitate a consistent framework for estimating reference levels (GFOI 2016). Yet the IPCC and REDD + lack specific methodological details on quantifying emissions from forest degradation (IPCC 2006, Pearson et al 2014. This is because degradation is notoriously difficult to quantify, as it includes a variety of forest disturbances (e.g. fire, selective logging, mining, fuelwood consumption, hunting, invasive species), and forest degradation often operates at spatial and temporal scales incompatible with reporting at the national level (Hosonuma et al 2012, Pearson et al 2014, Ghazoul et al 2015. Consequently, national-level methodologies for measuring degradation are often bespoke and most countries report emissions from both forest degradation and deforestation as a single, combined value (Hosonuma et al 2012, Pearson et al 2017. Advances in remote sensing have made satellite data the most practical and cost-effective way to monitor forests at large spatial scales. The spatial and temporal accuracy of deforestation monitoring has improved rapidly in the last decade (Hansen et al 2013, Reiche et al 2018, as have detection of the spatial extent, severity, and impacts of fires (Peres et al 2006, Matricardi et al 2010. Yet, detection of selective logging has shown little progress, despite being a key driver of both deforestation and forest degradation (Hosonuma et al 2012, Pearson et al 2017. Selective logging often marks the onset of anthropogenic disturbance affecting primary forests, with road networks and improved access to forested lands facilitating further degradation (e.g. fuel wood removal, spread of invasive species, illegal logging, mining, and fires) or forest clearance for pastures, agriculture, or settlements. Furthermore, because of the role tropical forests are poised to play in meeting climate targets and growing concerns about the impacts to other services, the amount of tropical forests logged at lower intensity and with better management practices is likely to grow.
Efforts to improve detection of selective logging have appeared periodically in the literature (e.g. Asner et al 2005, Souza et al 2005, Broadbent et al 2008, Matricardi et al 2010. In all cases the approach was either a proof-of-concept and not applied at scale or the intensity of selective logging was so high that detections are mapped as forest loss in the (Hansen et al 2013) data (e.g. Asner et al 2006; see also figures S1 and S2 (available online at stacks.iop.org/ERL/15/094057/mmedia). The majority of researchers have utilized spectral unmixing of before-after images to estimate forest disturbance between time steps (e.g. Souza et al 2013). Single-image analyses can miss forest disturbances occurring later and/or regions covered by clouds during scene acquisitions. More recently, advancements in data handling (e.g. Google Earth Engine) have enabled tracking individual pixels over a long period to detect forest disturbances (Bullock et al 2018). Google Earth Engine (GEE) has also allowed for more complex image mosaics to be produced, in which an image can be composed of individual pixels spanning any time period, minimizing information loss from clouds (Gorelick et al 2017). Recently, Hethcoat et al (2019) developed a method that used logging records to train supervised classification algorithms for detecting logging activities. Their methods were only applied at the scale of the logging concession and have not been demonstrated at larger spatial and temporal scales. The primary objective of this paper was to extend their methodology to a regional-scale assessment of selective logging. We trained a supervised classification algorithm for detecting selective logging using detailed logging records, then estimate the extent of logging between 2000 and 2019 throughout Rondônia, Brazil.

Methods
An overview of the methods described in the following sections is given in figure 1.

Study area
The Brazilian state of Rondônia covers 237 576 km 2 and is one of the most deforested regions in the Amazon (Pedlowski et al 2005, Tyukavina et al 2017. Historically, Brazil encouraged logging and land clearance as part of its settlement and development policies between 1970and 1990(Asner et al 2009. Widespread, unmanaged logging ravaged large portions of Mato Grosso, Pará, and Rondônia, accounting for more than 90% of timber harvest within the Brazilian Amazon (Asner et al 2009). In an effort to address some of the impacts rampant deforestation and logging had caused, Brazil adopted the CONAMA resolution (CONAMA 2009), which recognized advances in forestry research in the Brazilian Amazon and imposed a number of restrictions on logging, including limiting logging intensities to 30 m 3 ha −1 . While Pará and Mato Grasso have endured the highest rates of selective logging (Tyukavina et al 2017), the smaller size of Rondônia and the high availability of cloud-free imagery during the dry season (in contrast to cloudier regions of the Amazon basin) make it ideal for initially upscaling the methodology proposed by Hethcoat et al (2019). Workflow summarizing the methodology. The platform used for each step is in parentheses, with GEE being Google earth engine and R being the statistical software developed by the R core team.

Data and processing 2.2.1. Selective logging data.
Selective logging data from four lowland tropical forest regions in the Brazilian Amazon were used to build the detection algorithm described in section 2.3. The Jacundá and Jamari regions were inside the Jacundá and Jamari National Forests, in Rondônia, while the Saracá and Cikel regions were in the Saracá-Taquera National Forest and Paragominas municipality, Pará, respectively (figure 2). Forest inventory data from 19 forest management units (FMUs) selectively logged between 2010 and 2017 were used, comprising over 55 000 individual tree locations (see Asner et al 2004 for a description of typical logging practices in the Amazon). Data from three additional locations, one inside each National Forest (Jacundá, Jamari, and Saracá), comprised over 11 500 randomly selected point locations known to have remained unlogged during the study period (table S1).

Satellite data and processing.
All available Landsat 5, 7 and 8 surface reflectance data that coincided with the logging data were utilized in Google Earth Engine (GEE). At each FMU the Landsat archives were queried to find a single scene with the lowest cloud cover that was late into the dry season, but before the onset of the rainy season, to ensure the majority of logging was completed for that FMU (Hethcoat et al 2019). A linear spectral unmixing model, developed and validated over a range of forest disturbance types within the Amazon  (table S2). The normalized burn ratio (NBR) was also calculated (equation (1)), because it highlights changes in BG and NPV relative to PV and has demonstrated strong change detection capabilities in evergreen tropical forests (Grogan et al 2015, Shimizu et al 2017, Langner et al 2018. Spectral unmixing fractions for BG were zero for all logged locations, because of documented difficulties distinguishing BG and NPV with multispectral data in deterministic spectral unmixing algorithms (Asner 1998, Okin et al 2001, Asner and Heidebrecht 2002. Consequently, PV and NPV values were complementary and we only utilized PV fractions in the analyses. To reduce variations arising from differing atmospheric conditions and solar illumination, the PV and NBR values were spatially normalized in a self-referencing step (equations (2a) and (2b)) by subtracting the centre pixel value from the median value in a 150 m radius window (Langner et al 2018): Normalized PV and NBR values ranged between −1 and 1. This step was necessary to prevent highly inconsistent predictions along adjacent Landsat paths acquired at different dates ( figure S3). The spatially normalized PV and NBR values for the logged and unlogged observations were then compiled for algorithm training (section 2.3).

Building the detection algorithm
We built Random Forest (RF) models using the ran-domForest package (version 4.6) in the R program, version 3.5.1 (Liaw andWiener 2002, R Core Team 2018). We randomly allocated 90% of the data for training and withheld 10% for validation. In addition, to ensure training and validation datasets were independent, we assessed spatial autocorrelation of predictor variables and spatially filtered the data such that no observations in the validation dataset were within 90 m of an observation in the training dataset ( figure S4; see Supplementary Materials, sections S1 and S2 for further details on model specification and training).

Predicting selective logging through time
All available Landsat 5, 7, and 8 data over Rondônia were utilized in GEE. A cloud-free mosaic was constructed from the latest cloud-free pixel within the dry season (see table S4 for date ranges in each year). Clouds were masked using the QA band and an additional 300 m radius buffer was applied to cloudy pixels to minimize cloud shadows not identified by the QA mask. For the first year of analysis (2000) we only included pixels with forest cover >90% (Hansen et al 2013;Hansen data hereafter) to exclude open canopy forests, regenerating secondary forests, and areas generally not suitable for selective logging concessions that might result in false positives. At each time step, pixels identified in the Hansen data as being deforested in that year were removed. In addition, deforested pixels in the preceding year had a one pixel buffer removed from its edges to reduce spurious logging detections associated with deforestation. Pixels identified by the Moderate Resolution Imaging Spectrometer (MODIS) monthly burned area product (MCD64A1.006) were also removed. Thus, the pixels used to estimate the occurrence of logging in each year were in regions with tree cover exceeding 90% in 2000, that had not been deforested that year (or prior years), and had not burned.

Post-processing of logging predictions
In order to remove isolated logging detections amongst undisturbed forest we removed any detection with fewer than three other detections within a 7 × 7 pixel window neighbourhood. The window size and number of additional detections were chosen through extensive testing of different values over the Jamari region.

Evaluating map errors
Methods in Olofsson et al (2014) were used to assess agreement, calculate unbiased error estimates, and produce 95% confidence intervals of the mapped classes. We only assessed the accuracies of selective logging and undisturbed forest (i.e. logged and unlogged pixels) and did not consider deforestation and fires, as these have been estimated elsewhere (Hansen et al 2013, Turubanova et al 2018, Giglio et al 2018.

Results
We mapped logging across 44% of Rondônia, as the remaining 56% had already been deforested by 2000 or was below 90% canopy cover (e.g. rivers, lakes, savanna, cerrado, gallery forest). Of the forested lands present in 2000, 26.5% were deforested by 2019 (figure 3). We estimate that 11.0% of the forest area present in 2000 had been selectively logged by 2019, comprising >11 500 km 2 of forest (table 1). Logging detections were highest in the north central part of the state, in the region of the Bom Futuro National Forest (figure S7), a hotspot for logging and land clearance over the period (Pedlowski et al 2005). In general, the amount of selective logging was about twice as high in the first 10 years of the period than in the last 10 years, generally coinciding with logging restrictions implemented under the CONAMA resolution (CONAMA 2009).
The bias-adjusted confusion matrix, summarizing errors and confidence intervals for the proportions of mapped classes (Olofsson et al 2014), is shown in table 1 and is consistent with the results from model validation (section S2, figure S5, table S3). These findings reiterate a roughly 55% omission of logging detections and 5% commission error for unlogged forest (i.e. an estimated 5.4% of the area mapped as unlogged forest, roughly 4500 km 2 , was actually logged). For these reasons, and because we further limited our detection rate by excluding high intensity logging detected in the Hansen data (figures S1 and S2), our estimates of selective logging should be viewed as conservative and the annual amounts of selective logging are likely closer to double what is reported here.
We explored the results in more detail over two FMUs where we had general knowledge of logging but limited field data. First, an FMU selectively logged in 2018 with no data on logging locations showed some false detections in the years preceding 2018 (at roughly the expected rate), but the year of logging and an internal logging road constructed in 2015 are accurately identified ( figure 4). Similarly, the number of false detections over an area known to have remained unlogged (a forest reserve area associated with the logging concession) was approximately 2% over the 20 year period (figure 5).
Many of the detections were obviously logging road networks, both main access roads and smaller internal roads, which generally go undetected by the Hansen dataset ( figure 6). In addition, many detections preceded deforestation by a year or two (figure 7), demonstrated by logging detections occurring in areas later identified as deforested (i.e. subtle forest disturbance preceding total clearance was detected). Consistently, about 55% (±8% SD) of selective logging detections were within 1 km of deforestation activities occurring in the same year (figure 8). Thus, the majority of selective logging activities in Rondônia occurred in close proximity to deforestation presently detectable through the weekly Global Land Analysis & Discovery alerts system (Hansen et al 2016). This result is in line with the well documented

Discussion
We have demonstrated that the approach in Hethcoat et al (2019) to map tropical selective logging with Landsat data can be extended beyond the scale of a logging concession or forest management unit to regional-scale assessments of logging activities using historical data. This required changes to the original methodology, moving away from surface reflectance values and utilizing a spatial normalization step to mitigate abrupt changes in image mosaic values resulting from varying solar illumination and atmospheric conditions. We show that about 11% of the forested land present in 2000 was selectively logged by 2019, comprising >11 500 km 2 of tropical forest. Yet, our estimates of annual logging rates are likely underestimated for two reasons. First, only about half of the logging was actually detected in a given year (tables 1 and S3). We abandoned higher detection   despite the proportion of pixels where a tree was removed being closer to 10%. Robust methods are needed that incorporate these additional disturbances as true detections in the absence of field data. Some have utilized a buffer (often 180 m) around logging road networks or landing decks (Souza and Barreto 2000, Monteiro et al 2003, Matricardi et al 2010 to account for missed detections, yet these authors have acknowledged high commission and omission errors associated with this approach. We almost certainly underestimate the amount of selective logging for 2010 and overestimate it for 2011 because of two concurrent factors affecting the predictions for these years. First, the cloud-free window was earlier and narrower in 2010 than most other years (table S4). The cloudiness of 2010 has been documented in other forest mapping exercises in the Brazilian Amazon (Qin et al 2019). This would result in fewer detections, because the dry season period was about three weeks shorter than average and fewer pixels would have been logged over the shorter time period. Second, 2010 was a particularly high fire year within the Amazon (Aragão et al 2018), consequently large regions excluded from our analyses probably and the total area selectively logged over the period (5%) is just under the 6% they found for all forms of degradation. However, our 1% omission error of unlogged forest (table 1) translates to about 970 km 2 of unlogged forest being identified as logged over the 20 year period (i.e. <20 km 2 yr −1 ). Thus, our estimates are unlikely to be erroneously inflated and they reflect an improvement in the detection of selective logging.
Immediately noticeable in the detections of selective logging are an abundance of linear features (i.e. logging roads). Road building has big implications for primary tropical forests (Kleinschroth et al 2015, Kleinschroth and Healey 2017 and improving their detection is critical to understanding their lifecycle and the continued loss of intact forest landscapes (Potapov et al 2008. Roads create forest edges that can alter abiotic processes like microclimate (Williams-Linera et al 1998), change plant and animal species composition (Tabarelli et al 2012), increase fire susceptibility (Armenteras et al 2013), and ultimately weaken forest resilience (Murcia 1995, Kleinschroth andHealey 2017). Moreover recent work has shown that tropical forests globally may be nearing a tipping point where fragmentation will begin to increase dramatically (Taubert et al 2018). The tropics are estimated to have around 50 million forest fragments, encompassing nearly 50 million km of edge (Brinck et al 2017). Monitoring the emergence and spread of roads is critical to understanding the disturbance frontiers of intact forests and our method clearly improves early detection of cryptic roads.
Some important caveats are needed regarding our approach and results. First, like all studies in the tropics that exclusively use optical data, some areas were excluded from analyses each year because of clouds. Despite creating a mosaic of all available pixels in each year,~1% of Rondônia was affected by clouds annually (mean = 2600 km 2 ± 2400 km 2 SD) and was included in the subsequent year assuming no disturbance had occurred. Second, each mosaic consisted of only a single pixel per location and any selective logging that occurred after the date of the latest cloudfree pixel in the mosaic would remain undetected. Third, our approach cannot distinguish between logging and fire. We limited this by removing burned areas annually, using the MCD64 burn product, but the different scale of these datasets (500 m) and Landsat (30 m) is certain to result in commission and omission of burned area. Collectively, these factors will tend to cause underestimation of the area selective logging annually. Finally, our complete dataset on selective logging covered only a subset of the years (2011-2017) we mapped (2000-2019) and could not be used to properly validate annual maps from years without logging data (i.e. 2000-2010, 2018, 2019). Consequently, we only validated the final map against the validation data. Thus, if a logging detection was temporally inaccurate, it was technically regarded as correctly classified. Figures 4 and 5 were included to provide some perspective on this issue, where we show the false alarm rate (FAR) in regions where we had general knowledge that logging had occurred in a particular year (figure 4) and where we knew it had not occurred (figure 5). Both figures display very low FARs (the temporally inaccurate detections in figure 4 and any detection in figure 5) that suggest our results were not impacted.
Moving forward, we are exploring the sensitivity of the logging estimates to the choices of the value of the classification threshold used to detect logging (sections S1, S2, and figure S5) and the window size and the number of detections in the post-processing step (section 2.5). In particular, it is desirable to decrease the omission of logging by lowering the threshold and/or altering the window size and detection requirements in the post-processing step. However, such changes will also modify the commission error when predicting unlogged forest so both must be considered together. An additional decision affecting logging estimates was the exclusion of forests with canopy cover <90% as defined within the Hansen data. Brazil defines a forest as having >10% canopy cover and >5 m height (GFOI 2016), but we sought to restrict our analyses to continuous tropical forests (i.e. not secondary forest, cerrado, gallery forests, or otherwise modified forests) where commercial logging leases tend to occur.
Tropical forests store billions of tons of carbon. While the emissions estimates from selective logging are much lower than those from deforestation (Asner et al 2010), recent work has shown that taking full accounting of degradation activities suggests much higher emissions than previously thought Maxwell et al (2019). However, Maxwell et al (2019) simulated selective logging in proximity to road networks because large-scale maps are lacking. The extent of logged forest in the tropic is likely to be vast, yet they represent the next best alternative to the protection of primary forest (Edwards et al 2014). Given that financially viable pathways for global action on forest degradation will be linked to climate mitigation potential, with the aim of achieving secondary benefits for biodiversity and human livelihoods, reliable logging maps will enable a better accounting of the relationships between timber harvest and the full suite of goods and services tropical forests provide.

Acknowledgments
MGH was funded by the Grantham Centre for Sustainable Futures. JMBC was funded by the Natural Environment Research Council (Agreement PR140015 between NERC and the National Centre for Earth Observation). We would like to thank the Google Earth Engine developers and community for access to the datasets that made this work possible.