Mapping forest disturbances across the Southwestern Amazon: tradeoffs between open-source, Landsat-based algorithms

Local and cross-continental road building, increased economic teleconnections, growing agricultural demands, logging and mining practices, and general development processes are putting pressure on even the least densely populated regions of the Amazon, where local, regional, and global demand for food, fuel and fiber are resulting in observable biophysical effects. It is essential, then, that stakeholders can both map and understand the effects of these forest disturbances on ecosystem services. Multiple remote sensing algorithms focused on detecting vegetation changes have been developed: the challenge now lies in understanding which algorithm best suits the user´s study area and research objective. Using Google Earth Engine, we compared the performance of three algorithms –Continuous Degradation Detection (CODED), Landsat-based detection of trends in disturbance and recovery (LandTrendr), and Multi-variate Time-series Disturbance Detection (MTDD)– to detect and characterize forest disturbances in the Southwestern Amazon (Ucayali, Peru and Acre, Brazil) during the 2000–2020 period. In general, the results of all of the algorithms agreed with the reference data: overall accuracies were 94% (± 0.6% LandTrendr), 95% (±0.6% MTDD), and 96% (± 0.6% CODED). Although the map products exhibit similar spatial patterns, they often differ on the specific disturbance extent. CODED works well in capturing disturbances associated with roads, MTDD excels best at capturing entire disturbance patches, and LandTrendr excels both in terms of user friendliness and range of output options. Through three case study regions, we highlight land-cover change dynamics that have occurred in this remote, transboundary region over the last two decades. We also describe the strengths and weaknesses of each algorithm and demonstrate that it would be incorrect to assume that any one algorithm is the most accurate. Our work, then, improves the capacity of the community to understand how well each algorithm is suited best to map various forest disturbances to promote sustainable decision making.


Introduction
Home to over 25% of the world's terrestrial species [1], almost 15% of the planet's running freshwater [2], the world's largest river basin, and nearly half of tropical forest carbon stocks [3], the Amazon rainforest provides ecosystem services to 33 million people, including 1.5 million Indigenous people from 385 different groups living within the biome boundary [4,5]. These ecosystem services, and the forests and rivers that provide them, are increasingly threatened as the Amazon rainforest rapidly approaches a tipping point [6,7]. Here we focus on the Southwestern Amazon (SWA), a region identified as critical to both the maintenance of continental scale atmospheric moisture flows and transportation infrastructure networks [7]. More specifically, we focus on the remote borderlands of Ucayali, Peru and Acre, Brazil (figure 1), where Indigenous and non-indigenous groups rely on the provisioning and regulating services provided by humid, tropical forest and its streams. The SWA is characterized by large swaths of intact forest and myriad patches of small-scale forest clearing, often in riparian zones. Forest disturbances are primarily driven by development activities such as road building, selective logging, cattle ranching, and expansion of farmland dedicated to export crops [8][9][10][11].
Multiple studies have highlighted the SWA's high biodiversity [13,14] and limited human footprint due it its remoteness [15][16][17][18][19]. Although deforestation in the entire Peruvian Amazon dropped 13% between 2016 and 2017, deforestation increased in Ucayali during this time [8], and save for 2018, has been trending upward since [20]. In Acre, Brazil, the deforestation rate increased from 25,700 ha in 2017 to 44,000 ha in 2018 and 68,200 ha in 2019 [21]. Total annual forest clearing in Acre state had not been that high since 2005 [21]. Moreover, both Peruvian and Brazilian governments are targeting this region for development. On March 21, 2018, the Peruvian government issued Supreme Decree 005-2018-RE, stating their commitment to improved access to the interior regions of Peru, highlighting Ucayali for targeted development through the creation of 'sustainable infrastructure' (Decreto Supreme #005-2018-RE). In 2020, Brazil's President promoted a road connecting Acre and Ucayali while in 2021 the Peruvian congress approved two roads between Ucayali and Acre given integration with Brazil is of 'public necessity and national interest' (Bills 6486/2020-CR and 6916/2020-CR). The acceleration of deforestation and development planning make this remote transboundary region an important location to quantify and understand forest disturbances.
Understanding forest dynamics and their relationship with ecosystem services and other environmental processes is crucial for Brazil and Peru's environmental and development ministries, regional governments, and SWA stakeholders grappling with sustainable development decision making as road proposals and fire intensity increase, Peru and Brazil face national leadership transitions, and climate change intensifies. Disturbed tropical forests are of particular interest not only because they represent the second largest anthropogenic source of global carbon dioxide emissions [22], but also because tropical forest growth after loss acts as an effective carbon sink due to high biomass accumulation rates [23]. Forest regeneration, degradation, and clearing has never been more important given the prospect of an impending tipping point that threatens to transition the Amazon rainforest to a tropical savannah [6].
Multiple Landsat-based remote sensing algorithms focused on detecting vegetation changes have been developed. These algorithms have relied on a variety of time series-oriented approaches, including slope estimation [24], temporal segmentation and trajectory fitting [25][26][27][28], identification of anomalous departures from the trend [28][29][30][31], and machine learning classifications [23,32]. The application of spectral mixing models to Landsat time series has been a particularly common practice in studies focused on mapping forest disturbances in the Amazon region [33][34][35][36][37]. The challenge now lies in understanding which algorithm best suits the user's study area and research or mapping objective.
Here, we use Google Earth Engine (GEE) to examine the effectiveness of three forest disturbance algorithms: Continuous Degradation Detection (CODED; [37]), Landsat-based detection of trends in disturbance and recovery (LandTrendr; [26]), and a random-forests-machine-learning algorithm based on [23], called the Multivariate Time-series Disturbance Detection (MTDD; Wang 2021, personal communication). These algorithms were evaluated for the 2000-2020 period in our study region, where the transboundary scenario adds an extra challenge to the configuration and adaptation of algorithms as they cannot rely on country-based datasets. With this comparative analysis, we address the following questions: (1) Do the three algorithms agree?and (2) What are the sources of agreement/disagreement among maps?We compliment this analysis with three case-studies to discuss the land-use and land-cover change dynamics that have occurred in this remote, transboundary region over the last two decades, and demonstrate the tradeoffs associated with each algorithm in capturing land-cover dynamics, in ease of use and parameterization, and in output options. Lastly, we provide the links to our GEE code associated with each algorithm in the Supplementary Material (SM) (available online at stacks.iop. org/ERC/3/091001/mmedia).

Definitions
It is nontrivial to adopt a set of definitions that harmonize with the fundamental logic behind CODED, LandTrendr, and MTDD. Here we broadly define forest disturbances as changes in the forest´s state and function. These disturbances, then, can vary from high-impact events, such as fires or deforestation, to subtle and gradual processes, such as those caused by prolonged droughts, insects, or diseases [38,39]. We define degradation as a long-term process that does not lead to a change in land cover but negatively affects the forest´s structure and function [40,41]. Because forest degradation encompasses a wide array of processes, a remotely-sensed degradation signal can be subtle and difficult to identify. For example, in the satellite signal, degradation could be a slight negative-but not necessarily linear-decrease in vegetation greenness (a drought), or, it could appear as a sharp drop in vegetation greenness, and then a slight increase in vegetative greenness over time (deforestation and recovery resulting in degraded forest). In contrast, we define deforestation as the permanent conversion of forested land to non-forested land [42]. Accordingly, we refer to both deforestation (i.e., high-impact event) and forest degradation (i.e., subtle and gradual process) as types of forest disturbances.

Input variables
We implemented CODED, LandTrendr, and MTDD in GEE using Landsat Thematic Mapper (TM), Landsat Enhanced Thematic Mapper+(ETM+), and Landsat Operational Land Imager (OLI) atmospherically corrected surface reflectance data from 2000 to 2020. The time series variables used as input by each of the algorithms are listed in table 1. CODED transforms Landsat data into spectral member fractions and uses them to calculate the Normalized Difference Fraction Index (NDFI; [36]). Because NDFI is often used in mapping forest degradation, we chose to also execute LandTrendr using NDFI. To accomplish this task, we added NDFI to the list of possible spectral indexes supported by the LandTrendr GEE library. MTDD utilizes two shortwave infrared (SWIR) bands and four spectral indices: the Normalized Difference Vegetation Index (NDVI), two Normalized Difference Water Indices (NDWI 1 and NDWI 2 ), and the Soil-Adjusted Vegetation Index (SAVI) in its calculations.

Algorithms implementation
All of the code used in this analysis, the specific parameters used for each algorithm, and the instructions for running the code are provided in the SM. Both because there exists a rich literature on the creation and implementation of CODED [29,37], LandTrendr [26,43], and MTDD [23], and for the sake of brevity, we point the reader to the aforementioned papers and the SM for in depth information on each algorithm, and provide only a brief overview of each algorithm and our specific implementations below. CODED detects and characterizes disturbances based on NDFI change scores and then classifies them into degradation or deforestation. CODED outputs: (1) date of disturbance occurrence, (2) magnitude of NDFI change, (3) land cover-use type after the event (forest, non-forest), and 4) a stratified map (degradation, deforestation, stable forest, and non-forest). To create these products, CODED follows four general steps: (1) calculate NDFI time series, (2) detect disturbances based on change scores, (3) classify the post-disturbance land cover type, and (4) create a stratified map classified into deforestation, degradation, forest, and non-forest. See [29] and [37] for more details, and the SM for our specific CODED parametrization for the SWA.
LandTrendr detects and characterizes vegetation loss or gain by segmenting and fitting temporal trajectories of NDFI (or the user-defined variable-other options include NDVI, Normalized Burn Ratio, Normalized Difference Snow Index, individual Landsat bands, among others). Although LandTrendr allows the user to output myriad results (including oldest, newest, fastest, slowest, greatest, and smallest disturbance), after testing LandTrendr in the SWA, we found that the best results for our study area were obtained by parametrizing the algorithm to estimate the 'greatest' disturbance. Thus, we configured LandTrendr to output the following characteristics for the greatest disturbance of the study period: (1) date of occurrence, (2) magnitude of change, (3) duration, (4) pre-change spectral value, (5) rate of change, and (6) signal-to-noise ratio. The overall methodology to estimate these characteristics can be divided into five steps: (1) generating annual time series of the selected variable, (2) segmenting trajectories, (3) fitting trajectories to the identified segments, (4) simplifying models, (5) selecting the best model, and (6) determining vegetation changes based on filtering criteria. We further classified the disturbances as degradation and deforestation by determining a threshold for the magnitude of NDFI change (i.e., deforestation0.73; degradation<0.73). See [26] and [43] for more details, and the SM for our specific parametrization for the SWA.
The MTDD algorithm classifies areas into intact forest, degradation, and deforestation by training a random forest model with sixty-six metrics derived from annual time series (table 1). We built this MTDD algorithm in GEE based on [23]. Overall, [23]'s methodology consists of five main steps: (1) generating six annual time series, (2) calculating eleven descriptive statistics for each annual time series, (3) selecting training/validation points, (4) training a random forest classifier with the resulted 66 metrics, and (5) validating the classification (this validation is characteristic of machine learning methods and is independent from the overall validation described in section 2.4). See [23] for more details and the SM for our GEE implementation and specific parametrization for the SWA.
To be able to compare across results, we run the three algorithms over a pre-defined forest mask: all areas covered by forest at least three consecutive years during the study period according to pan-Amazon MapBiomas annual land-use and land-cover maps [12].

Validation
Due to the different methods and products resulting from each of the algorithms, we can only robustly assess the accuracies of two land-cover classes: disturbed forest and undisturbed forest. To evaluate the accuracy of the algorithms in detecting these classes, we collected reference observations selected from a stratified random sample. The four strata used were: (1) areas where none of the algorithms detected a disturbance, (2) areas where only one algorithm detected a disturbance, (3) areas where two of the three algorithms detected a disturbance, and (4) areas where all three algorithms detected a disturbance. Following the methods of [44], our final sample size comprised of 811 points. Based on relative areas of each strata and the recommendations of [45], we allocated 586 samples to Stratum 1 and 75 samples to each of the remaining strata.
We used Collect Earth Online (CEO, [46]) to facilitate sample interpretation. Three experts interpreted each of the 811 sampling units to determine whether each point was disturbed or undisturbed. If disturbed, the interpreter would also note whether the sample was degraded or deforested; and, if more than one disturbance occurred at the sample point, the interpreter would also record the year of the first disturbance and the year of the greatest disturbances. All samples in which one of the three interpretations differed underwent to a second revision by each of the experts. Ultimately, 18 samples were removed because of the low confidence in the interpretation, the impossibility to assign a label, or the lack of data.
Formulas for estimating accuracy and area are well known when the strata correspond to the map classes [45,47]. However, because our strata (agreement across map products) are different from our map classes (disturbed and undisturbed), a traditional error matrix displaying sample counts is not appropriate [47]. Instead, we follow the methods of [47] both to calculate confusion matrices in terms of proportions of area and to estimate areal proportions of each map class. See SM for more information regarding all aspects of our validation methodology.
We note that in the Case Studies subsection (3.2), we present the results of a post-disturbance classification for qualitative comparative purposes and a discussion of algorithm tradeoffs. There, the disturbance is subdivided into a 'degraded forest' class and 'deforested' class as described above. But, it is impossible to compare and assess the accuracy of the algorithms with regard to these specific classes. Unlike LandTrendr and CODED, which denote the year of disturbance in their output products, MTDD only outputs a map for the final year of the study period of interest and provides no information on the date of disturbance. Moreover, we cannot compare LandTrendr results with CODED because we parameterized LandTrendr to output a map of the 'greatest' disturbance and CODED to output the first disturbance as both of these parameterizations gave us the best results. Given the purpose of our study, though, we believe that it is useful to demonstrate both how these algorithms can be taken one step further and where each algorithm excels, should the user decide to implement these algorithms themselves.

Accuracy assessment and area estimation
The results of the accuracy assessment and area estimation are shown in table 2. CODED had the highest overall accuracy (0.956±0.006), followed by MTDD (0.949±0.006), and LandTrendr (0.936±0.006). The low standard errors (table 2, SE) are a good indicator that each algorithm performs well in the SWA with the parametrizations defined in the SM. As highlighted in table 2, CODED had the highest user's accuracy for the disturbance class (0.961±0.015) and MTDD had the lowest (0.85±0.022). According to the area estimations, between 2000 and 2020, 12.4% of our region of interest was disturbed and 87.6% remained undisturbed (∼3.1 million ha and 22.2 million ha, respectively).
With a ±1-year margin of error, CODED correctly assigned the date of the first disturbance to 102 (74%) of the 138 reference points that this algorithm accurately classified as disturbed. Similarly, LandTrendr correctly identified the date of the greatest disturbance in 73 (70%) of the 104 reference points accurately classified as disturbed.
In terms of the spatial distribution of the disturbed areas, at a coarse scale, the three algorithms exhibit a similar spatial pattern as they generally detect disturbances around rivers, roads, and human settlements. However, at a finer scale, they often differ on the areal extent of the disturbed patches, which we discuss further through Case Studies below (figure 2). Table 2. Confusion matrices expressed as proportions of area and accuracy estimations for CODED, LandTrendr, and MTDD. The proportion of area of each class is based on the reference classes which are the same for the three algorithms. Standard Errors (SE) for accuracy estimations and proportion of area are also shown.

Case studies
We have chosen these three regions because they represent three different but typical examples of the landscape dynamics that have occurred in this transboundary region over the last two decades and illustrate tradeoffs in the effectiveness of each algorithm.

Settlement project expansion
The forest disturbances observed in this region are due to human settlements and agricultural activities along formal settlement roads ( figure 3). The temporal output of CODED and LandTrendr illustrate that many of the areas that remained degraded in 2020 (according to the MTDD results) were disturbed early in the study period. These areas were likely deforested by farmers between 2001 and 2005 and abandoned soon afterwards. Although this region in Acre has not been greatly affected by forest disturbances; extreme drought events [50,51], agricultural fires [52], and a proposed international road (through the Serra do Divisor National Park along the border between Brazil and Peru) have been cause of concern in recent years. All three algorithms map these larger scale patch clearings similarly, the major difference being that CODED and LandTrendr often classify the borders of these agricultural patches as degradation, while MTDD identifies the entire patch as deforested. Specifically, CODED detected a road crossing the settlement project, while LandTrendr and MTDD failed to identify it.

Post-road dynamics
The disturbances in this region are generally associated with roads and agricultural lands close to populated villages ( figure 4). Most of the degradation is related to the expansion of agriculture and livestock areas outside the village of Nuevo Italia and several indigenous Asháninka villages along the Genepanshea river. The deforestation in the region is associated with both forestry concessions and roads, specifically a 1980s Occidental Oil road rebuilt and extended by the Forestal Venao logging company to extract timber in the 2000s, and now rebuilt again for coca cultivation and to promote the UC-105 Trocha Nueva Italia-Breu road construction project. Importantly, as highlighted by the CODED and LandTrendr results, the intense deforestation of which this region has been subject to has generally occurred during the last few years with the rebuilding of the road.
CODED best delineates deforestation due to roads, followed by LandTrendr. As in case study one, MTDD often classifies entire agricultural plots as deforested, whereas CODED and LandTrendr often classify plot edges as degraded, which may be related to the ways each algorithm deals with mixed pixels. An example of this type of disagreement can be observed in the central-west portion of the study area (73°54'W, 9°45S), where a cleared patch was classified as degraded by CODED, as deforested in the center and degraded in the edges by Land Trendr, and as entirely deforested by MTDD.

Floodplain dynamics, agropastoral land expansion, and road building
In this case study region, forest disturbances can be associated with natural floodplain dynamics, expansion of agropastoral land, and road building ( figure 5). Much of the road building here is associated with forestry concessions and agricultural expansion.
MTDD, which is sensitive to natural forest disturbances, does well at capturing degradation associated with natural floodplain dynamics-here, specifically with regard to the Ucayali river and its tributaries. CODED and LandTrendr are more conservative in the detection of riverine disturbances, but the shape of the patches along the river that they detect seem to correspond to slash and burn agriculture practiced by long-term residents. As highlighted by the additional CODED and LandTrendr output, a majority of these disturbances have occurred in the last ten years.
Like in the previous two case studies, all three algorithms perform well at mapping larger-scale patches of deforestation-in this region, related to the expansion of cattle pasture, coca cultivation, and other upland agriculture. As evidenced in this case study region, CODED excels at mapping informal and logging roads, whereas MTDD is less sensitive to these disturbances, and LandTrendr falls in the middle. The class assigned to the detected roads also often differs among algorithms; while CODED and LandTrendr usually classify them as degradation, MTDD classifies them as deforestation.

Discussion
The 2000-2020 forest disturbance maps derived from the three Landsat-based algorithms tested in this study are similar at coarse scales and regarding overall accuracy (between 94% and 96%), but different with respect to the Year of the first (CODED) and the greatest (LandTrendr) disturbance. Study period: 2000-2020. Sources: all three algorithms are based on Landsat data; MTDD modeling also involved the use of ( [4,48,49]; see SM for details); the forest mask was built based on [12]. extent of the disturbed patches and, importantly, the user´s and producer´s accuracy estimations (table 2 and figure 2). According to the user´s accuracy of the disturbed class, which we prioritize as we focused on producing maps best from the user´s perspective, MTDD has the highest false detection rate (15%), followed by LandTrendr (7.3%), and CODED (3.9%). And based on the producer´s accuracy, LandTrendr has the highest rate of disturbed reference data incorrectly classified (47%), followed by CODED (33.1%), and MTDD (28.7%).
The sources of agreement/disagreement among maps can be explained by each algorithm's fundamental methodology, which has profound effects on the results. CODED detects disturbances by searching for anomalous observations within spectral trajectories, LandTrendr segments and fits time series trajectories to find both abrupt and long-term changes in vegetation, and MTDD trains a random forest classifier with the statistical characteristics of the time series to determine forest conditions in the end of the study period. Although all three algorithms perform well in the SWA, each one outputs different information; thus, it is incumbent upon the user to determine which best suits their needs.
Each algorithm has its strengths and weaknesses. We present some of the aspects to consider when deciding if and when to use each algorithm in table 3. Depending on the study's objectives, approach, and training data availability, one algorithm may be better suited to the task at hand.
None of the algorithms are subject of major temporal constraints in the SWA because they rely on Landsat data that are freely available up to the present. Unlike MTDD, CODED and LandTrendr characterize disturbances on the annual scale. CODED and LandTrendr do not require training data, while MTDD does. For our ∼25 million ha region of interest, CODED's running time was approximately two days; whereas LandTrendr and MTDD both ran in 4 and 15 h, respectively.
Results-wise, the forest disturbance drivers observed in the three study cases are not equally captured by all three algorithms. As highlighted in our case studies, MTDD excels at capturing the entire disturbance patch, and is sensitive to degradation associated with floodplain dynamics as well as subtle disturbances (figures 3-5). CODED works particularly well in mapping deforestation associated with roads ( figure 3). As for agricultural patches, CODED and LandTrendr often classify the edges as degradation, while MTDD commonly classifies the entire plots as deforested.
Moreover, the interpretation of some landscape dynamics and histories may be best accomplished through the combination of algorithms' results as MTDD provides the current forest condition and CODED or LandTrendr provide important information with regards to when those disturbance events occurred. Understanding these tradeoffs between algorithm mapping is key for making decisions regarding sustainable development as local and cross-continental road building, increased economic teleconnections, growing agricultural demands, logging and mining practices, and general development processes are putting increasingly pressure on remote tropical forests and their traditional inhabitants and ecosystem services. Table 3. Tradeoffs to consider when implementing CODED, LandTrendr, and MTDD.

CODED
LandTrendr MTDD • Online documentation based on [29,37] • Straightforward and user-friendly online documentation based on [26,43] • Documented on [23] and our SM • Characterizes individual disturbances and classifies them as degradation or deforestation • Characterizes individual disturbances (it is up to the user to design a post-classification to differentiate between degradation and deforestation) • Does not characterize individual disturbances, but it does make a distinction between degradation and deforestation • Does not need training data if a forest mask is provided; otherwise, training data are needed to distinguish forest from nonforest • Does not need training data • Needs training data; the results are highly dependent on the quality of these data • Long processing time • Very short processing time • Short processing time • Often, the total extend of the disturbed patches is not completely captured • Often, the total extend of the disturbed patches is not completely captured • Delimits more naturally shaped disturbed patches by capturing the total extend of the disturbed patches • Works particularly well for road detection • Works well for road detection • Very sensitive to degradation associated with floodplain dynamics and in general to subtle disturbances • Borders of agricultural plots are often classified as degradation • Borders of agricultural plots are often classified as degradation (based on our postclassification) • Agricultural plots are often totally classified as deforested

Conclusion
This study evaluated and compared three forest disturbance algorithms: CODED, LandTrendr and MTDD in the SWA. As part of this study, we added NDFI to the list of possible spectral indexes supported by the LandTrendr GEE library and fully developed a MTDD-based algorithm in GEE (see SM). Overall, the results of all of the algorithms agreed with the reference data (table 2) and the user's accuracy of the disturbed class was greater than or equal to 85%. Similar to [39] and [53], we found that although the disturbance maps exhibit a similar general spatial pattern, they often differ on the specific disturbance's location. The sources of these disagreements can be associated with the difference in the algorithm's fundamental methodology.
As evidenced by the three case studies, remote regions are becoming increasingly teleconnected with mixed effects on the land cover and land use. In the last 20 years, the expansion of roads (planned and informal), agriculture, logging, mining, and settlements have threatened not only the SWA's forests, but also the remaining intact forests across the tropics. As climate change concerns and development pressures both intensify, environmental and development policy offices in tropical and temperate countries are under increasing pressure to monitor and manage their remote intact forests. This is particularly true in the SWA, where the onset of the fire season and a proliferation of Amazonian road proposals welcome Peru's new 2021 political regime, and anticipate Brazil's elections in 2022 even as the possibility of an Amazon's tipping point approaches [7]. In the context of political and climate change, remote sensing algorithms provide an objective, efficient, and costeffective tool to help monitor forests and inform management practices, but algorithms differ in their strengths and weaknesses. Each of the algorithms we analyzed can be utilized to provide information about specific land use and land cover changes, which may be key for making decisions regarding both climate change and sustainable development.
Thus, the goal of this work is to improve the capacity of the greater environmental community to recognize and understand the tradeoffs and products of these algorithms when using them as a tool for decision-making. With this work, we demonstrate that all three algorithms have strengths and weaknesses and it would be incorrect to assume that any one algorithm is the most accurate. Depending on the study's area, objectives, approach, and training data availability, one algorithm may perform better than the other and it is the responsibility of the user to understand how well each algorithm adjusts to specific needs.