Combining post-disturbance land cover and tree canopy cover from Landsat time series data for mapping deforestation, forest degradation, and recovery across Cambodia

ABSTRACT Mapping of deforestation, forest degradation, and recovery is essential to characterize country-level forest change and formulate mitigation actions. Previous studies have mainly used a simple forest/non-forest classification after forest disturbance to identify deforestation and forest degradation. However, a more flexible approach that is applicable to different forest conditions is desirable. In this study, we examined an approach for mapping deforestation, forest degradation, and recovery using disturbance types and tree canopy cover estimates from annual Landsat time-series data from 1988 to 2020 across Cambodia. We developed models to estimate both disturbance types and tree canopy cover based on a random forest algorithm using predictor variables derived from a trajectory-based temporal segmentation approach. The estimated disturbance types and canopy cover in each year were then used in a rule-based classification of deforestation, forest degradation, and recovery. The producer’s and user’s accuracies ranged from 59.1% to 72.9% and 60.8% to 91.6%, respectively, for the forest change classes of mapping deforestation, forest degradation, and recovery. The approach developed here can be adjusted for different definitions of deforestation, forest degradation, and recovery according to research objectives and thus has the potential to be applied to other study areas.


Introduction
Human-induced deforestation, forest degradation, and recovery in tropical forests critically affect local ecosystem services and benefits (Schaafsma et al. 2014), the global carbon cycle, and biodiversity (Mitchard 2018;Pan et al. 2011;Xu et al. 2021). In the past decades, increasing pressure from timber extraction, agricultural expansion, and other human activities have led to broad-scale deforestation and forest degradation (Lewis, Edwards, and Galbraith 2015). The spatial and temporal patterns of these forest changes differ considerably depending on the region and period (Baccini et al. 2017;Hansen et al. 2013). Therefore, the spatial and temporal monitoring of tropical forests is a vital step in quantifying these changes. Forest monitoring also has an important role in formulating mitigation actions within international frameworks such as Reducing Emissions from Deforestation and Forest Degradation and the role of conservation, sustainable management of forests and enhancement of forest carbon stocks in developing countries (REDD+) (Global Forest Observations Initiative 2020).
Medium-spatial-resolution optical satellite data have been widely used for regional and countrywide mapping of deforestation, forest degradation, and recovery because of their cost-effectiveness and wide area coverage (Finer et al. 2018;Mitchell, Rosenqvist, and Mora 2017;Miettinen, Stibig, and Achard 2014). Freely available satellite data, such as Landsat data, have enabled the development of long-term and large-scale spatial mapping in the last decade (Dupuis et al. 2020;Phiri et al. 2020;Wulder et al. 2015). One such example is the Global Forest Change (GFC) dataset generated by Hansen et al. (2013), which provides the annual global forest loss and recovery at 30 m spatial resolution. Although this dataset provides valuable information on forest change, forest loss in the dataset represents mostly deforestation, and thus further processing is required to separately map both deforestation and forest degradation. In addition, although a rigorous validation was implemented on the original dataset, as forest loss detection in the GFC data has been gradually improved, time series trend analysis might be problematic without rigorous validation for forest loss in the updated version of the dataset (Grassi, Cescatti, and Ceccherini 2021;Palahí et al. 2021;Shimizu, Ota, and Mizoue 2020). In terms of the global availability, the land cover map at 10 m resolution by the ESA WorldCover (Zanaga et al. 2021) using Sentinel-1 and Sentinel-2 data is promising; however, this map currently does not provide temporal change information. Therefore, accurate estimation of forest change requires the development of locally adjusted maps. Although many studies have developed methods for mapping deforestation, forest degradation, and recovery, the long-term trend in spatial and temporal patterns in the tropics has still not been investigated sufficiently; thus, suitable methods remain to be discovered especially at the country scale.
Many approaches using Landsat data have been proposed to detect deforestation and forest degradation in the tropics (e.g. Asner et al. 2005;Raši et al. 2011;Sannier, Fichet, and Makaga 2014;de Bem et al. 2020;Matricardi et al. 2010) caused by various drivers, such as agricultural development, logging, and fires (Geist and Lambin 2002;Hosonuma et al. 2012). However, the estimates of change areas still contain large uncertainty (Gao et al. 2020). This uncertainty mainly arises because it is difficult to monitor forest degradation despite its prevalence in tropical forests (Dupuis et al. 2020). Forest degradation is generally a process of gradual or subtle abrupt changes. Therefore, monitoring is more challenging than for deforestation (Gao et al. 2020). Furthermore, there are more than 50 definitions of forest degradation in the literature (e.g. "the reduction of the capacity of a forest to produce goods and services", FAO 2001) and the use of the term depends on the interests and goals of different projects (Morales- Barquero et al. 2014;Vásquez-Grandón et al. 2018). Although the indicators for assessing the degree of forest degradation, such as a forest degradation index (Modica et al. 2015;Wu et al. 2020), have been proposed in the literature, the optimal approaches also vary depending on the forest types, drivers, and data sources (Miettinen, Stibig, and Achard 2014). These situations have led to different approaches using Landsat data in previous studies, such as spectral mixture analysis (SMA, (Souza et al. 2013;Asner et al. 2005)), fragmentation of primary forests (Zhuravleva et al. 2013), and temporal changes of related indicators (Mon et al. 2012;Duarte et al. 2020). Time series analysis of Landsat data can provide important temporal information on forest conditions. This characteristic is suitable for detecting gradual and subtle changes from deforestation and forest degradation (Global Forest Observations Initiative 2020). Recent studies have successfully detected changes caused by selective logging, fire, and shifting cultivation using Landsat time series data (e.g. Shimizu et al. 2017;De Marzo et al. 2021;Vancutsem et al. 2021;Hethcoat et al. 2020).
Landsat time series analysis assesses the temporal change of indices, and it can be used in various ways for mapping deforestation and degradation. For example, Bullock, Woodcock, and Holden (2020b) used time series of spectral endmembers from SMA for detecting forest disturbance and estimating forest/non-forest land cover after disturbance to monitor deforestation and forest degradation. Other studies used the temporal changes of indices from Landsat time series for monitoring land cover change (Tang et al. 2020;Arévalo, Olofsson, and Woodcock 2020;Murillo-Sandoval et al. 2018), tree canopy cover (Potapov et al. 2019), and forest disturbances (Reygadas et al. 2021;Hamunyela et al. 2016;Aryal et al. 2021) and then related them to deforestation and forest degradation in the tropics. Previous studies have mainly used a simple forest/non-forest classification to identify deforestation and forest degradation; however, a more flexible approach is required because the definition and evaluation of both deforestation and forest degradation are complex. One possible approach is to use post-disturbance land cover classification. This approach allows us to investigate the drivers of deforestation (e.g. conversion to agriculture) and forest degradation (e.g. changes from natural forest to plantation). Additionally, the estimation of forest condition, such as tree canopy cover, after disturbance provides important implications for forest recovery (Hamunyela et al. 2020). Tree canopy cover estimation has proved to be successful in monitoring applications in previous studies because of its capacity to evaluate forest status (Potapov et al. 2019;Estoque et al. 2021). The continuous data provided by tree canopy cover can complement disturbance type classification, enabling the estimation of pre-and post-disturbance status. Such flexible characterization can provide useful information for both deforestation/ degradation and vegetation recovery; however, few studies have mapped both types of change for entire countries to the best of our knowledge.
The main objective of this study was to develop an approach for mapping deforestation, forest degradation, and recovery using both post-disturbance land cover and tree canopy cover (TCC) estimated from Landsat time-series data in 1988-2020 for the whole of Cambodia. Our specific objectives were to (1) examine the accuracy of both post-disturbance land cover and TCC estimation using Landsat time-series data; (2) demonstrate the utility of an approach for mapping deforestation, forest degradation, and recovery using post-disturbance land cover and TCC; and (3) map countrywide spatial and temporal patterns of deforestation, forest degradation, and recovery.

Study area
The study area was the entire country of Cambodia, which has a total land area of 18.1 million ha between 10-15°N and 102-108°E. There is an annual dry season (November to April) and a rainy season (May to October) with a mean precipitation of 1780 mm and a mean temperature of 27°C. The forest in Cambodia consists mainly of evergreen, semi-evergreen (i.e. mixed evergreen/deciduous), deciduous, and flooded forests (Ministry of Environment Cambodia 2020).
Forest cover in Cambodia has declined over the past few decades (FAO 2020) from nearly 70% in the 1970s to 46.86% in 2018 (Ministry of Environment Cambodia 2020). The main causes of deforestation are agricultural land expansion and plantation development, whereas the causes of forest degradation include logging in forest concessions (Tsujino, Kajisa, and Yumoto 2019). Recent studies show that large-scale land acquisitions for agro-industrial use have contributed substantially to forest loss in Cambodia (Magliocca et al. 2020;Davis et al. 2015). To increase forest cover, several actions have been planned by the Cambodian government, including establishing Community Forestry and Protected areas, which are demonstrated to partially compensate forest loss (Ota, Lonn, and Mizoue 2020).

Methods
Annual estimates of forest disturbance types and TCC at a resolution of 30 m pixel were essential components for mapping deforestation, forest degradation, and recovery in this study ( Figure 1). The main processing steps were to (1) preprocess Landsat time series to generate annual image composites; (2) extract predictor variables using the LandTrendr temporal segmentation algorithm (Kennedy, Yang, and Cohen 2010) and auxiliary data; (3) develop a random forest (RF) model to identify disturbance types and land cover classes in each year; (4) develop an RF model to estimate TCC in each year; and (5) identify deforestation, forest degradation, and recovery using a rulebased classification of disturbance type and TCC following disturbance. Finally, we assessed the accuracy of the two RF models and rule-based classification of deforestation, forest degradation, and recovery. In our approach, the results of temporal segmentation from the LandTrendr were used to derive the predictor variables of two RF models and then mapped disturbance types/land covers from RF models were subsequently used to predict deforestation, forest degradation, and recovery. Although our approach is different from the original algorithm, it can provide a locally adjustable way of mapping for the desired criterion.

Preprocessing of Landsat and topographic data
We used Landsat TM, ETM+, and OLI surface reflectance (SR) data (Masek et al. 2006;Vermote et al. 2016) acquired from October 1 to January 31 in 1988-2020 to minimize variations in vegetation phenology. We only selected SR data with a cloud cover of less than 70%. A quality assessment band generated by the CFmask algorithm (Zhu and Woodcock 2012;Zhu, Wang, and Woodcock 2015) was used to remove cloudy pixels. We transformed the spectral bands of Landsat OLI SR data to those of TM/ETM+ SR data using the coefficients of Roy et al. (2016) to compensate the spectral differences. We generated annual SR composites by calculating the median values in each year. Any gaps were filled with the mean SR values from the preceding and subsequent years at the same pixel locations. After preprocessing, we calculated the time series of the normalized burn ratio (NBR) (Key and Benson 2006) and the tasseled cap brightness (TCB), greenness (TCG), wetness (TCW), and angle (TCA) (Crist 1985) spectral indices using annual SR median composites. We used the digital elevation model from the Shuttle Radar Topography Mission (Farr et a. 2007) 1 arc-second (approximately 30 m) product to calculate topographic slope and elevation. We implemented all processing in Google Earth Engine (GEE) (Gorelick et al. 2017).

RF models of disturbance type/land cover classification and TCC
We used the same predictor variables for both RF models: one for identifying disturbance type and land cover classes and the other for estimating TCC. To derive predictor variables for each year, we followed the approach shown in (Shimizu et al. 2019;Shimizu and Saito 2021). This approach used LandTrendr temporal segmentation (Kennedy, Yang, and Cohen 2010), which fits several straight line segments for the time series of a spectral index, to extract variables representing spatial and temporal changes in each year. We used these extracted variables to model annual disturbance type/land cover and TCC estimation, unlike the original LandTrendr algorithm. As demonstrated in previous studies (e.g. Franklin et al. 2015), predictor variables from disturbance detection algorithms are useful to improve land cover classification. Based on the annual land cover classification, forest disturbances only occurred in forest areas, as described in the post-processing steps below. In this study, we used the five spectral indices for LandTrendr and derived a total of 83 variables related to vertex, magnitude, fitted, and original values for each spectral index (see Table S1 in Supplementary Materials).
We collected training data for the RF model that identified five disturbance types and five land cover classes (Table 1) using a combination of stratified random sampling and purposive procedures. For stratified random samples, we randomly selected pixel locations for stable forest (tree cover ≥ 10% in 2000), stable non-forest (tree cover < 10%), and forest loss (2001-2019) using the GFC data . For purposive samples, we manually selected forest  Note: Error-adjusted area, producer's accuracy (PA) and user's accuracy (UA) for each class, and overall accuracy (OA) are shown with 95% confidence intervals.
disturbances that occurred before 2000 and other disturbances/land cover classes that occupied small areas. To label the disturbance type/land cover class for each year, we used Landsat time-series data and high-spatial-resolution data from Google Earth Pro for visual interpretation (Figure 2). In a pixel time series, we randomly selected three sample years as training samples if there were no disturbances at the pixel locations. If a disturbance occurred in the time series, we selected three sample years (pre-disturbance, post-disturbance, and disturbance years) and two random years. This procedure was followed so we could balance the training samples across the study period.
We selected a total of 1000 simple random pixel locations to train the RF model for estimating TCC. For each location, we manually labeled TCC in the years when high-spatial resolution satellite data were available in Google Earth Pro (Figure 2). We used 3×3 regular spaced grid points within 30 m pixels to label TCC, following the method in (Potapov et al. 2019). We calculated the proportion of nine sample grid points that intersected the tree canopy. If high-spatial-resolution data were not available, we removed such locations from training samples.
We fine-tuned the two RF models and applied them to all years over the entire study area using GEE. Because it is difficult to tune RF parameters in GEE in this analysis, we tuned the mtry parameter with 10-fold cross-validation using the package 'randomForest' (Liaw and Wiener 2002) and 'caret' (Kuhn 2008) in R statistical software (R Core Team 2021) and then used the obtained parameter in GEE with a number of trees of 500. The outputs of the RF models for disturbance types/ land cover classes and TCC were a 10-class classification (Table 1 and Table S2 for detailed definitions) and tree cover (0-100%), respectively, for each year. For the results of disturbance types/land cover classes only, we applied post-processing steps to eliminate irrelevant land cover transitions. We removed disturbances that occurred in 2 consecutive years in the same pixel locations since two consecutive disturbances were unlikely. Then, we removed forest disturbances that had fewer than four spatially contiguous pixels in each year to reduce false disturbance detection. If the adjacent disturbances in spatially contiguous pixels had more than two disturbance types, we reclassified them to the majority disturbance types. If the land cover class changed without disturbances in a pixel time series, we allocated the majority of land cover classes to all years of the time series. If land cover classes before and after disturbance did not match the disturbance type, we reclassified such land cover classes to improve temporal consistency.

Mapping deforestation, forest degradation, and recovery
After mapping forest disturbance types/land cover and TCC annually from 1989 to 2019, we implemented a rule-based classification of deforestation, forest degradation, and recovery. In this study, we defined 'deforestation' as forest disturbances that resulted in conversion to non-forest, 'forest degradation' as a reduction in forest canopy cover compared with the original state, although the area remained forest, and 'recovery' as a state where forests reached the original canopy cover after forest disturbances. Forest disturbances were allocated to these three classes (i.e. deforestation, forest degradation, or recovery). To identify deforestation, we used forest disturbance types, which represented land covers following forest disturbances. Among the disturbance types, three (i.e. toAgriculture, toGrass/Shrub, and toWater/Urban) resulted in land cover conversion and thus were classified as deforestation. For other disturbances (i.e. toForest and toPlantation), we labeled these as either forest degradation or recovery based on the TCC after these disturbances. Since toPlantation in this study indicated the conversion of natural forests to rubber plantations (Table S2), we included this class as potentially recovered to forests, following the definition of forest by the FAO (2020). Forest recovery in this study required almost the same level of TCC before disturbance; thus, we defined a threshold value of 70% of original TCC as the forest recovery. When TCC reached 70% of the average values of the two years prior to disturbance (if two years were not available, TCC prior to disturbance) within 5 years after disturbance, we classified these as forest recovery. If TCC after disturbance did not reach this threshold, we regarded such cases as forest degradation. Since it required 5 years to identify forest degradation/ recovery, we mapped deforestation, forest degradation, and recovery only from 1990 to 2014.

Spatial and temporal pattern of deforestation and forest degradation
We analyzed the spatial and temporal patterns of mapped deforestation and forest degradation. We calculated the distance in the 8-neighbor path from non-forest area each year and from non-forest as of 1989 using the gridDistance function in the R package 'raster' (Hijmans 2020). The neighbors were determined by the queen's case adjacency through the centers of raster pixels in the calculation. We identified forest and non-forest areas using classified disturbance types and land cover. Because mapped disturbance types did not represent a land cover but represented the transition from natural forests, we regarded land cover at the disturbance year as forest. The land cover following disturbance year depended on each disturbance type. We assigned the Natural forest and Plantation classes as forest and other classes as non-forest in the distance calculation. We investigated temporal trends in the mean distance from non-forest area for both deforestation and forest degradation.

Accuracy assessment
We conducted accuracy assessments for (1) disturbance types/land cover classification, (2) TCC estimation, and (3) deforestation, forest degradation, and recovery classification. We first collected reference samples for disturbance types/land cover classification and deforestation, forest degradation, and recovery classification using stratified random sampling following good practices as recommended by Olofsson et al. (2014). We determined the total sample size (n = 1793) to achieve 95% confidence intervals of ±2% in overall accuracy (OA). The sampling strata were disturbance types/ land cover classes throughout the study period. To generate this strata map, we assigned recent disturbance types for pixel locations with disturbance and land cover classes for pixel locations where there was no disturbance throughout the period. We allocated 75 sample pixels to rare classes and the remaining samples proportionally to the other classes. The reference samples were labeled using the same visual interpretation protocols as training data collection. In the visual interpretation, we additionally labeled deforestation, forest degradation, and recovery from disturbance samples. Here, deforestation was labeled if post-disturbance land cover classes were non-forest. Forest degradation and recovery were identified whether tree canopy cover reaches pre-disturbance conditions within 5 years after disturbance. When pre-disturbance tree canopy cover could not be determined due to the availability of high-spatial-resolution data, surrounding forest areas that have not been disturbed were used to represent pre-disturbance canopy cover. In the visual interpretation, one interpreter labeled all reference samples. We generated population error matrices to calculate the OA and producer's accuracy (PA) and user's accuracy (UA) of each class for both classifications. For disturbance type/land cover classification, we also calculated accuracy metrics for three aggregated classes (i.e. stable forest, disturbance, and other land cover) using the ratio estimators described in Stehman (2014). For deforestation, forest degradation, and recovery classification, we only used disturbances that occurred in 1990-2014 from the same reference samples. Additionally, we compared the detection accuracy and mapped annual disturbance areas with those in the GFC data for 2001-2019. To assess the accuracy of the GFC data, map classes in the GFC data were treated as the disturbance (i.e. forest loss occurred in 2001-2019), stable forest (i.e. tree cover ≥ 10% in 2000), and other land cover (i.e. tree cover < 10% in 2000) classes. Forest disturbances that occurred prior to 2001 in the reference samples were re-labeled to the corresponding post-disturbance land cover classes to adjust different mapping periods from the disturbance type/land cover classification (i.e. 1989-2019).
To assess the accuracy of TCC estimation, we collected simple random samples for 400 pixel locations. We labeled TCC in the most recent years between 1989 and 2019 when multiple highspatial-resolution data were available at the same location. The performance was evaluated using the coefficient of determination (R 2 ) and root mean squared error (RMSE). We also compared and assessed TCC estimation with that in previous studies by Hansen et al. (2013), which provided TCC in 2000, and Potapov et al. (2019), which provided TCC in 2000-2017 to test the robustness of our results based on the two sets of samples: 3000 random pixel locations and 400 pixel reference samples. We used both datasets in 2000 and dataset from Potapov et al. (2019) in 2017.

Disturbance types/land cover classification and TCC estimation
A countrywide map of disturbance types and land cover classification 1989-2019 is depicted in Figure 3. The mapped disturbance area accounted for 15.1% of the total study area. When focused on the temporal dynamics of disturbances for Cambodia, we found a notable increase in mapped disturbance area after 2010 with the largest disturbance area occurring in 2012 (Figure 4). The example of disturbance types and TCC mapping in Figure 5 illustrates forest disturbance patterns that were mainly affected by the large-scale development of rubber plantations. As indicated in previous studies, there were booms driven by the establishment of rubber plantations (Fox and Castella 2013;Grogan et al. 2018) and other cash crops in the 2010s (Kong et al. 2019) in Cambodia, which concurred with the temporal area changes of forest disturbance types in the current study. Spatially, forest disturbances were concentrated in land concessions and cross-border regions, reflecting large-scale development in the concessions and cross-border investment (Davis et al. 2015;Grogan et al. 2018).
Although the disturbance class of the consolidated 3-class classification achieved high accuracies, we found low PA for some disturbance types in the 10-class classification. This result was partially explained by the omission and commission errors of disturbances in deciduous forests. Such errors occurred because low TCC in deciduous forests is confused with other land cover classes and the phenological noise of deciduous forests reduces detection accuracy (Grogan et al. 2015). Furthermore, forest disturbances that occurred at the end of the study period were inherently prone to classification errors since information on the land cover after disturbances was not available. These omissions might be further exacerbated because disturbance detection algorithms that use the entire time series, such as LandTrendr, are prone to errors at the end of time series (Bullock, Woodcock, and Holden 2020a). In this study, toGrass/Shrub indicated disturbance without agriculture or plantation development; however, it would usually take several years after disturbances to detect signs of these land cover changes, leading to low classification accuracies.
Several land cover classes also had low PA and UA in the accuracy assessment of the 10-class classification. Most of the misclassifications of the Plantation class were confused with the Natural forest class. Because the spectral responses of the rubber plantations are usually similar to those of natural forests, this result is not surprising. In addition, the collection of training samples for rubber plantations was sometimes difficult due to spectral similarities, especially when there was no conversion from natural forests to rubber plantations. The possible mislabeling in the visual interpretation might exacerbate estimation accuracies. On the other hand, the forest disturbance (toPlantation) class achieved moderate accuracy. This result might be explained by the use of predictor variables from the LandTrendr temporal segmentation, which reflected the different recovery patterns between natural forests and rubber plantations.
The accuracy assessments in this study might contain the variability of reference sample collection, which was not accounted for. As the previous studies indicated, reference labeling in visual interpretation has a certain degree of variability among different interpreters (Pengra et al. 2020;Stehman et al. 2022). Different interpreters might label different classes to the same reference samples; however, multiple interpreters with quality control protocols would mitigate the bias and precision caused by mislabeling errors (McRoberts et al. 2018). Therefore, it is desirable to include several interpreters and resolve inconsistency among interpreters in visual interpretation (Stehman and Foody 2019). Although the trained interpreter implemented visual interpretation in this study, there might be interpretation errors that were not accounted for, due to the lack of multiple interpreters. Other important uncertainty exists in the visual interpretation of forest degradation. Although the reduction of tree canopy cover after a disturbance was used for determining the forest degradation and recovery in visual interpretation, it was difficult to label parts of samples due to the lack of high-spatial-resolution data, especially for forest disturbances that occurred before 2010. Such forest disturbances were prone to cause misinterpretation. These uncertainties that were not accounted for should be considered to understand the results of this study.
The accuracy of disturbance (i.e. forest loss) detection of the GFC data was lower than that of disturbance type/land cover classification, with the PA and UA of 64.0% (±5.3%) and 79.4% (±5.1%), respectively ( Table 2). The mapped disturbance area from 2001 to 2019 was 5.2% larger Table 2. Accuracy assessment for forest loss detection in Hansen et al. (2013)  than forest loss area in the GFC data. Annual comparison achieved an R 2 of 0.71 and RMSE of 38,187 ha ( Figure 6). Although the accuracy of the GFC dataset in a specific country is not always consistent with that of the global scale (Palahí et al. 2021;Bos et al. 2019), the GFC dataset provides a good approximation of forest loss at the country-level (Galiatsatos et al. 2020). The PA and UA of forest loss in the GFC data were lower than those of forest disturbances in the disturbance type/land cover classification; however, the high similarity in trends and area of forest disturbance in this study indicates the plausibility of our mapping approach in Cambodia. The R 2 and RMSE for TCC were 0.79 and 19.0%, respectively. The comparison with the existing TCC dataset showed that high similarities were obtained for both the datasets of Hansen et al. (2013) (Figure 7). The accuracy assessment using the reference samples for TCC achieved the R 2 and RMSE of 0.44 and 33.9%, respectively, for TCC by Hansen et al. (2013) and R 2 and RMSE of 0.72 and 22.1%, respectively, for TCC by Potapov et al. (2019). This result indicates that the locally adjusted TCC estimation have higher accuracy compared with these existing datasets. The summary of TCC estimation in natural forest and forest disturbance classes across Cambodia demonstrated the time series variation of TCC estimates (Figure 8). Examples of each disturbance class show relatively stable  TCC estimation before disturbance events (Figure 8). The TCC before and after disturbance show different annual variation for the two disturbance classes, in which forests remained as forests after disturbance ( Figure S1). Generally, TCC after toPlantation had lower values compared with those in toForest. We assumed that this result was obtained because several years are usually required for rubber plantations to develop a canopy after clearing all vegetation in natural forests (Beckschäfer 2017). The spectral characteristics of young rubber trees are strongly affected by the bare soil and understory vegetation (Chen et al. 2018), which might also cause large variation in TCC after toPlantation.

Mapping deforestation, forest degradation, and recovery
The mapped deforestation, forest degradation, and recovery in 1990-2014 are shown in Figure 9. The OA of classification was 94.4% (±1.1%). Accuracy assessment yielded a relatively higher PA for recovery (72.9%) followed by deforestation (64.7%) and forest degradation (59.1%). The UA was the highest in deforestation (91.6%) followed by forest degradation (74.0%) and recovery (60.8%). Overall, the mapping approach using post-disturbance land covers and TCC provided moderate accuracy. Since the rule-based classification of deforestation, forest degradation, and recovery depended on disturbance classification and TCC estimation, the accuracy obtained in the rule-based classification was largely affected by these two models. Relatively low PA in deforestation stemmed from low PA in disturbance types/land cover classification, especially affected by toGrass/Shrub and toUrban/Water. Large uncertainties of PA and area estimates (i.e. 95% confidence intervals) for small area classes like forest disturbances are also reported in previous studies. To obtain more reliable estimates, the use of a buffer stratum in accuracy assessment might be a viable solution . Although the efficiency of the buffer stratum differs depending on the size of the buffer stratum and the area proportion of the larger stratum, this procedure would be a simple way to reduce estimation uncertainties. The accuracy difference between forest degradation and recovery was also affected by both land cover classification and TCC estimation. In 1990-2014, deforestation, forest degradation, and recovery occupied 64.0%, 22.0%, and 14.0% of the forest disturbance area, respectively (Table 3). In Cambodia, 46.86% of the country was covered with forests in 2018 based on visual interpretation (Ministry of Environment Cambodia 2020). The error-adjusted area of stable forest and disturbance (toForest and toPlantation) in this study (i.e. 45.4%) was similar to this figure, revealing that the automatic mapping approach might be effective for forest monitoring.
The TCC relative to the original states before disturbance could distinguish forest degradation and recovery in forests with different original TCC levels. In Cambodia, TCC is high in evergreen  and semi-evergreen forests but low in deciduous forests (Potapov et al. 2019). In areas with such different forest types, it is difficult to assess forest degradation and recovery using a fixed TCC threshold. In this regard, we provided a better mapping approach for different forest types. In addition, the approach here can be changed to suit different study objectives. Forest degradation/recovery in this study included rubber plantations, which were treated as forests. However, natural forests converted to plantations can be degraded in terms of environmental services, as reported in (Scheidel and Work 2018). Our approach is applicable to different definitions of forest degradation/recovery because the estimated disturbance type and TCC can be adjusted to the desired definitions. Distance from the non-forest area in each year was higher for forest degradation (mean: 424.5 m) compared with deforestation (mean: 156.4 m). In each year, 72.2% of forest degradation and 94.4% of deforestation areas were located within 500 m from a non-forest area. The distance from non-forest in each year was stable through time for both deforestation and forest degradation ( Figure 10). Because the distance from roads and other non-forest areas is an important indicator for forest loss in the tropics (Mon et al. 2012;Ehara et al. 2021;Oliveira et al. 2007), it was not surprising that disturbance frequently occurred near the forest edge (i.e. near non-forest). Interestingly, however, the distance from non-forest as of 1990 increased from the 2000s both for deforestation and forest degradation ( Figure 10). This result indicated that the disturbance areas were gradually moving into places previously covered with forests, in contrast to before the 2000s. Deforestation in Cambodia has been accelerated by large-scale industrial development and the expansion of agricultural land under rapid economic growth (Tsujino, Kajisa, and Yumoto 2019). The increased distance trend seems to reflect such deforestation and forest degradation patterns, which were not apparent before economic growth.

Conclusions
We developed an approach for mapping annual deforestation, forest degradation, and recovery using disturbance types/land cover classification and TCC estimation from Landsat time series in Cambodia. Overall, rule-based classification of deforestation, forest degradation, and recovery yielded a moderate accuracy. The mapped deforestation, forest degradation, and recovery provided an understanding of the spatial and temporal patterns of forest changes across the country. The mapped disturbance types and TCC were helpful to characterize post-disturbance conditions. In Cambodia, the government has mapped land use/land cover change based on manual interpretation using the segmentation of satellite images across the country (Ministry of Environment Cambodia 2020). Although such an approach might eliminate errors caused by automated classification algorithms, the consistent and automatic mapping approaches in the current study can assist with more frequent (e.g. annual) monitoring of land cover changes. While the definitions of forest degradation/recovery might differ according to research objectives, the combinations of disturbance types and tree canopy cover estimates in this study can be flexibly modified to suit specific requirements. Thus, our approach is potentially applicable for other study sites over large areas. Although the use of RF models in our study can improve the prediction accuracy, one limitation of the approach is the requirement of sufficient training samples to develop valid prediction models for mapping disturbance type/land cover classes and TCC. As the performance of RF models is affected by the derivation of variables, the investigations on effective predictor variables might reduce training samples that achieve high prediction accuracy.