Detecting tropical selective logging with C-band SAR data may require a time series approach

Selective logging is the primary driver of forest degradation in the tropics and reduces the capacity of forests to harbour biodiversity, maintain key ecosystem processes, sequester carbon, and support human livelihoods. While the preceding decade has seen a tremendous improvement in the ability to monitor forest disturbances from space, large-scale (spatial and temporal) forest monitoring systems have almost universally relied on optical satellite data from the Landsat program, whose effectiveness is limited in tropical regions with frequent cloud cover. Synthetic aperture radar (SAR) data can penetrate clouds and have been utilized in forest mapping applications since the early 1990s, but only recently has SAR data been widely available on a scale sufficient to facilitate pan-tropical selective logging detection systems. Here, a detailed selective logging dataset from three lowland tropical forest regions in the Brazilian Amazon was used to assess the effectiveness of SAR data from Sentinel-1, RADARSAT-2, and Advanced Land Observing Satellite-2 Phased Arrayed L-band Synthetic Aperture for monitoring tropical selective logging. We built Random Forests models aimed at classifying pixel-based differences between logged and unlogged areas. In addition, we used the Breaks For Additive Season and Trend (BFAST) algorithm to assess if a dense time series of Sentinel-1 imagery displayed recognizable shifts in pixel values after selective logging. In general, Random Forests classification with SAR data (Sentinel-1, RADARSAT-2, and ALOS-2 PALSAR-2) performed poorly, having high commission and omission errors for logged observations. This suggests little to no difference in pixel-based metrics between logged and unlogged areas for these sensors, particularly at lower logging intensities. In contrast, the Sentinel-1 time series analyses indicated that areas under higher intensity selective logging ( > 20 m 3 ha − 1 ) show a distinct spike in the number of pixels that included a breakpoint during the logging season. BFAST detected breakpoints in 50% of logged pixels and exhibited a false alarm rate of approximately < 5% in unlogged forest. Overall our results suggest that SAR data can be used in time series analyses to detect tropical selective logging at high intensity logging locations ( > 20 m 3 ha − 1 ) within the Amazon.


Introduction
Selective logging is the primary driver of forest degradation in the tropics Hosonuma et al., 2012). Logging reduces the capacity of forests to harbour biodiversity, maintain key ecosystem processes, sequester carbon, and support human livelihoods (Baccini et al., 2017;Barlow et al., 2016;Lewis et al., 2015). However, large uncertainties remain in assessing the true impact of selective logging because the technological advances in detecting and monitoring logging at large scales are only just emerging (Hethcoat et al., 2019;Hethcoat et al., 2020;Langner et al., 2018;Lima et al., 2020;Shimabukuro et al., 2019). The ability to reliably map forest degradation from selective logging is a key element in understanding the terrestrial portion of the carbon budget and the role of land-use in turning tropical forests into net carbon emitters (Baccini et al., 2017). In addition, reliable forest monitoring systems are urgently needed for tropical nations and conservation groups seeking to report and/or mitigate carbon emissions through improved forest stewardship (GOFC-GOLD, 2017).
While the preceding decade has seen a tremendous improvement in the ability to detect forest disturbances from space (Hansen et al., 2013;Hethcoat et al., 2019;Tyukavina et al., 2017), forest monitoring at large spatial and temporal scales has largely relied on optical satellite data from the Landsat program. Yet, the effectiveness of optical data is limited in tropical regions with frequent cloud cover like the northwest Amazon and central Africa. Synthetic aperture radar (SAR) data can penetrate clouds and have been widely utilized in forest mapping applications since the early 1990s (see review by Koch, 2010 and references therein). However, the SAR data archives are spatially and temporally fragmented, and in many cases the data products require commercial licences for their use. Consequently, uptake by users has been more limited than optical data and the full potential of SAR has likely been under-utilized (Reiche et al., 2016). SAR backscatter, particularly at L-and P-band, is sensitive to changes in carbon stocks in forests with biomass <300 Mg ha −1 (Koch, 2010;Mitchard et al., 2009;Saatchi et al., 2011). This enables accurate differentiation between forested and non-forested areas and has been well studied (e.g. Shimada et al., 2014). More recently, polarimetric and interferometric methods have been developed that utilize information in the SAR signal to detect forest changes (Deutscher et al., 2013;Flores-Anderson et al., 2019;Lei et al., 2018;Mathieu et al., 2013). Yet, the limited temporal and spatial coverage of SAR data have hampered widespread application and use of these techniques to monitor forest disturbances (e.g. single-pass interferometric SAR is only available with TanDEM-X data).
The launch of Sentinel-1A in mid-2014 represented the first continuous global acquisition strategy for open SAR data. Since 2015 a growing number of studies have used Sentinel-1 to map deforestation (Antropov et al., 2016;Delgado-Aguilar et al., 2017;Doblas et al., 2020;Hoekman et al., 2020), with others utilizing a fusion of optical and Sentinel-1 data (Reiche et al., 2018a(Reiche et al., , 2018bBouvet et al., 2018;Hirschmugl et al., 2020). Combining optical and SAR data has generally improved forest/non-forest mapping efforts over using either individually (Joshi et al., 2016;Reiche et al., 2015b;Zhang, 2010;Hirschmugl et al., 2020) and represents an important area of active development in forest remote sensing (Reiche et al., 2021). In contrast, no study has used Sentinel-1 to map selective logging and advancements in monitoring logging with SAR data are generally lacking, despite widespread recognition of both the need and the role it could play (Mitchell et al., 2017;Reiche et al., 2016). With the successful launch of SAOCOM 1A and 1B in late 2018 and early 2020, the planned continuation of the Sentinel-1 missions (with C and D), and the anticipated launch of NISAR in 2022, free C-and L-band SAR data will be available on a scale like never before. Accordingly, methods are needed that utilize open SAR data to make similar advancements in the detection of large-scale selective logging operations. Pixel-based methods for detecting changes in remotely sensed imagery often utilize differences between pixel values or other mathematically derived metrics in time or space, for example before and after some disturbance or in areas known to be disturbed and undisturbed within the same image (reviewed in Hussain et al., 2013). These differences can be used for classification, employed in machine learning, or analyzed temporally to map change. Recently, the detection of selectively logged regions in single images has been demonstrated successfully with optical data from Landsat (Hethcoat et al., 2019). Simultaneously, time series methods have increasingly been used for monitoring changes in pixel values, in part because of the availability of vast archives of imagery on cloud computing platforms like Google Earth Engine (Gorelick et al., 2017), but also because of the recognition that seasonal or longer term trends in pixel values can be less susceptible to erroneously characterizing change (Bullock et al., 2018;Verbesselt et al., 2012;Zhu, 2017).
The primary objective of this paper was to assess the ability of Sentinel-1 to detect tropical selective logging. Detailed spatial and temporal logging records from three regions in Brazil were used to develop and test the effectiveness of two different detection techniques: (1) exploiting pixel-based differences between logged and unlogged locations in single images and (2) detecting change in a time series of pixels known to be logged. Selective logging records were used to build supervised machine learning models to classify selective logged pixels. Machine learning methods have many applications in remote sensing and have been used with increasing frequency and success (Lary et al., 2016). We performed equivalent classification analyses with SAR data from the C-band RADARSAT-2 and L-band PALSAR-2 sensors to compare the performance of longer wavelength (i.e. L-band PALSAR-2) and higher resolution data (both RADARSAT-2 and PALSAR-2 have higher spatial resolution than Sentinel-1). Finally, we used all the available Sentinel-1 archives in a time series analysis to monitor pixel values for breakpoints in their time series at locations known to have been selectively logged. Given that forest disturbances from selective logging are often subtle and short-lived, detecting changes with SAR data over large regions will present technological and algorithmic challenges. However, a critical assessment of detection capabilities and a careful understanding of the performance of these data types is essential for advancing forest monitoring techniques in the tropics.

Study area and selective logging data
Selective logging data from three lowland tropical forest regions in the Brazilian Amazon were used in this study (Fig. 1). The Jacunda and Jamari regions are inside the Jacundá and Jamari National Forests, Rondônia, while the Saraca region is inside the Saracá-Taquera National Forest, Pará. Forest inventory data from 14 forest management units (FMUs) selectively logged between 2012 and 2017 were used, comprising over 32,000 individual tree locations. Unlogged data from three additional locations, one inside each study region, comprised over 11,500 randomly selected point locations known to have remained unlogged during the study period (Table 1). Forest inventory measurements were recorded by trained foresters and included the spatial location of each marketable tree species within the concession, its height, diameter, estimated volume, and if it was logged (only trees >50 cm in diameter are harvested). A field survey in 2016 relocated a subset of trees (n = 214) to estimate the geolocation precision of the logging inventory records (mean = 6.2 m; standard deviation = 6.6 m).

Satellite data and processing
All available C-band Sentinel-1A Ground Range Detected scenes in descending orbit and Interferometric Wide (IW) mode (VV and VH) were utilized in Google Earth Engine (GEE) over the study regions up through the date of logging. These had incidence angles of 38.7 • , 38.7 • , and 31.4 • for Jacunda, Jamari and Saraca, respectively. GEE is a cloud computing platform hosting calibrated, ortho-corrected Sentinel-1 scenes that have been processed in the following steps using the Sentinel-1 Toolbox: (1) thermal noise removal; (2) radiometric calibration; and (3) geometric terrain correction (i.e. geocoding) using the Shuttle Radar Topography Mission (SRTM) 30 m digital elevation model (DEM). Furthermore, we (1) applied a radiometric slope correction in GEE detailed in Vollrath et al. (2020); (2) performed multi-temporal speckle filtering (using a median, 7 × 7 pixel window) detailed in Quegan and Jiong Jiong (2001); and (3) reduced pixel resolution to 20 m. The final equivalent number of looks was approximately 80 after speckle filtering and spatial averaging.
Single Look Complex C-band RADARSAT-2 scenes in Fine mode (HH and HV) were obtained from the Canadian Space Agency. Twelve ascending scenes from the same orbit path and frame number, with an incidence angle of 30.7 • , coincided with selective logging records and were acquired between 2011 and 2012. Pre-processing of images was done with the Sentinel-1 Toolbox and included: (1) radiometric calibration; (2) multi-looking (by a factor of 2 in azimuth) to produce square pixels; (3) Radiometric Terrain Flattening; (4) multi-temporal speckle filtering (using a median, 7 × 7 pixel window) detailed in Quegan and Jiong Jiong (2001); (5) Geometric Terrain Correction using the SRTM 30 m DEM; and (6) reduce pixel resolution to 20 m. The final equivalent number of looks was approximately 105 after speckle filtering and spatial averaging.
Single Look Complex L-band PALSAR-2 scenes (HH and HV) were obtained from the Japan Aerospace Exploration Agency (JAXA). Ten ascending scenes from the same orbit path and frame number, with an incidence angle of 36.2 • , coincided with selective logging records and were acquired between 2016 and 2018. Pre-processing of PALSAR-2 data followed the same steps as RADARSAT-2. The final equivalent number of looks was approximately 90 after speckle filtering and spatial averaging.

Data inputs for classifying selective logging
For each satellite data type (Sentinel-1, RADARSAT-2, and PALSAR-2) data were extracted at each pixel where logging occurred and randomly selected pixels in nearby regions that remained unlogged. Thus, the data inputs for logged and unlogged observations came from a single scene for each study region (i.e. a space-for-time study design in contrast to images before and after logging from the same location). Selective logging at the study areas only occurred during the dry season, approximately June-October in a given year, and data were extracted from images acquired as late into the logging period as possible Table 1 Data used in the classification of selective logging from three study regions in the Brazilian Amazon. The forest management units (FMU), logging intensities, sample sizes (pixels), and overlap with satellite data coverage are shown for Sentinel-1 (S), RADARSAT-2 (R), and PALSAR-2 (P). (Table S1) to ensure the majority of pixels had been subjected to logging, but also before the onset of the rainy season (Hethcoat et al., 2019). In addition, logging activities tend to be accompanied by surrounding disturbances (canopy gaps, skid trails, patios, and logging roads) resulting in forest disturbances beyond just the pixels where a tree was removed. Accordingly seven texture measures were calculated for each polarization (sum average, sum variance, homogeneity, contrast, dissimilarity, entropy, and second moment) to provide a local context for each pixel (Haralick et al., 1973). These were calculated within a 7 × 7 pixel window, chosen as a trade-off between minimizing window size while still capturing the variability in selectively logged forests compared to unlogged forests. Finally, a composite band was calculated as the ratio of the co-polarized channel to the cross-polarized channel (i. e. HH/HV or VV/VH). Each dataset thus comprised a 17-element vector (2 polarization bands, their ratio composite band, and 7 texture measures for each polarization) for each pixel where logging occurred and randomly selected pixels that remained unlogged.

Random forests for classification of selective logging
We built Random Forests (RF) models using the randomForest package in program R version 4.0.2 (Liaw and Wiener, 2002). The RF algorithm (Breiman, 2001a) is an ensemble learning method for classification. Each dataset was split into 90% for training and 10% was withheld for validation. In order to further ensure the independence of training and validation datasets, the validation data were spatially filtered such that no observations in the training dataset were within 90 m of an observation in the validation dataset. RF models have two tuning parameters: the number of classification trees grown (k), and the number of predictor variables used to split a node into two sub-nodes (m). We used a cross-validation technique to identify the number of trees and the number of variables to use at each node that minimized the out-of-bag error rate on each training dataset (Table S2). The importance of each predictor variable was assessed during model training, using Mean Decrease in Accuracy, defined as the decrease in classification accuracy associated with not utilizing that particular input variable for classification (Breiman, 2001b).

Model validation: Assessing accuracy
The RF models were validated using a random subset of the full dataset for each sensor (described in Section 3.1.2). By default, RF models assign an observation to the class indicated by the majority of decision trees (Breiman, 2001a). However, the proportion of trees that voted for a particular class from the total set of trees can be obtained for each observation and a classification threshold can be applied to this proportion (Hethcoat et al., 2019;Liaw and Wiener, 2002). We adopted such an approach, wherein the proportion of trees that predicted each observation to be logged, informally termed the likelihood a pixel was logged, was used to select the classification threshold. A threshold, T, was defined such that if likelihood > T the pixel was classified as logged (Fig. 2).
The confusion matrix then has the form: where L and UL refer to logged and unlogged classes, N L and N U are the numbers of logged and unlogged observations in the reference dataset, and D L and D U are the numbers of logged and unlogged pixels detected as logged, respectively. We defined the detection rate DR = D L / N L and false alarm rate FAR = D U / N U as the frequency that a logged or unlogged pixel was classified as logged, respectively. Thus, the DR is equivalent to 1 minus the omission error of the logged class and the FAR is the omission error of the unlogged class. In addition, we defined the false discovery rate (FDR): The FDR is the proportion of all observations that were detected as logged that were actually unlogged, and is equivalent to the commission error of the logged class. The FDR is an assessment of the rate of prediction error (i.e. type I) when labelling pixels as logged and can be used in detection problems with rare events or unbalanced datasets, such as selectively logged pixels within the Amazon Basin (Benjamini and Hochberg, 1995;Hethcoat et al., 2019;Neuvial and Roquain, 2012). A high DR and low FDR is clearly desirable, but these cannot be fixed independently in two-class detection problems and both depend on the threshold value (Fig. 2). For example, if achieving a 95% detection rate led to a FDR of 50%, then half of all predictions of logging would be incorrect. This level of performance would make estimates of selective logging extremely uncertain. The value of the classification threshold (T) therefore represents a trade-off between true and false detections. In practice, a viable detection method would expect to achieve a DR > 50% while limiting the FDR to 10-20% to have any value for widespread forest monitoring. The performance of each sensor was assessed by plotting the DR, FAR and FDR values as T varied from 0 to 1 to facilitate discussion of model performance.

Sentinel-1 classification of high intensity logging
Most of the selective logging in this study was low-intensity (<15 m 3 ha −1 ) and we anticipated the logging signal to be weak and difficult to detect. Consequently, we also considered a reduced Sentinel-1 dataset that included only those FMUs with logging intensities above 20 m 3 ha −1 (n = 3 sites) and the unlogged data (n = 3 sites) to assess if Sentinel-1 could be used for detecting selective logging activities near the legal limit within the Brazilian Legal Amazon. Unfortunately RADARSAT-2 and PALSAR-2 imagery did not cover the highest intensity logging sites, so we could not perform equivalent analyses with these datasets. RF classification and validation was performed on this subset of the Sentinel-1 data in the manner detailed above for the full dataset.

Time series analyses
We tested whether a time series of Sentinel-1 data displayed discernible changes in pixel values after selective logging using the BFAST algorithm Verbesselt et al. (2010) in program R (R Core Team, 2020). BFAST estimates the timing of abrupt changes within a time series (breakpoint hereafter) and has been successfully utilized with a range of data types (e.g. Landsat, MODIS, SAR, etc.). The metrics used in searching for breakpoints in the full Sentinel-1 time series (approximately 55 scenes from October 2016 -August 2018) were the two most important predictor variables identified from RF models. The limited temporal coverage of RADARSAT-2 and PALSAR-2 at our study sites precluded time series analyses with these datasets. BFAST was used to assess if a suitable model with one or no breakpoints was appropriate Fig. 2. Diagram representing the trade-off between the detection rate (DR) and the false alarm rate (FAR) associated with using a threshold T (vertical black line) to label pixels as logged and unlogged based upon the proportion of votes that each observation was predicted to be logged. The purple and yellow colors correspond to density plots for hypothetical logged and unlogged observations, respectively. Thus, the areas A and B are the portions of the observations from unlogged and logged pixels, respectively, that will be labelled as unlogged. Similarly, C and D represent the portions of the observations from logged and unlogged pixels, respectively, that will be labelled as logged. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) and included tests for coefficient and residual-based changes in the expected value (i.e. the conditional mean). Where breakpoints were identified, we determined if they coincided with the timing of selective logging activities (June -October) and regarded these as true detections. Breakpoints in unlogged areas and breakpoints outside the timing of logging activities were considered false detections. In addition, the relationship between the frequency of breakpoints within an FMU and its logging intensity was examined to understand potential thresholds in logging intensity above which variables could be used to monitor selective logging activities through time series analyses.
Finally, we examined if the relationship between logging intensity and the rate of detections and false alarms was consistent between logging locations (i.e. a scattered subset of pixels in an area) and an entire region (i.e. all pixels within a bounding box). The timing of breakpoints was mapped for two 400 m × 400 m test regions within the Saraca study area (one logged and one unlogged). A limited number of small test regions were chosen because of the computationally expensive nature of the pull request in Earth Engine (e.g. two 1 km regions query >1 million records for export). Only breakpoints during the time period associated with logging were mapped (June -October).

Random forests classification of selective logging
The single image detection results for all sensors revealed that in order to obtain a sufficiently low false discovery rate (e.g. < 10%), the corresponding detection rates (DR) of selective logging were of almost no value (< 5-10%) for reliable forest monitoring. In general, the following results suggest that regions that have experienced selective logging do not show consistent differences from unlogged areas in the metrics we used for classification. The second analysis (section 4.2) therefore deals with detection of selective logging with time series data.

Sentinel-1
Random Forests detection performance for Sentinel-1 is shown in Fig. 3 (top). Both the detection and false alarm rates were close to 1 until the threshold exceeds ~0.4, meaning almost every pixel in an image would be detected as logged. This indicates little capability to distinguish logged and unlogged observations, with many unlogged observations misclassified as logged (Fig. S1). In general, the detection, false alarm, and false discovery rates (across the range of threshold values) were insufficient for reliable classification of selective logging with Sentinel-1 data at the intensities within our study areas (6-25 m 3 ha −1 ). For example, setting a threshold to achieve a FDR < 10% would yield a detection rate ~ 5%, which would be of little practical value. Thus, attempts to strongly limit the false discovery rate (commission error of logged observations) would require a high threshold value and result in very few detections. Overall, this suggests that using single images from Sentinel-1 on their own to detect and map selective logging activities would be fraught with error with the classification approach used here.

RADARSAT-2
Random Forests performance for RADARSAT-2 is shown in Fig. 3  (middle). Both the false alarm rate and the detection rate rapidly declined as the threshold value was initially increased, again suggesting difficulty in distinguishing logged and unlogged observations. In contrast to Sentinel-1, RADARSAT-2 was less likely to label an observation as logged and very few observations had likelihood values above 0.5 (Fig. S2). It should be noted that the logging records that coincided with RADARSAT-2 data were from a single FMU that was relatively low intensity (10 m 3 ha −1 ). Consequently, the performance displayed here may not be a full appraisal of RADARSAT-2 capabilities at higher intensities. Overall, our results suggest that RADARSAT-2 data cannot be used to effectively monitor low-intensity selective logging activities using pixel-based differences between logged and unlogged areas. However, additional tests with data at higher logging intensities should be pursued.

PALSAR-2
Random Forest classification performance for PALSAR-2 is shown in Fig. 3 (bottom). In general, the performance of PALSAR-2 was marginally better at distinguishing logged and unlogged observations than RADARSAT-2 and Sentinel-1 (Table S5 and Fig. S3). However, the DRs and FDRs associated with higher threshold values would result in very high uncertainty and preclude reliable mapping of logging. Similar to RADARSAT-2, the selective logging data that coincided with PALSAR-2 imagery was from four relatively low-intensity FMUs (9-12 m 3 ha −1 ). It remains unclear if more data at higher logging intensities would improve classification performance with this sensor. For example, when the data from Sentinel-1 was restricted to just the low intensity sites used in the PALSAR-2 analyses, there was effectively no change in the rates of detection and false discovery compared to the results from all logging intensities with Sentinel-1 (Figs. S4 and Table S7). However, in contrast to C-band, L-Band SAR is known to penetrate forest canopies and interact with forest structure (i.e. branches and stems). The marginal increase in performance compared to the C-band data hints that higher performance might be possible with sufficient data at higher intensities, but this needs further testing.

Sentinel-1 classification of high intensity logging
Detection performance of Sentinel-1 data for the highest intensity FMUs is shown in Fig. 4. Despite limiting the detection task to the most intensively logged FMUs (as well as unlogged observations), the detection rate and false discovery rate values were comparable to the results that used the full range of logging intensities. Instead, improvement in model performance was associated with better discrimination of unlogged observations (i.e. compare the commission and omission errors for the unlogged class between Tables S3 and S6). Essentially, the model was able to better identify unlogged forest, presumably because the more "confusing" observations (i.e. the low intensity FMUs) were absent and could not muddle the distinction between logged and unlogged observations (Fig. S5 and Table S6). Overall, our results suggest Sentinel-1 data cannot exploit classification of single image pixelbased differences to monitor selective logging activities with reasonable precision, even in the most intensively logged regions within the Amazon.

Sentinel-1 time series analyses
The two most important predictor variables from the Sentinel-1 RF model were the Sum Average metric (Haralick et al., 1973) on the VV and VH bands (Fig. S6, Eq. S1). A plot of VV sum average values through time at four randomly selected tree harvest locations at the Saraca site is shown in Fig. 5 and suggests selective logging decreased the value of this metric. In addition, histograms of the timings associated with all breakpoints at three FMUs, shown in Fig. 6, indicate that the time frame of the breakpoints mainly occurred within the logging season for those FMUs logged above 20 m 3 ha −1 . In contrast, the time periods associated with breakpoints at lower logging intensities were shifted toward the onset of the rainy season in late 2017 to early 2018; however, all FMUs showed an uptick in breakpoints associated with the rainy season (Fig. 6). This suggests that Sentinel-1 time series data could be used to detect and monitor selective logging activities from areas that have experienced logging close to the legal limit in Brazil (30 m 3 ha −1 ), particularly if the detection time-frame is narrowed to within the known logging season.
When the value of the VV sum average metric was monitored through time in pixels known to be logged and unlogged, the proportion of pixels with a significant breakpoint in their time series increased as the logging intensity of the FMU increased (Fig. 7A). Approximately 75% of logged pixels in high logging intensity FMUs had a breakpoint; however, roughly 10% of unlogged pixels showed a breakpoint in their  time series (i.e. 10% false alarm rate). This false alarm rate was generally consistent through logging intensities approaching 15 m 3 ha −1 and suggests no signal in pixels logged at low to moderate intensities (Fig. 7A). When the breakpoints were assessed only over the time period associated with logging (to remove the false peak associated with the rainy season), the relationship showed a similar pattern whereby the FMUs logged at the highest intensities showed a large rise in breakpoints above a background false alarm rate that was relatively constant up through moderate logging intensities (Fig. 7B). At the highest intensities, the detection rate was >50% and the false alarm rate was almost zero. These results further support the idea that FMUs logged at low to moderate intensities do not show a distinct time series signal whereas FMUs logged at higher intensities do. Overall, this suggests that FMUs logged at intensities close to the legal limit within the Brazilian Legal Amazon (30 m 3 ha −1 ) should show a noticeable spike in the number of breakpoints within their time series above a background false alarm rate and this could be used to detect logging activities in the dry season.
Approximately 51% and 10% of pixels in the logged and unlogged test regions had a breakpoint during the logging season (Fig. 8). These values are generally in agreement with our prior results from the subset of pixels where trees were removed (see Fig. 7B). While 51% of the pixels in the logged test region did not have a tree removed, selective logging is associated with forest disturbances that go beyond the individually logged pixels (e.g. canopy gaps, skid trails, logging roads, etc.) and additional detections are expected. Only about 5% of the pixels in the logged test region were actually logged, however, it is clear from the Planet imagery ( Fig. 8C and D; Planet Team, 2017) that more than 5% of the forest patch was disturbed by logging activities. Given the false alarm rate was, at most, around 10%, the difference between detections and false alarms might represent a value comparable with the amount of forest disturbance expected at this intensity (i.e. about 40%).

Discussion
We present the first multi-sensor comparison of SAR data for monitoring a range of selective logging intensities in the tropics. We demonstrated that L-band PALSAR-2, C-band RADARSAT-2, and C-band Sentinel-1 data performed inadequately at detecting tropical selective logging when using single image pixel-based attributes for classification. However, when analyzing a time series of Sentinel-1 texture measures, logged pixels displayed a strong tendency for a breakpoint in their time series as the logging intensity of the FMU increased. Moreover, the timing associated with the identified breakpoint generally coincided with active logging at the highest logging intensities. Overall, our results suggest that Sentinel-1 data could be used to monitor the most intensive selective logging, but a time series approach would be required to detect change. A number of studies have used Sentinel-1 time series data to monitor deforestation Reiche et al. (2018b), often in combination with optical data, but our study is the first to show it has some potential to be used to monitor selective logging.

Variable importance
In a number of cases the most important predictor variables from RF models involved the co-polarized channel (Fig. S1), despite the generally accepted view that the cross polarized channel is best for detecting changes in forest cover (Joshi et al., 2016;Reiche et al., 2018a;Ryan et al., 2012;Shimada et al., 2014). The HH polarization of PALSAR-2 data has previously been shown to be sensitive to the early stages of deforestation, resulting from single-bounce scattering from felled trees (Watanabe et al., 2018). Our results support the idea that the copolarized channel (for L-and C-band SAR) is useful and should not be ignored in forest disturbance detection analyses (e.g. Reiche et al., 2018a). While shorter wavelength SAR data, like C-and X-band, are known to be less sensitive to forest structure, because the radar signal mainly interacts with the forest canopy (Woodhouse, 2005;Flores-Anderson et al., 2019), the higher backscatter values in the co-polarized channel for all three sensors suggests predominantly rough surface backscattering from the forest canopy (as volume scattering generally results in roughly equal backscatter between co-and cross-polarized channels). This suggests that forest tracts subjected to more intensive selective logging than we studied (conventional logging permits with larger canopy gaps, large road networks, and many log landing areas) should possess a signal in the co-polarized channel that could be used to detect changes in canopy cover and should not be discarded (e.g. Reiche et al., 2018a).
Random Forests models offer an objective approach to selecting important variables for use in time series analyses. The Mean Decrease in Accuracy (MDA) variable rankings were used to select the sum average texture measures in the time series analyses. The detection rate was highest with the best variable (i.e. the one with the highest MDA), lower with the second best, and lower still with the third best metric, corroborate their rankings (see Figs. 7 and S6). SAR data often has fewer bands than optical data, for example, so the choice of which metric to use in time series analyses may be more straightforward. However, many studies do not compare the results among metrics to select the optimal one, relying instead on speculation (e.g. Reiche et al., 2018a). Our findings suggest Mean Decrease in Accuracy is useful for variable selection, even if the Random Forests models themselves are of little practical use (e.g. Fig. 3).

Texture measures and detecting selective logging
In all cases the texture measures had the highest variable importance rankings (Fig. S6). This is consistent with previous results with optical Fig. 7. The relationship between the proportion of observation within a Forest Management Unit (FMU) that had a breakpoint identified within its Sentinel-1 VV sum average texture measure time series and the logging intensity of the FMU. The proportion of all observations (A) and the proportion that had a breakpoint that coincided with the logging season (B) are shown separately. The circle size corresponds to number of observations at each FMU and yellow, green, and purple colors represent the Saraca, Jamari, and Jacunda sites, respectively. See the supplementary material for the same analyses with the second and third best metric from Random Forests (Fig. S7). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) data, where detection of selective logging relied on the contextual information embodied within their calculation (Hethcoat et al., 2019). Similar to their results, the predictions of logging in our test areas were spatially correlated, presumably a consequence of the spatial window used in the calculation. Again, however, extra detections are expected from the accompanying forest disturbances associated with logging. Yet, in the context of accuracy assessment, an issue that has not received much attention within the remote sensing literature is how to report selective logging detections in the absence of robust field data on canopy gaps, roads networks, skid trails, log landing decks, etc. Others have shown that selective logging can be associated with 30-70% forest disturbance, despite the proportion of pixels where a tree was removed being closer to 10% (Asner et al., 2002(Asner et al., , 2004Putz et al., 2019), depending on the intensity and logging practices (reduced impact versus conventional). Clearly Fig. 8A shows false detections associated with the breakpoint detections, but some of the detections that do not occur at a tree location correspond with canopy gaps seen in the Planet imagery.
While the texture information clearly helped with detection of selective logging, a coherent understanding of what the sum average metric means, in terms of characterizing forest disturbances from selective logging or understanding the structural changes to forests associated with increasing and decreasing values, remains unknown. Attempts to generalize and interpret the meaning of textures have proven difficult over the years. However, some have suggested that high values in measures like variance, dissimilarity, entropy, and contrast were associated with visual edges whereas average, homogeneity, correlation, and angular second moment were associated with subtle irregular variations from continuous regions like forests or water (Hall-Beyer, 2017). More work is needed to understand the interpretation of textures measures that are so often employed in remote sensing classifications. Logged tree locations are black crosses and the date of the breakpoint for each pixel is colour coded by week, with white representing no breakpoint. Planet imagery (3 m) from 28 August, 2017 overlaid with and without breakpoint locations (C and D) for the logged area (trees in white). Approximately 51% and 10% of the pixels in the logged and unlogged regions had breakpoints, respectively.

Combining sensors for classification
We chose not to combine any of the data types used here, partly because the inconsistent spatial and temporal coverage precluded such an analysis, but also because we wanted to assess the detection capabilities of each sensor on its own. Methods that combine data from multiple sensors (both other SAR platforms and/or optical data from Landsat or Sentinel-2) would likely perform better, corresponding with results for monitoring deforestation Reiche et al. (2015a). Indeed, prior work with Landsat data has shown strong detection of selective logging at similar intensities (Hethcoat et al., 2019), yet this work sought to establish a baseline with the available SAR sensors. The general direction and momentum for the advancement of detecting subtle forest disturbances from spaceborne SAR will likely require time series, polarimetric, and data fusion approaches, particularly in light of our findings that pixel-based differences between logged and unlogged areas using SAR backscatter alone cannot do the job effectively.

Longer time series in the tropics
Sentinel-1A began acquiring imagery regularly (approximately every 12 days) in late 2016 for most of Brazil, with Sentinel-1B following in late 2018. Consequently, a time series assessment was only possible for a single calendar year (roughly 2017) with the logging data sets we had access to. The BFAST algorithm is flexible and can be tuned with a baseline period if sufficient data are available, enabling assessments of longer and more variable time series (Verbesselt et al., 2010). The limited time series available is likely the reason many breakpoints for the less intensively logged sites occurred in December, presumably with the onset of the rainy season in earnest and an increase in backscatter associated with moisture (Hoekman et al., 2020). Our analysis, however, was limited to a simpler test of one or no breakpoints. Future work should explore how longer time series might improve detection of lower intensity logging, where seasonal patterns in backscatter can be established as a baseline to help reduce false alarms.

Conclusion
Tropical selective logging is fundamentally connected to global climate, biodiversity conservation, and human well being (Lewis et al., 2015). Selective logging is often the first disturbance to affect primary forest (Asner et al., 2009), with road networks and ease of access facilitating further disturbances (e.g. increased fires, hunting or illegal logging). Efforts to detect and map selective logging with Sentinel-1, because of its global coverage and anticipated continuation missions (i. e. Sentinel-1C and D), are urgently needed to understand the capabilities this data stream might offer at advancing detection of tropical selective logging activities. With the successful launch of SAOCOM 1A and 1B in late 2018 and early 2020, the planned continuation of Sentinel-1 (with C and D), the opening of the ALOS PALSAR-1 archives, and the anticipated launch of NISAR in 2022, an immense volume of freely available C-and L-band SAR data will, hopefully, usher in a new era of forest monitoring from space with SAR data. Our findings suggest that time series methods should be effective at detecting the most intensive selective logging in the Amazon with these data sets. Moreover, if a distinct dry season is characteristic of the study region, focusing on detection during this time frame can bolster detection accuracy by removing false positive detections associated with seasonal rainfall.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.