Application of a Soil Erosion Susceptibility Model Using Unmanned Aerial Vehicle Photogrammetry in a Timber Harvesting Area, South Korea

operation road were determined to be erosion-susceptible areas in the ES map of the XGB model.


Introduction
(7) In these studies, soil surface deformation (SSD) was detected and monitored in timber harvesting areas.13) A recent study revealed that the UAV system has the capability of detecting SSD in a timber harvesting area. (7)The UAV photogrammetric approach was utilized to acquire point cloud data (PCD).2D images were collected monthly by UAV surveys and processed in structure from motion (SfM).The airborne PCD were georeferenced to align all acquired PCD by correcting the coordinate data of stump installed ground control points (GCPs).Hence, to determine the PCD alignment, the elevation difference per pixel was calculated by subtracting the digital surface models (DSMs) acquired monthly.The height difference indicated the SSD acquired in the timber harvesting area, and 3D SSDs were validated precisely with ground truths to monitor the seasonal effects of timber harvesting.
Factors such as soil texture, precipitation, vegetation, topography, and surface cover are related to displacement generation, as revealed by the revised universal soil loss equation utilized in previous studies. (14,15)(21) These previous landslide studies sampled landslides from 3D images derived from remote sensing technology, which were used for independent variables such as terrain, vegetation, and land use maps to classify landslide-susceptible areas from meter-class resolution images.(24) Vegetation covering the soil surface is also related to landslide occurrence by mitigating rainfall energy and was used as the normalized difference vegetation index (NDVI) in landslide susceptibility models. (24,25)Moreover, classification maps such as soil texture and land use maps were utilized as nominal variables to assess erosion vulnerability caused by human activity, which may accelerate landslides.
)(28)(29)(30) Landslide modeling studies have utilized landslide samples for supervised learning techniques, such as random forest (RF), (27,31) extra gradient boost (XGB), (32) logistic regression (LR), (30) support vector machine, (26) and artificial neural network (ANN). (22,24,28)uring the classification process, the sampled landslide cells were trained and tested as target data, also known as independent variables, and the variables from the terrain, vegetation, and cover maps were prepared and tested as dependent variables.Moreover, to overcome the overfitting issues of the micro-target data, cross-validation methods were utilized in the training and testing processes to enhance the classification performance.Subsequently, hyperparameters were tuned to develop a model suitable for the dataset.The overall accuracy (OA) was derived from the classification results of the confusion matrix, while model performance was verified by using F1 and the receiver operating characteristic area under the curve (ROC-AUC) scores. (21,27,31)Related studies mapped the classification results to a GIS environment in the final stages.
Despite the challenging tasks of detecting micro-SSDs from UAV-derived data, (7) we performed additional steps to classify the erosion susceptibility (ES) pixels from the UAV photogrammetric DSM.In this study, we investigated the feasibility of classifying micro-SSDs from 3D soil surface data by developing a classification model with an ES threshold of 5 cm.Moreover, the processes of training and testing millions of 3D data were optimized by tuning the dataset quantities and splitting the datasets in the stratified cross-validation (SCV) process.Finally, the performance characteristics of the developed models were compared and subsequently mapped to the highest-performing model.

Study area
The timber harvesting site, situated in the research forest of Kangwon National University, Republic of Korea (37°46′34.4′′N, 127°49′41.1′′E; Fig. 1), was cleared before employing UAV photogrammetry to detect SDD in the canopy-opened site.In March 2022, timber was harvested on a 3-ha total area.The area has a temperate climate, with summer (June to August) being the wettest season.During the study period, the monthly precipitation averaged 300 mm during the heavy rainy season, and according to the climate, the wettest months were June to August. (33,34)ith elevations ranging from roughly 508 to 628 m above sea level, the site has an average slope of 47%.According to American soil taxonomy, the region's soil is dark brown sandy loam and the soil type corresponds to the Mui series (coarse loamy, mixed, Typic Humudepts).Logging trails were formed at the center and on the right side of the study area.The area's surface is covered by rocks, logging waste, forest soil, and slightly less flora from March through June.The steep slopes and abundant rainfall at the research site provide optimal conditions for runoff and SSD.

2D image collection from UAV system on timber harvesting site
Prior to conducting UAV surveys, GCPs must be installed in order to georeference the 3D data.Owing to the difficulty in finding undeforming objects in the forests, we prepared 40 × 40 cm Fomex texture plates for GCPs to install at stumps.A total of 29 GCPs were installed at recognizable stumps; hence, the Trimble R12i global navigation satellite system (GNSS) collected the center coordinates of the installed GCPs.Parallel to the GCP installation, we also prepared ruler-attached polyvinyl chloride pipes for validation points (VPs).A total of 24 VPs were installed to validate the DSM of difference (DoD), which is used as a dependent variable in ML models.
Aerial 2D images were collected using Matrice 300 (7 kg) of Da Jiang Industry for the platform and Zenmuse H20T (0.82 kg) for the sensor.To avoid risks of crashing with slopes during airborne 2D image collection without real-time kinematic, a GPS-based vertically parallel flight method was performed to acquire a high-resolution DSM of steep slopes. (7)The overlap, margin, and flight speed for automatic UAV flights were set at 90% for the side, 80% for the front, and 5 m/s, respectively.In particular, considering the slope degree of the study area, the flight height was set at 100 and 140 m.All of the UAV surveys were able to collect more than 200 images per survey for the entire study site.

Variable generation for ES model
To analyze 3D data using a ML model, it is necessary to generate terrain variables from the 3D DSM.In previous gully erosion and landslide susceptibility studies, terrain variables were generated from space-and airborne-derived DEMs, and the models developed from preceding studies were able to accurately classify macro-SSDs (e.g., gully erosion and landslides). (21,22,28,35,36)owever, the classification modeling in this study should be performed with centimeter-class resolution to classify micro-SSDs.Thus, the UAV photogrammetric method was utilized to acquire centimeter-class resolution.In this process, all the images collected by the automatic flight method were aligned with the embedded coordinate data from the collected images in Agisoft Metashape Professional version 1.5.1 (Agisoft LLC., Petersburg, Russia).Hence, images generate the PCD through the Sf M algorithm.The Metashape software provides photogrammetric options that can be customized for each process (Table 1).
In photo alignment, feature points were used to calculate the correlation between images through SfM, and tie points and depth maps were generated.In dense cloud generation, PCD are generated from tie points.
Each monthly acquired PCD must be aligned at the lowest possible spatial root mean square error (RMSE) for SSD calculation.(39) The coordinates collected from the centers of all 29 GCPs using GNSS surveyed during fieldwork were imported into the PCD in shapefile format.These GCPs were used to apply the coordinate data and validate the spatial error throughout the process.
The height difference per pixel was calculated using ArcGIS Pro (ESRI Inc., Redlands, CA, USA) from spatially aligned DSMs.In ArcGIS Pro, the raster calculator tool was used to calculate Z (height).In this method, the pixels from the pre-DSM of June 10, 2022 were subtracted from those of the post-DSM of July 9, 2022, and the resolution was considered during the calculation.
As DoD was calculated from the height difference per pixel, validation from DoD was required.To validate the precision of SSD in DoD, the height values at each installed VP were compared with the point values that appeared in the DoD file.The coordinate data acquired from VPs by GNSS were imported as points in shapefile format on the DoD map.The pixel values calculated using the DoD method were then compared with the ground truth data recorded from the field survey and calculated as the RMSE for their assessments.Moreover, the data conditions of slope degree, precipitation, and alignment error revealed that the most precise and understandable average erosion of the total monitoring was the average erosion height from DoD between October and September.According to the calculation of DoD from October to September, the average erosion height was revealed to be 5.13 cm.Thus, the threshold of the ES height for reclassification was set at 5 cm, which resulted from the average erosion level in DoD from October to September. (7)o predict ES from terrain 3D data, the DSM must be converted to its morphological features for analysis.This process was conducted using ArcGIS Pro, and the slope, aspect, TWI, profile curvature (PRC), plan curvature (PLC), and TRI were calculated using the DSM.However, unlike the other independent variables calculated using ArcGIS tools, TWI and TRI should be generated using semi-manual calculations.TWI indicates the water flow on upper slopes, which may contribute to erosion.Therefore, the equation for TWI is where α is the contributing upper slope pixel and β is the slope gradient of the neighboring pixels of the slope.Moreover, TRI indicates the morphological effect of water flow.In ArcGIS, the maximum, mean, and minimum of each input pixel by neighboring pixels of the input pixels in the DSM were antecedently calculated using the "focal statistics tool."Hence, the TRI equation is where FS Mean , FS Minimum , and FS Maximum are the mean, minimum, and maximum elevations of the focal statistics, respectively.All the terrain variables were masked and adjusted appropriately for exact pixel numbers and values at appropriate pixel locations.Subsequently, all maps were exported in the raster format to generate the data frame.
Georeferencing was conducted using GCPs in the Metashape environment.Initially, the GCPs were manually registered in the PCD, and the program automatically selected the corresponding 2D images.The exact centers of the GCPs were manually selected from the 2D images.The center points were manually adjusted and deleted, where relevant, before calculating the GCP centers (distance in cm), which represented the GCP georeferenced errors from June, July, September, and October.

Development of ES model using a classification algorithm
All dependent and independent variable maps were uploaded to the R environment for stacking.The stacked pixel data from the maps were extracted and transformed into the CSV format.Moreover, the datasets for ES analysis were tested by setting an ES threshold of 5 cm for each pixel of the data.The datasets generated from the stacking maps were transferred to the Python 3.10.6environment for faster analysis using the Numpy package.In total, 34088223 stacked pixel data points were imported from the total dataset (10% of the total data), scaled using the min-max method, and randomly sampled without replacement for training and testing.The sample data were trained and tested using the SCV in each of the three classification algorithms during the training and testing processes (Table 2).First, in the SCV process, three sets of trials (5, 10, and 100 splits) were trained and tested using the ML-and statistic-based models.Subsequently, classification models were developed, and the performance of each model was verified using precision and recall values.In the binary classification process, ML must include an algorithm for classification.To classify a large amount of 34-million-pixel DoD SSD data, which is convolutional and has small terrain features in contrast to landslides, strong and precise ML algorithms such as RF and XGB are required.In addition, to assess the applicability of ML models in ER analysis, a statistical algorithm, LR, was used to compare the performance characteristics of ML-based models.

Assessments of ES models
(42) The OA, recall, and precision were calculated using the confusion matrix derived from the classification results (Table 3).OA was calculated as the sum of true negative (TN) and true positive (TP) divided by the sum of TN, TP, false negative (FN), and false positive (FP).Recall and precision were calculated as F1 scores from the harmonic means.ROC was calculated for each classification result and performance using line plots to confirm whether the results were fitted correctly.The TP rate (TPR) and FP rate (FPR) were used to plot ROC with a size of 1 × 1, and AUC was calculated to determine the success rate of the ES models.Finally, all the classification results were applied to the coordinate data and mapped using ArcGIS Pro for visualization.

Dataset for training and testing
From the UAV surveys conducted on June 10 and July 9, 235 and 233 2D images were collected, respectively.The PCD for June and July were generated from the UAV-derived 2D images via a photogrammetric process using SfM.These PCD were georeferenced using 29 GCPs to align the June and July DSM and then filtered manually using the Agisoft software.The alignment of each PCD was processed from the GCP centers of the PCD with a spatial error of 11.1 cm.In the final data processing, DSMs of <2.7 cm resolution were generated from the preprocessed PCD for June and July (Fig. 2).From the validation of the DoD map, the precision of SSD was calculated for 24 VPs with an RMSE of 8.8 cm, which was the difference between the 3D data and ground truth measurements.
The DoDs of July and June were first calculated as numerical data from the process.Hence, to classify ES in the algorithm, the maps were reclassified according to the threshold of 5 cm [Figs.3 and 7(b)].The ES model used the reclassified DoD as the dependent variable.Parallel to the independent variable generation, six terrain variables (slope, aspect, TWI, PRC, PLC, and TRI) were generated from the DSM of June and stacked for ES analysis (Fig. 4).

Model comparisons
Classifications were performed with 10000 samples during training and testing attempts using 5-, 10-, and 100-split SCV sets.In the 5-split SCV set, the XGB model could not appropriately classify erosion-susceptible cells from the total target data.Moreover, because none of the erosion-susceptible cells accounted for most of the target data, overfitting issues occurred mainly in classifying erosion-susceptible cells as none.These issues constantly occurred in the 10-split SCV set, where the early classification performance from the AUC was lower than 0.6, and the performance was not enhanced at the end of the SCV process.Furthermore, the 100-split SCV set showed that the final model performance was slightly improved by 0.03 compared with the 5-split SCV set.
The SCV results of XGB from 10% of the total dataset showed a significantly different performance from the 10000 samples, and the AUC scores increased by almost 0.1.Moreover, the AUC results from the 5-, 10-, and 100-split SCV sets showed no significant difference in model performance; thus, the SCV process was almost meaningless (Fig. 5).
The 10% sample dataset of RF and LR also showed a significant difference in AUC compared with the 10000 samples.The AUCs from the 5, 10, and 100 splits of both models showed no significant differences in the training and testing of 10% of the sampled data (Fig. 6).However, the processing durations of RF and LR were significantly longer than that of the 10000-sample model at 14 and 21 h, respectively.
The OAs of all models were more than 64% for classification accuracy; however, by confirming the F1 score, overfitting issues were found in the developed models.The confirmed F1 scores for the RF, XGB, and LR groups were 0.53, 0.54, and 0.49, respectively.On the basis of the confirmed issues with the confusion matrix, the erosion-susceptible values were classified as much lower than the non-ES values, revealing that all the models had difficulties in classifying erosion-occurring cells.Despite a classification accuracy of 64% for the LR model, it was confirmed that the LR model had a lower F1 score than the RF and XGB models.The SCV durations for each model were 14, 2, and 21 h for RF, XGB, and LR, respectively.As expected, XGB had the shortest duration for the total analysis, whereas LR had the longest duration.Moreover, the differences in AUC between XGB and RF and XGB and LR were 0.03 and 0.07, respectively [Figs.5(a) and 6].XGB was confirmed appropriate for ES analysis and mapping by comparing the three developed models.

ES mapping of study site
On the basis of the model performance, mapping was conducted with the best ES model-XGB.Because overfitting issues were focused on none of the erosion-susceptible cells, none of the erosion-susceptible areas were determined more from XGB than the reference map, which is DoD (Fig. 3).The lower slope of the site was determined by erosion-susceptible areas much smaller than that in the reference map.However, the XGB model could determine the erosionsusceptible areas at the edges of the operational roads in the center and on the left side [Fig.7(a)].The left side of the map was also determined to show erosion-susceptible areas where the surfaces were covered by rocks [Fig.7(a)].The total erosion-susceptible area was calculated to be approximately 40 m 2 , which was different from the reference erosion area of approximately 104 m 2 (Fig. 7).Furthermore, in the ES analysis of the micro-SSD, XGB could determine the wheel tracks detected on the reference DoD (Fig. 8).

Dataset for training and testing
Herbaceous plants grew throughout the monitoring period.Despite the comparable alignment errors in the DSMs from July and June to the steep slope studies, the verifications of SSDs at each plot were only available in the DoD from July to June because of vegetation growth.Moreover, distortion occurred in the DSM on July 9, which was shown as a georeferencing error of 23 cm and revealed to be relatively high compared with the georeferencing error of 12 cm on June 10 th . (7)These spatial issues may not be comparable to the landslide susceptibility studies because the sampling methods of those studies were not derived from DoD but from landslide cells. (24,43)Thus, this relatively high spatial error should have caused AUC to remain at 0.6, even in the 100-split SCV set tests with 10% of the total data.Moreover, we assume that DoD quality is essential for enhancing model performance.
The use of independent variables in this study was based on a case study of landslide susceptibility.In the light detection and ranging (LiDAR) data-based landslide susceptibility case studies, variables were generated from a LiDAR-derived DEM. (22,44)The DEM generated six terrain-independent variables: slope, solar radiation, PRC, PLC, TWI, and upslope drainage area.It was difficult to compare the performance characteristics with validated scores (i.e., AUC), similar to the landslide susceptibility modeling studies, because the AUC from the ANN model was not shown in this study.However, this approach may be called the DEM and DSM generated from the PCD because the LiDAR and photogrammetric data have the same features, resulting in a DSM.Moreover, terrain-independent variables were also generated from the 3D model, comparable to the independent variables in this study.
)(47) However, owing to the classification of the ES of the 2.5-ha area of the photogrammetric data, utilizing the vegetation, land cover, and distance features has limitations in generating features from 2D images.Moreover, it is impossible to generate precipitation features because the measurement from the station covers almost a kilometer, which is incomparably overscaled from our site area of 2.5 ha.

Performance characteristics of ES models
The XGB models could not perform the classification well compared with the landslide susceptibility models categorized in the ML approach study. (42)The AUC performance from the recent landslide study was presented as AUC and OA of 0.86 and 78% for the k-nearest neighbor model, 0.87 and 79% for the ANN model, and 0.89 and 80% for the RF model, respectively, which are comparably higher than those of our XGB models.However, these preceding models consisted of 4-18 more variables than did the XGB model.Furthermore, the variables used in the previous study were mainly NDVI, land cover, lithology, distance maps from roads, and hydrology.However, the UAV photogrammetry process used in this study did not generate these variables.Thus, it is more appropriate to compare our attempts with the terrain variables derived from the LiDAR-DEM study. (22,44)The DEM-adopted studies attempted to investigate landslide susceptibility using slope, TWI, PRC, PLC, and solar radiation variables.The model performance is not shown for the variables derived from the LiDAR-DEM.
Although the resolution of the original data used in the ML studies was meter-class, MLbased spatial classification studies were not able to obtain excellent validation scores. (22,44,48)hus, the resolution of the 3D data may not be the reason for the low model performance.However, according to Lidberg et al., (48) who varied the resolution of the variables, the developed model showed a significant difference in terms of variable importance.In this study, we showed that the 24-m-resolution DEM-based TWI impacted the OA of the model by ~50%, whereas the 48-m-resolution DEM-based TWI showed 30%, which is lower than that of the 24-m-resolution DEM-based TWI.From the comparison of our model with that of Lidberg et al., (48) the results may indicate that the variables derived from the 3-cm-class DSM used in this study may have impacted the model performance.We suggest that further research should be conducted to confirm the applicability of micro-resolution DEM-derived variables in the training and testing of high-resolution ES models.
Consequently, the research direction of the ES analysis in this study was appropriate because the terrain variable selections and generations were similar to those in previous studies.Despite utilizing the SCV method, the limitation of this study was that the variables used in the classification process were insufficient for analyzing ES (AUC of 0.63).Thus, the AUC of XGB did not correspond to the acceptable model standard, which required an AUC of 0.7 or higher.It may also be speculated that the number of target erosion-susceptible cells was too small to be trained and tested using the algorithm, even when a powerful algorithm was used in the model.
The LR model, which is a statistical model, could not appropriately process binary classification from the trained target data.The average AUC from the LR model was 0.56, a poor model for ML studies. (42,46)This result may be attributed to the characteristics of the LR algorithm.Previous ML studies revealed that single-regulated training and testing in the LR algorithm may not be appropriate for verifying a large dataset.The RF models, which are ML models, could process binary classification better than the LR models because of the multidecision tree process in the RF algorithm.(51) However, a comparison of the LR and RF performance characteristics from previous studies revealed that the RF results showed a higher performance than the LR results.(53) From this and previous studies, it may be assumed that the LR algorithm processes the dataset only once by regulating the classification process in the sigmoid function.On the basis of this characteristic of the LR algorithm, we assumed that the LR model has a theoretical limitation in classifying a large spatial dataset compared with the RF model.
The duration issues were probably due to the characteristics of the RF and LR algorithms.In the theoretical approach of each algorithm in RF studies, the algorithm processes a multidecision tree at each bagging process without a boosting process, unlike the XGB model.Moreover, previous studies have shown that XGB has a higher model performance than RF.Thus, in comparison with this study and related studies, we assumed that the boosting process in the XGB algorithm may have resulted in a higher performance in terms of the duration and AUC of the XGB model than the RF model in large spatial data processing.

Erosion-susceptible areas classified by XGB model
Here, we were able to perform the entire process of data acquisition by UAV photogrammetry for the prediction of ES using ML-based classification models.The threshold was based on the average erosion obtained in the precise DoD map for September to October. (7)In a previous study, the DoD map for September to October was acquired with the lowest alignment error of 3 cm using 24 GCP centers.Moreover, the reliable average one-month erosion level was calculated as 5.13 cm on the total slope.Thus, a threshold of 5 cm was appropriately considered on the basis of the DoD for July to September. (7)The XGB model appropriately analyzes and detects high-ES regions related to topographic features similar to landslide models. (22,27,54)However, the AUC of our XGB model was lower than that of previous studies.We assumed that spatial distortion occurred during the data acquisition in July.
Moreover, the training and testing samples were derived from the total area, which differed from the landslide samples.The sampling method used to classify landslide susceptibility was based on selecting areas where landslides occurred and did not occur, which differs from that used in this study.Thus, we hypothesize that the sampling method can be improved by selecting specific erosion-occurred cells for appropriate samples in future studies.

Conclusions
In this study, we mainly investigated the feasibility of classifying ES, similar to the results from DoD.Although DoD was calculated from the high alignment error of the DSM to develop the ES model, ML models could be used to classify the ES from the terrain variables to the target data.The ML models, XGB and RF, performed better than the LR statistical model when comparing the AUC and time duration results.However, utilizing the SCV method for classifying the target data did not enhance the model at an acceptable performance level.Moreover, despite using 10% of the total data, the AUCs from XGB and RF were only more than 0.6, which is unsuitable for classification model performance and should be increased for necessary applications.We concluded that our method may not be practically used yet for immediate forest management or establishing efficient forest management practices.Regardless of the confirmed issues, XGB mapping sensitively classified the actual erosion-susceptible areas of the wheel tracks and the edges of the logging tracks at the study site.In future work, we assume that the model should be enhanced by adding vegetation maps from multispectral sensors and a lithology map from soil texture analysis.Moreover, the target data should be significantly improved by utilizing real-time kinematic to reduce alignment errors and distortions or enhanced sensors such as LiDAR.

Fig. 1 .
Fig. 1. (Color online) Location of timber harvesting area of experimental forest of Kangwon National University.

Fig. 2 .
Fig. 2. (Color online) DSM of June 10 and July 9 from UAV photogrammetry method.DSM derived from the UAV survey on (a) June 10 and (b) July 9.

Fig. 3 .Fig. 4 .
Fig. 3. (Color online) DoD calculated from the subtraction of DSM acquired in July and June.

Fig. 7 .
Fig. 7. (Color online) Comparison of XGB and target data.(a) DoD by mapping of enhanced XGB model.(b) DoD map of ES threshold at 5 cm.

Fig. 8 .
Fig. 8. (Color online) Identification of ES at wheel and logging tracks in the timber harvesting area from the XGB map.

Table 1
Parameters used in each process to generate 3D data.

Table 2
Hyperparameters utilized in classification models.