Assessing the reforestation effects of plantation plots in the Thai savanna based on 45 cm resolution true-color images and machine learning

In forest restoration projects, assessing progress is critically important. This study was conducted to quantify the restoration progress for approximately 30 year old plantation plots of Acacia and Pterocarpus in Sakaerat, Thailand. For 22.5 × 22.5 m areas in the plantation plots, as a continuous measure of forest restoration, dry evergreen forest-likelihood was determined using 45 cm resolution satellite images and an artificial neural network architecture. The Acacia plot achieved a moderate mean evergreen forest likelihood of 0.451 (April 2016) to 0.679 (May 2019), but the values for the Pterocarpus plot were 0.076 or smaller. Two other Pterocarpus plots at cooler and moister sites had mean evergreen forest-likelihood values of up to 0.884 (April 2016), which were significantly greater than that (0.451) for the Acacia plot, but the values dropped in May 2019. Throughout the period of the four image acquisition times from March 2014 to May 2019, the plantation plots had significantly smaller evergreen forest-likelihood than the dry evergreen forest did. The current approach would be a helpful option for stakeholders of forest management by applying sub-m-resolution images and machine learning followed by quantification of the forest restoration effects relying on the appearance and texture of tree canopies at multiple data acquisition times.


Introduction
Forest ecosystem conservation is one of the goals of sustainable socio-economic development [1]. For sustainable development, retaining or restoring forest ecosystem services, such as flood prevention, is an important challenge [2]. Assessing forest ecosystem restoration is important in driving forest management projects [3] because long-term management requires interim assessments for fine tuning of the projects [4]. Various methods are used for assessing forest ecosystem restoration. Taking a plant community census [5], observing animals [6] and measuring soil properties [7] are commonly applied in situ methods, and remote sensing of forests is also widely applied. Camarretta et al [8] systematically reviewed multiple aspects of remote sensing for forest restoration monitoring such as passive and active sensors. The observed aspects include plant community structure, diversity and complexity. The advantages of remote sensing, typically reliant on satellites, are the minimal cost and broad geographic coverage, while the disadvantages are the limited number of observed aspects and the lack of in situ data [9].
An important challenge for remote sensing of vegetation, including forests, has been to accurately classify different vegetation types. Accuracy has been improved by relying on certain factors such as spatial resolution. Sensors carried by some satellites can acquire high-resolution image data while covering wide areas, and the high resolution may overcome limitations caused by low resolution [10,11]. For example, the Landsat 8 satellite provides 30 m resolution images in which single pixels represent 30 × 30 m square areas. The 30 × 30 m square areas contain objects such as canopies with diameters of typically up to several meters. These objects may be precisely detected by sub-m resolution sensors such as that carried by the WorldView-3 satellite. Thus, by adding precise information on particular objects within the 30 × 30 m square areas, more precise differences among these areas can be detected when compared with the 30 m resolution sensors [12]. Li et al [13] found that true-color Google Earth images with 0.35-0.60 m resolution improved the accuracy of classifying different landcover types when used with 30 m resolution Landsat image data.
For accurate classification of different vegetation types, in addition to high-resolution images, machine learning has emerged as a helpful tool [14]. Today, various machine-learning architectures are available. Support vector machines, extreme gradient boosting, random forests and deep learning have been successfully applied in the classification of vegetation types [15]. Previous studies have reported successful combinations of high-resolution images and machine learning in the classification of vegetation types [16].
These successful examples of classifying vegetation types suggest the possibility of generating continuous measures for indicating the status of forests, such as the current stage of deforestation/reforestation.
Thus, the combination of high-resolution images and machine learning is expected to help the stakeholders of forest management by providing continuous measures of forest status. So far, a limited number of continuous measures of forest status have been developed. Using a remote sensing method of laser imaging detection and ranging in combination with machine learning, maps of the vegetation heights of a west Slovenian region [17] and Texas [18] were made. Recently, Wagner et al [19] efficiently detected Cecropia trees in Brazilian forests by using 30 cm resolution true-color images acquired by the WorldView-3 satellite and applied a machinelearning architecture [20] to quantify the disturbance intensity of forests based on the distribution of Cecropia.
This study was conducted to quantify the effects of plantation tree species on ecosystem restoration in Sakaerat, Thailand where the natural vegetation is tropical dry evergreen forest [21]. True-color images of the plantation plots at 45 cm resolution were split into 50 × 50 pixel images. For these images, tropical dry evergreen forest-likelihood was determined by relying on an artificial neural network architecture that generated an evergreen forest-likelihood measure. Values of dry evergreen forest-likelihood for the plantation plots and the dry evergreen forest were compared to values of the normalized difference vegetation index (NDVI) [22] as a well-established reference measure of live green vegetation. Then, unexpectedly poor restoration effects at a plantation site were found, and the causes were further analyzed by involving the above approach and the normalized difference water index (NDWI) [23] as a measure of moisture in plant biomass. To the best of the author's knowledge, this is the first study to report the intensity of forest ecosystem restoration by using a continuous measure generated by high-resolution images and machine learning based on the appearance and texture of canopies.

Methods
The entire workflow is shown in figure 1. The following subsections describe the details.

Site description
The Sakaerat biosphere reserve in Wang Nam Kiao district, Nakhon Ratchasima, Thailand (14 • 30 ′ N, 101 • 55 ′ E) has an area of 41 815 hectares [21,24] (figure 2). The climate is classified as savanna [25] and the dry season generally starts in November and ends in late April. At a site 389 m above sea level, there is a management office [26] where the mean annual temperature is 26 • C. Precipitation at the closest meteorological station was made available by the Royal Irrigation Department, Ministry of Agriculture and Cooperatives, Thailand (figure 3). The biosphere was established in 1967 for dry evergreen forest conservation and reforestation experiments. At that time, most of the area had already been disturbed by human activities, primarily slash and burn shifting cultivation [27]. The natural vegetation type is tropical dry evergreen forest mainly dominated by Hopea ferrea and Shorea species that form the upper story 20-40 m above the ground [28]. As the second figure in Trisurat and Duengkae 2 [24] precisely indicates, some parts of the biosphere reserve retain the dry evergreen forest, while the rest has plantation plots that were established on the areas subjected to anthropogenic disturbance. The plantation project started in 1982 and the plantation area was extended to 3762 ha in 2002 [29]. The introduced tree species were Acacia species, Eucalyptus camaldulensis, Pterocarpus macrocarpus, Xylia kerrii and others [30]. The trees were systematically planted in uniformly spaced rows.
In this study, dry evergreen forest-likelihood values of Acacia auriculiformis and P. macrocarpus plantation plots in figure 2 were quantified (table 1). The Acacia plot was established in 1989 at an inter-tree distance of 2 × 3 m. Pterocarpus plots 1, 2 and 3 were established in 1987, 1986 and 1986 at intertree distances of 2 × 4 m, 2 × 2 m and 4 × 4 m, respectively. As sources of reference images, dry evergreen forest, plantation plots with multiple tree species and bare ground were selected because a previous remote-sensing study suggested that they would have dry evergreen forest-likelihood values of approximately 1.0, 0.5 and 0, respectively [31]. The plantation plots of multiple tree species were established in 1986 by planting juvenile trees of A. auriculiformis, A. mangium and Lithocarpus lindleyanus at inter-tree distances of 2 × 4 m. Bare ground, devoid of vegetation, was located in and around the biosphere reserve. The plantation plots and vegetation types were on gentle slopes (< 5 • ).     Table 2 Reference images (Google) (dry evergreen forest, multiple species plantation, bare ground) The downloaded images consisted of red, green and blue grayscale layers and were saved as portable network graphics files. Every single image included the Acacia plot, the P. macrocarpus plot or one of the three reference vegetation types. The spatial extent of the Acacia plot was larger than the coverage of captured images. For this reason, the image of the entire plantation area was prepared by merging four rectangular images. The generated shape represented an area that included a large portion of the Acacia plot. This area will be, hereafter, referred to as the Acacia+ area. In the image, the area representing the plantation plot to be analyzed or the reference vegetation type was cropped using the Adobe Photoshop software while relying on maps of the plantation plots and visually perceivable dirt roads as the borders between the plantation plots. The cropped image area was divided to generate 50 × 50 pixel images. Among these images, those that contained logos of the distributors or Google were eliminated. The numbers of 50 × 50 pixel images used in this study are summarized in table 1.

Machine learning
The 50 × 50 pixel images representing bare ground were labeled as 0 because this is the most degraded vegetation type in the area and thus has no likelihood of becoming a natural evergreen forest.
The plantation plots of multiple tree species fall between the evergreen forest and the most degraded bare ground categories. Therefore, the images of the plantation containing multiple tree species were labeled as 0.5, while those of the dry evergreen forest were labeled as 1.0. A neural network architecture was constructed using Sony Neural Network Console version 1.2.0 (Sony, Tokyo, Japan). First, a deep-learning architecture [32] was prepared. In determining values of evergreen forest-likelihood for the 50 × 50 pixel sample images, the Acacia and Pterocarpus images to be analyzed were labeled as 0.5. The default settings were used in the training and validation, except that the number of epochs was 10 000 and the random architecture search was selected. For training or validation, half of the 996 (2016 and 2019) to 1006 (2018) reference 50 × 50 pixel images were used. The software ran the deep-learning architecture, then tried different architectures derived from the deeplearning architecture. After examining 1000 architectures, the most suitable architecture was specified in terms of the error. For determination of dry evergreen forest-likelihood for the Acacia or Pterocarpus sample images, 20 Acacia or Pterocarpus images were added to the 498-503 validation images.

Normalized difference vegetation/water index and land surface temperature
As well-established indices of live green vegetation and moisture in plant biomass, this study employed the NDVI [22] and the NDWI [23], respectively. Geo-tagged images acquired by the Landsat 8 Operational Land Imager were downloaded from the Earth Explorer site of the USGS (VA, USA). The images were level-2 products to which had been applied geometric and radiometric corrections. By using the following equations, the indices were determined for the 30 × 30 m resolution pixels acquired by the Landsat 8 satellite when there were no cloud effects over the Sakaerat area: where red and near infrared are reflectance measurements for red (640-670 nm wavelength, band 4) and near infrared (850-880 nm wavelength, band 5), respectively.
where near infrared and shortwave infrared 1 are reflectance measurements for near infrared and shortwave infrared 1 (1570-1650 nm wavelength, band 6), respectively. The images acquired on 11 Land surface temperature maps of the Sakaerat area were prepared using band 10 (10.6-11.19 µm wavelength) level-1 product images acquired by Landsat 8 and have undergone terrain correction and ground control [33] together with the NDVI images [34].  The above calculation was performed using QGIS 3.12 (Free Software Foundation, Inc., MA, USA). To locate the evergreen forest and plantation plots in the prepared NDVI, NDWI and temperature images, these images were overlapped by the Landsat red-green-blue and the 45 cm resolution true-color images using the QGIS software and referring to (dirt) roads, ponds and other markers. Within the areas representing the evergreen forest and plantation plots, values of NDVI, NDWI and land surface temperature were read and used.   cluster size was 10 pixels, the first critical distance was a 50 Euclidean distance and the other critical distances were 100 Euclidean distances. The computation stopped at 99% convergence when no more than 1% of the pixels were differently classified. The calculation was based on 24 Landsat image datasets acquired between 29 April 2013 and 28 February 2020 with no cloud effects. For a single data acquisition time, Shannon diversity was determined based on the following equation:.
where Pi is the proportional abundance for the ith pixel cluster.

T-test for comparison of means of evergreen forest-likelihood
Dunnett's T3 t-test was performed to evaluate the significance of the observed differences among means of evergreen forest-likelihood for the plantation plots and the dry evergreen forest. SPSS 10.0.1 (SPSS Inc., IL, USA) was used for the calculation.

Neural network architecture
Among 1000 neural network architectures, the best architecture was found as shown in figure 4. The computation first intakes 8-bit grayscale intensity values for the red, green and blue grayscale layers of the 50 × 50 pixel image. Average pooling [36] was first applied to average the grayscale intensity values for the column consisting of 12 straight pixels in the single grayscale image. This generates a 4 × 50 pixel image for each of the red-green-blue color components. Next, the hyperbolic tangent activation function was used for processing the grayscale intensity values to convert them to values between −1 and 1 in the hyperbolic tangent-S-shaped sigmoidal manner. Likewise, the generated grayscale intensity values were further processed by average pooling, leaky rectified linear unit or hyperbolic tangent function activation and batch normalization [37] before undergoing an affine transformation and generating a single value of evergreen forest-likelihood. A preliminary examination was made using the neural network architecture in figure 4 and the 50 × 50 pixel reference images extracted from the 45 cm resolution true-color images. Some changes were made in the number of intensity-averaged pixels in average pooling, the value of alpha in the leaky rectified linear unit step and the sequence of the components. However, no other architectures exceeded the architecture when examined with the 2014-2019 true-color 45 cm resolution image datasets. Overtraining [14] was quite negligible as very well-overlapping of the training and validation error curves indicate. Thus, the architecture was used throughout the determination of dry evergreen forest-likelihood for the 50 × 50 pixel sample images.
Another preliminary examination was made using 20 completely black or white images instead of 20 sample images. The error significance in determining evergreen forest-likelihood for the reference images was quantified using the following equation: Mean absolute change for a reference vegetation type = Σ evergreen forest-likelihood (with no white/black images) − evergreen forest-likelihood (with 20 white/black images) number of images for the reference vegetation type , and white images, respectively, seemingly because of the darkness of the evergreen forest canopies and the brightness of the bare ground surface, as described later.

Dry evergreen forest-likelihood for the forests
Reference images of the three vegetation types clearly differed in evergreen forest-likelihood ( figure 5). Differences in dry evergreen forest-likelihood were also seen among the reference images of the dry evergreen forest ( figure 6). The images determined to be the most evergreen forest-likely had shadow [38] uniformly distributed throughout them.  images of multiple tree species and bare ground have comparable variance with the others as figure 5 indicates.
Maps of dry evergreen forest-likelihood for the Acacia plot are depicted in figure 7. The dry evergreen forest-likelihood tended to be comparable to that of the reference plantation plot of multiple tree species (figure 5). The distribution pattern changed among the four data acquisition times, indicating various tree species that differently respond to precipitation and other changes. The values of evergreen forest-likelihood were significantly lower (p < 0.001) than those for the evergreen forest. The Acacia plot was still recovering the original components of the dry evergreen forest. Different parts of the Acacia+ area apparently had different dry evergreen forestlikelihood values. The southwestern part has some native evergreen tree species [31], which resulted in markedly greater evergreen forest-likelihood than the other parts. The interface between the southwestern part and the Acacia plot is still clear. Along the interface, there is a shallow valley, which must make it difficult for the evergreen tree species to expand their range beyond the very gentle slope and into the only slightly higher adjacent eastern area dominated by drought-tolerant tree species. This interface between the areas suggests the stagnant vegetative succession of the Acacia area. In 2016, the Acacia and Acacia+ areas had relatively low evergreen forestlikelihood largely due to loss of leaves (   some wide canopy gaps that reveal the soil. The soil-exposing appearance of the Pterocarpus 1 plot images is likely to have resulted in the low evergreen forest-likelihood. When compared to the Acacia, Acacia+, and multiple species plots, the Pterocarpus 1 plot was further behind in forest ecosystem restoration. In 2018, there were still many P. macrocarpus trees found in the Pterocarpus 1 plot.

Relationship between dry evergreen forest-likelihood and NDVI
A relationship between dry evergreen forestlikelihood and NDVI is depicted in figure 1. When the reference dry evergreen forest and the Acacia and Pterocarpus plots were compared, the dry evergreen forest was very stable in terms of changes in the indices over the four data acquisition times. Its NDVI values were close to 0.83, which is one of the greatest among tropical forests [41]. Thus, the dry evergreen forest was retaining stably high physiological activity throughout the period. A comparable value of NDVI, representative of the physiological activity of the trees, was indicated for the Acacia plot in 2019. However, in 2014/2018, Acacia plot trees exhibited lower physiological activity, which was still greater than that of the 2016 Acacia plot trees. This trend was similar to changes in evergreen forest-likelihood ( figure 7). The Pterocarpus 1 plot also indicated a similar relationship between the indices, the greatest evergreen forest-likelihood and physiological activity in 2019, which was greater than that in 2014. Among the compared forests, the 2016 Pterocarpus 1 plot was the least evergreen forest-likely in appearance and the physiologically most inactive, as recognized in figure 8. The NDWI revealed that, in April 2016, the Acacia plot was drier than at the other data acquisition times (figure 11) and the trees experienced reduced physiological activity, exhibited by the loss of many of their leaves ( figure 8). Values of the NDWI for 2014 and 2016 differed by approximately 0.2. This difference was much smaller, around 0.1, for the dry evergreen forest, again, indicating environmental stability. Moisture-wise, the Pterocarpus 1 plot experienced the most severe 2016 drought effect, as confirmed visually. The sub-zero value of the NDWI indicates that the drought was very harsh even for the Pterocarpus trees. Negative values of the NDWI seldom occur, even in summer in southern Italy where oak trees (Querques species), known for their fitness in dry environments [42], suffered considerably in the 2017 climate anomaly [43].

Plant community diversity before and after April 2016
Based on the clustering of the Landsat pixels, changes in values of the diversity indices did not show any evidence that the 2016 drought ejected droughtsusceptible trees ( figure 12). The results imply that the Acacia+ area, especially the Acacia plantation plot, has been halted at a stage of vegetative succession towards dry evergreen forest as the climax. From this viewpoint, the Pterocarpus 1 plot seemed to have a similar tendency, but the progress in vegetative succession was much slower (figures 8 and 9). The cause of the stagnant succession in the Acacia plot may be drought-deciduous tree species whose survival strategy is to retain green leaves only when soil moisture is adequately available, meaning that they will lose their leaves if the environment becomes too dry [44]. Because of this strategy, drought-deciduous tree species are strong generalists [45] that survive over a wide range of dry-moist gradients. Unfortunately, Thailand is subject to drought every few years [35]. The periodical dryness was thought to be a cause of the seemingly stagnant succession in the Acacia plot.

Is Pterocarpus macrocarpus useless in Sakaerat?
According to Marod et al [46], P. macrocarpus allelopathy has not been reported, and this species is likely to be easily outcompeted by shade-tolerant tree species during the ecological restoration in the region. The Pterocarpus 1 location might be unsuitable for further succession. To investigate this possibility the Pterocarpus 2 and 3 plots were analyzed (table 2). According to the 2016 values, the Pterocarpus 2 and 3 plots achieved ecosystem restoration to a greater degree than did the Pterocarpus 1 and Acacia plots. Figure 13 shows that, in 2016, some parts of the Pterocarpus 3 plot had greater evergreen forest-likelihood than did those of the most evergreen forest-likely parts of the evergreen forest ( figure 6). In 2019, however, the Pterocarpus 3 plot was revealed to be less evergreen forest-likely than in 2016 ( figures 13 and 14). Also, the evergreen forestlikelihood of the Pterocarpus 2 plot dropped in 2019. A possible explanation of these changes is that the Pterocarpus 2 and 3 plots contained some native and non-native tree species from the evergreen forest of Sakaerat.
The Pterocarpus 1 plot was extremely dry on 5 April 2016 (figure 15) when dryness in the Pterocarpus 2 and 3 plots was much milder. As figure 11 shows, the Pterocarpus 1 location is not necessarily disadvantageous for water acquisition, but another difficulty was heat ( figure 16). On 5 April 2016, the Pterocarpus 1 plot was 6 • C hotter than the Pterocarpus 2 and 3 plots. Heat flux could come up from the northeastern area [47]. The large distance from the dry evergreen forest is another disadvantage for the Pterocarpus 1 plot because seeds of H. ferrea and Shorea species are relatively large among those dispersed by birds [48] and large mammals that move long distances are rare in the Sakaerat area. The Pterocarpus plot 1 seems to be a too difficult place for the native evergreen tree species, which rely on sufficient moisture and cooler air, while excessive sunlight per se (figure 8) should not be a barrier [49].

Technical aspects
The 45 cm resolution true-color images used in this study were somewhat crude. The crudeness is especially indicated in figure 6 which shows obvious changes in brightness and other radiometric intensity among the data acquisition times. The 2018 images are dark and blueish, while the 2019 images are more greenish than those acquired in the other years. These differences, due to a lack of radiometric corrections, could be a disadvantage for the current neural network approach. To minimize such possible issues, using reference vegetation types for which their nature is known provides the solution. A technique for radiometric correction may be an effective option if there are standard color-invariant objects such as rooftops [50]. Other sources of error such as a lack of ortho-rectification could also have an impact though the images used in this study did not have perceivable geometric inconsistencies among them when they were overlapped with each other. Using the maps of plantation plots and referring to the dirt roads that border the plantation plots, the plots were easy to locate. But these conditions do not always exist. If geometric errors are significant, QGIS or similar software is available for correcting such errors [51].
A significant advantage of the current approach is determining dry evergreen forest-likelihood based on only the red, green and blue layers. The 45 cm resolution true-color images do not include any information on invisible bands. However, the precise information on the appearance and texture of forest canopies was shown to be helpful, as previously reported [19]. The 45 cm resolution true-color and satellite images were available at a coarse temporal resolution, but images at even more precise temporal and spatial resolution can be acquired at present. The combination of a drone and a digital camera is commonly available to acquire red-green-blue truecolor images of forests and other vegetation [12]. The application of the current approach to generate continuous measures of plant community status can be extended to other objects including crops in agricultural fields by using precise images and the capability of accurately discriminating among canopies, as suggested in recent review articles [52,53].
The use of multiple images acquired at multiple acquisition times is favorable for concluding the succession progress of a forest. As figures 13 and 14 show, the Pterocarpus 3 plot could be over-estimated if users see only the 2016 images. Values of mean evergreen forest-likelihood for the Acacia plot and the others analyzed in this study fluctuated from time to time. The intra-plot variation of vegetative succession and the frequency distribution profile for statistical analyses helped confirm the vegetative succession of the Pterocarpus 2 and 3 plots, which were concluded to be still under vegetative succession towards the climax. It is estimated to take 50 years or even more for a degraded tropical area to be completely restored in terms of its natural vegetation [54]. The plantation plots analyzed in this study are approximately 30 years old. After a few more decades, it will be worthwhile to monitor the Acacia and Pterocarpus plots to determine if they have acquired greater similarity to the dry evergreen forest.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary information files).