Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data

Lin, Haoran; Liu, Xiaoyang; Han, Zemin; Cui, Hongxia; Dian, Yuanyong

doi:10.3390/app13084911

Open AccessArticle

Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data

¹

College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, China

²

Hubei Forestry Investigation and Planning Institute, Wuhan 430079, China

³

Hubei Academy of Forestry, Wuhan 430075, China

⁴

Hubei Engineering Technology Research Centre for Forestry Information, Huazhong Agricultural University, Wuhan 430070, China

⁵

Key Laboratory of Urban Agriculture in Central China, Ministry of Agriculture, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(8), 4911; https://doi.org/10.3390/app13084911

Submission received: 17 March 2023 / Revised: 10 April 2023 / Accepted: 12 April 2023 / Published: 13 April 2023

(This article belongs to the Special Issue Spatial Information Technology in Forest Ecosystem)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate identification of forest tree species is important for forest resource management and investigation. Using single remote sensing data for tree species identification cannot quantify both vertical and horizontal structural characteristics of tree species, so the classification accuracy is limited. Therefore, this study explores the application value of combining airborne high-resolution multispectral imagery and LiDAR data to classify tree species in study areas of different altitudes. Three study areas with different altitudes in Muyu Town, Shennongjia Forest Area were selected. Based on the object-oriented method for image segmentation, multi-source remote sensing feature extraction was performed. The recursive feature elimination algorithm was used to filter out the feature variables that were optimal for classifying tree species in each altitude study area. Four machine learning algorithms, SVM, KNN, RF, and XGBoost, were combined to classify tree species at each altitude and evaluate the accuracy. The results show that the diversity of tree layers decreased with the altitude in the different study areas. The texture features and height features extracted from LiDAR data responded better to the forest community structure in the different study areas. Coniferous species showed better classification than broad-leaved species within the same study areas. The XGBoost classification algorithm showed the highest accuracy of 87.63% (kappa coefficient of 0.85), 88.24% (kappa coefficient of 0.86), and 84.03% (kappa coefficient of 0.81) for the three altitude study areas, respectively. The combination of multi-source remote sensing numbers with the feature filtering algorithm and the XGBoost algorithm enabled accurate forest tree species classification.

Keywords:

different altitudes; multispectral image; LiDAR; machine learning; tree species classification

1. Introduction

Forests are a major component of terrestrial ecosystems and are closely connected to human living environments, playing an irreplaceable role in water conservation and biodiversity protection [1,2]. Accurately identifying forest plants and obtaining plant information can establish the foundation for the research and utilization of forest ecosystems [3]. The traditional method of identifying tree species relies primarily on human surveys, which are difficult, inefficient, and imprecise due to the complex conditions in the forest understory [4]. Therefore, the rapid and accurate acquisition of forest plant information to unleash the multiple functions of forests is a prerequisite for achieving the multi-objective sustainable management of forest ecosystems.

Remote sensing technology has long been considered a good solution to the problem of tree species classification. Remote sensing data acquired from different platforms and sensors are extensively used in this field of research [5]. The emergence of high-resolution satellite and airborne remote sensing systems has provided image data with rich spatial, color, and texture information, leading to the improved fine classification of tree species [6]. Different tree species possess unique spectral information due to their distinct structures and morphologies [7]. Even under the same environmental conditions, trees in different growth stages or health states can differ in spectral information [8]. Ferreira et al. [9]. classified eight species of tropical forest trees by quantifying the spectral differences of their canopies. Xu et al. [10] used only UAV multi-spectral features extracted from spectral image data to classify eight dominant tree species in the southern part of the Ronggu Turnip Nature Reserve, Yunnan Province, China, using a random forest classifier and proved that the spectral features of remote sensing images are useful for tree species classification. Dong Yuan et al. [11] selected the canopy spectra of five tree species from the USGS spectral library and extracted 11 vegetation indices for spectral difference analysis, showing that this method can identify tree species. However, hyperspectral data contain excess information, leading to the “Hughes phenomenon”, which can affect classification results. Therefore, it is necessary to process hyperspectral data for dimensionality reduction [12]. Although aerial remote sensing images provide rich spectral and texture information, they are insufficient for quantifying the vertical structural attributes of tree species.

Airborne light detection and ranging (LiDAR) technology is characterized by fast data acquisition efficiency and high accuracy and has unparalleled advantages for measuring the vertical structure and physical and chemical characteristics of the forest canopy [13]. This technology can also be used as a new data source for tree species classification [14]. In recent years, researchers have explored many applications for monitoring forests using airborne LiDAR data. Gao et al. [15] combined airborne LiDAR and backpack LiDAR data to extract single wood parameters and fit volumetric models for tropical plantation forests. The results showed that the combination of the two data sources had a strong ability to estimate single wood parameters and fit models. Zhao Feng et al. [16] inferred single-wood tree heights with the help of aerial imagery and airborne LiDAR data. The average estimation accuracy was as high as 74.89% compared to the actual measured tree height. Despite the advantages of LiDAR data in quantifying the structural parameters of forest species and single-wood segmentation, a study showed that the results are susceptible to the influence of the point cloud data density [17]. Furthermore, due to its working principle, LiDAR technology is unable to provide vegetation-level spectral information, which presents a significant disadvantage in the application of tree species classification.

With the development of remote sensing technology, fusing multi-source remote sensing data for tree species classification has become an important trend in the current study of vegetation classification [18]. High-resolution remote sensing image data and LiDAR data have their own data characteristics and advantages in tree species classification, respectively [19,20]. Many studies have demonstrated that combining the characteristics of the two data sources for tree species classification has resulted in greater accuracy than using a single data source [21,22]. In general, the two datasets combine information on the vertical and horizontal characteristics of tree species, and through this combination, the differences between tree species can be more easily identified, thus improving the accuracy of tree species classification [23]. Additionally, most aerial integrated remote sensing systems have both LiDAR and multispectral camera sensors, which can acquire point cloud and image data of the same area in real time, providing great convenience for later utilization. In existing studies, some scholars have achieved better classification results by combining machine learning classifiers after feature screening using both data sources [24]. Gaoxia et al. [25] used the above method to achieve the classification of five dominant tree species in Changshu National Forest Park, Jiangsu Province, with an accuracy of 84%. They showed a higher accuracy of 91.3% in classifying forest types using this method. Similarly, Shen et al. [26] achieved the classification of five tree species in a subtropical forest based on this method, with a single wood segmentation accuracy of 82.9%, and the classification also provided relatively high accuracy. However, this method does not fully utilize the features of both data sources, and for denser forest types, it is difficult to perform accurate single-wood segmentation of vegetation using LiDAR point cloud data [27]. Therefore, adequate feature fusion and screening are necessary to improve the classification accuracy of tree species.

Species diversity varies at different altitudes due to the environmental conditions and changes in the vegetation response [28]. In a study by Xu et al., it was shown that the species richness and diversity of forests peaked at medium altitudes and exhibited a decreasing trend with an increasing altitude. The impact of altitude on classification accuracy is expressed through differences in the tree species among altitudes. Therefore, it is important to study the use of multi-source remote sensing data to classify forest vegetation at different altitudes and obtain vegetation information.

To summarize the aforementioned study, it is evident that combining high-resolution remote sensing images and LiDAR data for tree species classification can provide higher accuracy in identifying tree species than using single data features alone. However, most existing studies have focused on single species or forests with simple stand structures. In multi-species forests with complex stand structures, the fine classification of tree species is challenging, and the classification accuracy is often low. Furthermore, classifying tree species in complex forest environments at different altitudes poses an even greater challenge.

Therefore, the purpose of this study was to explore the responsiveness of different forest community tree species to multi-source remote sensing data features in different altitude regions and to analyze the effectiveness and differences of different machine learning classifiers combined with optimal features for tree species classification in three altitude regions. This study aimed to address the problem of the insufficient ability of a single data source in classification applications and to clarify the impact of different species diversity caused by altitude on tree species classification. Optimal classification features were selected for each altitude, and species classification was performed using the optimal machine learning classifier. Three different altitude regions (1600 m, 1800 m, and 2000 m) in Muyu Town, Shennongjia Forest Area were selected as the study area. Airborne LiDAR and high-resolution multispectral image data were used as the data sources, and two data features were extracted to construct a multidimensional feature dataset. The RFE method was used to select the optimal features, and four machine learning algorithms were combined to carry out object-oriented tree species classification in the three altitude study areas. A tree species classification map was completed for different altitude study areas.

2. Materials and Methods

2.1. Study Area

The study area was located in Muyu Town, Shennongjia Forest Area, with an average altitude of 1200 m and a total area of about 454.19 km². This area belongs to the typical northern subtropical monsoonal humid climate type, with an average annual temperature of around 11.6 °C. It is humid and rainy in summer, and mild with less rain in winter. Precipitation increases with the altitude. The superior geographical location and climatic conditions of the study area are very conducive to the growth of subtropical plants. The forest vegetation is dense and has abundant plant resources. The forest vegetation is well-preserved, and the overall spatial level is rich [29]. In the area, the ecological vulnerability of high-altitude forests has increased due to the pest infestation of a major typical high-altitude tree species, the Pinus armandii (distributed above 1400 m). Studying the classification of high-altitude tree species can help to assist in solving this problem. We established a long-term forest ecology monitoring station at an elevation of 1500 m, and based on this, we selected typical areas at elevations of 1600 m, 1800 m, and 2000 m as study areas. The main tree species in this study area are Pinus armandii, Larix gmelinii, Abies fargesii, Betula albosinensis, Betula platyphylla, Carpinus cordata, Cunninghamia, Fargesia spathacea, Quercus aliena var. acutiserrata, Cyclobalanopsis glauca, Quercus, and many other broad-leaved species. The geographical location of the study area is shown in Figure 1.

2.2. Data Acquisition and Pre-Processing

2.2.1. Remote Sensing Data Sources

The dataset used in this study was obtained from an airborne integrated sensor system that is capable of collecting multispectral imagery, LiDAR data, and IMU and GPS data. LiDAR data were acquired by a Leica ALS80-HP sensor (manufactured by Leica Geosystems AG, Heerbrugg, Switzerland), with an acquired point cloud density of 2.47 points/m². The simultaneous acquisition of multispectral data was carried out with the Leica DMC III multispectral imager, with a sensor size of 26,112 (across) × 15,000 (along) and an average spatial resolution of 0.2 m. The dataset was collected in the Shennongjia forest area from 10 to 16 August 2020. The absolute altitude of the flight was 4000 m., with an along overlap rate of 60% and an across overlap rate of 25%. The data were acquired with clear skies and few clouds, and the main parameters of the two sensors of the airborne integrated sensor system are shown in Table 1.

2.2.2. Field Data

A total of seventeen plots (20 m by 20 m) were chosen in the study sites. Four plots were surveyed in the complex forest community structure study area at an altitude of 1600 m, six plots were surveyed in the moderately complex forest community structure study area at an altitude of 1800 m, and seven plots were surveyed in the simple forest community structure study area at an altitude of 2000 m. The location, DBH, and height of each individual tree with a DBH > 2 cm were measured from 25 September to 9 October 2021. The tree location was measured by a sonar rangefinder (POSTEX LASER, Haglöf Sweden AB, Solleftea, Sweden), the tree height was measured by an altimeter, and the DBH was measured with a diameter tape. Detailed information is provided in Table 2.

2.2.3. Data Pre-Processing and Data Alignment

The preprocessing of multispectral data includes two parts: image stitching and image registration. Image stitching was completed using INPHO5.7 software, while image registration was carried out using ArcMap10.5.

The multispectral images were processed using INPHO software to generate an orthoimage of the study area. The INPHO system for processing airborne multispectral data includes the following steps [30]: (a) The preparation of raw data; (b) data preprocessing; (c) digital aerial triangulation processing. The average error of each parameter was X: 0.084 m, Y: 0.081 m, Z: 0.068 m, omega: 3.0 mdeg, phi: 3.2 mdeg, kappa: 3.7 mdeg. (d) The construction of a digital elevation model (DEM), and (e) the construction of a digital orthophoto map (DOM). The final output was a multispectral orthoimage of the study area with an average resolution of 0.2 m.

The LiDAR data acquisition instruments consisted of a scanner, which was mainly used to record the distance between the sensor and the ground, a kinematic GPS receiver, which was used to record the spatial position of the aircraft center, and an IMU, which was used to record the flight attitude data. The data from these instruments were jointly processed with differential calculation results to obtain the spatial position and attitude of each observation time, as well as the imaging device.

The preprocessing of the airborne LiDAR data in this study included ground point classification, point cloud normalization, and the generation of a digital elevation model (DEM) and canopy height model (CHM). R software was used for the preprocessing of the airborne LiDAR data. The progressive morphological filter (PMF) proposed by Zhang et al. [31] was utilized to classify ground points from the point cloud and output them. The ground points were processed using the irregular triangular network (TIN) algorithm to obtain a smooth DEM of the study area. The point cloud was normalized with respect to the DEM (i.e., by subtracting the elevation value of the point cloud from the elevation value of the same location). The normalized point cloud was then processed using point-to-raster methods and smoothing to generate the CHM of the study area. As the airborne multispectral data and LiDAR data were acquired through different sensors, georeferencing was required. In this study, the nearest neighbor resampling method was used in ArcMap 10.5 to generate multispectral data and the CHM data of LiDAR with a spatial resolution of 1 m. Then, the geographic registration method was used to select 10 pairs of control points with uniform spatial distribution on two images of the study areas of different altitudes. The feature elements that were easy to distinguish were mainly selected. The average error of the final control point pairs was within one image element, indicating that the registration was reliable [32].

2.3. Sample Selection

In this study, the main dominant tree species in each altitude study area were selected based on the combination of field sample survey data and forest resource class II survey data. Tree species with a number of surveyed trees not exceeding 20 were combined into other classes. The samples were selected using multispectral orthophotos with a spatial resolution of 0.2 m, and the distribution and number of tree species samples in each study area are shown in Figure 2. For each altitude sample selected, 60% were used as training samples, and 40% were used as test samples. The sample objects can include the complete canopy, and the samples were representative to ensure there was no duplication with the training samples.

2.4. Method

2.4.1. Species Diversity Index

Differences in the tree species composition due to altitude differences can be reflected by diversity indices. In this study, the species diversity was calculated for each altitude study area based on the results of the sample plot surveys at different altitudes. The Shannon–Wiener (D) index was used to reflect the complexity of the community structure, with higher values indicating greater complexity. The Simpson index (D) was used to reflect the dominance of species in the community, with higher values indicating fewer dominant species. The Pielou (J) index was used to reflect the distribution of species in the community, with higher values indicating a more even spatial distribution of plants [33]. The importance value index was also used, with higher values indicating greater importance of plant species [34]. The calculation formula was as follows:

Shannon–Wiener:

H = - \sum_{i = 1}^{S} P_{i} \times \ln P_{i}

(1)

Pielou:

J = D / (1 - 1 / S)

(2)

Simpson:

D = 1 - \sum_{i = 1}^{S} P_{i}^{2}

(3)

where

P_{i} = N_{i} / N

, where

N

is the total number of individuals,

N_{i}

is the total number of individuals of the ith species, and

S

is the number of species in the sample site.

The formula of the importance value was used:

importance value = (relative multiplicity + relative frequency + relative dominance)/3

(4)

2.4.2. Image Segmentation

In this study, an object-oriented classification method was used to segment the images. eCognition is mature software for object-oriented analysis and can perform multi-scale segmentation, generate remote sensing images automatically, and link these image objects according to specific structures that reflect the inherent scale of the surface landscape to some extent. The selection of the optimal segmentation scale is a research focus of object-oriented segmentation, but the optimal scale is relative and varies with different specific variables, thus often representing a range of values.

Therefore, optimal segmentation parameter experiments were conducted in this study, and a visual discrimination method was mainly used to determine the final optimal segmentation scale. The heterogeneity factor includes the color and shape factors, the sum of which is 1. This is one of the important features for performing image segmentation. The shape factor defines the texture consistency of the segmentation result, which is determined by the compactness factor and smoothness factor, both of which sum up to 1. This can effectively enhance the integrity of the object shape. The smaller the compactness factor, the smoother and less compact the object boundary can be obtained. The smaller the smoothness factor, the rougher the object boundary of the segmentation, which may result in jagged edges. In this study, the method for selecting heterogeneity factor combination parameters was to fix one parameter and then continuously adjust the other parameter for visual comparison to judge the segmentation effect. The image segmentation experiments were conducted for all the study areas by the above method. The optimal parameters for image segmentation in different altitude study areas were finally determined as shown in Table 3. The optimal parameter combinations were used to segment single wood images at different altitudes to complete the extraction of single wood canopies, and the segmentation results were then used as the basis for subsequent tree species classification.

2.5. Feature Extraction and Selection

The feature extraction in this study was based on image segmentation using the segmented single wood canopy polygon as the study object. The mean value of multiple pixel values contained within the canopy polygon was used as the value of that polygon on a specific feature. Therefore, the object features were calculated based on the mean pixel values contained within each canopy polygon.

2.5.1. Original Band Characteristics

The multispectral raw spectral bands completely represent the original features of the images. In this study, INPHO software was used to generate orthophotos of the study area with a resolution of 0.2 m, containing four raw spectral bands (blue, green, red, and near-infrared).

2.5.2. Vegetation Index Characteristics

Vegetation indices have indirect connections with many ecological factors in the environment and can reflect tree species information, such as canopy structure, leaf area index (LAI), chlorophyll, and net primary productivity (NPP) [10]. The differences in vegetation indices among the different vegetation types with respect to various parameters provide the possibility of using vegetation indices for tree species classification [35]. This allows researchers to select vegetation index features as variables for tree species classification.

In this study, we comprehensively reviewed previous research results and selected six vegetation index features. The calculations were performed using the Python platform, and each index is introduced in detail as follows:

Normalized difference vegetation index (NDVI):

N D V I = \frac{ρ_{N I R} - ρ_{R e d}}{ρ_{N I R} + ρ_{R e d}}

(5)

Soil-adjusted vegetation index (SAVI):

S A V I = \frac{(1 + L) (ρ_{N I R} - ρ_{R e d})}{(ρ_{N I R} + ρ_{R e d} + L)}

(6)

Ratio vegetation index (RVI):

R V I = \frac{ρ_{N I R}}{ρ_{R e d}}

(7)

Infrared percentage vegetation index (IPVI):

I P V I = \frac{ρ_{N I R}}{ρ_{N I R} + ρ_{R e d}}

(8)

Normalized greenness (Norm G):

N o r m G = \frac{ρ_{G r e e n}}{ρ_{G r e e n} + ρ_{R e d} + ρ_{B l u e}}

(9)

Normalized green–red ratio (Norm GR):

N o r m G R = \frac{ρ_{G r e e n} - ρ_{R e d}}{ρ_{G r e e n} + ρ_{R e d}}

(10)

2.5.3. Texture Characteristics

Currently, there are many methods for computing texture features. In this study, the gray-level co-occurrence matrix (GLCM) proposed by Haralick et al. [36] in 1973 was chosen to describe the texture features. Eight commonly used texture feature statistics were selected for the study of forest tree species classification in the different study areas. The formulas and descriptions of these feature statistics are presented in Table 4.

2.5.4. LiDAR Point Cloud Features

Compared to image data, the advantage of LiDAR data for forest monitoring is their ability to reflect the vertical structure information of the forest. Therefore, when utilizing LiDAR data, the height information is mainly used. Combining the results of previous studies, a total of 37 statistical variables related to height, density, etc. were selected in this study. The specific descriptions of the extracted LiDAR feature variables are shown in Table 5.

2.5.5. Feature Selection

In machine learning algorithms for hyperspectral image classification, as the number of bands increases, a sharp increase in the required number of training samples is necessary. When the number of training samples is limited, the classification accuracy will initially increase and then decrease with an increasing number of bands involved in the computation, which is known as the “Hughes phenomenon” [37]. Therefore, it is necessary to remove redundant information, reduce the number of feature dimensions, reduce the complexity of data processing, and enhance the generalization ability of the model. In this study, the recursive feature elimination (RFE) algorithm was chosen for feature selection. The RFE starts from the entire dataset and selects features in a backward sequence, removing one feature at a time based on the lowest score until the optimal set of features is selected. The method is described as follows:

Step 1: Train the full set of features 𝑀.

Step 2: Calculate the overall accuracy 𝑂𝐴𝑀-𝑖 after removing 𝑖 features from 𝑀.

Step 3: Obtain the set of 𝑚 with 𝑖 features removed from the set of features 𝑀.

Step 4: Iterate Step 1 to Step 3 until the optimal subset of features is obtained. The order in which features are removed during the iteration of the RFE algorithm depends on the importance of the features.

The feature selection process based on RFE in this study was implemented using the “feature_selection” module in scikit-learn, a machine learning library in Python (https://scikit-learn.org (accessed on 18 January 2022)).

2.6. Classification Method

2.6.1. Random Forest

The random forest (RF) algorithm is a highly effective machine learning algorithm. It is composed of a series of decision trees, and the classification result is determined by the vote of all decision trees, which has a stronger generalization ability than a single decision tree [38]. RF trains the samples with the training data to create a forest decision model with multiple decision trees.

2.6.2. Support Vector Machine

The support vector machine (SVM) algorithm is a supervised machine learning algorithm based on statistical theory. Its principle is to map the original vector to a higher-dimensional space and establish a maximum margin hyperplane in this space, which separates the middle hyperplane with the hyperplanes on both sides to maximize the distance between these two planes. The kernel function is the inner product of two vectors in a certain feature space, expressed as 𝐾(𝑋𝑖, 𝑋𝑗) = < 𝛷(𝑋𝑖), 𝛷(𝑋𝑗) [39]. The introduction of a kernel function can better achieve nonlinear mapping. Among many kernel functions, the Gaussian radial basis kernel function (RBF kernel) is the most widely used one. Therefore, in this paper, the SVM algorithm with Gaussian radial basis kernel was adopted for classification, and the calculation formula is as follows:

K (x_{i}, x_{j}) = e x p (- \frac{‖ x_{i} - x_{j}^{2} ‖}{2 δ^{2}})

(11)

2.6.3. K-Nearest Neighbor

The K-nearest neighbor (KNN) algorithm is an instance-based learning method and is often considered one of the simplest machine learning algorithms. The idea of the method is very simple and intuitive: if the majority of the K most similar samples in the feature space of a sample belong to a certain category, then that sample also belongs to that category and has the characteristics of the samples in that category [40].

2.6.4. eXtreme Gradient Boosting

The eXtreme Gradient Boosting (XGBoost) algorithm is a novel machine learning algorithm. Its algorithmic process consists of two parts: learning and inference. The goal of the learning machine is to minimize the loss function. Specifically, it requires the prediction error to be as small as possible while keeping the decision tree complexity as low as possible, and it calculates the prediction results of each sample to obtain the probability that the sample belongs to each category. The inference machine is based on the decision tree sequence derived from the learning machine. Firstly, the sample information is substituted in order from the root node to the leaf node of the decision tree sequence for logical judgment. If it is not a leaf node, the sample is judged to belong to the left/right child node, and vice versa, the leaf node score is calculated and input to the next decision tree for judgment. Secondly, the predicted values given by all decision trees are summed to obtain the probability that the sample is classified as 1. Finally, the threshold function is used to determine the final category to which the sample belongs [41].

2.7. Classification Accuracy Evaluation Method

In this study, a sample was selected by combining a field survey and visual interpretation. Forty percent of all samples were chosen as validation samples, and the accuracy was validated using the confusion matrix, which is a commonly used method for quantifying classification accuracy. The metrics used to measure the accuracy included the overall accuracy (OA), the producer’s accuracy (PA), the user’s accuracy (UA), and the kappa coefficient. The calculation formulas and descriptions of the indicators are shown in Table 6.

3. Results

3.1. Species Diversity of Tree Layers at Different Altitudes

In Table 7, it can be observed that the number of species in the study area at an altitude of 1600 m was 32. There were 23 species at the 1800 m altitude, and only 10 species at the 2000 m altitude. The number of species exhibited a decreasing trend with the increasing altitude.

Furthermore, the three diversity indices, namely, the Shannon–Weiner index, Simpson index, and evenness, also showed gradual decreases with increasing altitude. The highest values were recorded in the study area at 1600 m, with values of 1.9725, 0.6999, and 0.5692, respectively. The values for the 1800 m altitude were 1.6126, 0.6591, and 0.5143, respectively. The lowest values were recorded at the 2000 m altitude, with values of 1.1382, 0.5133, and 0.4943, respectively.

In this study, the importance of each species in the communities of the three study areas was calculated according to the importance value calculation formula, and the top ten tree species ranked by the importance value in the different study areas are summarized in Table 8. It can be seen that the coniferous species Pinus armandii ranked first in the importance value in all three study areas, and the importance value increased gradually with the increase in altitude. Larix gmelinii was the second most important species, and its importance value also showed a trend of increasing with altitude. The importance of broad-leaved species Betula platyphylla showed a trend of increasing first and then decreasing. The importance of other broad-leaved tree species, such as Carpinus cordata, gradually decreased. It can be concluded that with the increase in altitude, the ecological environment is more suitable for the survival of coniferous species, such as Pinus armandii and Larix gmelinii, which are resistant to low-temperature conditions.

3.2. Results of the Importance of Tree Species Characteristics at Different Altitudes

The importance of the extracted features for each elevation was analyzed using the RFE feature selection method. Ultimately, 36 features were selected as the optimal feature variables for tree species classification in each elevation study area. The optimal feature variables selected for each elevation are shown in Figure 3.

Analyzing the importance of the selected features, it was found that for the study area at an elevation of 1600 m (Figure 3a), IPVI, SAVI, NDVI, RVI, and Blue had higher importance scores of 0.059169, 0.057463, 0.055815, 0.055256, and 0.051124, respectively. In the study area at an elevation of 1800 m (Figure 3b), SAVI, B3_Mean, RVI, and Blue had higher importance scores of 0.050658, 0.050043, 0.046915, and 0.04652, respectively. In the study area at an elevation of 2000 m (Figure 3c), Red, NDVI, B3_Mean, and SAVI had higher importance scores of 0.049911, 0.045405, 0.04376, and 0.042695, respectively. It can be observed that SAVI played a crucial role in the classification of tree species in all three elevation study areas. Furthermore, the feature variables extracted from the LiDAR data were found to be of low importance and ranked relatively low for the classification of tree species in all three elevation study areas. Among the 36 features selected using the RFE algorithm that ranked high in importance for each study area, the numbers of features accounted for by LiDAR data were 18 for the study area at an elevation of 1600 m, 5 for the study area at an elevation of 1800 m, and 6 for the study area at an elevation of 2000 m.

From the optimal feature variables screened, several variable features with high importance for tree species classification in each altitude study area were selected and analyzed for changes in their importance for tree species classification in the different study areas. This is shown in Figure 4.

From the comprehensive analysis in Figure 4a, it can be seen that among the raw band features, except for the NIR band, the classification importance of the other three bands gradually decreased from the low- to high-altitude areas. All of the raw band features had the highest importance for tree species classification at 1600 m above sea level.

Figure 4b shows the four vegetation index features screened, with the highest importance for tree species classification in the study area at a 1600 m altitude. However, it should be noted that at 1800 m, the importance of the NDVI among the four vegetation indices was the lowest, but at 2000 m, the importance of the NDVI was the highest, showing a trend of first decreasing and then increasing.

The results in Figure 4c show that the four selected texture features were the least important for tree species classification in the study area at 1600 m above sea level. In addition, B1_Mean, B3_Mean, and B4_VA had the highest importance for tree species classification in the 1800 m altitude study area, but B2_CO had the highest importance for identifying tree species in the 2000 m altitude study area.

As seen in Figure 4d, the importance of the four LiDAR features screened for tree species classification at each altitude showed a decreasing and then increasing trend with increasing altitude. Zmean, Zmax, CHM, and H90 were all of the lowest importance for tree species classification at an elevation of 1800 m.

Due to the different altitudes, the tree species composition varies in each region. The results in Section 3.1 indicate that coniferous species gradually dominate with increasing altitude. Therefore, combined with the results in Figure 4a–d, it can be observed that the original band features and vegetation index features performed better in classifying broad-leaved tree species, while the texture features performed better in classifying coniferous tree species.

3.3. Classification Results of Optimal Characteristics of Tree Species at Different Altitudes

Using the filtered optimal features combined with four machine learning classifiers for tree species classification in the three study areas of different altitudes, the results of the tree species classification in each study area are shown in Table 9.

It can be seen that the best classification results were produced for tree species in all study areas using the XGBoost algorithm in combination with the filtered optimal features. The study area at an altitude of 1800 m had the highest accuracy, with an accuracy of 88.24% and a kappa coefficient of 0.86, followed by the study area at an altitude of 1600 m, and the study area at 2000 m had the lowest accuracy. The SVM algorithm had the second highest accuracy after XGBoost, followed by the RF algorithm, and finally the KNN algorithm. It is noteworthy that all algorithms achieved the highest classification accuracy in the 1800 m study area. In summary, the XGBoost classifier had the best results for the classification of the three altitude tree species. Therefore, this study next focused on analyzing the results of classifying each altitude tree species using the XGBoost classifier and classifying the three altitude study areas for mapping.

3.4. Evaluation of Classification Accuracy Based on XGBoost for Different Altitude Tree Species

The XGBoost algorithm was used to classify tree species in three altitude study areas by combining the optimal features screened. The results of the confusion matrices and classification plots of the classification accuracy for each tree species in different altitude study areas are as follows.

3.4.1. Altitude 1600 m

As shown in Table 10, in the area above 1600 m, the user accuracy (UA) of coniferous species ranged from 81.82% (Larix gmelinii) to 100% (Cunninghamia lanceolata). Meanwhile, for broad-leaved species, the UA ranged from 53.13% (Betula platyphylla) to 94.29% (Betula albosinensis), indicating a higher classification accuracy of coniferous species compared to broad-leaved species. In addition, as indicated by the classification results in Figure 5, it was found that the producer accuracy (PA) suggested better accuracy for coniferous species compared to broad-leaved species. The PA accuracy for coniferous species ranged from 97.37% (Cunninghamia lanceolata) to 100% (Pinus armandii), while for broad-leaved species, it ranged from 75% (Betula albosinensis) to 89.47% (Betula platyphylla).

3.4.2. Altitude 1800 m

As shown in Table 11, in the 1800 m altitude region, the classification accuracy of coniferous species was also higher than that of broad-leaved species, with a user accuracy (UA) range of 94.78% (Abies fargesii) to 98.21% (Pinus armandii), while the UA range of broad-leaved species was 57.14% (Carpinus cordata) to 86.84% (Cyclobalanopsis glauca), and the UA precision of the mixed species category “other species” was 85.71%. This was mainly due to the misclassification of Carpinus cordata, Betula albosinensis, and Cyclobalanopsis glauca with “other species.” The classification results for this altitude level are shown in Figure 6.

3.4.3. Altitude 2000 m

As shown in Table 12, unlike the previous two study areas, only two dominant conifer species, Pinus armandii and Larix gmelinii, were screened at an altitude of 2000 m, with user accuracy (UA) values of 100% and 81.82%, respectively. It was also found that there was serious confusion among the broad-leaved species, for example, the UA of Cyclobalanopsis glauca was only 33.33%, which was easily confused with Fargesia spathacea and Quercus aliena var. acutiserrata, etc. In addition, Larix gmelinii was also easily confused with several other broad-leaved species. Figure 7 shows the tree species classification results.

4. Discussion

4.1. Effects of Changes in Species Diversity at Different Altitudes on Classification Results

The community composition of the forest changes at different altitudes, leading to variations in species diversity and the importance of tree species [42]. In this study, the number of tree species decreased gradually with the increasing altitude, with 32, 23, and 10 tree species at the low, middle, and high altitudes, respectively. This suggests that high-altitude forests have a relatively low number of species and a simple community structure, mainly consisting of pure forests dominated by a few tree species. In contrast, low-altitude forests have rich species diversity and a complex community structure. A previous study showed that complex forest community structures increased the difficulty of fine classification of tree species [43]. However, the present study yielded different results, as among the three altitudinal study areas, the 1800 m study area had the best tree species classification results. This is because there are inherent differences among the dominant tree species in the 1800 m altitude study area, and each remote sensing feature extracted can effectively reflect these differences, resulting in the best differentiation between tree species. Although the composition of forest species in the study area at 1600 m is more complex than that in the 1800 m study area, the extracted features are easily confused with those of the dominant species. The main reason for the lowest accuracy of tree species classification in the study area at a 2000 m altitude is the small differences among the tree species themselves, resulting in little difference in the extracted remote sensing features, which cannot be easily distinguished.

4.2. Effects of Different Remote Sensing Features on Classification Results

Many studies have shown that the raw spectral features of high-resolution multispectral images can be used to classify tree species with high accuracy [44]. In this study, it was also observed from the selected optimal features at different altitudes (Figure 3a–c) that four original spectral bands were important for tree species classification at each altitude. However, the accuracy of tree species classification in the 2000 m altitude study area with fewer species was the highest, while for areas with more species, the classification accuracy was relatively lower. Using raw band features alone is insufficient to reflect the forest structure on the different altitude gradients, and the accuracy of tree species classification for different altitudes is also somewhat unstable.

The six vegetation indices used in this study primarily expressed differences in the interleaf and canopy structure among the tree species at each altitude and were shown to be important for tree species classification in a previous study [10]. This is consistent with the results of the current study, where vegetation indices played an important role in identifying the tree species at the three different altitudes. However, relying solely on vegetation indices cannot be applied to classify all types of forests. For instance, in this study, for the study area at 2000 m above sea level, where the forest structure is relatively simple, the extracted vegetation index feature values were relatively similar when the tree species were similar, leading to confusion and subsequently affecting the overall accuracy of tree species classification. Therefore, it is necessary to extract other remote sensing features to supplement the classification [45].

Studies have shown that the involvement of texture features in tree species classification using high-resolution aerial imagery determines whether the overall accuracy is improved or not [46], which is consistent with the results of this study. The optimal results at each altitude indicate that different texture feature metrics calculated in the NIR band are highly important for tree species identification in this study area and play a significant role in distinguishing the variability between species, thereby improving the accuracy of tree species classification. Therefore, it has been suggested that the vertical structure features of tree species should be incorporated to increase the classification accuracy of tree species [19,47].

LiDAR features can reflect differences in the vertical structure of forests. Hovi et al. [48] demonstrated that LiDAR data can be used to extract tree height information, which can be used to stratify forest stands into different layers and classify tree species in each layer, thus improving the overall tree species classification accuracy. However, in this study, the results of the optimal feature screening showed that the importance of the extracted LiDAR features for tree species classification at each altitude was low. This was mainly due to the low density of the LiDAR data points obtained in this study. Previous research has shown that the quality of the feature raster generated from LiDAR data has a direct impact on the accuracy of tree species classification, and point cloud density has a direct impact on the quality of the generated raster [49,50]. Therefore, for more accurate tree species refinement classification, higher point density LiDAR data should be used.

4.3. The Effect of Different Classifiers on the Results of Each Classification

Among the machine learning classifiers used in this study, the XGBoost algorithm showed the best classification results at all three altitudes, with classification accuracies of 87.63%, 88.24%, and 84.03% at the altitudes of 1600 m, 1800 m, and 2000 m, respectively. The main reason for the different performances of the four classifiers is their different capabilities. First, the classification accuracy of the KNN algorithm is greatly influenced by the training samples, especially due to the unbalanced number of samples among various tree species in different altitude study areas, which can easily lead to misclassification [43]. Secondly, RF is essentially composed of multiple decision trees, and its results are obtained by voting on the N classification results. However, for decision trees with multiple categories, errors generated at the upper layer can propagate to the lower layer, leading to less satisfactory classification results [51]. Thirdly, the SVM algorithm has a relatively simple classification idea. During classification, the system randomly establishes a hyperplane to classify the samples. Although it can reduce the occurrence of misclassification, it is also shown that when the training samples are not uniform and the feature type changes greatly among different altitude study areas, it leads to more difficult data training, and the classification effect is not as good as other classifiers [5]. In contrast, the XGBoost algorithm has better classification performance, a simpler model for training, and is less susceptible to training data than the other three algorithms. Therefore, it exhibited the best classification accuracy [52].

5. Conclusions

This study demonstrated the effectiveness of combining multiple remote sensing data sources with machine learning algorithms for classifying tree species at different elevations in Shennongjia. The following conclusions were drawn: the diversity of the tree layer decreases with increasing elevation in the different elevation research areas. When classifying the three common tree species (Pinus armandii, Larix chinensis, and Betula platyphylla) in the three elevation gradients, the classification accuracy of the same species varies at different elevations. Among them, the average producer accuracy and user accuracy of Pinus armandii were 97.06%, 99.11%, and 100% at elevations of 1600 m, 1800 m, and 2000 m, respectively; the average accuracy rates of Larix chinensis were 90.91%, 90.28%, and 83.77%; and the average accuracy rates of Betula platyphylla were 84.65%, 85.00%, and 91.90%. The results indicate that coniferous species perform better than broad-leaved species when using remote sensing data features to classify tree species at different elevations. Redundant features can be eliminated through feature selection, and combining the optimal features with four machine learning classifiers can achieve high accuracy in tree species classification. Among them, using the XGBoost classifier in combination with the optimal features yielded the highest overall classification accuracy for tree species at elevations of 1600 m, 1800 m, and 2000 m, which were 87.63%, 88.24%, and 84.03%, respectively. These research results can provide guidance for tree species classification applications.

Author Contributions

Conceptualization, Y.D.; data curation, X.L.; formal analysis, H.L.; funding acquisition, Y.D.; investigation, H.L., X.L., and H.C.; methodology, X.L. and Z.H.; project administration, Y.D.; resources, H.C. and Y.D.; software, H.L.; supervision, Y.D.; validation, X.L.; visualization, H.L.; writing—original draft, H.L.; writing—review and editing, H.L. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Research Fund of Key Laboratory of Digital Earth Science, Aerospace Information Research Institute Chinese Academy of Sciences, Chinese Academy of Sciences (No. 2022LDE006).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

De Frenne, P.; Lenoir, J.; Luoto, M.; Scheffers, B.R.; Zellweger, F.; Aalto, J.; Ashcroft, M.B.; Christiansen, D.M.; Decocq, G.; De Pauw, K.; et al. Forest Microclimates and Climate Change: Importance, Drivers and Future Research Agenda. Glob. Change Biol. 2021, 27, 2279–2297. [Google Scholar] [CrossRef] [PubMed]
Ma, S.; Qiao, Y.-P.; Wang, L.-J.; Zhang, J.-C. Terrain Gradient Variations in Ecosystem Services of Different Vegetation Types in Mountainous Regions: Vegetation Resource Conservation and Sustainable Development. For. Ecol. Manag. 2021, 482, 118856. [Google Scholar] [CrossRef]
Felipe-Lucia, M.R.; Soliveres, S.; Penone, C.; Manning, P.; van der Plas, F.; Boch, S.; Prati, D.; Ammer, C.; Schall, P.; Gossner, M.M.; et al. Multiple Forest Attributes Underpin the Supply of Multiple Ecosystem Services. Nat. Commun. 2018, 9, 4839. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Yan, S.; Jing, L.; Wang, H. A New Individual Tree Species Recognition Method Based on a Convolutional Neural Network and High-Spatial Resolution Remote Sensing Imagery. Remote Sens. 2021, 13, 479. [Google Scholar] [CrossRef]
Wang, Y.; Wang, J.; Chang, S.; Sun, L.; An, L.; Chen, Y.; Xu, J. Classification of Street Tree Species Using UAV Tilt Photogrammetry. Remote Sens. 2021, 13, 216. [Google Scholar] [CrossRef]
Dian, Y.; Li, Z.; Pang, Y. Spectral and Texture Features Combined for Forest Tree Species Classification with Airborne Hyperspectral Imagery. J. Indian Soc. Remote Sens. 2015, 43, 101–107. [Google Scholar] [CrossRef]
Deepak, M.; Keski-Saari, S.; Fauch, L.; Granlund, L.; Oksanen, E.; Keinänen, M. Leaf Canopy Layers Affect Spectral Reflectance in Silver Birch. Remote Sens. 2019, 11, 2884. [Google Scholar] [CrossRef] [Green Version]
Brilli, L.; Chiesi, M.; Maselli, F.; Moriondo, M.; Gioli, B.; Toscano, P.; Zaldei, A.; Bindi, M. Simulation of Olive Grove Gross Primary Production by the Combination of Ground and Multi-Sensor Satellite Data. Int. J. Appl. Earth Obs. Geoinf. 2013, 23, 29–36. [Google Scholar] [CrossRef]
Ferreira, M.P.; Zortea, M.; Zanotta, D.C.; Shimabukuro, Y.E.; de Souza Filho, C.R. Mapping Tree Species in Tropical Seasonal Semi-Deciduous Forests with Hyperspectral and Multispectral Data. Remote Sens. Environ. 2016, 179, 66–78. [Google Scholar] [CrossRef]
Xu, Z.; Shen, X.; Cao, L.; Coops, N.C.; Goodbody, T.R.H.; Zhong, T.; Zhao, W.; Sun, Q.; Ba, S.; Zhang, Z.; et al. Tree Species Classification Using UAS-Based Digital Aerial Photogrammetry Point Clouds and Multispectral Imageries in Subtropical Natural Forests. Int. J. Appl. Earth Obs. Geoinf. 2020, 92, 102173. [Google Scholar] [CrossRef]
Dong, Y.; Dong, M.; Shan, Y. Tree Species Recognition Based on Hyperspectral Remote Sensing. J. North China Univ. Sci. Technol. 2020, 42, 11–16. [Google Scholar]
Dabiri, Z.; Lang, S. Comparison of Independent Component Analysis, Principal Component Analysis, and Minimum Noise Fraction Transformation for Tree Species Classification Using APEX Hyperspectral Imagery. IJGI 2018, 7, 488. [Google Scholar] [CrossRef] [Green Version]
Kaasalainen, S.; Holopainen, M.; Karjalainen, M.; Vastaranta, M.; Kankare, V.; Karila, K.; Osmanoglu, B. Combining Lidar and Synthetic Aperture Radar Data to Estimate Forest Biomass: Status and Prospects. Forests 2015, 6, 252–270. [Google Scholar] [CrossRef]
Zhao, K.; Suarez, J.C.; Garcia, M.; Hu, T.; Wang, C.; Londo, A. Utility of Multitemporal Lidar for Forest and Carbon Monitoring: Tree Growth, Biomass Dynamics, and Carbon Flux. Remote Sens. Environ. 2018, 204, 883–897. [Google Scholar] [CrossRef]
Gao, S.; Zhang, Z.; Cao, L. Individual Tree Structural Parameter Extraction and Volume Table Creation Based on Near-Field LiDAR Data: A Case Study in a Subtropical Planted Forest. Sensors 2021, 21, 8162. [Google Scholar] [CrossRef]
Zhao, F.; Pang, Y.; Li, Z.; Zhang, H.; Feng, W.; Liu, Q. Extraction of Individual Tree Height Using a Combination of Aerial Digital Camera Imagery and LiDAR. Sci. Silvae Sin. 2009, 45, 81–87. [Google Scholar]
Cățeanu, M.; Ciubotaru, A. The Effect of LiDAR Sampling Density on DTM Accuracy for Areas with Heavy Forest Cover. Forests 2021, 12, 265. [Google Scholar] [CrossRef]
Jiaxin, K.; Zhaochen, Z.; Jian, Z. Zhejiang Tiantong Forest Ecosystem National Observation and Research Station, School of Ecological and Environmental Sciences, East China Normal University, Shanghai 2002412 Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092 Classification and identification of plant species based on multi-source remote sensing data: Research progress and prospect. Biodivers. Sci. 2019, 27, 796–812. [Google Scholar] [CrossRef]
Pu, R.; Landry, S. Mapping Urban Tree Species by Integrating Multi-Seasonal High Resolution Pléiades Satellite Imagery with Airborne LiDAR Data. Urban For. Urban Green. 2020, 53, 126675. [Google Scholar] [CrossRef]
Chi, D.; Degerickx, J.; Yu, K.; Somers, B. Urban Tree Health Classification Across Tree Species by Combining Airborne Laser Scanning and Imaging Spectroscopy. Remote Sens. 2020, 12, 2435. [Google Scholar] [CrossRef]
Shi, Y.; Wang, T.; Skidmore, A.K.; Heurich, M. Improving LiDAR-Based Tree Species Mapping in Central European Mixed Forests Using Multi-Temporal Digital Aerial Colour-Infrared Photographs. Int. J. Appl. Earth Obs. Geoinf. 2020, 84, 101970. [Google Scholar] [CrossRef]
Hartling, S.; Sagan, V.; Maimaitijiang, M. Urban Tree Species Classification Using UAV-Based Multi-Sensor Data Fusion and Machine Learning. GIScience Remote Sens. 2021, 58, 1250–1275. [Google Scholar] [CrossRef]
Peng, X.; Liu, H.; Chen, Y.; Chen, Q.; Wang, J.; Li, H.; Zhao, A. A Method to Identify Dacrydium Pierrei Hickel Using Unmanned Aerial Vehicle Multi-Source Remote Sensing Data in a Chinese Tropical Rainforest. J. Indian Soc. Remote Sens. 2022, 50, 25–35. [Google Scholar] [CrossRef]
Man, Q.; Dong, P.; Yang, X.; Wu, Q.; Han, R. Automatic Extraction of Grasses and Individual Trees in Urban Areas Based on Airborne Hyperspectral and LiDAR Data. Remote Sens. 2020, 12, 2725. [Google Scholar] [CrossRef]
Sha, G.; Xin, S.; Dai Jinsong, C.L. Tree Species Classification in Urban Forests based on LiDAR Point Cloud Segmentation and Hyperspectral Metrics Extraction. Remote Sens. Technol. Appl. 2018, 33, 1073–1083. [Google Scholar]
Shen, X.; Cao, L. Tree-Species Classification in Subtropical Forests Using Airborne Hyperspectral and LiDAR Data. Remote Sens. 2017, 9, 1180. [Google Scholar] [CrossRef] [Green Version]
Yin, D.; Wang, L. Individual Mangrove Tree Measurement Using UAV-Based LiDAR Data: Possibilities and Challenges. Remote Sens. Environ. 2019, 223, 34–49. [Google Scholar] [CrossRef]
López-Angulo, J.; Pescador, D.S.; Sánchez, A.M.; Mihoč, M.A.K.; Cavieres, L.A.; Escudero, A. Determinants of High Mountain Plant Diversity in the Chilean Andes: From Regional to Local Spatial Scales. PLoS ONE 2018, 13, e0200216. [Google Scholar] [CrossRef] [Green Version]
The PLoS ONE Staff. Correction: Integrating the Effects of Latitude and Altitude on the Spatial Differentiation of Plant Community Diversity in a Mountainous Ecosystem in China. PLoS ONE 2017, 12, e0176866. [Google Scholar] [CrossRef] [Green Version]
Wang, M.; Zhu, D.; Wang, D.; Yang, Y. Comparative Study of Inpho and apMatrix in UAV Remote Sensing Data Processing. J. Anhui Agri 2016, 44, 264–267. [Google Scholar] [CrossRef]
Zhang, K.; Chen, S.-C.; Whitman, D.; Shyu, M.-L.; Yan, J.; Zhang, C. A Progressive Morphological Filter for Removing Nonground Measurements from Airborne LIDAR Data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 872–882. [Google Scholar] [CrossRef] [Green Version]
Guo, S. Application of ArcGIS Georeferencing and Spatial Analysis Tools in Analyzing Cases Involving Changed Use of Forestland. Anhui For. Sci. Technol. 2020, 46, 47–50. [Google Scholar]
Morris, E.K.; Caruso, T.; Buscot, F.; Fischer, M.; Hancock, C.; Maier, T.S.; Meiners, T.; Müller, C.; Obermaier, E.; Prati, D.; et al. Choosing and Using Diversity Indices: Insights for Ecological Applications from the German Biodiversity Exploratories. Ecol. Evol. 2014, 4, 3514–3524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhang, Y.-h.; Dian, Y.-Y.; Huang, G.-T.; Liu, X.-Y.; Han, Z.-M.; Jian, Y.-F.; Li, Y.; Wang, X. Effects of spatial structure on species diversity in Pinus massoniana plantation of different succession stages. Chin. J. Ecol. 2021, 40, 2357–2365. [Google Scholar] [CrossRef]
Kandare, K.; Ørka, H.O.; Dalponte, M.; Næsset, E.; Gobakken, T. Individual Tree Crown Approach for Predicting Site Index in Boreal Forests Using Airborne Laser Scanning and Hyperspectral Data. Int. J. Appl. Earth Obs. Geoinf. 2017, 60, 72–82. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef] [Green Version]
Hsu, P.-H. Feature Extraction of Hyperspectral Images Using Wavelet and Matching Pursuit. ISPRS J. Photogramm. Remote Sens. 2007, 62, 78–92. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An Assessment of the Effectiveness of a Random Forest Classifier for Land-Cover Classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Huang, G.-B.; Zhou, H.; Ding, X.; Zhang, R. Extreme Learning Machine for Regression and Multiclass Classification. IEEE Trans. Syst. Man Cybern. B 2012, 42, 513–529. [Google Scholar] [CrossRef] [Green Version]
Noi, T.P.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef] [Green Version]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Chen, L.; Huang, J.-G.; Ma, Q.; Hänninen, H.; Rossi, S.; Piao, S.; Bergeron, Y. Spring Phenology at Different Altitudes Is Becoming More Uniform under Global Warming in Europe. Glob. Change Biol. 2018, 24, 3969–3975. [Google Scholar] [CrossRef] [PubMed]
Yanshuang, W.; Xiaoli, Z. Object-oriented tree species classification with multi-scale texture features based on airborne hyperspectral images. J. Beijing For. Univ. 2020, 42, 91–101. [Google Scholar]
Li, X.; Li, H.; Chen, D.; Liu, Y.; Liu, S.; Liu, C.; Hu, G. Multiple Classifiers Combination Method for Tree Species Identification Based on GF-5 and GF-6. Sci. Silvae Sin. 2020, 56, 93–104. [Google Scholar]
Dai, P.; Ding, L.; Liu, L.; Dong, L.; Huang, Y. Tree Species Identification Based on FCN Usingthe Visible Images Obtained from an Unmanned Aerial Vehicle. Laser Optoelectron. Prog. 2020, 57, 36–45. [Google Scholar]
Alvarez-Taboada, F.; Paredes, C.; Julián-Pelaz, J. Mapping of the Invasive Species Hakea Sericea Using Unmanned Aerial Vehicle (UAV) and WorldView-2 Imagery and an Object-Oriented Approach. Remote Sens. 2017, 9, 913. [Google Scholar] [CrossRef] [Green Version]
Chen, X.; Yun, T.; Xue, F.; Liu, Y. Classification of Tree Species Based on LiDAR Point Cloud Data. Laser Optoelectron. Prog. 2019, 56, 203–214. [Google Scholar]
Hovi, A.; Korhonen, L.; Vauhkonen, J.; Korpela, I. LiDAR Waveform Features for Tree Species Classification and Their Sensitivity to Tree- and Acquisition Related Parameters. Remote Sens. Environ. 2016, 173, 224–237. [Google Scholar] [CrossRef]
Peng, X.; Zhao, A.; Chen, Y.; Chen, Q.; Liu, H. Tree Height Measurements in Degraded Tropical Forests Based on UAV-LiDAR Data of Different Point Cloud Densities: A Case Study on Dacrydium Pierrei in China. Forests 2021, 12, 328. [Google Scholar] [CrossRef]
Wang, K.K.; Zheng, X.D.; Lai, X.D. Relationship Between Airborne LiDAR Point Cloud Density and DEM Product Accuracy. J. Geomat. 2021, 46, 78–82. [Google Scholar] [CrossRef]
Cao, J.; Leng, W.; Liu, K.; Liu, L.; He, Z.; Zhu, Y. Object-Based Mangrove Species Classification Using Unmanned Aerial Vehicle Hyperspectral Images and Digital Surface Models. Remote Sens. 2018, 10, 89. [Google Scholar] [CrossRef] [Green Version]
Xu, Y.; Zhen, J.N.; Jiang, X.P.; Wang, J.J. Mangrove species classification with UAV-based remote sensing data and XGBoost. J. Remote Sens. 2021, 25, 737–752. [Google Scholar]

Figure 1. Study area location.

Figure 2. Distribution of samples in different altitude study areas.

Figure 4. Comparison of importance values of optimal features at different altitudes: (a) Original band features; (b) Vegetation index features; (c) Texture features; (d) LiDAR point cloud features.

Figure 5. Classification results of tree species at altitude 1600 m based on XGBoost.

Figure 6. Classification results of tree species at altitude 1800 m based on XGBoost.

Figure 7. Classification results of tree species at altitude 2000 m based on XGBoost.

Table 1. Airborne integrated sensor system detailed parameters.

Airborne LiDAR: Leica ALS80-HP
Max. flight height	3500 m	Max. pulse frequency	1000 KHz
Min. flight height	100 m	Scan method	Sine, triangle, parallel
View field angle	0–72°	Max. scan frequency	200 Hz, 158 Hz, 120 Hz
Laser pulse width	400 ps	Density echo count	3
Airborne Multi-Spectrum
Along field of view angle	48.2°	Along Pixels	6708
Across field of view angle	61.7°	Across Pixels	8956
Spectral band	R, G, B, NIR	Color depth	14 bit
Focal length	45.0 mm	Pixel size	5.2
Frame rate	1.9 s	Storage	6.4 TByte

Table 2. Survey information on tree species in sample plots at different altitudes.

Altitude	Species Number	Tree Number	Average DBH	Average Height
1600	32	400	16.18 cm	14.62 m
1800	23	502	17.88 cm	13.82 m
2000	10	439	18.92 cm	13.68 m

Table 3. Optimal combinations of segmentation parameters.

Altitude	Band Weighting	Scale Parameters	Shape Factor	Compactness
1600	1, 1, 1, 1	300	0.4	0.7
1800	1, 1, 1, 1	350	0.5	0.7
2000	1, 1, 1, 1	260	0.4	0.7

Table 4. Descriptions of texture feature statistics.

Texture Feature	Abbreviation	Formula	Description
Mean	ME	$M E = \sum_{i} \sum_{j} i \cdot P (i, j)$	Indicates the average degree of image gray.
Variance	VA	$V A = \sum_{i} \sum_{j} {(i - μ)}^{2} P (i, j)$	Reflects the gray level change degree of remote sensing image.
Homogeneity	HO	$H O = \sum_{i} \sum_{j} P (i, j) [1 + {(i - j)}^{2}]$	Reflects local homogeneity in remote sensing images.
Contrast	CO	$C O = \sum_{i} \sum_{j} P (i, j) {(i - j)}^{2}$	Reflects the clarity of the image and the depth of texture grooves.
Dissimilarity	DI	$D I = \sum_{i} \sum_{j} \| i - j \| P (i, j)$	Represents local regional texture features in remote sensing images.
Entropy	EN	$E N = - \sum_{i} \sum_{j} P (i, j) l o g P (i, j)$	A randomness measure representing the amount of information contained in an image.
Second moment	SM	$S M = \sum_{i} \sum_{j} P {(i, j)}^{2}$	Uniformity of gray distribution and texture measurement of remote sensing image.
Correlation	CR	$C R = \sum_{i} \sum_{j} (i - μ_{x}) (j - μ_{y}) P (i, j) / σ_{x} σ_{y}$	Indicates the similarity of image gray level.

Note:

P (i, j)

denotes the probability of grayscale

P (g_{1}, g_{2})

in the grayscale co-occurrence matrix;

μ_{x}

,

μ_{y}

,

σ_{x}

,

σ_{y}

are the mean and standard deviation of the rows and columns, respectively.

Table 5. LiDAR characteristics and variable descriptions.

Characteristic Variable	Abbreviation	Describe
Canopy height model	CHM	Maximum height of point cloud above ground.
Mean canopy height	Z_mean	Average point cloud height above ground.
Percentage of mean canopy height	Z_{above_zm}	Percentage of total points for all canopy point clouds above.
Percentage of crown base height	Z_above	Proportion of the number of point clouds above the basic height of the tree canopy.
Percentile of point cloud height	H_5, H₁₀, H₁₅…H₉₀, H₉₅	The 5th, 10th, 15th, 20th… The points in the 90th and 95th height cells, the value of each raster cell is calculated between all points in the cell.
Point cloud return density	D₁, D_2, D_3, D₄…D_8, D₉	The point cloud density of each raster cell is divided into 9 parts according to the height of the points larger than 0.5 m in the cell, and then the points in different layers are divided by the total number of points in the cell.
Point cloud height distribution statistics	Z_max, Z_sd, Z_skew, Z_kurt, Z_en	Maximum height value, standard deviation, deviation, kurtosis, information entropy of all canopy point clouds.

Table 6. Classification accuracy evaluation indices.

Evaluation Indicators	Formula	Describe
Producer’s Accuracy (PA)	$P A = \frac{X_{i i}}{X_{+ i}}$	Denotes the number of correct classifications of a category X to the total number of true reference samples of that category X, which reflects the ratio of the probability of being correctly classified.
User’s Accuracy (UA)	$U A = \frac{X_{X i i}}{X_{i +}}$	Represents the ratio of the number of correct classifications of a category X to the total number of classifications in that category in the classification results. The ratio of the total number of categories is classified as X.
Overall Accuracy (OA)	$O A = \frac{\sum_{i = 1}^{n} X_{i i}}{M}$	Denotes the percentage of the overall correctly classified categories out of the total sample size M, expressed as the degree of overall correct classifications, and the number of each category correctly classified lies on the diagonal of the matrix.
Kappa	$K a p p a = \frac{M \sum_{i = 1}^{n} X_{i i} - \sum_{i = 1}^{n} X_{i +} \cdot X_{+ i}}{M^{2} - \sum_{i = 1}^{n} X_{i +} \cdot X_{+ i}}$	It is a precision statistical value, reflecting the matching degree between the classification results and the actual ground feature categories, which can objectively evaluate the classification accuracy.

Table 7. Species diversity of tree layer in different elevation study areas.

Altitude	Species Number	Shannon-Wiener	Simpson	Pielou
1600	32	1.9725	0.6999	0.5692
1800	23	1.6126	0.6591	0.5143
2000	10	1.1382	0.5133	0.4943

Table 8. Ranking of the top ten important values of tree species compositions at different elevations.

Altitude 1600 m		Altitude 1800 m		Altitude 2000 m
Species	Important Values	Species	Important Values	Species	Important Values
P.arm	0.391	P.arm	0.416	P.arm	0.502
C.cor	0.083	B.pla	0.154	L.gme	0.151
B.alb	0.074	C.cor	0.049	Q.ali	0.068
B.pla	0.044	L.gme	0.038	C.gla	0.054
L.gme	0.029	A.far	0.037	F.spa	0.054
C.lan	0.028	Q.	0.029	B.pla	0.046
A.miy	0.024	M.gly	0.028	C.mol	0.042
C.tur	0.024	L.obt	0.027	A.fab	0.037
L.obt	0.022	C.con	0.026	C.con	0.023
D.kak	0.019	A.bat	0.022	C.seg	0.023
All	0.738	All	0.826	All	1.000

Note: P.arm (Pinus armandii); L.gme (Larix gmelinii); B.pla (Betula platyphylla); C.cor (Carpinus cor-data); C. (Cunninghamia); B.alb (Betula albosinensis); A.far (Abies fargesii); Q. (Quercus); Q.ali (Quercus aliena var.acutiserrata); C.gla (Cyclobalanopsis glauca); F.spa (Fargesia spathacea); C.lan (Cunninghamia lanceolata); A.miy (Acer miyabei); C.tur (Carpinus turczaninowii); L.obt (Lindera obtusiloba); D.kak (Diospyros kaki); M.gly (Metasequoia glyptostroboides); C.con (Cornus controversa); A.bat (Amygdaluspersica batsch); C.mol (Castanea mollissima Blume); A.fab (Abies fabri); C.seg (Castanea seguinii Dode); Other (other tree species). The following abbreviations are the same.

Table 9. Classification results of optimal characteristics of tree species at different altitudes.

Algorithm	Altitude 1600		Altitude 1800		Altitude 2000
Algorithm	OA%	Kappa	OA%	Kappa	OA%	Kappa
RF	86.22	0.84	87.50	0.85	81.25	0.77
SVM	86.93	0.84	87.87	0.86	83.68	0.80
KNN	82.69	0.79	84.93	0.82	82.64	0.79
XGBoost	87.63	0.85	88.24	0.86	84.03	0.81

Table 10. Classification confusion matrix of tree species at altitude of 1600 m based on XGBoost.

Species	P.arm	L.gme	B.pla	C.cor	C.	B.alb	Other	UA (%)
P.arm	32	0	0	1	0	1	0	94.12
L.gme	0	27	0	2	0	0	4	81.82
B.pla	0	0	17	3	0	8	4	53.13
C.cor	0	0	2	43	0	0	3	89.58
C.	0	0	0	0	37	0	0	100.00
B.alb	0	0	0	0	1	33	1	94.29
Other	0	0	0	3	0	2	59	92.19
PA (%)	100.00	100.00	89.47	82.69	97.37	78.00	83.10

Table 11. Classification confusion matrix of tree species at altitude 1800 m based on XGBoost.

Species	P.arm	L.gme	B.pla	C.cor	A.far	Q.	Other	UA (%)
P.arm	55	0	0	0	1	0	0	98.21
L.gme	0	35	0	0	0	1	0	97.22
B.pla	0	1	34	1	2	0	2	85.00
C.cor	0	6	0	16	1	0	5	57.14
A.far	0	0	1	0	37	1	0	94.87
Q.	0	0	1	2	0	33	2	86.84
Other	0	0	4	0	0	1	30	85.71
PA (%)	100.00	83.33	85.00	84.21	90.24	91.67	76.92

Table 12. Classification confusion matrix of tree species at altitude of 2000 m based on XGBoost.

Species	P.arm	L.gme	B.alb	Q.ali	C.gla	F.spa	Other	UA (%)
P.arm	66	0	0	0	0	0	0	100.00
L.gme	0	18	1	2	0	1	0	81.82
B.alb	0	1	55	0	0	1	0	96.49
Q.ali	0	0	1	20	0	0	0	95.24
C.gla	0	1	0	1	9	7	9	33.33
F.spa	0	1	0	0	2	29	5	78.38
Other	0	0	6	4	0	3	45	77.59
PA (%)	100.00	85.71	87.30	74.07	81.82	70.73	76.27

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lin, H.; Liu, X.; Han, Z.; Cui, H.; Dian, Y. Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data. Appl. Sci. 2023, 13, 4911. https://doi.org/10.3390/app13084911

AMA Style

Lin H, Liu X, Han Z, Cui H, Dian Y. Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data. Applied Sciences. 2023; 13(8):4911. https://doi.org/10.3390/app13084911

Chicago/Turabian Style

Lin, Haoran, Xiaoyang Liu, Zemin Han, Hongxia Cui, and Yuanyong Dian. 2023. "Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data" Applied Sciences 13, no. 8: 4911. https://doi.org/10.3390/app13084911

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Acquisition and Pre-Processing

2.2.1. Remote Sensing Data Sources

2.2.2. Field Data

2.2.3. Data Pre-Processing and Data Alignment

2.3. Sample Selection

2.4. Method

2.4.1. Species Diversity Index

2.4.2. Image Segmentation

2.5. Feature Extraction and Selection

2.5.1. Original Band Characteristics

2.5.2. Vegetation Index Characteristics

2.5.3. Texture Characteristics

2.5.4. LiDAR Point Cloud Features

2.5.5. Feature Selection

2.6. Classification Method

2.6.1. Random Forest

2.6.2. Support Vector Machine

2.6.3. K-Nearest Neighbor

2.6.4. eXtreme Gradient Boosting

2.7. Classification Accuracy Evaluation Method

3. Results

3.1. Species Diversity of Tree Layers at Different Altitudes

3.2. Results of the Importance of Tree Species Characteristics at Different Altitudes

3.3. Classification Results of Optimal Characteristics of Tree Species at Different Altitudes

3.4. Evaluation of Classification Accuracy Based on XGBoost for Different Altitude Tree Species

3.4.1. Altitude 1600 m

3.4.2. Altitude 1800 m

3.4.3. Altitude 2000 m

4. Discussion

4.1. Effects of Changes in Species Diversity at Different Altitudes on Classification Results

4.2. Effects of Different Remote Sensing Features on Classification Results

4.3. The Effect of Different Classifiers on the Results of Each Classification

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI