Dominant Tree Species Classification using Remote Sensing Data and Object -Based Image Analysis

Over the last few decades, forests have been the victims of over logging and deforestation. Uncontrolled of this activity gave an impact to the tree species to be endangered. A detailed inventory of tree species is needed to manage and plan the forest on a sustainable basis. Many techniques had been done to identify the tree species, but in the recent three decades, remote sensing technique was widely used to study the distribution of tree species. In this study, an object-based image analysis (OBIA) with a combination of high-resolution multispectral satellite imagery (WV-2) and airborne laser scanning (LiDAR) data was tested for classification of individual tree crowns of tropical tree species at Forest Research Institute Malaysia (FRIM) forest, Selangor. LiDAR data was taken using fixed-wing aircraft with Gemini Airborne Laser Terrain Mapper (ALTM) laser with 0.15m and 0.25 resolution for horizontal and vertical. WV-2 was captured with a 0.5m spatial resolution. In this study, hyperspectral data captured using Bayspec sensor mount at UAV with height 220m from the ground and have 0.3 resolution was used to extract the spectral reflectance of tree species. Segmentation of the image was performed using multi-resolution segmentation in eCognition software. Accuracy assessment for segmentation was done by measure the ‘goodness fit’ (D value) between training object and output segmentation. The overall accuracy of the segmentation was 86%. For species classification, the accuracy assessment was performed using the error matrix confusion technique to 7 classes of tree species. The result had shown the overall accuracy classification was 64%.


Introduction
Forest is one of the important natural resources for all living things. Undeniably, that forest gives a positive impact on each individual and community in social, cultural, and economic contexts. Forest released a large amount of oxygen (O2) resulted from their photosynthesis process. The uptake of carbon dioxide (CO2) during photosynthesis, has caused a cooling impact on the global climate. Besides providing habitats for animals and livelihoods for humans, forests and trees were the main sources in enhancing rainfall, recharging groundwater, and preventing erosion and flooding [1].
Malaysia, which is one of the most diverse flora and fauna regions in the world, has nearly 15,000 native plant species. Eight thousand two hundred species (about 250 families) are found in Peninsular Malaysia and 12,000 species are found in Sabah and Sarawak [2]. Due to overexploitation, the forest had to face a great threat. Uncontrolled deforestation had contributed to the negative effect on tree species diversity which resulted in the loss of valuable economic and medicinal trees [3]. In Peninsular Malaysia, almost ninety-two taxa of dipterocarps species were in the endangered categories [2]. This would lead to the extinction of this species if no action was taken to protect the remaining species.
In protecting the diversity of trees species, endangered tree species need to be protected. Nowadays the use of remote sensing techniques become a common method in forest management and tree species classification. Remote sensing improved the traditional field surveys for doing forest tree inventory especially in obtaining species information as this method can be applied within large and inaccessible areas [4]. Conventional methods in maintaining and updating tree inventory may be challenging in terms of time, money, and effort. This is because a lot of time needs to measure the tree manually. In the large area and inaccessible area, the process may take a lot of time and energy. A large cost is required to finance employees, instruments, and transportation. In classify the tree species manually, species expert was needed. The number of species experts may not enough or possible to classify individual tree species manually.
The capability of the multispectral sensor in remote sensing technology to record spectral reflectance of the object had capable them to classify the tree species [5]. In classify the tree species for large forest areas, high spectral resolution data such as satellite World View (WV) image is needed and helpful. This is because the WV image can provide a high-resolution image with a multispectral sensor. Beside time-efficient and user-friendly, WV-2 image can provide high accuracy results for the classification of major forest tree species However, there are difficulties in classifying the tree species using the WV-2 image. This is because there are some issues in determining the tree segmentation for the classification process if only WV-2 is used. Multispectral imagery does not have enough spatial detail, to delineate the crown of the tree [6]. Apart from that, the use of advanced remote sensing technology such as LiDAR had been introduced to be used in the study of the forest inventory as it can improve the segmentation and classification process [7].  Figure 1 shows the location of the study area in MRSO, GDM 2000 coordinate system.

Multispectral WorldView-2
The Multispectral WorldView-2 image was used to carry out this study. The image was acquired on 23rd January 2010 covering FRIM forest at Kepong, located in the northwest of Kuala Lumpur. The image has a spatial resolution of approximately 0.5 m. The 0.5 m spatial resolution of the WorldView-2 image is one of its advantages in modern forestry. The level of detail enables the forest parameter to be analyzed at the individual crown level. Besides that, LiDAR and hyperspectral data also used to support this study. LiDAR data was used to generated CHM while hyperspectral used to extract spectral reflectance for tree species. LiDAR and Hyperspectral data were acquired on 9 April 2013 with both have a spatial resolution of 0.15m and 0.3m respectively.

Field Data Observation
Ground truth observation of the study area was carried out on 3 March 2020. The study area with a total of 3.5 hectares was grid into 35 rectangular subplots with each subplot have 500m2.area each. There are a total of 327 trees were recorded during the field observation. 25 species from 11 families were measured in the sampling plot. Diameter at Breast Height (DBH), height, crown diameter, tree species, and tree position (x,y) was the tree inventory parameter that was measured as the forest inventory.

Canopy Height Model
LiDAR data was then filtered to generated DTM and DSM using ArcGIS software. Both were the main output used to generate the CHM. Canopy height model can be derived from subtracting DSM  [8]. In this study, CHM was derived by subtracting the DSM into DEM using the minus tools in ArcMap software. During DSM generation all classified point cloud and all return were selected while for DEM only ground returns were included

Image Segmentation
Before the segmentation process, the convolution filter was applied to WV-2 image. A 3x3 convolution filter was applied to NIR and Red band to remove noise and increase the distinction of the features. In this study, classification was done on the object created by multi-resolution segmentation using parameter 23 for scale parameter, 0.8 for shape creation, and 0.5 for the compactness criterion. According to [9], multiresolution segmentation is often used as a segmentation algorithm in highresolution remote sensing data. This is because image objects can be generated with greater geographical significance and good adaptability [10]. Specific segmentation settings were created. The Image layer weights were adjusted for the best differentiation of forested areas, with the emphasis being put mainly on the NIR band. Before runs, the classification, the minimum and maximum NDVI value of tree were determined to classify between vegetation and non-vegetation segmented area. Nonvegetated areas were then classified into two further which is shadow and building using assign class algorithm. The features classification was done using the assign class algorithm. Table 1 show the threshold condition used to classify the features in the study. There are large segments region produced from the segmentation process. This is because there is a cluster of trees in one segmented region. Since individual trees segment needs in this study, the cluster of trees is separated into individual trees using a watershed transformation algorithm. According to [11] a watershed transformation image was considered as a topographic surface that defines watershed lines and a catchment basin by flooding process. Watershed transformation process had resulted in the shapeless tree segments. Tree segments normally expected to be circular. All the tree segments were then reshaped to give them a more rounded shape. eCognition morphology algorithm was applied. The processing of morphological images includes two fundamental operations which are erosion and dilation. Dilation and erosion, respectively, are operations that thicken and thin the object in the photograph [12].

Segmentation Validation
In this study, accuracy assessment of segmentation was done by using a method by [13] which The formula used to calculate the D value was described below. In this study, the training sample was done by manually digitizing the tree crown. There are 72 sample crowns that were digitized manually and been intersect with the segmentation output result. This is to calculate the over and under segmentation value. The intersection process was done using ArcGIS software (3D analyst tools).

Tree Species Classification
Each species was assigned with one to two species samples to train the software the characteristics of each tree species. Before the training sample was selected, classes of trees that had been identified during the field observation were inserted in the class hierarchy in the eCognition software. About 20 species of tree was classes into 10 family class. Vegetation class has been reclassified into forests and non-forests, including the field and garden. To archive, the second objective of this study a supervised Nearest Neighbour (NN) classification was then used for a detailed classification of the forest area. Standard NN features space was edited using the arithmetic features (equation) that had been created and test by [14]. This equation used 5 image layers which are R,G,B,NIR, and CHM. This approach was to resolve the problem inequality in brightness since the study area contains not uniform tree height that results in inequality of brightness distribution. After the standard NN was finally set, the classification algorithm was run to classify the individual tree into their species. The classification was divided into two which is classified by a family group and classify by the tree species group. A new export rule was created to export the image vector layer result for the future used in other software.

Classification Accuracy Assessment
Accuracy assessment for species classification was done by using error confusion matrix. Error confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known. This method is done by analysis of the result based on the training sample. For this study, the accuracy assessment was done to species from Dipterocarpace family, which is Dryobalanops aromatica, Dipterocarpus baudii, Shorea macroptera, Shorea bracteolate, and Shorea leprosula. While another 3 species are Scorodocarpus borneensis and Ochanostachys amentacea from Olacaceae family, and Elateriospermum Tapos from Euphorbiaceae family. This is because the sample for this species is more compare then other species. Some species have only one sample and this was used as a sample classify the tree species. The using of testing sample from classification sample for accuracy assessment will affect the accuracy result. Apart from that, 5 species from Dipterocarpaceae, Olacaceae and Euphorbiaceae which is Elateriospermum taposwas used as the validation sample to check the accuracy of the tree classification result. The total otal testing sample was 17 samples.

Extraction of Spectral Signature
Segmentations and tree species classification vector result was used to extract the spectral reflectance of the tree species. The exported vector layer results were overlay with the WV2 image as the indicator to create the sample polygon of the tree species. This is to make sure the created polygon was within the crown area of the tree species and not included the border of the other species crown.
ArcGIS software was used to extract the spectrum value of the tree species. Using the 'tree sample manager' from classification tools in ArcGIS, the distribution of tree sample was drawn using the polygon. The value was then export to Microsoft Excel to plot the spectral reflectance graph. Spectral reflectance result was separated into two categories which is dominant and endangered tree species group. Dominant tree species refer to the highest number of tree species in the study area. There are about 10 endangered tree species and 8 dominant tree species.

Results and Analysis
This study consists of endangered and dominant tree species map and spectral signature as the final result. Only the tree recognized in the image during the fieldwork was used for the analysis. Statistical analysis was done to analyze the tree inventory data using SPSS software. The accuracy assessment for the crown segmentation by measuring the goodness fit between manual digitized crown and output segmentation while for tree species classification error confusion matrix in the eCognition was used for the accuracy assessment.

Tree inventory
The RMS for horizontal and vertical was below 0.03 and 0.06 and the final coordinate of the reference point which is 358021.725N and 404255.828 E. This reference point was used to be tie with the position of survey plot and tree. Table 2 show the summary of statistical analysis for 100 selected trees collected from the field.

Canopy Height Model
Minus tools were used to subtract DSM into DEM to extract the CHM CHM value was considered as the value of tree height result from LiDAR data processing. Table 3 show the highest and the lowest CHM, DEM and DSM value of the study area. The CHM value was validated by applying regression statistical analysis as show in graph provided in figure 2. Based on the result it is shown that the R square and adjusted R square showed that LiDAR-derived height was best predicted at 67%. This shows that there are no significant changes between the tree height measure from the field and extract from CHM.  Figure 3 show the result of the segmentation using parameter of 0.8 Shape criterion, 0.5 Compactness criterion and 23 Scale Parameter. The segmentation process had undergone several sub-processes such as object classification, water transformation, and morphology. Figure

Segmentation Accuracy Assessments
After done with the segmentation process, tree segmentation result was validated first. concept of under and over-segmentation to evaluate the multi-resolution segmentation result. D value was measured as 86% segmentation accuracy with 16% error. The overall accuracy of multi-resolution segmentation resulted in 86% in 1:1 match. From the results can be analysed that the segmented result was between 1 and 0 which is valid to be used for species classification. The nearest the value of segmented validation result to 0 the ideal the match between xi and yj while the nearest the result to 1, this means the segmentation has the minimum value mismatch. Table 4 shows the segmentation validation result.

Output Classification
The segmentation result used to classify the tree species in the study area. There are almost 26 types of tree species from 11 families. The result shows the distribution of Dipterocarpace family, dominant the study area. This is in line with the characteristic of FRIM forest which is dominated by the Dipterocarpace. Figure 5 shows the overall species distribution map.

Classification Accuracy Assessment
The test was done for 8 classes. Overall accuracy was 62% with kappa coefficient 0.57. The lowest accuracy is Shorea macroptera with 30% and highest accuracy Scorodocarpus borneensis with 63%.

Conclusion
In conclusion, most of the dominant trees have a DBH measurement of more than 50cm. The most dominant tree was an emergent tree with a large area of the crown. Image segmentation has produced a good result for this study which is individual tree crown delineation. Most of the tree crown was segmented fitly and was accepted when validated by calculated the over and under segmentation area between reference polygon and segmented object. The overall accuracy which is 86% had shown that the multi-resolution segmentation was valid to be used to segment the crown of a tropical tree. The classification of tree species using the Nearest Neighbour supervised classification had produced 62 % of accuracy assessment result when being test by using a testing sample. The species classification result had shown that FRIM forest was dominated by Dipterocarpaceae family. This also been supported by the research by (Omar, 2016) which had mentioned that FRIM forest was dominated by Dipterocarpaceae. This shows that the classification by eCognition software was valid to be to classify the tree species using WV-2 data. But to be concluded, it is difficult to classify the tree species in tropical forest as it consists of many species of trees whose position is scattered even in a small area.