Next Article in Journal
Real-Time Synchronous 3-D Detection of Air Pollution and Wind Using a Solo Coherent Doppler Wind Lidar
Next Article in Special Issue
Satellite, UAV, and Geophysical Data to Identify Surface and Subsurface Hydrodynamics of Geographically Isolated Wetlands: Understanding an Undervalued Ecosystem at the Atlantic Forest-Cerrado Interface of Brazil
Previous Article in Journal
UAV Remote Sensing for Detecting within-Field Spatial Variation of Winter Wheat Growth and Links to Soil Properties and Historical Management Practices. A Case Study on Belgian Loamy Soil
Previous Article in Special Issue
Combining Different Transformations of Ground Hyperspectral Data with Unmanned Aerial Vehicle (UAV) Images for Anthocyanin Estimation in Tree Peony Leaves
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mapping Areas Invaded by Pinus sp. from Geographic Object-Based Image Analysis (GEOBIA) Applied on RPAS (Drone) Color Images

by
Vinicius Paiva Gonçalves
1,*,
Eduardo Augusto Werneck Ribeiro
2 and
Nilton Nobuhiro Imai
3
1
Professional Master’s Program in Climate and Environment, Department of Health and Services, Federal Institute of Santa Catarina—IFSC, Av. Mauro Ramos, 950, Florianópolis 88020-300, SC, Brazil
2
Professional Master’s Program in Climate and Environment, São Francisco do Sul Campus, Federal Catarinense Institute—IFC, Duque de Caxias Highway, 6750, Iperoba, São Francisco do Sul 89240-000, SC, Brazil
3
Department of Cartography, São Paulo State University—UNESP, Roberto Simonsen St., 305, Presidente Prudente 19060-900, SP, Brazil
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(12), 2805; https://doi.org/10.3390/rs14122805
Submission received: 2 April 2022 / Revised: 12 May 2022 / Accepted: 26 May 2022 / Published: 11 June 2022

Abstract

:
Invasive alien species reduce biodiversity. In southern Brazil, the genus Pinus is considered invasive, and its dispersal by humans has resulted in this species reaching ecosystems that are more sensitive and less suitable for cultivation, as is the case for the restingas on Santa Catarina Island. Invasion control requires persistent efforts to identify and treat each new invasion case as a priority. In this study, areas invaded by Pinus sp. in restingas were mapped using images taken by a remotely piloted aircraft system (RPAS, or drone) to identify the invasion areas in great detail, enabling management to be planned for the most recently invaded areas, where management is simpler, more effective, and less costly. Geographic object-based image analysis (GEOBIA) was applied on images taken from a conventional RGB camera embedded in an RPAS, which resulted in a global accuracy of 89.56%, a mean kappa index of 0.86, and an F-score of 0.90 for Pinus sp. Processing was conducted with open-source software to reduce operational costs.

1. Introduction

Invasion by exotic species represents one of the greatest threats to biodiversity conservation at a global scale [1] and is indicated by experts as capable of compromising the supply of ecosystem services even in protected areas [2]. As a result, it is important to develop methods for controlling exotic species, especially in natural areas with difficult access.
Experts emphasize that invasion cases are frequent in protected areas, despite their recognized importance in in situ biodiversity conservation. Invasions occur even in conservation units (CUs). They are protected areas under special management regimes in Brazil, created to preserve ecosystems and provide them with a management structure and resources for immediate corrective action.
These invasions in protected areas represent a serious concern throughout the country and the world [3]. In general, protected areas such as CUs in Brazil are extensive territories and usually have reduced staff for management and inspection activities. The total area protected by 334 federal CUs is 1,714,241.92 km2, while the total number of Chico Mendes Institute for Biodiversity Conservation (ICMBio) employees is 3603 [4,5].
Even with all this protecting infrastructure, it should be noted that the dispersion of invasive species is closely associated with human presence, as demonstrated in a global evaluation conducted by Van Kleunen et al. [6], who, through the GloNAF project [7], found that human pressure has modified the geographic composition and global distribution of invasive plants across the continents of the world by transporting and accumulating species, mainly from the Northern Hemisphere.
This scenario has occurred for pine trees of the Pinaceae family, especially the genus Pinus. These pine trees were brought from the Northern Hemisphere to the Southern Hemisphere due to their great economic importance, but they already exhibit invasive behavior in many countries, causing not only ecological but also economic and social damage [8]. The genus Pinus was introduced in Brazil in the 1950s [9] and on the island of Santa Catarina in 1963 [10].
The planting of Pinus sp. on Santa Catarina Island was not properly managed, triggering uncontrolled pine reproduction [10]. Thus, the conservation areas of the island began to suffer from the invasion of this exotic species. The impacts of urban expansion and climate change [11], once combined, represent strong reasons to implement effective and efficient management and controlling strategies for exotic species. In this sense, Ziller et al. [2] proposed a scheme to define priorities for the control of exotic species with invasive potential. The authors demonstrated that the implementation of persistent control represents the path to containing the reproduction of invasive species established in protected areas, as already recommended by Moody and Mack [12]. However, this strategy still depends on human resources to conduct the control, in addition to implementing costs that are generally incompatible with the budgets of CUs.
Given this scenario, in regard to the control of exotic plant species within CUs, Marzialetti et al. [13] pointed out that the use of remote sensing can contribute to locating, detecting, measuring, or even specifically identifying these species, and remote sensing can be used both in planning and in monitoring efforts, especially in areas that are difficult and risky to access [14].
Freely accessible satellite images usually have limited use to monitor this type of species, due to the spatial resolution presented. High-resolution satellite imagery, frequently expensive, can be useful for monitoring critical large areas. As a result, aerophotogrammetric surveys that offer sub-metric spatial resolutions may represent even better alternatives for this kind of monitoring, as demonstrated by Müllerová et al. [15] and confirmed by other studies [13,14], especially with the popularization of unmanned aircraft, formally referred to as a remotely piloted aircraft systems (RPASs) [16].
This technology has had an impact on the spread of uses and applications of this type of equipment, especially in providing high-resolution spatial imagery, which allows for greater detail of objects. However, visual interpretation becomes more time-consuming or even impractical. In this sense, the most recent mapping methods, as reviewed by Osco et al. [17], try to automate the process to achieve high accuracy, using an artificial intelligence (AI) technique known as deep learning (DL). According to LeCun et al. [18], DL “allows computational models that are composed of multiple layers to learn representations of data with multiple levels of abstraction […], and discovers intricate structure in large datasets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representations in each layer from the representation in the previous layer”. In this sense, a common DNN in the supervised network categories is known as convolutional neural network (CNN). This architecture became widely used in remote sensing mapping since Krizhevsky et al. [19] used it to win an image classification competition by a large margin. Recent studies reviewed by Osco et al. [17] also reported that DL is used to enhance remote sensing observations, combined with other methods, such as super-resolution, denoising, restoration, pan-sharpening, and image fusion techniques. However, the practical application of DL in remote sensing requires the mastery of programming languages, such as Python, through libraries such as TensorFlow and Keras.
RPASs have become more accessible and easier to fly [20], mounted with high-quality RGB sensors, and available in “ready-for-flight” models [21]. Demonstrating the high applicability in vegetation research, this equipment began to be widely used to determine and plan management actions for crops and forest of various species, as reported by Guimarães et al. [22], Onishi and Ise [23], Wang et al. [24], White et al. [25], and Apostol et al. [26].
Guimarães et al. [22] evaluated the use of RPASs in the management process of forests and agricultural resources and established criteria for decision making in terms of the equipment available on the market. Thus, operational costs, the ability to mount sensors, and the management of monitoring missions are variables that must be taken into account when using RPASs.
Regarding the sensors, White et al. [25] assessed the use of an RPAS with an RGB camera to identify individual maritime pine tree seedlings after fire events on the Upper Peninsula of Michigan (USA) in 2012. The authors were successful in individually identifying species using a multispectral camera combined with geographic object-based image analysis (GEOBIA), which allowed successful identification with 90% accuracy for individuals with heights of approximately 0.3 to 1.5 m. In Wang et al. [25], the availability of ultrahigh resolution images by RPAS allowed the classification of an urban forest from spectral information, morphological parameters of the vegetation, texture information, and vegetation indices. In comparison to that determined by images generated by multispectral or hyperspectral specimens, the identification of species determined by ultrahigh-resolution images of RPAs with RGB and segmented by GEOBIA provides more accurate information and is cheaper [26].
Onishi and Ise [23] used RGB images and deep learning mapping in a mixed urban forest in Japan. The authors successfully identified five species (90% accuracy), and a key factor of the ultrahigh-resolution image classification success may be the way in which they separated each class. Thus, greater detail of each class in the GEOBIA results in better classification performance in terms of species recognition, even when compared to species with similar colors [27].
The combination of attributes (features) in GEOBIA, such as texture and tree crown shape, contributes to improving the classification tasks [28]. This characteristic occurs because, unlike traditional pixel-based methods, GEOBIA works with homogeneous and continuous groups of pixels, addressed as objects [28,29]. Therefore, GEOBIA classification considers segmentation one of the main processes [30].
Several studies evaluated different methods to perform segmentation, either by using different tools or using different criteria from the same tool, resulting in several segmentation results that need to be evaluated and prioritized for classification [28,30]. However, Costa et al. [30] indicated that the analysis of segmentation accuracy is still under development and has several methods proposed to objectively evaluate, using mathematical criteria that consider aspects of geometry and positioning, which may be supervised or unsupervised. Despite this, procedures to evaluate segmentation accuracy have not yet been standardized; thus, subjective evaluation by visual interpretation may be acceptable depending on the application [30].
In studies that evaluate the invasion of exotic plant species by remote sensing, especially in forest environments, it is expected that the target “edges” are more complex compared with anthropic objects such as buildings in urban areas, due to the complex format of each plant or their groupings that usually form transition areas in a gradient between different environments. This results in the definition of boundaries always being imbued with subjectivity, whether for visual delimitation in images or even in field measurements, which makes it acceptable to verify the segmentation quality using subjective visual criteria. One example is the recent study by Modica et al. [31], who reported that they obtained the best segmentation for the LSMS (large-scale mean-shift) algorithm according to their visual evaluation through trial-and-error testing of the parameters.
According to the survey conducted by Modica et al. [31], random forests (RFs) and support vector machines (SVMs) are among the most recommended supervised classifiers for GEOBIA, showing a very satisfactory performance. As explained by the same authors [31], SVM is a nonparametric algorithm based on kernel functions from statistical learning theory. In essence, SVM operates by learning to determine the boundary between training samples of different classes by projecting them into a multidimensional space and finding hyperplanes that maximize the dataset separation, according to the predefined number of classes. RF, on the other hand, is a method that randomly creates many decision trees, independent of each other. They are all trained on the same characteristics but on different sets or on individual datasets derived from the training set, a separation called “bagging”. The algorithm itself generates an unbiased, internal estimate of the generalization error using the “out-of-bag” samples, i.e., data from the training set itself, but which were not used to train the sample under test.
Despite the advances provided by the GEOBIA classification on high-spatial-resolution imagery, the identifying process of a specie should consider the different types of environments where the targets are found. However, even with RPAS images, accurately extracting different thematic classes only from the structural characteristics and colors of RGB images is still a challenge that must be overcome.
Given this information, the present study aimed to evaluate a methodology for mapping Pinus sp. in natural environments using a classification process by geographic regions, known as GEOBIA [32,33]. This study presents and analyzes alternative solutions according to a set of attributes and algorithms classified by machine learning (random forest (RF) and support vector machine (SVM)) applied to determine the best alternative to classify and identify Pinus sp.
Only conventional RGB sensors were chosen, similar to Albuquerque et al. [34], making it possible to perform the study with any good-quality sensor equipment. No ground support points were used to produce the georeferenced mosaics because the positional accuracy obtained from the global navigation satellite system (GNSS) sensor embedded in the equipment was sufficient to reach the purpose.
The main text of this study consists of three parts divided into the introduction, followed by the materials and methods, and the results and discussion The methodological section covers the areas of interest, equipment and software, method flowchart, and accuracy evaluation. The results and discussion segment includes the overall results compared with similar studies and the results obtained in each area of interest, along with confusion matrices, classification color maps, and particularities. The main text is supported by five appendices that complement the textual information with a description of each feature computed, the software parameters and settings, a list of acronyms, and a feature list with the best classifications for each area of interest.

2. Materials and Methods

2.1. Study Areas

Four areas of interest (Figure 1) were selected for the study, all in the Greater Florianópolis region, containing portions of the ecosystem pioneer formations—vegetation with marine influence, according to the Brazilian Institute of Geography and Statistics (IBGE) [35], also known as restinga areas.
Three study areas were located on Santa Catarina Island, municipality of Florianópolis—SC, and one is on the mainland, city of Palhoça-SC. In the northern part of the island is Sapiens Park (Sapiens Parque—SAPIENS). This is a private area that has an extensive natural area and preserves a mosaic of wetlands and restingas. It was proposed as a CU categorized as a Private Natural Heritage Reserve (RPPN), although it has not yet been formally recognized. In the east, Rio Vermelho State Park (Parque Estadual do Rio Vermelho—PAERVE) that was recognized as a CU in 1974 currently has extensive areas dominated by Pinus sp., due to a state a forestry program carried out in the 1960s [9]. In the southern part of the island there is the Municipal Natural Park of Dunes in Lagoa da Conceição (Parque Natural Municipal das Dunas da Lagoa da Conceição—PNMDLC), a public area that, for more than 10 years, has been managed by volunteers from a biodiversity conservation project aiming to contain the invasion of Pinus sp. [10].
On the mainland, Baixada do Maciambu is located in an area of Palhoça in Serra do Tabuleiro State Park (Parque Estadual da Serra do Tabuleiro—PEST) and represents one of the most significant restinga areas conserved in Santa Catarina state, but it is also invaded by Pinus sp. Although it is mostly covered by herbaceous vegetation, this area was chosen because it represents an invasion front, with individuals of Pinus sp. at various ages, from adults to the youngest.
This study aimed to expand the representativeness of diverse environments and, consequently, the challenges of mapping invasion areas. In each area of interest, sampling regions were selected, which covered 300 × 300 m, totaling 9 ha of images per flight.
Regarding the local climate, considering the Köppen classification, this region of the state has a humid mesothermal climate (without dry season) Cf, containing subtype Cfa with a hot summer [36]. According to the world map that was updated for the Köppen–Geiger classification [37], the climatic similarity of this area with that of the southeastern region of the USA is evident, as Cfa encompasses the natural distribution area of the genus Pinus.

2.2. Equipment and Software

The processing was performed on a personal notebook computer with an Intel Core i7-950H processor, 2.60 GHz, and 32 GB of RAM. WebODM 1.8.2 software [38] was used to process the images captured by the RPAS. To conduct tasks and processing in the GIS environment, including the execution of the GEOBIA classification, the software QGIS 3.16.8-Hannover [39] was used mostly with the Orfeo Toolbox (OTB) complement algorithms [40]. The attribute extraction was carried out using the GeoDMA 2.0.3 beta [41] plugin implemented in TerraView 5.6.1 software [42], while attribute selection was performed in Weka software [43], version 3.8.5. The accuracy analysis was performed using spreadsheets in Libreoffice Calc based on the confusion matrix obtained from the processing logs of the TrainVectorClassifier algorithm of the OTB complement implemented in QGIS. The precision, recall, kappa, and F-score values calculated in spreadsheets were confirmed with those provided in the same logs.
The flights used a multirotor equipment model Inspire 1 (T600) manufactured by Da-Jiang Innovations (DJI), equipped with a Model X3 (FC350) camera with 12.4 MP. The execution of the flight plans used the proprietary platform DroneDeploy [44] in its default setting, which, for this equipment, meant 75% frontal and 65% lateral overlap and 12 m/s speed for a flight height set at 100 m (SAPIENS) and 90 m (PAERVE, PNMDLC, and PEST), resulting in approximately 150 photos per flight. The records were made on days with good weather, with sun and few clouds, except in the PEST area, since it was cloudy. After the first flight (SAPIENS), safety reasons led to the height being reduced by 10%. The flights were performed in the SAPIENS area on 06/11/2020, 09:18:21–09:40:10, in the PEST area on 06/04/2021, 10:13:20–10:18:15, in the PNMDLC area on 10/04/2021, 12:10:14–12:15:09, and in the PAERVE area on 11/04/2021, 12:23:38–12:49:59, Brasilia Time (BRT), UTC-3.

2.3. Methodological Flowchart

This methodological flowchart (Figure 2) displays the main steps covering the complete cycle, from the preparatory step for field data acquisition to the post-processing steps involving cartographic product assembly, layer stacking, segmentation, GEOBIA classification, and area estimation.
The first stage (A) consisted of delimiting the polygons for the flights with the RPA and preparing the flight plan and aerial survey carried out first in the SAPIENS area (06/11/2020, 09:18:21–09:40:10), then in the PEST area (06/04/2021, 10:13:20–10:18:15), PNMDLC area (10/04/2021, 12:10:14–12:15:09), and PAERVE area (11/04/2021, 12:23:38–12:49:59).
The second step (B) involved the orthomosaic composition (RGBα), visible atmospherically resistant index green—VARI, vegetation index calculation, and normalized digital elevation model—nDEM generation to better represent the vegetation spectral variability and height.
Raw cartographic products, the orthomosaic (RGBα), the digital surface model (DSM), and the digital terrain model (DTM) were assembled employing the open-source software WebODM 1.8.2 [38]. The settings were adjusted to have “DSM and DTM” options “enabled” and the GSD value set to 5 cm. Images were not resized, and other settings were set to default. The processing time was about 40 min for each area (150 photos).
The derived cartographic products were prepared in QGIS. The raster calculator was used to remove the RGBα alpha (α) band, converting the orthomosaic only to RGB, and calculating the VARI vegetation index using the formula set by Henrich et al. [45].
VARI = Green − Red/Green + Red − Blue.
According to the method proposed by De Luca et al. [29], to reduce the effect of potential outliers in the subsequent segmentation, this band was discretized and normalized to a usual range of values from 0 to 255, compatible with the RGB band values. This procedure was performed by the QGIS Processing Toolbox, using the SAGA GIS Raster Normalization and GDAL Convert algorithms to normalize and convert the values to 8 bits (byte).
The digital elevation model—DEM was also generated from QGIS by the difference between DSM and DTM, i.e., subtracting DTM values from DSM ones. This procedure, also performed with SAGA GIS, generated the raster elevation values. After being normalized, the nDEM was resampled to 5 cm with SAGA’s resampling algorithm to make the pixel size compatible with other layers. In conclusion, the nDEM raster was also discretized to 8 bits (byte).
The third step (C) involved layer stacking, which was performed with the GDAL Merge algorithm, native to QGIS, observing the “orthomosaicRGB + VARI + nDEM” order, resulting in a raster with five bands, the three RGB bands, plus the VARI vegetation index and elevation model nDEM.
The fourth stage (D) applied GEOBIA while combining zonal, spectral, and shape statistical information. The GEOBIA classification was conducted in the Orfeo Toolbox (OTB) integrated with QGIS, starting with the image segmentation process, whose parameters were set as indicated by Liau [46] in Figure 3.
The interpretation used to identify the appropriate segmentation parameters corresponded to the definition proposed by Modica et al. [31], which was based on the review conducted by Costa et al. [30], assuming that good segmentation corresponds to a relatively high spectral separability between two different land cover (LC) classes and a minimum within polygons of the same LC.
The attribute extraction obtained with OTB was complemented by GeoDMA, a software integrated into TerraView [42] that allows the extraction of zonal, spectral, and textural attributes, important to execute GEOBIA classification through machine learning [32]. The full description of the calculated attributes with OTB and GeoDMA is shown in Appendix A. In addition, all OTB and GeoDMA attributes were also used in the process, alone and together, even if in duplicate, since it was seen that some values calculated for the same attribute did not have a totally identical outcome. There were subtle differences, usually after the sixth or seventh decimal place.
At this stage, a minimum of 2% of the segments were selected as training and validation samples, resulting in the model files being further employed in the classifying process and in the confusion matrices used for the accuracy evaluation step. Global accuracy, kappa index, F-score, precision, and recall were determined to choose the best model files for classification.
The number of classes (Table 1) varied among the study areas, according to the typologies found in each sampled area. The class definition was specified for each area by photointerpretation. For a better understanding, photos taken on the ground and video captured in flight with a camera at an oblique angle, between 45° and 60°, were employed. These images were not processed; they were only used for the interpretation of the vegetation typologies and other classes.
A widely practiced approach was used to run the machine learning algorithms, as described in the study by De Luca et al. [29]. As explained by Frank et al. [43], it is necessary to prepare the data to conduct the machine learning (ML) process toward the training and validation step, which are employed as inputs for the ML algorithms. To do this, it was necessary to first correctly label a certain number of samples for each class to be used in the training and validation process. The determination (labeling) of the regions containing the training and validation samples was performed in QGIS, with the native “selection by location” algorithm, as shown in Figure 4, by visual interpretation, using the same field photographs and videos taken during panoramic flights as described for class definition.
From the total samples indicated, most were employed in the training stage, during which the algorithms identified patterns used to classify the remaining objects. A portion of the labeled samples were not used in the training stage but were employed in the validating stage. The correct classifications and errors were calculated, as recommended by Radoux and Bogaert [33], to enable an accuracy evaluation.
The division of the sample percentages dedicated to training and validation, in tasks involving machine learning, has been defined by researchers. Some opted for a division of 80% training and 20% validation [47], while others prefer a division of 70% training and 30% validation [32]. In this study, 70% of the samples were used for training, and 30% were used for validation to provide a more rigorous evaluation of the results.
Table 2 shows the number of segments selected for training and validation in each area in relation to the segment total number in each image. There are no parameters to determine sample size, but De Luca et al. [29] considered it appropriate to select a number of segments no less than 2% of the total number of image segments. In this case, 2% was adopted as the minimum reference value that needed to be exceeded in the sampling.
In the sample selection process, the samples were labeled uniformly across the entire image, trying to maintain the proportion per class. The minimum number of samples needed per class was previously calculated, considering a total of 2% of the polygons resulting from the segmentation. Concentrating too much selection in one class to the detriment of another was avoided, especially in the target class of the study, Pinus sp. However, it is not possible to perform this distribution in a perfectly homogeneous way among the classes, because the area occupied by each class varied, and the total number of polygons per class was unknown before the classification conclusion.
Attributes (features) were selected using tools from Weka, University of Waikato [43]. The vector files with attribute tables that contained the selected samples for training were exported to a spreadsheet to create the datasets used in the attribute selection process. To prevent the model from overfitting, i.e., losing the ability to generalize to new data, no validation polygons were employed in the attribute selection.
It was necessary to generate three spreadsheets for each area of interest to comprehensively perform the attribute selection: one containing attributes calculated with QGIS/OTB, another with those generated only by TerraView/GeoDMA, and the third containing both. An evaluation with all attributes was performed due to the possible differences between the values produced by the two applications, aiming to discard non-numerical values (not a number—NaN), which can severely degrade the classification. The existence of fields calculated in duplicity with identical values was also verified.
The attribute selection was performed in the Explorer application on the select attributes table. After opening the spreadsheet, the numeric fields “label” and “class” were removed because they were not used in this process. This procedure did not alter the original spreadsheet file but only served to indicate to the software which fields were not to be considered in the attribute selection.
Two tools were used. The first tool, CfsSubsetEvaluation (CfsSE), is a correlation-based feature selection (CFS), does not use an algorithm to mine data, and is considered a “filter” type of tool, because it employs statistical techniques to score the relevance of the relation between the inputs and the target variable and, accordingly, filters the best to be used. According to Nevalainen et al. [48], this tool searches for subsets that are highly correlated with each class and have low internal correlation, making this tool suitable for spectral vegetation data because the spectral characteristics are generally best characterized by a combination of more than one spectral band. The second tool, wrapperSubsetEval, is a wrapper type of tool. As explained by Tiwari and Singh [49], it uses machine learning algorithms to mine the best data subset and, in theory, is able to provide better results than those of the filter tools at the cost of greater computational effort. Since it involves machine learning, the attribute selection was performed with a wrapperSubsetEval (wrapper) combined with a classifier J48 (Decision Trees C 4.5). The CfsSE correlation method was implemented in the Weka software as suggested by Nevalainen et al. [48] using the search methods Genetic Search, BestFirst, and GreedyStepWise. The Wrapper method was implemented according to Frank et al. [43]. All the specific parameters used in Weka for attribute selection are described in Appendix B.
It is important to note that any changes made to the order of the rows or columns in the spreadsheet would result in random changes in the attribute selection. For this reason, the spreadsheets containing the datasets were kept intact for the selection of attributes, without adjustments beyond those already described as necessary.
Regarding attribute selection, it should also be explained that 10-fold two-way cross-validation was employed, resulting in lists of attributes that were sorted according to the frequency with which the attributes were selected. This allowed us to test various combinations of the “best attributes” that were compared with the use of all attributes. Additionally, it was worth observing how much the attribute selection truly improved the classification.
In the CfsSE method, the selected attributes refer to eight of the 10 cross-validation rounds, indicating a cutoff line of 80% frequency, as proposed by Nevalainen et al. [48]. However, instead of testing only the selected attributes in all three search methods (Genetic Search, BestFirst, and GreedyStepWise), the selected attributes were also separately tested. For the wrapper method, the selected attributes ranged from one to five rounds (10% to 50% of frequency), since higher ranges did not select attributes for all areas. All fifty combinations of attribute selection and machine learning algorithm configurations tested are shown in Appendix C, with 25 different feature combinations implemented in the two algorithms (SVM and RF).
The attribute selection by Weka resulted in logs exported in text format (.txt) containing the attributes and their importance in percentages. The logs were converted into spreadsheets to sort and effectively select the lists of attributes to be employed in the classification.
The training and validation step was performed using the TrainVectorClassifier algorithm from the OTB/QGIS Processing Toolbox. Among the available machine learning classification algorithms, two were used in this application, random forest (RF) and support vector machine (SVM), which, according to the OTB Development Team [40], are based on the OpenCV Machine Learning (2.3.1 and later) and LibSVM libraries.
The process inputs were vectors containing the polygons selected for training (70% of the total) and validation (30% of the total), indicating in “field names for training features” the fields selected for training in the attribute selection and in “field containing the class integer label for supervision” the numeric field “class”.
The configuration parameters were kept in default QGIS/OTB configuration for both algorithms. The main settings for LibSVM were SVM kernel type “linear”, SVM model type “csvc”, cost parameter C “1.0” cost parameter Nu “0.5”, and random seed “0”, while other settings were kept disabled or in blank. The settings for RF were maximum depth of the tree “5”, minimum number of samples in each node “10”, terminating criteria for regression tree “0”, cluster possible values of a categorical variable into K <= cat clusters to find a sub-optimal split “10”, size of randomly selected subset features at each tree node “0”, maximum number of trees in the forest “100”, sufficient precision “0.01”, user-defined input centroids “0” (not empty, to avoid error), and random seeds “0”, while other parameters were kept empty or disabled. The parameters employed in QGIS/OTB for classification are described in Appendix D.
Once the training and validation stage was concluded, it was possible to perform the accuracy evaluation on the basis of the confusion matrices generated by QGIS/OTB itself as explained in Section 2.4. The final classification was performed with the four feature sets that reached the best-evaluated accuracy for each one of the four areas of interest. This procedure was performed using the QGIS/OTB VectorClassifier algorithm.
Lastly, the fifth step (E) consisted of fixing the geometry vectors using the native QGIS algorithm “fix geometries” and then converting the vector data from polygon to multipart using the “dissolve” algorithm, also native to QGIS and described in GIS documentation [39]. Once completed, it was possible to calculate the total area values per class, with the QGIS field calculator in square meters using the $area formula, as well as in hectares ($area/10,000). Some development details about this proposed method can be found in Gonçalves [50].

2.4. Accuracy Evaluation

Accuracy estimates were determined on the basis of the count of spatial entities (objects), as indicated by Radoux and Bogaert [33], from the confusion matrix provided by QGIS/OTB. The formulas used in the calculations are shown in Table 3. The calculated values for producer accuracy or recall, user accuracy or precision, F-score [51], and the kappa Cohen index [52] were confirmed as a function of the values provided by QGIS/OTB. Overall accuracy and Cohen’s Kappa index allowed the overall classification quality to be evaluated, along with precision, recall [53], and especially F-score, which allowed each class to be assessed [51]. These measures could be implemented in spreadsheets according to the formulas presented in Table 3.
The estimated accuracy was calculated for all the performed classifications. By default, the OTB TrainVectorClassifier algorithm provides precision, recall, and F-score for each class and overall classification performance by the kappa index. The obtained classification only with the features calculated by OTB was adopted as a reference to estimate the improvement obtained from the dataset increment with the features calculated by GeoDMA, as well as with successive rounds of attribute selection using Weka. Thus, a baseline was established to estimate the improving classification in relation to the OTB.
The accuracy results were evaluated using the classification performed using the OTB package implemented in QGIS as reference after using this single software rather than another approach involving fewer processing steps and, thus, an easier process. In order to consider the inclusion of other attributes computed and selected in the other software as advantageous, it was necessary to check whether there was an improvement in the classification result.
A widely known reference, suggested by Landis and Koch [55], was employed to interpret the quality of the classification on the basis of the kappa index value, as shown in Table 4.

3. Results and Discussion

Table 5 shows the highest values attained for global accuracy (GA) and kappa indices, representing the classification accuracy considering all classes and those obtained for the F-score for Class 1, which represent the map quality in denoting the invasion areas of Pinus sp., the target of this study. The classifier-based predominance of the SVM is highlighted since it presented the smallest error for the target class in all areas.
According to the reference suggested by Landis and Koch [55] (Table 4), the classifications were excellent in all areas. However, as the resolutions of the orthomosaics generated by RPASs are very high according to Sharma and Müllerová [56], the values considered acceptable according to the literature vary considerably from 70% for global accuracy [57] to 85% for sensitivity (recall) and precision [58]. However, even considering this variation, the achieved results in this study can be considered satisfactory.
Indeed, GeoDMA calculates all metrics generated by OTB with the LargeScaleMeanshift and ZonalStatistics algorithms. It could seem that it would be enough to use only the calculated attributes by GeoDMA. If the calculations were totally identical, one should expect that there would be no differences between the classification results using only GeoDMA or GeoDMA combined with OTB. Nevertheless, for the common attributes between OTB and GeoDMA, subtle differences were observed, usually after the sixth or seventh decimal place, which is why the chosen option was to keep both attributes, leaving the choice up to the algorithm in the attribute selection step. From the attributes generated exclusively by OTB, the number of pixels (nbPixels) and the averages (meanB0, meanB1, meanB2, meanB3, and meanB4) calculated by LargeScaleMeanshift were excluded since the calculation of ZonalStatistics was identical for these fields. Considering only GeoDMA, some attributes had non-numerical values (not a number—NaN), which fatally affected the classification. Therefore, the removal of such attributes, which generally occurred with kurtosis (KURT_BX) and asymmetry (SKEW_BX) associated with bands 3 (VARI) and 4 (DEM), was performed.
The accuracy evaluation analysis showed that the best results were achieved using the GeoDMA attributes only in the SAPIENS area. In the other areas, the best classification result came from the combined OTB and GEODMA features (Appendix A). This showed that, although subtle, there were differences between the calculated attributes values and that the algorithm should make the selection.
Table 6 shows the highest results obtained in the accuracy evaluation in each AOI for the GA, kappa index, and F-score of the target class (Pinus sp.), for both classifiers (SVM and RF). The values are presented in comparison with the reference value obtained from processing the classifications using only the features calculated by the OTB, shown in the first column. The acronyms in the column headings indicate the setup used in the processing that originated features employed in the classification. The colors applied to the kappa index values correspond to the quality assigned, according to Table 4. The list of acronyms and their meanings are exhibited in Appendix C, while Appendix E shows the feature list.
In all four areas, the SVM classifier surpassed the RF classifier, with an average performance 6.24% higher, as shown in Table 7.
According to the results, the SVM performance was consistently superior in three areas (PAERVE, PNMDLC, and PEST) and slightly superior in the SAPIENS area. Regarding processing time, which reflects computational effort, SVM, on average, consumed approximately 30 times more effort than RF, taking an average of 33 s to complete the training and validation steps, while RF took an of 1 s.
The accuracy evaluation indicated an improvement given the additional procedures for the GEOBIA flow implemented in this study, such as increasing the number of attributes or features computed by GeoDMA and the attribute selection performed in Weka through filtering methods by statistical calculations (CfsSE) and by data mining with machine learning (wrapper). If the GEOBIA classification had been conducted only in the most basic way, using only the attributes computed in QGIS/OTB, without submitting them to selection, no improvement would be possible. Table 8 shows the average percentage quantification of these improvements reached in the overall accuracy and in the kappa and F-score indices.
Compared to the results procured when using only the OTB package for GEOBIA classification, a consistent classification improvement was reached, with an average increase of 4.22% in GA, 5.77% in kappa index, and 6.81% in F-score for the target class (Pinus sp.).
The improvement in classification with the inclusion of additional attributes calculated with GeoDMA and then with the attribute selection conducted with Weka was not immediate, showing randomness, as displayed in Figure 5.
These results showed that no feature selection method was superior to another. One could also observe that, in machine learning, the inclusion of more features did not necessarily result in improvement. Similarly, selection by the simplest method (CfSE), referenced by Nevalainen et al. [48] as suitable for vegetation, proved successful for some classifications but not others. The same scenario occurred for the most computationally expensive method (wrapper).
According to Table 6, the F-score value of the class referring to Pinus sp. increased for the SAPIENS area when only the set of attributes of the OTB was used in the classification. For the other cases, the selection of attributes consistently improved the classifications quality. This result shows that selecting only the best set of attributes may not be effective, as observed by Nevalainen et al. [48], who selected the method that employs 80% of the most frequent attributes. However, this set of features did not improve the classification accuracy because, in their study, the best performance, GA 95.5% and kappa 0.92, was achieved when all attributes were used. It is also worth mentioning that the data were captured by a hyperspectral camera and processed by deep learning in their research. The classification was performed by including the point cloud and not the orthomosaic, achieving the best results with random forest and multilayer perceptron (MLP).
Unlike Nevalainen et al. [48], who obtained the best results using all attributes, this study had random results because of the rising number of attributes not necessarily reflected in the improved classification. Similarly, the choice of attributes indicated as the most relevant attributes did not always improve accuracy. The results from this study indicate that testing possible attribute combinations according to different attribute selection methods and ordinating classification results, based on accuracy values, can improve the classification result. In addition, the automation of this process will enable the testing of a wide range of possibilities.
Despite having used fixed-wing equipment, the approach used by De Luca et al. [29] is the closest to the methodology employed here. Their results of GA 94% and kappa 0.92 were higher than those in this study, but the near-infrared (NIR) band method was employed in a less environmentally diverse context.
Nascente et al. [59], on the other hand, also used a ready-to-fly multirotor quadricopter equipped with an RGB sensor. They mapped with GEOBIA the cover of invasive vegetation in vereda environments in the Cerrado biome. Despite similar methodology and objectives, the technique is different since it employs nearest neighbor classifier combined with feature space optimization. The results also achieved GA and kappa indices higher than 80% and 0.8, respectively. This demonstrates that more than one method may perform accurate mapping with equipment limited to the RGB spectrum.
The aims of Feduck et al. [60] were similar to the objectives of this study in identifying a detection method capable of mapping the smallest individuals. However, their study used single photographs taken in an automated flight plan, hovering 15 m above ground level. No orthomosaic was generated, and the sampling-based method reached a GA of 86% and kappa of 0.81. These photographs can be used to estimate samples inside vegetation clearings, but it may be impractical to execute orthomosaic flight plans in forested environments.
Considering studies that proposed mapping invasive alien species, it is also worth mentioning the results achieved by Lehmann et al. [14], who also employed open-source platforms to map invasions of Acacia mangium in a mussunanga environment in northeastern Brazil, and reached a GA of 85% and kappa of 0.82, and by Samiappan et al. [60], who worked on the mapping of Phragmites australis, Poaceae, using RPAS images in marshes in the south USA, attaining a GA of 85% and kappa of 0.70. Both studies had similar accuracy results, slightly below those of this study, despite the use of cameras capable of capturing the infrared spectrum.
The next section presents the confusion matrices, maps, and interpretations of the most accurate classification models for each area of interest according to the experimental results. Appendix E lists all the features used to create these classification models.

3.1. SAPIENS

Table 9 shows the confusion matrix for the best classification obtained for the SAPIENS Park area. This classification was achieved using the SVM classifier with the attributes calculated by GeoDMA with Weka using CfsSubsetEvaluation executed with the GreedyStepWise search method. Compared to OTB, this improved the kappa index by 4.02% (Table 6).
Figure 6 shows the SAPIENS area image and the best-procured classification. The numerical results of the accuracy evaluation were corroborated by the consistent results from the visual analysis.
Overall, the classifier distinguished the class Pinus sp. in the tree vegetation in a very challenging context. The best-represented class was herbaceous vegetation, which was expected due to its homogeneity and difference in hues. The shade category, despite not representing a feature in the field itself, was treated as a class due to the difficulty in defining covert objects. Despite this, shade was more abundant where there were individuals of Pinus sp. due to their larger size compared to other vegetation types. This feature can also serve as an indicator of its occurrence.

3.2. PAERVE

PAERVE had the highest class number (Table 1) among the evaluated areas, representing an additional challenge. In addition to Pinus sp., Eucalyptus sp. and Casuarina sp. were also identified as exotic species. The confusion matrix (Table 10) indicated few errors, reflecting the higher accuracy achieved in this area when compared to other classified areas. This result shows that the class diversification did not hinder the classification process, which worked even for less representative classes such as Casuarina sp., anthropic areas, and water.
Figure 7 shows the PAERVE area image and the best-obtained classification. This area also showed visual coherence, as a good distinction between Pinus sp. and Eucalyptus sp. was observed. However, some segments were classified as restinga shrubs amid Eucalyptus sp. vegetation, which does not correspond to what is actually present in the area. The same occurred with the Casuarina sp. class, which represented a further challenge to the classifier given the few samples in the image, in addition to the low crown density that made the soil and objects visible, such as people and vehicles below the trees.

3.3. PNMDLC

Overall, the classification result in this area was also satisfactory, with a kappa index of 0.82 (Table 11), despite the complexity presented by the environment mosaic represented by the various classes. Again, the best-represented class was herbaceous vegetation, which had higher differentiation, in terms of both spectral and textural aspects and height.
Figure 8 shows the classification results attained for the PNMDLC, which were generally consistent with the image according to a visual evaluation.

3.4. PEST

The PEST area also had a very challenging context due to diversity of species and environments in the mosaic, as reflected in the variation of classes needed for classification. The target class Pinus sp. in this area also achieved an excellent F-score (Table 12). The most confusion occurred among the classes arboreal restinga and shrub–arboreal restinga, which are naturally difficult to discern even in the field.
Figure 9 shows the classification results. Notably, the mosaic diversity of environments formed by mostly herbaceous species was well represented in the classification, demonstrating the possibility of using the method in this context.
This area, mostly covered by herbaceous vegetation, was specifically chosen because it represents a region of invasion fronts around the matrix, close to different types of environments found in the restinga of the Baixada do Maciambu. By selecting this region, the challenge was to map the typical behavior of invasion fronts in the park, with differently aged individuals, from seedlings to adults, increasing the chance of testing the detection method in a more diverse condition than in consolidated invasion areas that tend to be further homogeneous. In addition, invasion fronts tend to be prioritized in biological invasion control projects; hence, it was considered important to represent them in the classification.
Nevertheless, during the visual evaluations, some segments classified as Pinus sp. were actually jerivá palms, Syagrus romanzoffiana (Figure 10). Like the Pinus sp., they usually stand out from the surrounding vegetation due to their height. Miyoshi et al. [61] used a deep learning method with hyperspectral images representing the state of the art in remote sensing to detect and classify individuals of S. romanzoffiana since the species occur mainly in early secondary forests, serving as an ecological indicator of regenerating forests.
Since an altimetry raster was one of the attributes used in the classification, this factor likely helped the confusion. Unlike the multispectral images used by Miyoshi et al. [58], the RGB spectrum may not have sufficient amplitude to allow an adequate distinction.
However, in the study areas, the occurrence frequency of S. romanzoffiana was low. According to a manual count over 9 ha, 32 individuals were found in PNMDLC, eight were found in PEST, seven were found in SAPIENS, and none were found in PEST. Thus, manual counts can be employed to adjust classifications if necessary. Overall, the classification results were considered satisfactory on the basis of the used technology.

3.5. Area Estimates

Area estimates per class were obtained from the classification, as shown in Table 13. The measurement of these areas using vectors was aimed at validating the representativeness of each class in the landscape represented by the image.
This result shows that the proposed methodological flow can be employed both for a quick and detailed diagnosis of the study area and for studies that involve monitoring on the basis of repeating the method application.

4. Conclusions

From the classification results, it was possible to map the areas invaded by Pinus sp. in the areas of interest. Thus, the understanding is that the used method can help both the invasion interpretation and action plans needed to manage and control Pinus sp. as an invasive species.
In addition to mapping the target species in this study, the method was also helpful as a classifier of images in general, and it can be employed in managing other exotic plant species or even in other environmental applications.
Compared to the RF, the SVM classifier obtained better results in all areas studied considering the measures used in the accuracy evaluation. According to these results, SVM behavior for this application is apparently more consistent. Nevertheless, further studies involving the method’s application with similar equipment in other areas should be encouraged to assess whether such behavior can be confirmed as a trend. Regarding the used attributes, the classifiers differed significantly across the four areas (Appendix E).
The main methodological difference in this study from others was the effort to increase the set of attributes and select them by applying distinct methods, with subsequent ranking according to the accuracy assessment. This approach allowed improving the classification substantially in all four areas evaluated. Thus, although a single set of attributes, which is ideal for carrying out a classification in multiple areas, has not been found, it was possible to identify which set of attributes best fit to classify each area and achieve an excellent kappa index.
Given the results achieved, it is likely that further accurate classifications can be reached by adjusting the used technique. Thus, the suggestion is to investigate other vegetation indices as a substitute or in conjunction with VARI, as well as other algorithms, to extract textural attributes and other attribute selection techniques.
The recommendation is also to apply the technique in larger areas, since computational capacity can represent a bottleneck. An alternative is to divide an area into subareas with adequate dimensions, dividing the segmentation flow into stages and eliminating the first segmentation, which is optional, as proposed by De Luca et al. [29]. Another possibility is to develop the model in a restricted area that sufficiently includes all classes before application to the entire segmented area.

Author Contributions

Conceptualization, V.P.G. and E.A.W.R.; methodology, V.P.G. and N.N.I.; validation, V.P.G. and N.N.I.; formal analysis, E.A.W.R. and N.N.I.; investigation, V.P.G.; resources, V.P.G.; data curation, V.P.G.; writing—original draft preparation, V.P.G.; writing—review and editing, E.A.W.R. and N.N.I.; supervision, N.N.I.; project administration, V.P.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any external funding for its development. However, the Federal Institute of Santa Catarina—IFSC provided a refund for translating and formatting the submitted manuscript, and for 42% of the APC taxes. Conselho Nacional de Desenvolvimento Científico e Tecnológico—CNPq contributed with a research grant to N.N.I. author through process: 308747/2021-6.

Data Availability Statement

The data presented in this study are openly available from Zenodo at https://doi.org/10.5281/zenodo.5809795.

Acknowledgments

In memory of Laura Tajes Gomes, we kindly thank her for a careful grammatical proofreading service. Laura will have our eternal gratitude.

Conflicts of Interest

The authors declare to have no conflict of interest.

Appendix A

Attribute description of the features calculated by OTB and GeoDMA software.
Table A1. Attributes calculated by the LargeScaleMeanshift and ZonalStatistics algorithms of the OTB package implemented in QGIS. Those highlighted in bold were calculated for each raster band used in the process. As an example, only the statistics of band 0 (B0) are shown.
Table A1. Attributes calculated by the LargeScaleMeanshift and ZonalStatistics algorithms of the OTB package implemented in QGIS. Those highlighted in bold were calculated for each raster band used in the process. As an example, only the statistics of band 0 (B0) are shown.
FieldAttributeAlgorithmDescription
labelNot a segmentation attributeLargeScaleMeanshiftSegment labeling is not an attribute for classification.
nbPixelsPixel numberLargeScaleMeanshiftPixel number in each segment. Identical to pixel count.
meanB0MeanLargeScaleMeanshiftThe gray level average value of the pixels within the object.
varB0VarianceLargeScaleMeanshiftVariance value in the pixel gray level within the object.
countPixel countZonalStatisticsPixel count in each segment. Identical to the pixel number.
mean_0MeanZonalStatisticsThe gray level average value of the pixels within the object.
stdev_0Standard DeviationZonalStatisticsValue of the pixels gray level standard deviation within the object.
min_0MinimumZonalStatisticsMaximum value of the pixels gray level pixels within the object.
max_0MaximumZonalStatisticsMaximum value of the pixels gray level pixels within the object.
Source: Adapted from OTB Development Team [40].
Table A2. Attributes calculated by the feature extraction function of the GeoDMA plugin in the TerraView software of INPE. Those highlighted in bold are calculated for each raster band used in the process. As an example, only the statistics of band 0 (B0) are shown.
Table A2. Attributes calculated by the feature extraction function of the GeoDMA plugin in the TerraView software of INPE. Those highlighted in bold are calculated for each raster band used in the process. As an example, only the statistics of band 0 (B0) are shown.
FieldAttributeAlgorithmDescription
AMPL_B0AmplitudeFeature extractionDefines the pixel amplitude within the object. Amplitude means pixel maximum value minus pixel minimum value.
COUNT_B0Pixel countFeature extractionDefines the total number of pixels within the object, including pixels with fictitious values.
KURT_B0KurtosisFeature extractionReturns the kurtosis value for all valid (actual) pixels within the object.
MAX_VAL_B0Maximum valueFeature extractionCalculates the maximum gray level value (actual) within the object.
MEAN_B0MeanFeature extractionCalculates the average value of all N pixels within the object.
MEDIAN_B0MedianFeature extractionCalculates the median of all N pixels within the object.
MIN_VAL_B0Minimum valueFeature extractionCalculates the minimum gray level value (actual) within the object.
MODE_B0ModeFeature extractionReturns the gray level value with the highest occurrence (mode) for all N pixels (actual) within the object. When the object is multimodal, the first value is assumed.
N_MOD_B0Number of modesFeature extractionReturns the number of modes for the object.
SKEW_B0Asymmetry (skewness)Feature extractionReturns the skewness value for all valid (actual) pixels within the object.
STDDEV_B0Standard deviationFeature extractionReturns the standard deviation of all N pixels (actual) within the object.
SUM_B0SumFeature extractionDefines the total number of pixels within the object without fabricated values.
VLDCNT_B0Count of valid pixels (without fabricated values)Feature extractionDefines the total number of pixels within the object without fabricated values.
VARCOEF_B0Coefficient of variation for valid pixelsFeature extractionReturns the coefficient of variation of the values for all valid (actual) pixels within the object.
VAR_B0VarianceFeature extractionReturns the variation of all N pixels (actual) within the object.
CONTSE_B0ContrastFeature extractionReturns a measure of the intensity contrast between a pixel and its southeastern neighbor on the object. The contrast is 0 for a constant object. It is also known as squares variance sum.
DISSE_B0DissimilarityFeature extractionMeasures how different the elements of the gray level cooccurrence matrix (GLCM) are from each other, and this value is high when the local region has high contrast.
ENERGSE_B0EnergyFeature extractionIt measures how different are the elements of the gray level co-occurrence matrix (GLCM) from each other, and this value is high when the local region has high contrast.
ENTRSE_B0EntropyFeature extractionMeasures the disorder in an image. When the image is not uniform, many elements of the gray level co-occurrence matrix (GLCM) have small values, resulting in large entropy.
HOMOGSE_B0HomogeneityFeature extractionAssumes higher values for smaller differences in the gray level co-occurrence matrix (GLCM). Additionally, it is called inverse difference moment. Homogeneity is 1 for a diagonal GLCM.
BRATIO_B0Contribution rateFeature extractionDescribes the contribution of a given band to the region.
P_AREAPolygon area Feature extractionReturns the object area, measured with the current spatial reference system measurement unit.
P_PERIMPolygon perimeter Feature extractionReturns the object perimeter, measured with the current spatial reference system measurement unit.
P_FRACDIMPolygon fractal dimension Feature extractionReturns a polygon fractal dimension.
P_PERARATProportion between perimeter and the polygon or segment areaFeature extractionCalculates the ratio between an object perimeter and area
P_COMPACSegment compactnessFeature extractionReturns the object compactness.
PBOX_AREAArea of the segment bounding boxFeature extractionReturns the area of the object bounding box, measured with the current spatial reference system measurement unit
PBOX_PERIMPerimeter of the segment bounding boxFeature extractionReturns the perimeter of the bounding box of an object, measured with the current spatial reference system measurement unit.
PBOX_LENHeight of the segment bounding boxFeature extractionIt is the height of the object bounding box, measured with the current spatial reference system measurement unit
PBOX_WIDTHWidth of the segment bounding boxFeature extraction Returns the width of the object bounding box, measured with the current spatial reference system measurement unit.
POL_ANGLEObject main angle (segment)Feature extraction Represents the main angle of an object. It is obtained by computing the minimum circumscribing ellipse, and the angle of the biggest radius of the ellipse suits to the object’s angle.
PELLIP_FITProportion between the object area and the circumscribed minimum ellipse areaFeature extractionFinds the minimum circumscribed ellipse to the object and returns the ratio between the object area and the ellipse area.
PGYRATIUSAverage distance between each polygon vertex and its centroidFeature extractionThis resource is equal to the average distance between each polygon vertex and its centroid. The more similar to a circle the object is, the more likely the centroid will be within it. Therefore, this resource will be closer to 0.
POLRADIUSPolygon radiusFeature extractionReturns the polygon radius. Corresponds to the maximum distance between the polygon centroid and its vertices.
PCIRCLERelationship between the segment area and the smallest inner circleFeature extractionRelates the object areas and the smallest circumscribed circle around the object. In the equation, R is the maximum distance between the centroid and all vertices
PSAHPEIDXRelationship between the segment area and the smallest inner circleFeature extractionThis resource corresponds to the ratio between the polygon perimeter and the square root of the polygon area.
PDENSITYProportion between the polygon area and radiusFeature extractionThis feature corresponds to the ratio between the polygon area and the polygon radius.
PRECTFITProportion between the segment area and its minimum external rectangleFeature extractionThis feature adjusts a minimum rectangle outside the object and calculates the ratio between its area and this rectangle area. The closer to 1 this feature is, the more similar it is to a rectangle.
Source: Adapted from INPE [62].

Appendix B

Parameters used in the Weka software for attribute selection:
Method wrapperSubsetEval (wrapper)
  • In the Select Attributes tab, select the “WrapperSubsetEval” method;
  • Select the string field of the dataset to be used as “class”;
  • Click on “WrapperSubsetEval” to open the options, click on Choose in the classifier field to select J48 in the trees tab and change folds from 5 to 10, and threshold to −1;
  • Click OK to accept the settings;
  • In Search Method choose BestFirst, click to open the options; in direction, choose Bidirectional and set searchTermination to 5;
  • Click OK to accept the settings;
  • In the Attribute Selection Mode, select the cross-validation option and adjust to 10 Folds 1 Seed.
Method CfsSubsetEvaluation (CfsSE)
  • In the Select Attributes tab, select the “CfsSubsetEval” method;
  • Select the string field of the data set to be used as “class”;
  • Choose the Search Method as desired. Parameters used according to Nevalainen et al. [48] with the standard options provided by Weka software:
    • ○ BestFirst (direction Forward, by default),
    • ○ GeneticSearch,
    • ○ GreedyStepWise;
  • In the Attribute Selection Mode, select the cross-validation option and adjust to 10 Folds 1 Seed.

Appendix C

Table A3. List of acronyms to combine the attribute selection tested in the training and validation phases with the SVM and RF machine learning algorithms.
Table A3. List of acronyms to combine the attribute selection tested in the training and validation phases with the SVM and RF machine learning algorithms.
AcronymFeatures OriginWeka Feature Selection MethodWeka Search MethodCutting Line (Fold Frequency)
1OTBOTBnonenoneAll features were used
2OTB_weka_CfsSE_BestFirst_cv10x1_80fdOTBCfsSEBestFirst80%
3OTB_weka_CfsSE_GeneticSearch_cv10x1_80fdOTBCfsSEGeneticSearch80%
4OTB_weka_CfsSE_GreedyStepWise_cv10x1_80fdOTBCfsSEGreedyStepWise80%
5OTB_weka_CfsSE_COMBIN_cv10x1_80fdOTBCfsSEBestFirst, GeneticSearch and GreedyStepWise combined80%
6GEODMA_weka_CfsSE_BestFirst_cv10x1_80fdGeoDMACfsSEBestFirst80%
7GEODMA_weka_CfsSE_GeneticSearch_cv10x1_80fdGeoDMACfsSEGeneticSearch80%
8GEODMA_weka_CfsSE_GreedyStepWise_cv10x1_80fdGeoDMACfsSEGreedyStepWise80%
9GEODMA_weka_CfsSE_COMBIN_cv10x1_80fdGeoDMACfsSEBestFirst, GeneticSearch and GreedyStepWise combined80%
10OTB_and_GEODMA_weka_CfsSE_BestFirst_cv10x1_80fdOTB and GeoDMACfsSEBestFirst80%
11OTB_and_GEODMA_weka_CfsSE_GeneticSearch_cv10x1_80fdOTB and GeoDMACfsSEGeneticSearch80%
12OTB_and_GEODMA_weka_CfsSE_GreedyStepWise_cv10x1_80fdOTB and GeoDMACfsSEGreedyStepWise80%
13OTB_and_GEODMA_weka_CfsSE_COMBIN_cv10x1_80fdOTB and GeoDMACfsSEBestFirst, GeneticSearch and GreedyStepWise combined80%
14OTB_weka_bd_cv_20pctOTBWrapper BestFirst20%
15OTB_weka_bd_cv_30pctOTBWrapperBestFirst30%
16OTB_weka_bd_cv_40pcOTBWrapperBestFirst40%
17OTB_weka_bd_cv_50pctOTBWrapperBestFirst50%
18GEODMAGeoDMAnonenoneAll features were used
19GEODMA_weka_bd_cv_10pctGeoDMAWrapperBestFirst10%
20GEODMA_weka_bd_cv_20pctGeoDMAWrapperBestFirst20%
21GEODMA_weka_bd_cv_30pctGeoDMAWrapperBestFirst30%
22OTB_E_GEODMA_dfOTB and GeoDMAnonenoneAll features were used
23OTB_E_GEODMA_weka_bd_cv_10pctOTB and GeoDMAWrapperBestFirst10%
24OTB_E_GEODMA_weka_bd_cv_20pctOTB and GeoDMAWrapperBestFirst20%
25OTB_E_GEODMA_weka_bd_cv_30pctOTB and GeoDMAWrapperBestFirst30%

Appendix D

Parameters set in QGIS/OTB algorithms:
TrainVectorClassifier
  • Classifier Support Vector Machine (libsvm)
    • ○ SVM Kernel Type:linear
    • ○ SVM Model Type: csvc
    • ○ Cost parameter C: 1,0
    • ○ Cost parameter Nu: 0,5
    • ○ Parameters optimization [optional]: no
    • ○ Probability estimation [optional]: no
    • ○ User defined input centroids: 0
    • ○ Statistics file [optional]: not used
    • ○ Random seed [optional]: 0
  • Classifier Random Forest (rf)
    • ○ Maximum depth of the tree [optional]: 5
    • ○ Minimum number of samples in each node [optional]: 10
    • ○ Termination criteria for regression tree [optional]: 0
    • ○ Cluster possible values of a categorical variable into K<= cat clusters to find a suboptimal split [optional]: 10
    • ○ Size of the randomly selected subset of each tree node [optional]: 0
    • ○ Maximum number of trees in the forest [optional]: 100
    • ○ Sufficient accuracy (OOB error) [optional]: 0,0
    • ○ User defined input centroids: 0 (not empty, to avoid error)
    • ○ Statistics file [optional]: not used
    • ○ Random seed [optional]: 0

Appendix E

Attributes (features) used in the classifications that obtained the highest kappa values in each area:
SAPIENS
SVM (KAPPA = 0.872983): “MODE_B0 CONTSE_B0 DISSE_B1 SKEW_B2 STDDEV_B2 VARCOEF_B2 MAX_VAL_B3 MEAN_B3 MIN_VAL_B3 MODE_B3 ENTRSE_B3 MEAN_B4 MODE_B4 BRATIO_B0 AMPL_B4 MIN_VAL_B4 ENTRSE_B4”
RF (KAPPA = 0.868927): “AMPL_B0 COUNT_B0 KURT_B0 MAX_VAL_B0 MEAN_B0 MEDIAN_B0 MIN_VAL_B0 MODE_B0 N_MOD_B0 SKEW_B0 STDDEV_B0 SUM_B0 VLDCNT_B0 VARCOEF_B0 VAR_B0 CONTSE_B0 DISSE_B0 ENERGSE_B0 ENTRSE_B0 HOMOGSE_B0 AMPL_B1 COUNT_B1 KURT_B1 MAX_VAL_B1 MEAN_B1 MEDIAN_B1 MIN_VAL_B1 MODE_B1 N_MOD_B1 SKEW_B1 STDDEV_B1 SUM_B1 VLDCNT_B1 VARCOEF_B1 VAR_B1 CONTSE_B1 DISSE_B1 ENERGSE_B1 ENTRSE_B1 HOMOGSE_B1 AMPL_B2 COUNT_B2 KURT_B2 MAX_VAL_B2 MEAN_B2 MEDIAN_B2 MIN_VAL_B2 MODE_B2 N_MOD_B2 SKEW_B2 STDDEV_B2 SUM_B2 VLDCNT_B2 VARCOEF_B2 VAR_B2 CONTSE_B2 DISSE_B2 ENERGSE_B2 ENTRSE_B2 HOMOGSE_B2 AMPL_B3 COUNT_B3 KURT_B3 MAX_VAL_B3 MEAN_B3 MEDIAN_B3 MIN_VAL_B3 MODE_B3 N_MOD_B3 SKEW_B3 STDDEV_B3 SUM_B3 VLDCNT_B3 VARCOEF_B3 VAR_B3 CONTSE_B3 DISSE_B3 ENERGSE_B3 ENTRSE_B3 HOMOGSE_B3 AMPL_B4 COUNT_B4 MAX_VAL_B4 MEAN_B4 MEDIAN_B4 MIN_VAL_B4 MODE_B4 N_MOD_B4 STDDEV_B4 SUM_B4 VLDCNT_B4 VARCOEF_B4 VAR_B4 CONTSE_B4 DISSE_B4 ENERGSE_B4 ENTRSE_B4 HOMOGSE_B4 BRATIO_B0 BRATIO_B1 BRATIO_B2 BRATIO_B3 BRATIO_B4 P_AREA P_PERIM P_FRACDIM P_PERARAT P_COMPAC PBOX_AREA PBOX_PERIM PBOX_LEN PBOX_WIDTH POL_ANGLE PELLIP_FIT PGYRATIUS POLRADIUS PCIRCLE PSAHPEIDX PDENSITY PRECTFIT”.
PAERVE
SVM (KAPPA = 0.895904): “BRATIO_B0 BRATIO_B2 MEDIAN_B1 MODE_B1 MODE_B2 N_MOD_B4 mean_2 mean_3 stdev_3 min_3 BRATIO_B1 BRATIO_B3 mean_0 min_1 max_3 mean_4 N_MOD_B0 MIN_VAL_B1 MEDIAN_B2 DISSE_B2 MODE_B3 ENERGSE_B3 MEDIAN_B4 DISSE_B4”
RF (KAPPA = 0.776774): “varB2 max_2 mean_3 min_3 mean_4 max_4 VARCOEF_B0 MODE_B1 VARCOEF_B1 MAX_VAL_B2 MODE_B2 VARCOEF_B2 CONTSE_B2 DISSE_B2 MEDIAN_B3 ENTRSE_B4 BRATIO_B0 BRATIO_B2 KURT_B1 AMPL_B2 SKEW_B2 MODE_B3 min_2”.
PNMDLC
SVM (KAPPA = 0.824535): “varB4 mean_3 min_3 max_3 mean_4 min_4 VARCOEF_B0 SKEW_B2 VARCOEF_B2 MEAN_B3 MEDIAN_B3 MIN_VAL_B3 MODE_B3 VARCOEF_B3 CONTSE_B3 ENTRSE_B4 BRATIO_B0 BRATIO_B3 CONTSE_B1 BRATIO_B4 min_0 MIN_VAL_B2 mean_0 max_0 stdev_4 MEAN_B0 SKEW_B0 BRATIO_B1 varB3 mean_2 MAX_VAL_B0 MEDIAN_B0 MODE_B0 ENERGSE_B3 P_PERARAT”
RF (KAPPA = 0.782718): “mean_3 mean_1 count max_4 min_1 min_3 varB2 mean_0 stdev_0 min_0 max_0 mean_2 min_4”.
PEST
SVM (KAPPA = 0.850954): “count mean_3 mean_0 mean_4 min_0 min_3 mean_1 mean_2 stdev_3 max_3 max_4”.
RF (KAPPA = 0.819808): “AMPL_B0 COUNT_B0 KURT_B0 MAX_VAL_B0 MEAN_B0 MEDIAN_B0 MIN_VAL_B0 MODE_B0 N_MOD_B0 SKEW_B0 STDDEV_B0 SUM_B0 VLDCNT_B0 VARCOEF_B0 VAR_B0 CONTSE_B0 DISSE_B0 ENERGSE_B0 ENTRSE_B0 HOMOGSE_B0 AMPL_B1 COUNT_B1 KURT_B1 MAX_VAL_B1 MEAN_B1 MEDIAN_B1 MIN_VAL_B1 MODE_B1 N_MOD_B1 SKEW_B1 STDDEV_B1 SUM_B1 VLDCNT_B1 VARCOEF_B1 VAR_B1 CONTSE_B1 DISSE_B1 ENERGSE_B1 ENTRSE_B1 HOMOGSE_B1 AMPL_B2 COUNT_B2 KURT_B2 MAX_VAL_B2 MEAN_B2 MEDIAN_B2 MIN_VAL_B2 MODE_B2 N_MOD_B2 SKEW_B2 STDDEV_B2 SUM_B2 VLDCNT_B2 VARCOEF_B2 VAR_B2 CONTSE_B2 DISSE_B2 ENERGSE_B2 ENTRSE_B2 HOMOGSE_B2 AMPL_B3 COUNT_B3 KURT_B3 MAX_VAL_B3 MEAN_B3 MEDIAN_B3 MIN_VAL_B3 MODE_B3 N_MOD_B3 SKEW_B3 STDDEV_B3 SUM_B3 VLDCNT_B3 VARCOEF_B3 VAR_B3 CONTSE_B3 DISSE_B3 ENERGSE_B3 ENTRSE_B3 HOMOGSE_B3 AMPL_B4 COUNT_B4 MAX_VAL_B4 MEAN_B4 MEDIAN_B4 MIN_VAL_B4 MODE_B4 N_MOD_B4 STDDEV_B4 SUM_B4 VLDCNT_B4 VARCOEF_B4 VAR_B4 CONTSE_B4 DISSE_B4 ENERGSE_B4 ENTRSE_B4 HOMOGSE_B4 BRATIO_B0 BRATIO_B1 BRATIO_B2 BRATIO_B3 BRATIO_B4 P_AREA P_PERIM P_FRACDIM P_PERARAT P_COMPAC PBOX_AREA PBOX_PERIM PBOX_LEN PBOX_WIDTH POL_ANGLE PELLIP_FIT PGYRATIUS POLRADIUS PCIRCLE PSAHPEIDX PDENSITY PRECTFIT”.

References

  1. Secretariat of the Convention on Biological Diversity. Global Biodiversity Outlook 5; Montreal. 2020. Available online: https://www.cbd.int/gbo/gbo5/publication/gbo-5-en.pdf (accessed on 20 July 2021).
  2. Ziller, S.R.; de Dechoum, M.S.; Duarte Silveira, R.A.; da Marques Rosa, H.; Mello Oliveira, B.C.; Zenni, R.D.; Motta, M.S.; da Filipe Silva, L. A Priority-Setting Scheme for the Management of Invasive Non-Native Species in Protected Areas. NeoBiota 2020, 62, 591–606. [Google Scholar] [CrossRef]
  3. Foxcroft, L.C.; Pyšek, P.; Richardson, D.M.; Genovesi, P. Plant Invasions in Protected Areas: Patterns, Problems and Challenges. In Plant Invasions in Protected Areas: Patterns, Problems and Challenges; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; pp. 1–656. [Google Scholar] [CrossRef]
  4. Brasil Dados Gerais, UC. Available online: https://www.icmbio.gov.br/portal/images/stories/servicos/geoprocessamento/DCOL/dados_tabulares/DadosGerais_UC_julho_2019.pdf (accessed on 20 July 2021).
  5. Brasil Onde Estamos. Available online: https://www.icmbio.gov.br/portal/ondeestamos (accessed on 20 July 2021).
  6. Van Kleunen, M.; Dawson, W.; Essl, F.; Pergl, J.; Winter, M.; Weber, E.; Kreft, H.; Weigelt, P.; Kartesz, J.; Nishino, M.; et al. Global Exchange and Accumulation of Non-Native Plants. Nature 2015, 525, 100–103. [Google Scholar] [CrossRef] [PubMed]
  7. Pyšek, P.; Pergl, J.; Essl, F.; Lenzner, B.; Dawson, W.; Kreft, H.; Weigelt, P.; Winter, M.; Kartesz, J.; Nishino, M.; et al. Naturalized Alien Flora of the World: Species Diversity, Taxonomic and Phylogenetic Patterns, Geographic Distribution and Global Hotspots of Plant Invasion. Preslia 2017, 89, 203–274. [Google Scholar] [CrossRef]
  8. Nuñez, M.A.; Chiuffo, M.C.; Torres, A.; Paul, T.; Dimarco, R.D.; Raal, P.; Policelli, N.; Moyano, J.; García, R.A.; van Wilgen, B.W.; et al. Ecology and Management of Invasive Pinaceae around the World: Progress and Challenges. Biol. Invasions 2017, 19, 3099–3120. [Google Scholar] [CrossRef]
  9. Bechara, F.C. Restauração Ecológica de Restingas Contaminadas por Pinus no Parque Florestal do Rio Vermelho, Florianópolis, Sc. Univ. Fed. St. Catarina 2003, 108, 136. [Google Scholar]
  10. de Dechoum, M.S.; Giehl, E.L.H.; Sühs, R.B.; Silveira, T.C.L.; Ziller, S.R. Citizen Engagement in the Management of Non-Native Invasive Pines: Does It Make a Difference? Biol. Invasions 2019, 21, 175–188. [Google Scholar] [CrossRef]
  11. Gallardo, B.; Aldridge, D.C.; González-Moreno, P.; Pergl, J.; Pizarro, M.; Pyšek, P.; Thuiller, W.; Yesson, C.; Vilà, M. Protected Areas Offer Refuge from Invasive Species Spreading under Climate Change. Glob. Change Biol. 2017, 23, 5331–5343. [Google Scholar] [CrossRef]
  12. Moody, M.E.; Mack, R.N. Controlllng the spread of plant invasions: The importance of nascent foci. J. Appl. Ecol. 1988, 25, 1009–1021. [Google Scholar] [CrossRef]
  13. Marzialetti, F.; Frate, L.; Simone, W.D.; Frattaroli, A.R.; Acosta, A.T.R.; Carranza, M.L. Unmanned Aerial Vehicle (UAV)-Based Mapping of Acacia Saligna Invasion in the Mediterranean Coast. Remote Sens. 2021, 13, 3361. [Google Scholar] [CrossRef]
  14. Lehmann, J.R.K.; Prinz, T.; Ziller, S.R.; Thiele, J.; Heringer, G.; Meira-Neto, J.A.A.; Buttschardt, T.K. Open-Source Processing and Analysis of Aerial Imagery Acquired with a Low-Cost Unmanned Aerial System to Support Invasive Plant Management. Front. Environ. Sci. 2017, 5, 44. [Google Scholar] [CrossRef]
  15. Müllerová, J.; Brůna, J.; Dvořák, P.; Bartaloš, T.; Vítková, M. Does the Data Resolution/Origin Matter? Satellite, Airborne and UAV Imagery to Tackle Plant Invasions. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2016, 41, 903–908. [Google Scholar] [CrossRef]
  16. Granshaw, S.I. RPV, UAV, UAS, RPAS ldots or Just Drone? Photogramm. Rec. 2018, 33, 160–170. [Google Scholar] [CrossRef]
  17. Osco, L.P.; Marcato Junior, J.; Marques Ramos, A.P.; de Castro Jorge, L.A.; Fatholahi, S.N.; de Andrade Silva, J.; Matsubara, E.T.; Pistori, H.; Gonçalves, W.N.; Li, J. A Review on Deep Learning in UAV Remote Sensing. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102456. [Google Scholar] [CrossRef]
  18. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  19. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  20. Ayamga, M.; Akaba, S.; Nyaaba, A.A. Multifaceted Applicability of Drones: A Review. Technol. Forecast. Soc. Chang. 2021, 167, 120677. [Google Scholar] [CrossRef]
  21. Giones, F.; Brem, A. From Toys to Tools: The Co-Evolution of Technological and Entrepreneurial Developments in the Drone Industry. Bus. Horiz. 2017, 60, 875–884. [Google Scholar] [CrossRef]
  22. Guimarães, N.; Pádua, L.; Marques, P.; Silva, N.; Peres, E.; Sousa, J.J. Forestry Remote Sensing from Unmanned Aerial Vehicles: A Review Focusing on the Data, Processing and Potentialities. Remote Sens. 2020, 12, 1046. [Google Scholar] [CrossRef]
  23. Onishi, M.; Ise, T. Explainable Identification and Mapping of Trees Using UAV RGB Image and Deep Learning. Sci. Rep. 2021, 11, 903. [Google Scholar] [CrossRef]
  24. Wang, X.; Wang, Y.; Zhou, C.; Yin, L.; Feng, X. Urban Forest Monitoring Based on Multiple Features at the Single Tree Scale by UAV. Urban For. Urban Green. 2021, 58, 126958. [Google Scholar] [CrossRef]
  25. White, R.; Bomber, M.; Hupy, J.; Shortridge, A. UAS-GEOBIA Approach to Sapling Identification in Jack Pine Barrens after Fire. Drones 2018, 2, 40. [Google Scholar] [CrossRef]
  26. Apostol, B.; Petrila, M.; Lorenţ, A.; Ciceu, A.; Gancz, V.; Badea, O. Species Discrimination and Individual Tree Detection for Predicting Main Dendrometric Characteristics in Mixed Temperate Forests by Use of Airborne Laser Scanning and Ultra-High-Resolution Imagery. Sci. Total Environ. 2020, 698, 134074. [Google Scholar] [CrossRef]
  27. Ferreira, M.P.; Zortea, M.; Zanotta, D.C.; Shimabukuro, Y.E.; de Souza Filho, C.R. Mapping Tree Species in Tropical Seasonal Semi-Deciduous Forests with Hyperspectral and Multispectral Data. Remote Sens. Environ. 2016, 179, 66–78. [Google Scholar] [CrossRef]
  28. Marpu, P.R.; Neubert, M.; Herold, H.; Niemeyer, I. Enhanced Evaluation of Image Segmentation Results. J. Spat. Sci. 2010, 55, 55–68. [Google Scholar] [CrossRef]
  29. De Luca, G.; Silva, J.M.N.; Cerasoli, S.; Araújo, J.; Campos, J.; Di Fazio, S.; Modica, G. Object-Based Land Cover Classification of Cork Oak Woodlands Using UAV Imagery and Orfeo Toolbox. Remote Sens. 2019, 11, 1238. [Google Scholar] [CrossRef]
  30. Costa, H.; Foody, G.M.; Boyd, D.S. Supervised Methods of Image Segmentation Accuracy Assessment in Land Cover Mapping. Remote Sens. Environ. 2018, 205, 338–351. [Google Scholar] [CrossRef]
  31. Modica, G.; De Luca, G.; Messina, G.; Praticò, S. Comparison and Assessment of Different Object-Based Classifications Using Machine Learning Algorithms and UAVs Multispectral Imagery: A Case Study in a Citrus Orchard and an Onion Crop. Eur. J. Remote Sens. 2021, 54, 431–460. [Google Scholar] [CrossRef]
  32. Zhang, X.; Chen, G.; Wang, W.; Wang, Q.; Dai, F. Object-Based Land-Cover Supervised Classification for Very-High-Resolution UAV Images Using Stacked Denoising Autoencoders. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3373–3385. [Google Scholar] [CrossRef]
  33. Radoux, J.; Bogaert, P. Good Practices for Object-Based Accuracy Assessment. Remote Sens. 2017, 9, 646. [Google Scholar] [CrossRef]
  34. Albuquerque, R.W.; Ferreira, M.E.; Olsen, S.I.; Tymus, J.R.C.; Balieiro, C.P.; Mansur, H.; Moura, C.J.R.; Costa, J.V.S.; Branco, M.R.C.; Grohmann, C.H. Forest Restoration Monitoring Protocol with a Low-Cost Remotely Piloted Aircraft: Lessons Learned from a Case Study in the Brazilian Atlantic Forest. Remote Sens. 2021, 13, 2401. [Google Scholar] [CrossRef]
  35. IBGE. Manual Técnico Da Vegetação Brasileira, 2nd ed.; Instituto Brasileiro de Geografia e Estatística: Rio de Janeiro, Brazil, 2012; Volume 39, ISBN 978-85-240-4272-0. [Google Scholar]
  36. Back, Á.J. Informações Climáticas e Hidrológicas Dos Municípios Catarinenses (Com Programa HidroClimaSC); EPAGRI: Florianópolis, Brazil, 2020; ISBN 978-85-85014-92-6. [Google Scholar]
  37. Peel, M.C.; Finlayson, B.L.; McMahon, T.A. Updated World Map of the Köppen-Geiger Climate Classification. Hydrol. Earth Syst. Sci. 2007, 11, 1633–1644. [Google Scholar] [CrossRef]
  38. OpenDroneMap Authors. ODM—A Command Line Toolkit to Generate Maps, Point Clouds, 3D Models and DEMs from Drone, Balloon or Kite Images. Available online: https://github.com/OpenDroneMap/ODM (accessed on 22 July 2021).
  39. QGIS.org QGIS Geographic Information System. Available online: http://www.qgis.org (accessed on 18 September 2021).
  40. OTB Development Team. OTB CookBook Dcumentation. Available online: https://www.orfeo-toolbox.org/CookBook/ (accessed on 20 June 2021).
  41. Brasil GeoDMA—Geographic Data Mining Analyst. Divisão de Processamento de Imagens—Instituto Nacional de Pesquisas Espaciais—Inpe. Available online: http://wiki.dpi.inpe.br/doku.php?id=geodma (accessed on 22 July 2021).
  42. Brasil TerraLib and TerraView Wiki Page. Divisão de Processamento de Imagens—Instituto Nacional de Pesquisas Espaciais—Inpe. Available online: http://www.dpi.inpe.br/terralib5/wiki/doku.php?id=start (accessed on 22 July 2021).
  43. Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, 4th ed.; Morgan Kaufmann: Hamilton, New Zeland, 2016. [Google Scholar]
  44. DroneDeploy Team. DroneDeploy. Available online: https://www.dronedeploy.com/ (accessed on 22 July 2021).
  45. Henrich, V.; Krauss, G.; Götze, C.; Sandow, C. Index DataBase: A Database for Remote Sensing Indices. 2021. Available online: https://www.indexdatabase.de/db/ias.php (accessed on 22 July 2021).
  46. Liau, Y.T. Hierarchical Segmentation Framework for Identifying Natural Vegetation: A Case Study of the Tehachapi Mountains, California. Remote Sens. 2014, 6, 7276–7302. [Google Scholar] [CrossRef]
  47. Bhuiyan, M.A.E.; Yang, F.; Biswas, N.K.; Rahat, S.H.; Neelam, T.J. Machine Learning-Based Error Modeling to Improve GPM IMERG Precipitation Product over the Brahmaputra River Basin. Forecasting 2020, 2, 248–266. [Google Scholar] [CrossRef]
  48. Nevalainen, O.; Honkavaara, E.; Tuominen, S.; Viljanen, N.; Hakala, T.; Yu, X.; Hyyppä, J.; Saari, H.; Pölönen, I.; Imai, N.; et al. Individual Tree Detection and Classification with UAV-Based Photogrammetric Point Clouds and Hyperspectral Imaging. Remote Sens. 2017, 9, 185. [Google Scholar] [CrossRef]
  49. Tiwari, R.; Singh, M.P. Correlation-Based Attribute Selection Using Genetic Algorithm. IJCA 2010, 4, 28–34. [Google Scholar] [CrossRef]
  50. Gonçalves, V. Metodologia de análise de imagens baseada em objetos geográficos (GEOBIA) utilizando RPAS (drone) com sensor RGB. Estrabão 2021, 2, 41–85. [Google Scholar] [CrossRef]
  51. Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. AAAI Workshop Tech. Rep. 2006, 4304, 24–29. [Google Scholar] [CrossRef]
  52. Cohen, J. A Coefficient of Agreement for Nominal Scales. ST-A coefficient of agreement for nominal. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
  53. Story, M.; Congalton, R.G. Remote Sensing Brief Accuracy Assessment: A User’s Perspective. Photogramm. Eng. Remote Sens. 1986, 52, 397–399. [Google Scholar]
  54. He, H.; Ma, Y. Imbalanced Learning: Foundations, Algorithms, and Applications; Wiley-IEEE Press: Piscataway, NJ, USA, 2013; ISBN 978-1-118-07462-6. [Google Scholar]
  55. Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159. [Google Scholar] [CrossRef]
  56. Sharma, J.B.; Müllerová, J. UAS for Nature Conservation—Monitoring Invasive Species. In Applications of Small Unmanned Aircraft Systems; CRC Press: Boca Raton, FL, USA, 2019; pp. 157–178. ISBN 978-0-429-24411-7. [Google Scholar] [CrossRef]
  57. Pringle, R.M.; Syfert, M.; Webb, J.K.; Shine, R. Quantifying Historical Changes in Habitat Availability for Endangered Species: Use of Pixel- and Object-Based Remote Sensing. J. Appl. Ecol. 2009, 46, 544–553. [Google Scholar] [CrossRef]
  58. Foody, G.M. Status of Land Cover Classification Accuracy Assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
  59. Nascente, J.C.; Ferreira, M.E.; Nunes, G.M. Integrated Fire Management as a Renewing Agent of Native Vegetation and Inhibitor of Invasive Plants in Vereda Habitats: Diagnosis by Remotely Piloted Aircraft Systems. Remote Sens. 2022, 14, 1040. [Google Scholar] [CrossRef]
  60. Samiappan, S.; Turnage, G.; Hathcock, L.; Casagrande, L.; Stinson, P.; Moorhead, R. Using Unmanned Aerial Vehicles for High-Resolution Remote Sensing to Map Invasive Phragmites Australis in Coastal Wetlands. Int. J. Remote Sens. 2017, 38, 2199–2217. [Google Scholar] [CrossRef]
  61. Miyoshi, G.T.; dos Arruda, M.S.; Osco, L.P.; Marcato Junior, J.; Gonçalves, D.N.; Imai, N.N.; Tommaselli, A.M.G.; Honkavaara, E.; Gonçalves, W.N. A Novel Deep Learning Method to Identify Single Tree Species in UAV-Based Hyperspectral Images. Remote Sens. 2020, 12, 1294. [Google Scholar] [CrossRef]
  62. INPE GeoDMA Features. Available online: http://wiki.dpi.inpe.br/doku.php?id=geodma_2:features (accessed on 24 June 2012).
Figure 1. Map revealing areas of interest. Acronyms: Sapiens Park (SAPIENS), Rio Vermelho State Park (PAERVE), Lagoa da Conceição Dunes Municipal Natural Park (PNMDLC), Serra do Tabuleiro State Park (PEST) and IBGE (Brazilian Institute of Geography and Statistics). Coordinate Reference System (CRS): European Petroleum Survey Group (EPSG) code 4326 (Geographic Coordinates, horizontal datum WGS84).
Figure 1. Map revealing areas of interest. Acronyms: Sapiens Park (SAPIENS), Rio Vermelho State Park (PAERVE), Lagoa da Conceição Dunes Municipal Natural Park (PNMDLC), Serra do Tabuleiro State Park (PEST) and IBGE (Brazilian Institute of Geography and Statistics). Coordinate Reference System (CRS): European Petroleum Survey Group (EPSG) code 4326 (Geographic Coordinates, horizontal datum WGS84).
Remotesensing 14 02805 g001
Figure 2. Methodological flowchart.
Figure 2. Methodological flowchart.
Remotesensing 14 02805 g002
Figure 3. Segmentation example in the PNMDLC area in comparison to visual reference proposed by Liau [46]. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS orthomosaic (GSD 5 cm). Flight date: 10/04/2021, 12:10:14–12:15:09 Brasilia Time (BRT), UTC-3.
Figure 3. Segmentation example in the PNMDLC area in comparison to visual reference proposed by Liau [46]. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS orthomosaic (GSD 5 cm). Flight date: 10/04/2021, 12:10:14–12:15:09 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g003
Figure 4. Example of selected segments by location in the Sapiens Park area. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 06/11/2020, 09:18:21–09:40:10 Brasilia Time (BRT), UTC-3.
Figure 4. Example of selected segments by location in the Sapiens Park area. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 06/11/2020, 09:18:21–09:40:10 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g004
Figure 5. Relative percentage graphs showing improvement obtained for each method in each area of interest.
Figure 5. Relative percentage graphs showing improvement obtained for each method in each area of interest.
Remotesensing 14 02805 g005
Figure 6. Comparison of the SAPIENS area orthomosaic and classified images (higher kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Fight date: 06/11/2020, 09:18:21–09:40:10 Brasilia Time (BRT), UTC-3.
Figure 6. Comparison of the SAPIENS area orthomosaic and classified images (higher kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Fight date: 06/11/2020, 09:18:21–09:40:10 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g006
Figure 7. Comparison of the PAERVE area orthomosaic and classified images (higher Kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 11/04/2021, 12:23:38–12:49:59 Brasilia Time (BRT), UTC-3.
Figure 7. Comparison of the PAERVE area orthomosaic and classified images (higher Kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 11/04/2021, 12:23:38–12:49:59 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g007
Figure 8. Comparison of the PNMDLC area orthomosaic and classified images (greater kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 10/04/2021, 12:10:14–12:15:09 Brasilia Time (BRT), UTC-3.
Figure 8. Comparison of the PNMDLC area orthomosaic and classified images (greater kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 10/04/2021, 12:10:14–12:15:09 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g008
Figure 9. Comparison of the PEST area orthomosaic and classified images (higher kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5cm). Flight date: 06/04/2021, 10:13:20–10:18:15 Brasilia Time (BRT), UTC-3.
Figure 9. Comparison of the PEST area orthomosaic and classified images (higher kappa), overlaid with 65% transparency. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5cm). Flight date: 06/04/2021, 10:13:20–10:18:15 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g009
Figure 10. PEST classification detail showing the difference between a shrub and herbaceous species in a mosaic context and the jerivá palm Syagrus romanzoffiana mistakenly classified as Pinus sp. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 06/04/2021, 10:13:20–10:18:15 Brasilia Time (BRT), UTC-3.
Figure 10. PEST classification detail showing the difference between a shrub and herbaceous species in a mosaic context and the jerivá palm Syagrus romanzoffiana mistakenly classified as Pinus sp. CRS: EPSG 32722. Horizontal datum WGS84, UTM projection system, 22 South Zone. MC 51°W. GR. Raster: RPAS Orthomosaic (GSD 5 cm). Flight date: 06/04/2021, 10:13:20–10:18:15 Brasilia Time (BRT), UTC-3.
Remotesensing 14 02805 g010
Table 1. Number of classes and typologies by area of interest (AOI).
Table 1. Number of classes and typologies by area of interest (AOI).
AOITypology
SAPIENS1—Pinus sp.
2—Arboreal restinga
3—Herbaceous vegetation
4—Marsh
5—Shade
PAERVE1—Pinus sp.
2—Eucalyptus sp.
3—Casuarina sp.
4—Herbaceous restinga
5—Shrub restinga
6—Shade
7—Bare soil
8—Water
9—Anthropic
PNMDLC1—Pinus sp.
2—Herbaceous restinga
3—Shrub–arboreal restinga
4—Arboreal restinga
5—Shade
6—Bare soil
7—Herbaceous vegetation
PEST1—Pinus sp.
2—Herbaceous restinga
3—Shrub–arboreal restinga
4—Arboreal restinga
5—Shade
6—Bare soil
7—Trunks and branches
8—Marsh
Table 2. Number of segments selected for training and validation.
Table 2. Number of segments selected for training and validation.
SAPIENSPAERVEPNMDLCPEST
Total segments45,445100.00%59,589100.00%47,531100.00%119,190100.00%
Training7941.75%10781.81%15873.34%21571.81%
Validation3390.75%4640.78%6801.43%9260.78%
SUM (training + validation)11332.49%15422.59%22674.77%30832.59%
Table 3. Formulas used to evaluate accuracy.
Table 3. Formulas used to evaluate accuracy.
NameFormulaObjective
User accuracy (precision)aui = xii/xi+,
where
xii = objects correctly classified for class i;
xi+ = total number of objects classified for class i
It reflects the commission errors that indicate the probability of an object classified in a given class actually belonging to that class [53].
Producer accuracy (recall)api = xii/x + I,
where
xi = objects correctly classified for class i;
xi = total of reference objects for a class i
It reflects the omission errors, i.e., the probability of an object being excluded (not classified) from the class to which it belongs [53], additionally referred to as “sensitivity”, according to [51].
Global accuracyag = xi/n,
where
xi = total number of correctly classified objects;
n = total number of objects considered in the analysis
Measurement used in the global evaluation of the classification [53].
Expected proportion P e = i = 1 r x i +   x + i n 2 Unit proportion for which an agreement is expected by chance [52]. Used to compose the kappa Cohen index.
Kappa Cohen K = A g P e 1 P e Kappa employed for the global evaluation of the classification. It differs from the global accuracy by incorporating the elements outside the main diagonal [52].
F-score F = 2     a u i     a p i a u i     a p i Parameterized measure with equal weights for precision and recall allowing the model performance for each class to be compared, which is the measure most frequently used in machine learning with unbalanced data between precision and recall [51,54]
Table 4. Reference to interpret the classification performance based on the kappa index value.
Table 4. Reference to interpret the classification performance based on the kappa index value.
KappaStrength of AgreementPerformance
<0PoorTerrible
0 < k ≤ 0.2SlightPoor
0.2 < k ≤ 0.4FairReasonable
0.4 < k ≤ 0.6ModerateGood
0.6 < k ≤ 0.8SubstantialVery good
0.8 < k ≤ 1.0Almost perfectExcellent
Source: Landis and Koch [55].
Table 5. Higher values obtained for global accuracy, kappa, and F-score indices for Class 1 (Pinus sp.) and their respective compositions of software, attributes, and machine learning algorithms.
Table 5. Higher values obtained for global accuracy, kappa, and F-score indices for Class 1 (Pinus sp.) and their respective compositions of software, attributes, and machine learning algorithms.
Global Accuracy
AreaHigher ValuesClassifierSoftware, Attributes, and Algorithms
SAPIENS89.97%libsvmGEODMA (attributes)
Weka (feature selection with CfsSE and GreedyStepWise search method, cutting line 80%)
PAERVE91.16%libsvmOTB and GEODMA (attributes)
Weka (feature selection with Wrapper method, cutting line 20%)
PNMDLC88.68%libsvmOTB and GEODMA (attributes)
Weka (feature selection with CfsSE in all three search methods combined, cutting line 80%)
PEST88.44%libsvmOTB and GEODMA (attributes)
Weka (feature selection with Wrapper method, cutting line 30%)
Mean%89.56- -- -
Kappa
AreaHigher ValuesClassifierSoftware, Attributes, and Algorithms
SAPIENS0.87libsvmGEODMA (attributes)
Weka (feature selection with CfsSE and GreedyStepWise search method, cutting line 80%)
PAERVE0.90libsvmOTB and GEODMA (attributes)
Weka (feature selection with Wrapper method, cutting line 20%)
PNMDLC0.82libsvmOTB and GEODMA (attributes)
Weka (feature selection with CfsSE in all three search methods combined, cutting line 80%)
PEST0.85libsvmOTB and GEODMA (attributes)
Weka (feature selection with Wrapper method, cutting line 30%)
Mean0.86 - -
F-Score
AreaHigher ValuesClassifierSoftware, Attributes, and Algorithms
SAPIENS0.86libsvmAll OTB features only
PAERVE0.96libsvmGEODMA (attributes)
Weka (feature selection with CfsSE and GreedyStepWise search method, cutting line 80%)
PNMDLC0.83libsvmOTB and GEODMA (attributes)
Weka (feature selection with CfsSE in all three search methods combined, cutting line 80%)
PEST0.96libsvmOTB and GEODMA (attributes)
Weka (feature selection with Wrapper method, cutting line 30%)
Mean0.90- -- -
Table 6. High values for global accuracy, kappa index, and F-score of Class 1 (Pinus sp.) reached for each classifier compared to those for the classifiers in the OTB software only. The highest values are in bold.
Table 6. High values for global accuracy, kappa index, and F-score of Class 1 (Pinus sp.) reached for each classifier compared to those for the classifiers in the OTB software only. The highest values are in bold.
INDEX PER CLASSIFIEROTBGEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%GEODMA MAXPROCESSING SCHEME SETUPIMPROVEMENT%
SAPIENSGlobal Accuracy rf.84.66% 89.68% 89.68%GEODMA0.0501475.92%
Global Accuracy libsvm87.32%89.97% 89.97%GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%0.0265493.04%
KAPPA rf.0.805405 0.868927 0.868927GEODMA0.0635227.89%
KAPPA libsvm.0.8392440.872983 0.872983GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%0.0337394.02%
FSCORE PINUS rf.0.797814 0.851064 0.851064GEODMA0.0532506.67%
FSCORE PINUS libsvm.0.855556 0.855556OTB0.0000000.00%
PAERVEINDEX PER CLASSIFIEROTBOTB_and_GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%GEODMA+weka_CfsSE_BestFirst_cv10x1_80%GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%OTB_and_GEODMA+weka_Wrapper_bd_cv_20%MAXPROCESSING SCHEMEIMPROVEMENT%
Global Accuracy rf.77.37%81.25% 81.25%OTB_and_GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%0.0387935.01%
Global Accuracy libsvm82.76% 91.16%91.16%OTB_and_GEODMA+weka_Wrapper_bd_cv_20%0.08405210.16%
KAPPA rf.0.7293180.776774 0.776774OTB_and_GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%0.0474566.51%
KAPPA libsvm.0.797338 0.8959040.895904OTB_and_GEODMA+weka_Wrapper_bd_cv_20%0.09856612.36%
FSCORE PINUS - rf.0.826087 0.894942 0.894942GEODMA+weka_CfsSE_BestFirst_cv10x1_80%0.0688558.34%
FSCORE PINUS libsvm.0.850467 0.957627 0.957627GEODMA+weka_CfsSE_GreedyStepWise_cv10x1_80%0.10716012.60%
PNMDLCINDEX PER CLASSIFIEROTBOTB_and_GEODMA+weka_CfsSE_COMBIN_cv10x1_80%OTB+wekaWrapper_bd_cv_30%OTB+weka_Wrapper_bd_cv_40% MAXPROCESSING SCHEMEIMPROVEMENT%
Global Accuracy rf.84.12% 86.32% 86.32%OTB+wekaWrapper_bd_cv_30%0.0220592.62%
Global Accuracy libsvm84.85%88.68% 88.68%OTB_and_GEODMA+weka_CfsSE_COMBIN_cv10x1_80%0.0382354.51%
KAPPA rf.0.745134 0.782718 0.782718OTB+wekaWrapper_bd_cv_30%0.0375845.04%
KAPPA libsvm.0.7718370.824535 0.824535OTB_and_GEODMA+weka_CfsSE_COMBIN_cv10x1_80%0.0526986.83%
FSCORE PINUS rf.0.646617 0.706767 0.706767OTB+weka_Wrapper_bd_cv_40%0.0601509.30%
FSCORE PINUS libsvm.0.7176470.828947 0.828947OTB_and_GEODMA+weka_CfsSE_COMBIN_cv10x1_80%0.11130015.51%
PESTINDEX PER CLASSIFIEROTBOTB+weka_Wapper_bd_cv_30%GEODMAOTB_and_GEODMA MAXPROCESSING SCHEMEIMPROVEMENT%
Global Accuracy rf.84.99% 86.18% 86.18%GEODMA0.0118791.40%
Global Accuracy libsvm87.47%88.44% 88.44%OTB+weka_Wapper_bd_cv_30%0.0097191.11%
KAPPA rf.0.802609 0.819808 0.819808GEODMA0.0171992.14%
KAPPA libsvm.0.8397620.850954 0.850954OTB+weka_Wapper_bd_cv_30%0.0111921.33%
FSCORE PINUS rf.0.923636 0.930403 0.930403OTB_and_GEODMA0.0067670.73%
FSCORE PINUS libsvm.0.9494950.962457 0.962457OTB+weka_Wapper_bd_cv_30%0.0129621.37%
Table 7. Classifier performance comparison.
Table 7. Classifier performance comparison.
AreaKappa RFKappa SVMDifference%
SAPIENS0.8689270.8729830.0040560.47%
PAERVE0.7767740.8959040.1191315.34%
PNMDLC0.7827180.8245350.0418175.34%
PEST0.8198080.8509540.0311463.80%
Average6.24%
Table 8. Mean improvement obtained in the kappa and F-score indices of Class 1 (Pinus sp.) with the use of the techniques employed to improve classification.
Table 8. Mean improvement obtained in the kappa and F-score indices of Class 1 (Pinus sp.) with the use of the techniques employed to improve classification.
ClassifierPercentage Improvement
Global AccuracyKappaF-Score of Class 1 (Pinus sp.)
RF3.74%5.40%6.26%
libsvm4.70%6.14%7.37%
Mean4.22%5.77%6.81%
Table 9. Confusion classification matrix for the SAPIENS area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—arboreal vegetation (AV), 3—herbaceous vegetation (HV), 4—marsh (BA), and 5—shade (SB). The table also includes the measures of producer accuracy or recall (PA), user accuracy or precision (UA), F-score, global accuracy (GA), and kappa Cohen index (K).
Table 9. Confusion classification matrix for the SAPIENS area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—arboreal vegetation (AV), 3—herbaceous vegetation (HV), 4—marsh (BA), and 5—shade (SB). The table also includes the measures of producer accuracy or recall (PA), user accuracy or precision (UA), F-score, global accuracy (GA), and kappa Cohen index (K).
ReferenceClassifiedSum
12345PA
(Recall)
PIAVVHBASB
1PI757008900.83
2AV871020810.88
3HV006600661.00
4BA230490540.91
5SB400044480.92
Sum8981665152339
UA (precision)0.840.881.000.960.85
F-score0.840.881.000.930.87
GA89.97%
K0.872983
Table 10. Confusion classification matrix for the PAERVE area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—Eucalyptus sp. (EU), 3—Casuarina sp. (CA), 4—herbaceous restinga (HR), 5—shrub restinga (SR), 6—shade (SB), 7—bare soil (BS), 8—water (A), and 9—anthropic area (AN). The table also includes the measures of producer accuracy or recall (PA), user accuracy or precision (UA), F-score, global accuracy (GA), and kappa Cohen index (K).
Table 10. Confusion classification matrix for the PAERVE area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—Eucalyptus sp. (EU), 3—Casuarina sp. (CA), 4—herbaceous restinga (HR), 5—shrub restinga (SR), 6—shade (SB), 7—bare soil (BS), 8—water (A), and 9—anthropic area (AN). The table also includes the measures of producer accuracy or recall (PA), user accuracy or precision (UA), F-score, global accuracy (GA), and kappa Cohen index (K).
ReferenceClassifiedSum
123456789PA (Recall)
PIEUCAHRSRSBBSAAN
1PI110501120001190.92
2EU1690040000740.93
3CA009100100110.82
4HR0111740200250.68
5SR0400680000720.94
6SB0100034000350.97
7BS0011003200340.94
8A0000000592610.97
9AN0020003325330.76
Sum1118013207736386227464
UA (precision)0.990.860.690.850.880.940.840.950.93
F-score0.960.900.750.760.910.960.890.960.83
GA91.16%
K0.895904
Table 11. Confusion classification matrix for the PNMDLC area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—herbaceous restinga (HR), 3—shrub–arboreal restinga (SRA), 4—arboreal restinga (AR), 5—shade (SB), 6—bare soil (BS), and 7—herbaceous vegetation (HV). The table also includes the measures of producer accuracy or recall (AP), user accuracy or precision (AU), F-score, global accuracy (GA), and kappa Cohen index (K).
Table 11. Confusion classification matrix for the PNMDLC area. The rows represent the reference map, and the columns represent the classified map. The classes are 1—Pinus sp. (PI), 2—herbaceous restinga (HR), 3—shrub–arboreal restinga (SRA), 4—arboreal restinga (AR), 5—shade (SB), 6—bare soil (BS), and 7—herbaceous vegetation (HV). The table also includes the measures of producer accuracy or recall (AP), user accuracy or precision (AU), F-score, global accuracy (GA), and kappa Cohen index (K).
ReferenceClassifiedSum
1234567PA
(Recall)
PIHRSRAARSBBSHV
1PI630013100770.82
2HR03100010320.97
3SRA106527000930.70
4AR101103455003710.93
5SB10063800450.84
6BS00000551560.98
7HV000000661.00
Sum75327539144567680
UA (precision)0.840.970.870.880.860.980.86
F-score0.830.970.770.910.850.980.92
GA88.68%
K0.824535
Table 12. Confusion PEST area matrix of classification. The lines represent the reference map. The classes are 1—Pinus sp. (PI), 2—herbaceous restinga (HR), 3—shrub–arboreal restinga (SRA), 4—arboral restinga (AR), 5—shade (SB), 6—bare soil (SL), 7—trunks and branches (TB), and 8—marsh (BA). The table also includes the measures of producer accuracy or recall (AP), user accuracy or precision (AU), F-score, global accuracy (GA), and kappa Cohen index (K).
Table 12. Confusion PEST area matrix of classification. The lines represent the reference map. The classes are 1—Pinus sp. (PI), 2—herbaceous restinga (HR), 3—shrub–arboreal restinga (SRA), 4—arboral restinga (AR), 5—shade (SB), 6—bare soil (SL), 7—trunks and branches (TB), and 8—marsh (BA). The table also includes the measures of producer accuracy or recall (AP), user accuracy or precision (AU), F-score, global accuracy (GA), and kappa Cohen index (K).
ReferenceClassified
12345678SUMPA
(Recall)
PIHRSRAARSBSLTBBA
1PI14103100001450.97
2HR0621100007800.78
3 SRA143032210003310.92
4AR6040361010840.43
5SB001030000310.97
6SL000001700171.00
7TB000000410411.00
8BA07100001891970.96
sum1487335959321742196926
UA (precision)0.950.850.840.610.941.000.980.96
F-score0.960.810.880.500.951.000.990.96
GA88.44%
K0.850954
Table 13. Area calculations by classified typology.
Table 13. Area calculations by classified typology.
AOITypologyArea (m2)Area (ha)%
SAPIENS1—Pinus sp.40,273.3774.02744.76%
2—Arboreal restinga22,914.4202.29125.47%
3—Herbaceous vegetation10,817.7741.08212.02%
4—Marsh2919.2120.2923.24%
5—Shade13,042.0631.30414.50%
Total89,966.8468.996
PAERVE1—Pinus sp.42,583.0914.25847.33%
2—Eucalyptus sp.14,589.8451.45916.21%
3—Casuarina sp.1868.9360.1872.08%
4—Herbaceous restinga8295.5980.8309.22%
5—Shrub restinga7256.4110.7268.06%
6—Shade10,289.6061.02911.44%
7—Bare soil4320.5880.4324.80%
8—Water264.9700.0260.29%
9—Anthropic 510.1850.0510.57%
Total89,979.2308.998
PNMDLC1—Pinus sp.6,444.5650.6447.16%
2—Herbaceous restinga5481.4690.5486.09%
3—Shrub–arboreal restinga14,529.6641.45316.15%
4—Arboreal restinga56,572.3215.65762.87%
5—Shade4781.2520.4785.31%
6—Bare soil1774.2360.1771.97%
7—Herbaceous vegetation399.3280.0400.44%
Total89,982.8358.997
PEST1—Pinus sp.4654.1080.4655.17%
2—Herbaceous restinga18,393.5861.83920.44%
3—Shrub–aboreal restinga35,376.9103.53839.32%
4—Arboreal restinga8173.8900.8179.09%
5—Shade1528.9630.1531.70%
6—Bare soil221.2880.0220.25%
7—Trunks and branches5299.0390.5305.89%
8—Marsh16,323.3551.63218.14%
Total89,971.1398.996
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gonçalves, V.P.; Ribeiro, E.A.W.; Imai, N.N. Mapping Areas Invaded by Pinus sp. from Geographic Object-Based Image Analysis (GEOBIA) Applied on RPAS (Drone) Color Images. Remote Sens. 2022, 14, 2805. https://doi.org/10.3390/rs14122805

AMA Style

Gonçalves VP, Ribeiro EAW, Imai NN. Mapping Areas Invaded by Pinus sp. from Geographic Object-Based Image Analysis (GEOBIA) Applied on RPAS (Drone) Color Images. Remote Sensing. 2022; 14(12):2805. https://doi.org/10.3390/rs14122805

Chicago/Turabian Style

Gonçalves, Vinicius Paiva, Eduardo Augusto Werneck Ribeiro, and Nilton Nobuhiro Imai. 2022. "Mapping Areas Invaded by Pinus sp. from Geographic Object-Based Image Analysis (GEOBIA) Applied on RPAS (Drone) Color Images" Remote Sensing 14, no. 12: 2805. https://doi.org/10.3390/rs14122805

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop