EVALUATION OF SPECTRAL INDICES FOR DETECTION OF BURNED AREAS IN AN ENVIRONMENTAL PROTECTION AREA USING SPOT-5 IMAGES

,


INTRODUCTION
In recent years, large-scale forest fires tend to occur with more frequency and magnitude due to climate change and human influence causing catastrophic risks (PINTO et al. 2020).Forest fires significantly affect the ecological, social and economic environment.The Cerrado biome frequently experiences large fires every year, as thousands of square kilometers are burned, making an important contribution to the total global burned area.Forest fires, together with climate change and drought, are considered one of the main disturbances, which cause, among other impacts, the destruction of native vegetation in this biome (LIBONATI et al. 2015).Climate change represents a major challenge for forests, particularly for the Cerrado, where the likelihood of extreme events is expected to increase.
Detecting active fires quickly and on a larger scale is a critical task in the context of natural risk management.Reliable and rapid-fire detection, for example, improves the coordination of fire extinguishing activities and rescue missions (CHUVIECO et al. 2019).Fire monitoring from remote sensing data is primarily performed by observing two distinct surface conditions: the presence of a fire front and the area affected by a fire.Thus, remote sensing provides an opportunity to assess the impact of forest fires.This is because a fire leads to changes in vegetation and soil moisture, enabling the use of optical remote sensing images to assess its impact (WEI et al. 2021).Optical images from the SPOT-5 satellite, with a spatial resolution of 10 m, together with spectral indices and Machine Learning (ML), have already been used to assess the burned area and fire severity (SANKEY et al. 2018).
A variety of methods have been developed for monitoring and mapping burned areas using remote sensing data, including threshold-based method with spectral indices.Many burned area mapping methods use spectral indices based on post-fire imagery or pre-fire and post-fire imagery to identify burned areas and separate them from other land cover categories and states, due to their simplicity and efficiency.The most used indices for this purpose are the Normalized Burn Ratio (NBR) and the Normalized Difference Vegetation Index (NDVI) (ATAK and TONYALOğLU, 2020).In recent years, with the emergence of machine learning technology, research on geospatial data using specific algorithms is increasingly present, mainly for environmental studies including mapping and monitoring the state of the territory and changes induced by disturbances, such as forest fires, thanks to the high accuracy of forecasts.One of the most performed applications using machine learning is the classification of land use and cover with the Deep Neural Network (DNN), Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbor (KNN) algorithms.These researches implied that SVM and RF classifiers are more versatile to be used than deep learning for land use and land cover application as they were less sensitive to imbalanced data (ZHANG et al. 2021).
Satellite-derived burned area products are often a prerequisite for a variety of research efforts and are of considerable user interest despite known accuracy limitations.Many studies report global and regional indicators of spatial, spectral and temporal accuracy to assess the ability of spectral indices to detect burned areas, but they tend to rely on relatively sparse validation data (VETRITA et al. 2021).Given these limitations, a cut of two SPOT-5 HGR scenes (before and after the fire) inserted in the Environmental Preservation Area of Rio Preto-BA for the 2015 fire season, was used to evaluate different spectral indices for detecting the area burned using parametric statistics and Machine Learning techniques.

Study area
The burned extension is entirely inserted in the Rio Preto APA, located in the interior of the state of Bahia with coordinates Lat: 10°41'57.65"Sand Long: 45°51'27.41"W.APA Rio Preto comprises the municipalities of Formosa do Rio Preto, Santa Rita de Cássia and Mansidão, whose territorial extension is 1,146,161.96ha.The creation of APA Rio Preto through State Decree No. 10,019 of June 5, 2006 considered the natural characteristics of the area covered due to its importance due to its ecological potential and concomitant high environmental fragility.The native vegetation of this region is related to the Cerrado and the ecological transition between Cerrado-Caatinga, with riparian forests and paths occupying the valley bottoms.Formosa do Rio Preto-BA has great potential for grain production.According to Silva (2021), the region's characteristic climate is Sub-Humid Dry, a type of hot climate.Average annual temperatures range between 25º and 28º C, and maximum temperatures range from 30º to 33º.Air humidity reaches very low levels in the dry winter (38 to 40%) and very high levels in the rainy summer (95 to 97%), which proves seasonality in terms of the alternation of rainy seasons with dry seasons.The Cerrado areas are characterized by a well-defined dry season where the local vegetation is more prone to fire, with a predominance of open Cerrado formations, with frequent occurrence of fires of anthropogenic and natural causes, with meteorological conditions, relief and time of burning, conditions of the temperature reached by the fire and the time necessary for the total burning of the available plant material (BOLSON and ARAÚJO, 2022).According to the Queimadas Database platform (BDQueimadas), for the years 2011 to 2015, 4391 outbreaks were registered, with a relative increase of 6.62% in relation to the first interval studied.The years with the highest incidences are 2011, 2012 and 2015, with respective values of 21.54%, 32.34% and 26.71%.

SPOT5-HGR
Two SPOT5 HRG Level 2 scenes were used on the dates 08/10/2015 pre-fire and 09/15/2015 post-fire, made available free of charge by the European Space Agency (ESA) archive platform at the electronic address: https://earth.esa.int/eogateway/catalog/spot1-5-esa-archive.The SPOT-5 image has a spatial resolution of 10 m with the following 4 bands: green (0.49-0.61 μm), red (0.61-0.68 μm), NIR (0.78-0.89 μm) and SWIR (1.58-1.75μm).These images are originally pre-processed with orthorectified surface reflectance (ToA) after atmospheric correction, including adjacency and terrain effects along with a mask of clouds and their shadows, as well as a mask of water and snow.Documentation on the SPOT5 scene pre-processing steps carried out by (ESA) can be found in (DOSOGNE et al. 2018;SANKEY et al. 2018).

Spectral Indices
Spectral indices compatible with the HGR sensor were used (Table 1) which, according to Chuvieco et al. (2019), are sensitive to the discrimination of burned areas and widely used around the world.The following spectral indices were compared: BAI, BAIM, EVI2, NBR, NDVI.
Table 1.The spectral indices tested to determine the best for detecting burned areas.Tabela 1. Os índices espectrais testados para determinar o melhor para detecção de áreas queimadas.

Spectral indices Formula Reference
Burned Area Index Normalized Burn Radio (NBR) (NIR -SWIR) / (NIR + SWIR) Key and Benson (1999) Normalized Difference Vegetation Index (NDVI) (NIR -R) / (NIR + R) Tucker (1979) Source: Authors (2023) The BAI and BAIM indices are calculated from the convergence coefficients of each pixel between charcoal/charcoal and other soil covers, defined based on radiative properties of recently burned areas in the red (RED) and near infrared (NIR) bands, while BAIM uses the near infrared (NIR) and shortwave infrared (SWIR) bands.The normalized burn rate (NBR) is used to identify burned areas and provide a measure of burn severity.It is calculated as a ratio between the (NIR) and (SWIR) bands.EVI2 is an index sensitive to variation in canopy structure, including leaf area index, plant physiognomy and canopy architecture.NDVI is sensitive to vegetation greenness and is widely used in mapping studies of burned areas (JIANG et al. 2008).

Separability of burned areas JM-Distance
The JM distance is a class separability index based on conditional probability theory and can be used to determine the separability of distinct variables for each classification.It is a measure of the average difference between the density functions of the two classes by calculating the separability of a pair of probability distributions, thus determining distinguishable characteristics based on the training samples (ZHANG et al. 2023).
JM(c i ,c j )= ∫ (√p(x∨c i )-√p(x∨c j )) the burned area class.The JM distance ranges from 0 to 2, with a large value indicating a high level of separability between two classes.

Classification by Support Vector Machines
SVM is a learning device for dividing linearly separable samples into categories and is one of the supervised machine learning algorithms that can be used for regression modeling or classification.In essence, SVM provides a way to avoid high-dimensional space complexity, which is based on the principle of structural risk minimization, operating on the Kernel trick principle, and is more suitable for binary classification tasks, as SVM it will have minimum classes to operate close to the separation hyperplane, bringing a strong generalization capacity (TONG et al. 2021).Mapping all samples to higher dimensions is computationally expensive and this is avoided by using a Radial Basis Kernel Function (RBF).This function calculates decision boundaries in terms of similarity measures in a high-dimensional feature space without doing a transformation.For an RBF, it is necessary to optimize the C (Coast) and Gamma parameters simultaneously, where the C parameter adds a penalty for each incorrectly classified data point, while the gamma parameter of the RBF controls the influence distance of a single training point.The RBF kernel was used in this study due to its stability (ZHOU et al. 2021).
In this paper, C and gamma values were selected through an experimental combination of the training samples extracted during the pre-classification step.The "caret" package in the RStudio software was responsible for learning the SVM based on the sample set, which resulted in an optimal parameter of C of 0.1 and gamma of 100 as it demonstrated greater adjustment stability (Figure 2).After defining C and gamma, the burned area was classified by SVM based on these hyperparameters in the HGR bands composed of each spectral index.In the post-classification stage, the assessment of the importance of the variable was applied with the purpose of understanding the strength of each variable in the final definition of the classification in separating the burned area.Assessing feature importance in ML studies facilitates variable selection and supports meaningful interpretation.Briefly, in the SVM post-classification stage, there are criteria that provide "Ranking" information on the importance of the predictor variables involved in learning.Briefly, these criteria provide an estimate of the importance of each variable in predicting the pre-defined class.Then, all measures and importance are scaled to have a value from 0 to 100%, using the $variable.importancefunction inserted in the "caret" package (KUHN et al. 2022).In this article, the SVM variable importance function was used in the post-classification stage of the burned area through the HGR bands together with each spectral index.

SVM Regression
SVM can also be used for regression problems.With the definition of C and Gamma, an SVM regression was performed to quantify the strength of the relationship between pre-and post-fire data for each index using the training samples used in the classification.The analysis was verified based on the coefficients of determination R² (eq.2) and Root Mean Square Error (RMSE) (eq.3).The model with the lowest R² values and highest RMSE values were considered the best predictors in spectral differentiation (RADOčAJ et al. 2021).

Classification validation and accuracy analysis
To evaluate the thematic accuracy of the burned area obtained by the SVM classification the post-fire SPOT5 image, a contingency matrix was calculated, obtained by cross-comparing the SVM classification against a spatial reference.Using the elements of the contingency matrix, three thematic quality metrics were calculated.These were Producer Accuracy (AP) (eq4), User Accuracy (AU) (eq5) and Dice Coefficient (DC) (eq6).User accuracy is the estimate of the fractions of mapping pixels, for each class, correctly classified.Producer accuracy is the sample fraction of pixels from each class correctly assigned to their classes by the classifiers.Therefore, producer and user accuracy can vary from a situation in which all classifications are incorrect (0% accuracy) to a situation in which all classifications are correct (100% accuracy).The Dice Coefficient is a statistic used to measure the similarity of two classes of the same assignment, in this article, being the burned area class (LIBONATI et al. 2015).The coefficient varies between 0 and 1, where 0 indicates no overlap and 1 indicates perfect overlap.
The burned area polygons produced by BDQueimadas across the platform (Aq30m) were used as auxiliary spatial reference data.Aq30m provides polygons representing estimates of the scars of burned areas in shapefile format generated from images with 30m spatial resolution via the electronic address: https://queimadas.dgi.inpe.br/queimadas/aq30m/.In this article, the Aq30m polygons helped in the selection of training and control samples added to the photointerpretation selection technique with the support of Landsat (30m), CBERS4/PAN (panchromatic band -5m) and Sentinel-2 (10m) scenes on dates close to the fire.

DC= 2TP 2TP+FP+FN (6)
Where the TP (True Positive) is the number of pixels that are classified correctly.The FP (False Positive) is the number of pixels classified as the target class but belonging to other classes and the FN (False Negative) is the number of pixels that are classified as other classes but belong to the target class.

Separability of burned areas
Figure 3 shows the JM statistical separability values of each spectral index through the dashed line competing with the histograms.

Regression Analysis by SVM
Figure 4 shows the regression graphs based on the SVM model tested with significance (p < 0.01).For contributions, the gray area represents the 95% confidence interval.

Importance of variable in SVM classification
The results consistently suggest that the BAI, NBR and BAIM indices were the main predictive indicators for fire detection in the study area, as they presented greater representation in the SVM classification process of the burned area when compared to the unit bands (figure 5).
Figure 5. SVM importance ranking of the permuted predictor estimates of the SPOT-5 bands and the spectral indices.Figura 5. Classificação de importância por SVM das estimativas de preditores permutados das bandas SPOT-5 e os índices espectrais.
The results consistently suggest that the BAI, NBR and BAIM indices were the main predictive indicators for fire detection in the study area, as they presented greater representation in the SVM classification process of the burned area when compared to the unit bands.On the other hand, the analysis of the classification based on the EVI2 and NDVI indices showed that the NIR band was more important in the classification, slightly surpassing the indices.The SWIR band presented a moderate capacity.The green and red bands received a low importance value in all cases, representing a low discriminatory capacity when competing with the other predictors in separating burned areas in the implemented SVM classification.

SVM classification accuracy analysis
All indices presented important producer accuracy values (PA>80%), with emphasis on the indices based on the SWIR band (NBR and BAI) with PA of approximately 99%.These indices overestimated burned areas in an area below 1 km².The BAI and NDVI presented promising PA values, reaching an overestimation of 1.4 km² and 4.33 km² of burned area, respectively (Table 2).The lowest PA performance was seen for EVI2, on the other hand it presented the highest UA in relation to all indices.Soon after, the NDVI showed an important AU performance with a poor classification of burned area of approximately 8.3 km², while the EVI2 was 7.7 km².The BAI, BAIM and NBR indices did not show significant variations in UA, however the BAIM stood out with an underestimation of the burned area of approximately 10Km², while the BAI was 11.5Km² and the NBR was 12.7Km².The Dice Coefficient estimates were most consistent for the BAI and BAIM indices with accuracies above 1.5, then the best estimate was the NBR index.NDVI and EVI2 followed the behavior of UA and PA with Dice Coefficient estimates below 1.4.

DISCUSSION
The indices with the greatest ability to separate burned and unburned areas were BAI, BAIM and NBR with high separability values, this confirmed the conclusions of previous studies that these indices have better performance for detecting burned scars (BOLSON and ARAÚJO, 2022;DOSOGNE et al. 2018).The analysis of the importance of predictors for the SVM classification showed that spectral indices are not always the best indicators that score strongly in the classification, although the composition of the bands used in them reflects the secondary importance of the other bands.The NIR and SWIR bands were the most important channels in the SVM modeling, which did not occur for the BAI, BAIM and NBR indices, indices with estimated importance above 80%.The prominence of NIR and SWIR in the importance test was also seen in Chuvieco et al. 2002;Vetrita et al. 2021;Pereira et al. (2016), where they demonstrated its high applicability in identifying burned areas.
The unique spectral characteristics of the SPOT-5 bands favored detection by spectral indices, mainly enhancing the capacity directly related to burned areas, as the red band is centered on the chlorophyll absorption peak (0.665 µm); the wide NIR range (0.78 -0.89 µm) corresponds to the maximum spectral reflectance of the vegetation and is mainly related to the structural properties of the canopy and the percentage of soil covered by burned vegetation; the shortwave infrared (SWIR) band centered around 1.65 µm, detects the water content of the canopy components and their post-fire structure (DOSOGNE et al. 2018).
In the accuracy analysis, the BAIM index demonstrated the highest values of PA and UA error and DC, on the other hand, the BAI and NBR also presented similar values as well as the lowest coefficients of determination.The good performance of BAIM may be related to the good adequacy of the convergence coefficients between the NIR and SWIR bands, successful in applications in mapping burned areas in the Cerrado (SOUSA et al.2018).The results obtained for the BAI index differ from the analyzes carried out by Pereira et al. (2016) andChuvieco et al. (2002).The authors justify these discrepancies by the low capacity to discriminate fires in the African Savannas and Cerrado, which can be attributed to the distinct characteristics of the vegetation in these regions compared to the vegetation of countries located in the Mediterranean region of Europe, where the pioneering study of Chuvieco et al. (2002) was conducted.The red and NIR bands of SPOT-5 centered in specific bands for in-depth analyzes of the state of vegetation, may be a factor for the good results found for the BAI index in this article, which is not seen when the index is applied by scenes OLI and ETM+ seen in the study by Chuvieco et al. (2002).In the case of NBR, satisfactory performance was expected, since this is the main index for detecting burned areas on a global and regional scale, in addition to being one of the parameters for the composition of automatic forest fire detection algorithms using different sensors.such as MOD64A1, FirecCCI, Landsat Burned Area (VETRITA et al. 2021).NBR applied to SPOT-5 images was successful in the study by Won et al. (2014) who used the NBR to evaluate the possible regrowth of vegetation and the severity of the fire and, therefore, estimate the areas damaged by fire.The authors concluded that the use of NBR spectral information was successful in overcoming most of the confusion between burned areas and other types of land cover, such as bodies of water and shadows in northern South Korea.There are few studies that evaluate EVI2 in applications in burned areas, although some report superior performance of this index for detecting disturbances in vegetation due to its aerosol attenuation and ability to reveal different vegetation dynamics (JIANG et al. 2008).
The results found for EVI2 in this study differ from previous studies.For other sensors with different spectral response functions, EVI2 may vary slightly from one sensor to another, NDVI also did not show promising results with separability close to 1 and the lowest Dice Coefficient.These results corroborate studies by other authors who report limitations of NDVI for detecting burned areas.

CONCLUSIONS
• Data from the SPOT-5 satellite has enabled progress in developing detailed and timely post-fire mapping.The application of operational and efficient SPOT-5 image processing procedures will benefit users who have limited knowledge about these images, which until months ago were not freely distributed.• Based on the results obtained, it can be concluded that the BAIM index stood out as the most accurate in detecting burned areas compared to all evaluated spectral indices.This was evidenced by the statistical analyzes and the performance of the SVM model.Therefore, its use is recommended in future research related to mapping the severity of forest fires, at least in the Cerrado biome.Spectral indices associated with fire, such as BAIM, BAI and NBR, demonstrated a remarkable ability to distinguish burned from unburned areas, with a statistical separability greater than 1.5 and low correlation with unaffected areas (R² < 0.3).
Misclassifications for these best-performing indices were low, with less than 2% of areas misclassified.Furthermore, the Dice Coefficient presented values above 1.45 for these indices.On the other hand, the NDVI and EVI2 indices did not prove to be significant predictors in the detection of burned areas in the SVM model.Therefore, the results suggest that, at least in the context of the Cerrado biome, fire-related indices, particularly BAIM, are more appropriate and accurate for this purpose.• In general, it is important to highlight that different types of vegetation, spectral resolution and different time intervals between image acquisition dates greatly influence the behavior of spectral indices.Despite this, the proposed method can be applied to different site conditions in addition to the possibility of representing a tool to link spectral indices, computational learning, and environmental conditions.Thus, the results found in this work express the importance of studies in protected areas for the conservation of forests and the importance of disseminating monitoring and inspection strategies so that these areas continue to exercise their role as an instrument of public policy with objective of reducing and containing forest fires.
x denotes a range of values of each spectral index; c i e c j (lowercase c) denote the pre-and postfire classes.In this study we use the JM distance to measure the statistical separability for each class in relation to FLORESTA, Curitiba, PR, v. 54, e-89740 -2024 Silva Jr, J. A.; Pacheco, A. P. ISSN eletrônico 1982-4688 DOI:10.5380/rf.v54.89740 4

Figure 2 .
Figure 2. SVM training validation accuracy graph as a function of Gamma and C. Figura 2. Gráfico de acurácia de validação de treinamento SVM em função de Gamma e C.

Figure 3 .
Figure 3. Pixel distribution and JM Separability Coefficient for each spectral index Figura 3. Distribuição de pixels e Coeficiente de Separabilidade JM para cada índice espectral All indices showed high separability (JM>1.5)except for NDVI and EVI2 with low separability (JM = 1.16 and 1.44, respectively) in relation to the other indices.The BAIM index presented the best performance with a maximum separability value (JM = 2.0), as well as its simplified version (BAI) which showed maximum separability (JM = 1.96).The EVI2 and NBR indices obtained similar values, while the NDVI presented the worst separability performance.

Figure 4 .
Figure 4. Regression by SVM between the training points of the spectral indices in the pre and post fire scenes.Figura 4. Regressão por SVM entre os pontos de treinamento dos índices espectrais nas cenas pré e pós incêndio.It is observed that the NDVI and EVI2 indices had strong relationships with R² above 0.7, which shows that these indices are not sensitive to the disparity between burned and unburned areas.This result is reinforced by the high RMSE values in relation to the other indices resulting in a slight stability of the model's accuracy.The lowest R² was seen for the BAIM index as well as the highest RMSE value, followed by BAI, which means that these indices indicate an abrupt difference between the pixel values for the burned and unburned area, that is, they show a high detection power.The estimates presented by NBR were moderate, while the model adjusted by NDVI and EVI2 performed well with low RMSE and R² values above 0,6.
Lacouture et al. (2020) found that even if sparse canopy sites are selected, heterogeneity and fire variation across the landscape can overestimate NDVI values.Because NDVI is indiscriminate, any large trees or shrubs not consumed by low-intensity fire will contribute higher green values Boke-Olén et al. (2016), especially during hot, humid months when vegetation is actively growing.Low accuracies are caused by confusion between the burned area and dark surfaces, such as soil, shade and water, and especially in the dry season, the vegetation characteristics can lead to errors and confusion with burned areas (CHUVIECO et al. 2019).FLORESTA, Curitiba, PR, v. 54, e-89740 -2024 Silva Jr, J. A.; Pacheco, A. P. ISSN eletrônico 1982-4688 DOI:10.5380/rf.v54.89740 9

Table 2 .
Estimates of Producer Accuracy (PA), User Accuracy (UA) and Dice Coefficient (DC) for the classification of burned area for the different SVM classifications obtained by integrating the SPOT bands with the spectral indices.Tabela 2. Estimativas de Acurácia do Produtor (AP), Acurácia do Usuário (AU) e Coeficiente Dice (CD) para a classificação de área queimada para as diferentes classificações SVM obtidas pela integração das bandas SPOT com os índices espectrais.