Evaluating the potential of burn severity mapping and transferability of Copernicus EMS data using Sentinel-2 imagery and machine learning approaches

ABSTRACT The abiotic and biotic conditions in forest ecosystems can be significantly influenced by forest fires. However, difficulties in policy decisions for restoration inevitably occur in the absence of information on the damaged forests, such as location, area, and burn severity. In this study, eight spectral indices calculated from Sentinel 2 MSI imagery and machine learning algorithms (Random Forest (RF) and Support Vector Machine (SVM)) were used for mapping burned areas and severity. Two study sites with similar meteorological environment (dry season) and species (coniferous vegetation) were tested, and dataset (EMSR448) from Copernicus Emergency Management Service (CEMS) was used as the reference truth. RF showed better performance for classifying pixels from classes with similar properties than SVM. Normalized Burn Ratio (NBR) and Green Normalized Difference Vegetation Index (GNDVI) showed high importance in assessing fire severity suggesting that it may be effective for identifying senescent plants. The results also confirmed that the CEMS dataset has transferability as a reference truth for fire damage classification in other regions. Implementation of this method enables fast and accurate mapping of the area and severity of destructive damage by forest fires, and also has applicability for other disasters.


Introduction
Forest ecosystems are an integral part of many terrestrial ecosystems, providing a wide range of ecological, economic, social, and cultural services (Chen et al. 2015).Furthermore, many more depend on forests for other critical ecosystem services, such as carbon storage, climate regulation, human health, and the genetic resources support wood products (Wingfield et al. 2015).Nowadays, various environmental changes caused by natural phenomena or anthropogenic activities has increased forest vulnerability to a range of natural disturbances including forest fires.In particular, with the increase in temperature and vapor pressure deficit due to anthropogenic climate change globally, fuel loads and fire potential increase accordingly and more forests are experiencing longer fire-seasons (Abatzoglou and Williams 2016;De Luca, Silva, and Modica 2022).
Forest fires can have a significant and immediate impact on both abiotic and biotic conditions in forest ecosystems as well as on population and society.Moreover, fires are a long-term threat, contributing to habitat degradation, soil erosion, affecting atmosphere, and global climate releasing greenhouse gases (GHGs) (De Luca, Silva, and Modica 2021;Rosa, Pereira, and Tarantola 2011).Forest fires are more frequent during dry climatic periods, on convex relief forms and on south facing slopes and the frequency decreases with increasing humidity (Angelstam 1998).Especially, forest fires have caused severe long-term destruction to wildlife, property, and the environment of forest areas in various places including Asian countries over the last decades (Tien Bui et al. 2016).Forest fires yearly affect more than 50 million hectares all around the world (Lasaponara and Tucci 2019).
The forest area of South Korea is about 6,298,000 ha, which is 62.7% of the total land area of 10,041,000 ha according to the 2020 forest standard statistics and the forest-to-land ratio of South Korea is the fourth largest among the Organization for Economic Cooperation and Development (OECD) countries (Choi, 2021).In addition, almost 37% of forest in South Korea are coniferous and the forest growing stock is 202.3 m 3 /ha, which is much higher than the average of OECD countries (131.3 m 3 /ha) (KFS, 2021).Most of the coniferous trees in Korea are Pinus because of the pine tree priority planting policy for resources, economy, and forest restoration.
Since coniferous forests have a large number of branches and leaves, those under the canopy dry easily and dry conifer forests are more susceptible to high-severity wildfires (Stevens, Safford, and Latimer 2014).So, forests in South Korea are densely distributed, a forest fire can straightforwardly spread outward, resulting in massive amounts of damage.Gangwon-do and Gyeongsangbuk-do are the two administrative districts with the largest forest area in South Korea, with 1,371,643 ha and 1,337,741 ha, respectively.In March 2022, a large forest fire lasted for about 10 days in the forests of Gangwon-do and Gyeongsangbuk-do, and it is estimated that an area of about 200,000 ha was affected by the forest fire.It is believed that the occurrence of forest fires is caused by anthropogenic activities and the dry season.In addition, more than 50% of forests are pine trees in the Gyeongsangbuk-do region.Coniferous characteristics and combustible materials such as resin could be the factors to occur forest fires more frequently and strengthen the spread of forest fires.
The social demand for restoration of damaged ecosystem including forest is increasing, however, in the absence of information of the damaged forests such as location, area, and burn severity, difficulties in policy decisions for restoration inevitably arise (Lee et al. 2020).Accurate and timely detection and quantification of burned areas are necessary to assess the damages, address the post-fire management, and implement medium and long-term territorial and landscape restoration strategies (Chuvieco et al. 2019).However, standards and specialized agencies for establishment related with post-fire management are absent in most countries.
Remote-Sensing (RS) methods are suited for earlystage and change detection in forestry and evaluations when accessibility for ground-surveys is difficult or still not possible, as the area needs to be cleared to provide access and security (Lee, Ryu, and Kim 2022;Mokroš et al. 2017).In addition, the development of various algorithms and techniques is helping to improve the accuracy and robustness of analysis through RS applications.In particular, machine learning algorithms are proving their usefulness in various research fields using RS data due to various advantages: (1) reducing computation time; (2) dealing with the nonlinearity of variables; (3) mitigating the Hughes phenomenon, etc. (Ghosh et al. 2014;Millard and Richardson 2015).
For forest fires analysis, RS and machine learning algorithms have been used to precisely estimate fireaffected areas and burn severity, to aid in forest fire assessment, restoration, monitoring and prevention on various scales since the mid-1980s (Chu and Guo, 2013;Lentile et al. 2006;Lee and Chow 2015).The use of RS and machine learning algorithms to assess forest fires effects has allowed for great progressions in characterizing spatial patterns of fire and understanding drivers of burn severity (Miller and Thode 2007).Datasets available for multiple spatial and temporal resolutions from various satellite sensors have been used to assess environmental conditions after fires to detect changes in post-fire spectral responses, which observe the vegetation response (Lentile et al. 2006;Navarro et al. 2017).Among these sensors, the European Space Agency (ESA)'s Sentinel-2 mission, which is part of the European Commission's (EC) Copernicus program, has been delivering highresolution optical imagery across global terrestrial surfaces since 2015.Sentinel-2 mission carries Multi-Spectral Instrument (MSI) and the design of the MSI have been driven by the requirement for large swath high geometrical and spectral performance of the measurements (Navarro et al. 2017).The MSI measures the Earth's reflected radiance over 13 spectral bands span from visible (VIS), near-infrared (NIR), to short wave infrared (SWIR), at different spatial resolutions depending on the band.Sentinel-2 data are also characterized by high temporal frequency (5 days), characteristics which constitute them attractive for setting up an operation burned area mapping service on a national level (Stavrakoudis et al. 2020).As a representative example, the Copernicus Emergency Management Service (CEMS) publicly funded European Union program coordinated by the EC and supports management of natural or artificial disasters including forest fires.CEMS provides geospatial information based on imagery data from various satellite such as European Centre for medium-range Weather Forecasts (ECMWF), in situ (ground), and model data showing scale, timeline, and perspective of a disaster.It can be used as an operational reference data for burned area, severity mapping, and decision-making for restoration, but few studies have investigated the potential.
Several existing studies that focus on forest fire damage or susceptibility detection using RS, machine learning algorithms, and various satellite imagery were examined (Fernández-Manso, Fernández-Manso, and Quintano 2016;Kalantar et al. 2020;De Luca, Silva, and Modica 2021;Schroeder et al. 2016;Stavrakoudis et al. 2020;Mpakairi, Ndaimani, and Kavhu 2020).For example, in the study of De Luca, Silva, and Modica (2021), burned area detection was performed using Sentinel-1 synthetic aperture radar (SAR) data and machine learning algorithm (k-means clustering).Stavrakoudis et al. (2020) and Mpakairi, Ndaimani, and Kavhu (2020) tried to detect burned areas using Sentinel-2 MSI data and machine learning algorithms (random forest and support vector machine).In addition, there have been attempts to detect burned areas using threshold values of various spectral indices without using classification algorithms (Fernández-Manso, Fernández-Manso, and Quintano 2016;Kalantar et al. 2020;Schroeder et al. 2016).Prior research is significant in that it seeks several spectral indices for burned area detection and develops and applies various algorithms.However, there were limitations in terms of verification due to the absence or inconsistent reference data.Furthermore, by performing verification using discretionary reference made with a single index such as Normalized Burn Ratio (NBR), limitations in terms of reliability were also revealed.
This study presents a novel methodology for mapping burned areas and severity using Sentinel-2 MSI data, CEMS data, and machine learning algorithms aiming at achieving mapping accuracy and transferability.Among machine learning algorithms, Random Forest (RF) and Support Vector Machine (SVM), pixelbased classification approaches, were used since these two have exhibited robust supervised pixelbased classification performance in studies using multi-and hyperspectral data (Millard and Richardson 2015).The proposed methodology uses a pair of Sentinel-2 images for each study areas, one acquired before the forest fires and one after the fires have been extinguished.Eight spectral indices that have been used in analyzing forest fires were employed, and these indices can offer economical and geographically comprehensive views of various areas that have been damaged by forest fire in different severities.

Study areas
In this study, there are two sites for study areas, a) Castelo Branco, Portugal, b) Gangwon-do and Gyeongsangbuk-do, South Korea (Figure 1 and 2).Castelo Branco is an area where CEMS provides reference data for forest fires, and will be used for model training and validation and we will analyze the area and severity of forest fire damage in Gangwon-do and Gyeongsangbuk-do.The site A is located in eastcentral Portugal (39°50ʹN; 07°50ʹW), and has almost 50,000 ha of forest area.A forest fire started on July 25, 2020 and burned 5,558.15ha of forest area according to the CEMS.In addition, most of the forest areas at site A were composed of 5-8 years old young pine trees, the combustion completeness of the fuel is high and mostly only light gray color ash remains after a forest fire (Bodí et al. 2014) (Figure 1b).
The site B is located in southeast of South Korea (37°60ʹN; 129°20ʹE) and has 2,709,384 ha of pine forest area.A forest fire started on March 4, 2022 and was extinguished on March 13, 2022.It is reported that about approximately 20,000 ha of forest area was affected by the 10-day forest fire and most of the affected areas are also covered with pine trees like site A (Figure 2d).Pine forest distribution dataset was obtained from the Ministry of Environment (https:// egis.me.go.kr/, accessed 9 April 2022).However, due to the absence of mapping and reference data on burned areas or severity of fire damage, damage assessment could not be performed.
Occurrence and severity of forest fires are affected by forest structure, type, and climate.Among these conditions, dry coniferous forests are known to be the most vulnerable to forest fires (Stevens, Safford, and Latimer 2014).Both study areas are at a similar latitude and present similar and comparable coniferous vegetation (mostly of pine trees) contexts.Within these similar environmental circumstances, forest fires occurred during the dry season in both study areas, so it is judged that the damage pattern of forest fires and spectral characteristics for damage assessment are also similar.

Forest fire reference data
As reference data, the forest fire grading data in Castelo Branco were obtained from the CEMS website (EMSR448) (Figure 4a) (https://emergency.copernicus.eu/mapping/list-of-components/EMSR448, accessed 10 March 2022).CEMS products are created using satellite, in situ (ground) and model data.These data show information about a disaster event on a scale, timeline, and perspective that only geospatial information can provide.For assessing forest fires, CEMS determined the perimeter of the fires by expert judgment using pre-or post-event images and its own manual, as well as the distribution of the severity levels inside it (Navarro et al. 2017).The severity classes are divided into two levels of burn severity: destroyed and damaged.Possibly damaged refers to an area where it is difficult to distinguish whether or not damage actually occurred.The European Macroseismic Scale (EMS) served as the basis for the CEMS categories for rapid damage assessment (Cotrufo et al. 2018) (Table 1).In this study, the nondamaged grade was added to the damage grading, and finally four grades were used as a classification category.

Datasets and preprocessing
The four Sentinel-2A Level-1C (L1C) MSI imagery used in this study were downloaded from the Earth explorer website (https://earthexplorer.usgs.gov/,accessed 21 May 2022) (Table 2).These images are of Level-1C products, which are Top-Of-Atmosphere (TOA) products that have undergone radiometric and geometric correction by the Payload Data Ground Segment (PDGS).The corrections include orthorectification and spatial registration on a global reference system with sub-pixel accuracy (Navarro et al. 2017).We further atmospherically corrected the images to derive the surface reflectance in QGIS 3.22 using the Dark Object Subtraction (DOS) algorithm.Atmospheric scattering and absorption make imaging system record a nonzero digital number (DN) value for dark objects and DOS method subtracted continuous non-zero DN value, DN haze from the whole band assuming that some objects were under comprehensive shadow must have zero reflectance (Nazeer, Nichol, and Yung 2014).

Spectral indices
The degree of soil, water, and vegetation change caused on by a forest fire can be used to determine the fire damage severity (Escuin, Navarro, and Fernández 2008).In this study, eight spectral indices  related with fire and burnt areas were calculated from the processed Sentinel-2 imagery for each study area (Table 3).Most of them use different bands such as SWIR which is sensitive to moisture content and thus makes it easier to distinguish between burnt regions instead of the traditional normalized difference formulation of the Normalized Difference Vegetation Index (NDVI) (Mutanga, Adam, and Cho 2012).These indicators showed the ability to distinguish burned areas in the previous studies and we have used narrowband NIR (B8a) instead of broadband NIR (B8) based on the results of other related studies (Fernández-Manso, Fernández-Manso, and Quintano 2016).The difference between the pre-fire image value and the post-fire image value for all indices was used in the model which is an indication of disturbance resulting from the forest fire (Stavrakoudis et al. 2020).The difference, which is denoted as dSI for any index SI (e.g.dNBR), is calculated as pre-fire minus post-fire value, i.e. dSI = SI pre -SI post .

Algorithms and validation
Two competitive machine learning algorithms, RF and SVM were utilized using the scikit-learn python library to find out how they perform on burn severity classification with high dimensional input datasets.Both are recognized as the most efficient and frequently used algorithms for satellite image classifications (Adugna, Xu, and Fan 2022;Kalantar et al. 2020).However, the values of the parameters each algorithm uses have a significant impact on how well they function.
RF algorithm is extensively used in various remote sensing applications for both classification and regression (Jang et al. 2019;Kamińska 2018;Mpakairi, Ndaimani, and Kavhu 2020;Rhee and Im 2017).RF is made up of a predefined number of Classification and Regression Tree (CART), each of which employs a sample subset of the available data.Predictors are chosen with an equal probability for each CART and by combining and averaging each component CART's individual projections, the anticipated output is generated.It has been shown that this construction approach, which combines the ideas of bagging and random feature selection, performs better than other machine learning methods (Archer and Kimes 2008).RF also provides the relative importance of input variables through the Out-Of-Bag (OOB) errors when a variable is perturbed (Georganos et al. 2018).The significant parameters for RF algorithm are the number of trees (NUM_TREES) and the depth of each tree (MAX_DEPTH) (Rhee and Im 2017).In this study, various of NUM_TREES values (10,50, 100, and 200) and MAX_DEPTH values (3,5,10, and no pruning) were tested.Name of Spectral Index Formula Reference Char Soil Index (CSI) NIR SWIR Smith et al. (2007) Global Environmental Monitoring Index (GEMI) Þþ1:5�NIRþ0:5�Red NIRþRedþ0:5 Pinty and Verstraete (1992) Green Normalized Difference Vegetation Index (GNDVI) Fernández-Guisuraga, Suárez-Seoane, and Calvo ( 2019) Miller and Thode ( 2007) Mutanga, Adam, and Cho (2012) Normalized Difference Vegetation Index red-edge 1 narrow (NDVIre1n) Fernández-Manso, Fernández-Manso, and Quintano (2016) Normalized Difference Water Index (NDWI) Modified Soil-Adjusted Vegetation Index (MSAVI) 2NIRþ1À ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Rogan and Yool (2001 SVM algorithm is a collection of hypothetically powerful machine algorithms (Huang, Davis, and Townshend 2002).SVM's fundamental aim is to construct an ideal hyperplane, also known as a decision boundary or an optimal boundary, that optimizes the distance between nearby samples (support vectors) and the plane and efficiently divides classes (Yang 2011).Although there are many alternative kernel types, including linear, polynomial, radial basis function, and sigmoid, the radial basis function (RBF) kernel is the most useful and often used one in remote-sensing image classifications (Waske et al. 2010).Two crucial parameters, penalty value (C) and gamma (γ), must be carefully selected for the RBF kernel to operate well (Yang 2011).As data fall on the incorrect side of the ideal hyperplane and hence have a major impact on the accuracy and/or generalizability of the method, C is used to adjust the degree of the penalty to regularize misclassified training datasets.Gamma, a parameter that regulates the width of the kernel, has a similar impact to C on the SVM model employing the RBF kernel in that if a large value is given to it, the model will overfit and does not generalize well (Foody and Mathur 2004).In this study, a range of C values (10, 50, 100, and 200) and gamma values (scale ( = 1/ (n_features*X.var()), 10 −1 , 10 −2 , 10 −3 ) were tested.
A total of 27,500 pixels were used in the two machine learning models.The number of training samples was extracted based on the ratio of areas for each severity class based on the reference data (destroyed, damaged area, nondamaged: 7,500, possibly damaged: 5,000).The training dataset was exposed to a resampling procedure utilizing Cross Validation (CV) in order to evaluate the suitability of a statistical model and estimate the variance and the bias of estimates.CV randomly divides the sample data into a dataset for creating models and a dataset for testing the performance accuracy of the models (Kalantar et al. 2020).Fivefold CV implementation scheme were used and the performance of the two models was compared based on overall accuracy (OA), user's accuracy (precision), producer's accuracy (recall), f1-score (Olofsson et al. 2014).Using the grid search method, the best optimization was confirmed among the combinations of hyperparameter values for each model (Appendix A and B).And parameter value showing the best model performance for each algorithm was selected for prediction analysis.

Post processing for predicted map
The salt-and-pepper phenomenon, which is characterized by misclassification of single pixels or tiny, isolated areas, frequently affects pixel-based categorization.Additional post processing, sieve and clump filter was used to effectively reduce salt-and-pepper noise in order to improve the findings of burn severity categorization.Sieve and clump filter provide means for generalizing classification images (Al-Ahmadi and Al-Hames 2009).The sieve method looks at the neighboring 8 pixels to determine if a pixel is grouped with pixels of the same class.If the number of pixels in a class that are grouped is less than the value, those pixels will be removed from the class.And then the clump method is used to clump adjacent similar classified areas together using morphological operators.

Accuracy assessment
Accuracy assessment is one of the most significant procedures in the classification analysis process.The objective of accuracy assessment is to evaluate the accuracy with which the pixels were sampled into the accurate classes on a quantitative level.Overall accuracy and kappa value were estimated for accuracy assessment.Kappa value was estimated using Equation (1): where i is the class number, n is the total number of points, nii is the number of pixels of actual data class i, that were classified as a class i, Ci is the overall number of classified pixels belonging to class i and Gi is the overall number of actual data belonging to class i.It is recommended that a minimum of 50 samples for each class in the error matrix be collected for the accuracy assessment to avoid risk of a biased sample (Congalton 1991).In this study, 200 samples points for each severity class were randomly selected from predicted map and comparative analysis with reference data was performed.
In addition, dNBR was reclassified based on the thresholds referring to Kokaly et al. (2007) for comparison with dNBR and model results as follows: nonda-maged≤0.16,0.16<low≤0.32,0.32<moderate≤0.67,and high>0.67.In this dNBR classification standard, low and moderate severity correspond to the damaged class of the CEMS reference, and high severity corresponds to the destroyed class.The possibly damaged class is a reference within CEMS that refers to regions with an unclear classification, not a burn severity class.Therefore, comparative analysis of accuracy with dNBR was performed for nondamaged, damaged, and destroyed classes excluding possibly damaged class.

Transferability assessment
In the absence of fire damage reference data or field data like the site B, it is impossible to assess the statistical assessment for the classification result.In this study, we constructed forest fire severity mapping for site B within the range available for Google Earth high-resolution imagery in the post-fire period (April to June) referring to the rapid mapping guidelines (Cotrufo et al. 2018).Following the guidelines, mapping was performed through visual examination using Google Earth imagery and satellite imagery from before and after the forest fire events (Table 4).As a guideline for establishing a reference map, the description of each burn severity and the range that appears in the imagery are expressed by color and the constructed map is used as reference data for transferability assessment.For transferability assessment, visual and quantitative comparative analysis between the reclassified dNBR map and the best model result was performed using the constructed reference map.For accuracy analysis, 200 samples points for three severity classes (destroyed, damaged and nondamaged) were randomly selected from two maps of site B and overall accuracy and kappa value for each map were estimated based on reference data.

Evaluation of model performance
We applied the same training samples and validation method to classify and assess the accuracy of classification models.To find the optimal model for each classification algorithm, multiple tests were performed utilizing ranges of values of two key parameters of each algorithm.As a result of performing comparative analysis of the two models for multiple classification, it has been shown that RF outperforms SVM.When the two key RF model parameters, NUM_TREES and MAX_DEPTH, were set to 200 and "no pruning" respectively, the best outcomes, highest overall accuracy (0.942) and substantially greater individual class accuracy, were obtained (Table 5).On the other hand, in the case of SVM, the highest overall accuracy (0.885) was obtained when the C and The two severity classes (destroyed, damaged) were classified with high producer's accuracy, user's accuracy and F1-score in both models overall.However, there was a difference in accuracy for possibly damaged class according to the values of key parameters of each model.In particular, producer's accuracy was particularly low overall for possibly damaged class.As explained earlier, possibly damaged is a class in which it is uncertain whether or not damage has actually occurred.Therefore, the possibly damaged class has a high probability of being classified as a damaged or nondamaged classes depending on the values of the spectral indices used in this study and only areas with specific values are classified as possibly damaged class.
In terms of parameter tuning, the producer's accuracy varies greatly depending on the MAX_DEPTH value for RF model.On the other hand, the values of NUM_TREES from 10 to 200 did not have a significant effect on overall accuracy and individual class accuracy as with previous study (Belgiu and Drăguţ 2016;Du et al. 2015).In the case of SVM model, increasing of C value has relatively minor impact on the overall accuracy unlike previous study (Foody and Mathur 2004).However, the recall accuracy of possibly damaged class differed by about 20% depending on the C value when the gamma value was the same as 10 −1 .Less overlap between the severity classes in the training data and/or selecting the right value for the gamma parameter might contribute to this like previous studies (Adugna, Xu, and Fan 2022;Foody and Mathur 2004).The gamma parameter determines how far a training sample's impact spreads, and if the wrong value of gamma is selected, no values of C will provide the intended outcomes (Qian et al. 2014).The accuracies of all models constructed with different parameters are included in appendix A and B.
Figure 3 highlights the relative variable importance of the 8 input variables that the best RF model gave, and it shows that the dNBR was found to be the most contributing variable to the model.The dNBR values can range from−2 to + 2 and used as burn severity reference in a number of research (Escuin, Navarro, and Fernández 2008; Fernández-Manso, Fernández-  Class ID: 1 = destroyed; 2 = damaged; 3 = possibly damaged; 4 = nondamaged Manso, and Quintano 2016;Miller and Thode, 2007;Navarro et al. 2017;Kokaly et al. 2007).The NBR is based on a connection between soil and plant reflectance, with plants mainly accountable for NIR reflectance and soils for SWIR (Roy et al. 2006).As the fire intensity increases, NBR values decline, so the severity of the burn increases as the difference of NBR values between before and after a forest fire increase.When comparing NDVI-related indices, dNDVI showed the lowest variable importance, and the importance of dNDVIre1n and dGNDVI were higher.Most satellite sensors detect a sharp decrease in visible-to-near-infrared reflectance when vegetation is burnt, along with an increase in the short and medium infrared surface reflectance (Lentile et al. 2006).Sentinel-2A MSI has three red-edge images, among which the red-edge1 image is known to have the best performance for discretization of the burn severity levels on the coniferous forest area composed of Pinus pinaster (Fernández-Manso, Fernández-Manso, and Quintano 2016).In addition, by swapping the red band for the green band, the saturation effect in the preserved vegetation may be reduced (GNDVI instead of NDVI).The GNDVI was suggested to be useful for distinguishing stressed and senescent plants and to be at least five times more sensitive to chlorophyll-a content than the NDVI (Gitelson, Kaufman, and Merzlyak 1996).The results found in this study also imply that alternate methods for identifying patterns of burned regions involve changes in spectral signatures that follow a fire damage.
dNDWI was found to be the fourth most important index.The Normalized Difference Water Index (NDWI), which effectively measures the moisture content of the canopy, is evaluated at 1 for water bodies and at −1 for dry surfaces (Gao 1996).dNDWI showed good performance in burn severity analysis for open forest or obligate seeders in previous studies (Bar, Parida, and Pandey 2020;Tran et al. 2018).The result of this study also suggests that the increase in soil water repellency caused by fire damage has a direct correlation with high burn severity (Lentile et al. 2006).GEMI is a non-linear index used for biomass or Leaf Area Index (LAI) estimation in remote sensing to minimize the relative influence of atmospheric effects (Pinty and Verstraete 1992;Heiskanen 2006).In this study, areas with high severity such as destroyed and damaged classes showed mean dGEMI around 0.3.When the severity was relatively low like possibly damaged class, the dGEMI value was around 0.2 or lower (Table 6).
The lowest two variable importance were found in dCSI and dMSAVI.Char Soil Index (CSI) and Modified Soil-Adjusted Vegetation Index (MSAVI) are indices used to find out char scar caused by fire damage (Smith et al. 2007).MSAVI is designed to rectify the limitation in SAVI in how vegetation changes as it deviates from the soil line and dMSAVI even showed good performance in burn severity evaluation in previous study (Tran et al. 2018).However, there was more overlap of box values (from first quartile to third quartile) between severity levels compared to other indices, and the differences in values were smaller.These discrepancy between results can be caused by variability in vegetation structure, forest fire intensity and composition of environmental characteristics of each site.Also, low importance of a variable does not mean that it is unnecessary for damage classification.As the vegetation and water conditions change as time elapses after a forest fire occurs, the importance of variables for classifying damaged area may also change (Serra-Burriel et al. 2021).

Model application and comparison
We produced fire damage classification maps using the best model for each algorithm and compared the maps with reference data and dNBR map (Figure 4).
First, the destroyed or damaged class seems to be classified similarly to the reference data as a whole in both model results, but, there were many pixels of possibly damaged class classified as damaged class which are low or moderate damage class in dNBR map.The possibly damaged class is a class that shows the uncertainty of the target area because of the bad image quality rather than a grade of burn severity.That is, it can be judged that the overall spectral characteristics of possibly damaged areas for vegetation, soil, and moisture are similar to damaged areas.
Second, both models showed better and more similar results with the reference data compared to dNBR.The damaged class in the reference data was more classified as the destroyed class in the dNBR map.Based on the reference data, the average dNBR value of the destroyed class is 0.834, and the standard deviation is 0.11, which is higher than the high standard of dNBR (0.67) (Table 6).Underestimating and overestimating the damage area and severity of wildfires can be a fundamental problem in restoration, overestimating can even affect overestimation of restoration cost.In addition, the area indicated as possibly damaged class in the reference data (red box in Figure 4a) did not appear on the dNBR map at all.These results indicate that the dNBR map is capable of roughly detecting burned areas by forest fire, but there is a limit to classifying an area that exhibits a unique spectral feature, such as a possibly damaged area, with a single index.As a result of accuracy assessment based on confusion matrix, overall accuracy of 85.5% and kappa value of 0.807 were obtained for RF, and an overall accuracy of 78% and a kappa value of 0.707 for SVM were obtained (Figure 5).The number of pixels misclassified for possibly damaged class in SVM was more than RF because the samples classified as nondamaged class increased further.In this context, Mountrakis, Im, and Ogole (2011) emphasized that SVM is more sensitive to noisy data than other algorithms and performance would deteriorates when the pixels from different classes have similar characteristics.Furthermore, SVM uses repetitive computations involving the multiplication of massive dimensional matrices (spectral values), which need enormous processing time and storage space to determine the optimal hyper planes that divide the various classes.By comparison, RF algorithm was faster to train the model and to perform classification, and computationally less expensive like previous study (Rodriguez-Galiano et al. 2012).Ensemble classification techniques like RF which multiple classifiers are employed could improve accuracy and speed over the usage of individual classifiers.
As a result of comparing the accuracy between dNBR and the best models for three severity classes (destroyed, damaged, and nondamaged), the overall accuracy was higher than when possibly damaged was included (Figure 6).Overall accuracy of 91.67% and kappa value of 0.875 were obtained for RF, and an overall accuracy of 89.67% and a kappa value of 0.845 for SVM were obtained (Figure 5).The overall accuracy of reclassified dNBR was 86.50%, with a kappa value of 0.798, which was lower than that of the models.As described above in reclassified dNBR, high severity regions (>0.67) classified as destroyed class are more distributed than regions classified as destroyed class in other models.Therefore, the accuracy of the destroyed class itself was higher than that of other models, but the ratio of the damaged class classified as the destroyed class was more than twice as high.Furthermore, the proportion of the nondamaged class classified  as the damaged class was rather large, resulting in lower accuracy.These results suggest that severity levels such as 0.16 (low severity) and 0.67 (high severity) may differ based on the environment in which the forest fire occurred.

Model transferability
Using the best RF model, fire damage classification was performed over the site B and compared with reclassified dNBR map for target area using reference data (Figure 7).As a result of model classification, the total area of classes where destructive damage occurred (destroyed, damaged, and possibly damaged) was estimated to be about 6868.13 ha.This means that about 34% of the 20,000 ha affected by forest fires needs priority and substantial restoration.In addition, there were very few pixels classified as possibly damaged class in model classification map and it means that almost all pixels of this study area have clear spectral differences.When the model output is compared to the reclassified dNBR based on the reference map, nondamaged areas appeared similar and the areas classified as damaged by the model and areas of low and moderate severity in the dNBR map also appear to be mainly consistent.
But in the case of destroyed areas, it was classified similarly to the reference map in the model result, although the severity was comparatively underestimated in the dNBR map.Regarding some specific regions, a reference map was constructed by partitioning destroyed and damaged regions according to the guidelines, and classification was also performed in the model result in a similar way (Figure 8).However, in the reclassified dNBR map, high severity pixels which are the same level as destroyed appear fewer and disconnected in the same location.This classification state appeared in more than three zones rather than one specific zone, it was judged that the overall destroyed areas were underestimated in the dNBR map.Similarly, based on the confusion matrix for each map, areas of damaged and nondamaged classes were classified with similar accuracy in both maps, but there are more areas that are destroyed and classified as damaged (low and moderate severity) in the dNBR map (Figure 9).As a result of accuracy assessment based on confusion matrix, overall accuracy of 90.5% and kappa value of 0.858 were obtained for RF, and an overall accuracy of 87.67% and a kappa value of 0.815 for reclassified dNBR were obtained for site B. Since the model was trained using samples from site A, the accuracy for site B is slightly lower, but it was also confirmed that model classification applied to other region performed more accurate detection than using only dNBR values.

Importance of reference data in burn severity analysis
The fire season is related to periods when vegetation is available because moisture declines and air conditions promote the fire environment, and climate change would increase the period as fuel is dry enough to burn (Jolly et al. 2015).In this context, understanding the contribution of reference data and input data for the detection of damaged areas by forest fires is critical for accurate classification of fire damage severity and improvement of restoration directionality and efficiency.However, in most countries, institutional construction of reference data for disaster occurrence area including forest fires has not been carried out, and it is not easy to introduce such a system in terms of policy and economy (Kolden, Smith, and Abatzoglou 2015).As a consequence, most previous research used spectral indices to identify fire damage or classify severity, compared them using different criteria, and derived different findings depending on the environment of each site (Bar, Parida, and Pandey 2020;Mpakairi, Ndaimani, and Kavhu 2020;Schepers et al., 2014;Tran et al. 2018).Thereby, in this comprehensive study, the usefulness of rapid assessment using open source  reference data and the effectiveness of modeling through different ML algorithms were derived for similar fire prone ecosystems.
Furthermore, our own reference map was constructed by visual examination to evaluate transferability, and visual criteria and descriptions for classification of burn severity were presented throughout this study.When using high-resolution Google Earth imagery, it is possible to clearly identify the difference caused by forest fires (Table 4).However, in RGB or false color imagery constructed using Sentinel-2, differences between landcover classes can be distinguished, but visual examination and classification of damage severity within the same forest are difficult.Forest fire occurrence areas appear in dark red color inside false color imagery due to the tendency for SWIR and red spectrum to be more reflecting (Jiao and Bo 2022), however it is also unable to classify severity and map precise ranges.Therefore, in the absence of high-resolution images such as Google Earth imagery, it is difficult to distinguish the severity and area of fire damage, and there is a limitation that a reference map of the entire site B could not be established because high-resolution imagery was only obtained in a specific area.In addition, when constructing a reference map for the forest fire target area in the future, it should be accompanied by a field trip to determine ambiguous areas such as possibly damage class and to confirm the actual conditions within the damaged areas.

Limitations of a single index and usefulness of modeling
The pivotal results of this study were a method for standard severity classification utilizing remote sensing and ML algorithms, rather than just detecting damaged regions caused by forest fires, and suitable reference data.Burn severity mapping data can contribute to: (i) assessments of the effectiveness of fire management strategies; (ii) response for damaged area emergently, (iii) understanding of forest fire behavior on ecosystem, etc (Collins et al. 2018;Price and Bradstock 2012).Modeling that can comprehensively judge by using various indices rather than specific indices such as dNBR can perform more accurate severity classification, but the indices useful for the model may vary.These discrepancy between results can be caused by variability in vegetation structure, forest fire intensity and composition of environmental characteristics of each site.Also, low importance of a variable does not mean that it is unnecessary for damage classification.As the vegetation and water conditions change as time elapses after a forest fire occurs, the importance of variables for classifying damaged area may also change (Serra-Burriel et al. 2021).In this study, pixel-based ML algorithms were used for direct comparison with index-based classification.Pixel-based classification could learn and classify pixels for each burn severity in detail through training that focuses exclusively on the spectral information of input data images (Gao 2008).However, in contrast to the object-based method, the generalization capability for spectral information within a neighborhood is poor, resulting in misclassification phenomena like salt and pepper.Although the overall accuracy was better and noise was less than that of the dNBR map, some local noise occurred as a limitation of the analysis of pixel-based classification.In the future, we plan to conduct a study for the development of severity classification model with image-based methods such as a Convolutional Neural Network (CNN) algorithm for less noise.

Conclusion
In this study, we presented a novel methodology for mapping burned areas and severity using Sentinel-2 data and CEMS data.We compared the two classifiers, RF and SVM for fire damage classification over the forest in Castelo Branco, Portugal based on reference data using eight spectral indices.RF, which is more suitable for classifying pixels with mixed characteristics (possibly damaged class) showed better performance than SVM.We found out that dNBR and dGNDVI outperform other indices in coniferous forest consisting of pine trees in an arid environment.However, all eight indices are frequently inferred using multi-temporal (pre-and post-fire) remotely sensed spectral data while the severity of a fire damage is estimated by visible or measurable field observations.The impact on ecosystem functions of burn severity depends on the environment and vegetation types, so the importance of indices for severity classification in other forest areas would be changed accordingly.
In addition, additional fire damage classification was performed for transferability assessment using the constructed model on the area where forest fire occurred in a similar environment.In this process, a fire damage severity reference map was established through visual examination, and guidelines for this were presented.Through comparative analysis, it was also confirmed that model classification applied to other region performed more accurate detection than using only dNBR values.
Our research shows that the severity assessment of burned regions could be performed using two satellite images (pre-and post-fire) and reference data of burn severity for different regions.This approach can be rapidly adopted for the evaluation of many other burned areas, allowing for an unprecedented recognition with a dataset of accurate, significant, and easily accessible reference information, which will provide up-to-date and reliable information on forest fires around the world.The management of forest fires requires a better understanding of the frequency of meteorological variables including climate change impacting each region and the application of strategies for reducing factors that balance the effect of forest fire intensity and spread.In this regard, remote sensing technologies using satellite observation data is at a significant juncture in terms of managing sustainable forest.Regarding future research, together with CEMS data or similar reference data and independent remote sensing observations, the method presented in this paper can be adapted for applications to other disasters in assessing the efficacy of emergency measures, such as drought or flood.

Figure 1 .
Figure 1.(a) Location of the site A; (b) Post-fire RGB composite over the site A; (c) Post-fire false color composite (SWIR, NIR, Red) over the site A; (d) Pine forest distribution map over the site A.

Figure 2 .
Figure 2. (a) Location of the site B; (b) Post-fire RGB composite over the site B; (c) Post-fire false color composite (SWIR, NIR, Red) over the site B; (d) Pine forest distribution map over the site B. The vegetation map to determine pine forest distribution was obtained from the Ministry of Environment of South Korea (https://egis.me.go.kr/, accessed 17 June 2022).

Table 1 .
Comparison of the EMS-98 and EMSR severity classes for assessing apparently visible damage.Copernicus EMS classes EMS-98 classes Possibly damagedIt refers to cases when the confidence level of the interpretation is slightly lower (e.g.bad image quality)

Figure 3 .
Figure 3. Relative variable importance of the 8 input variables provided by the RF algorithm.

Figure 4 .
Figure 4. (A) Reference data for the site a (EMSR448); (b) Reclassified dNBR map for site A; (c) Classification result using the best RF model for the site A; (d) Classification result using the best SVM model for the site a (Destroyed class corresponds to high severity of dNBR map, damaged class corresponds to moderate and low severity of dNBR map).

Figure 5 .
Figure 5. Confusion matrix of the best model of each algorithm for the site A: (a) confusion matrix of the best RF algorithm; (b) confusion matrix of the best SVM algorithm.

Figure 6 .
Figure 6.Confusion matrix of the best model of each algorithm and dNBR for three severity classes (destroyed, damaged, and nondamaged) the for site A: (a) confusion matrix of the best RF algorithm; (b) confusion matrix of the best SVM algorithm; (c) confusion matrix of the reclassified dNBR.

Figure 7 .
Figure 7. Fire damage classification map of site B produced by the best RF model and reclassified dNBR map: (a) RF classification for site B; (b) Reclassified dNBR map for site B; (c) RGB composite, false color composite, model classification map and dNBR map for first example site; (d) RGB composite, false color composite, model classification map and dNBR map for second example site (Destroyed class corresponds to high severity of dNBR map, damaged class corresponds to moderate and low severity of dNBR map).

Figure 9 .
Figure 9. Confusion matrix of the best model of RF algorithm and dNBR for three severity classes (destroyed, damaged, and nondamaged) the for site B: (a) confusion matrix of the best RF algorithm; (b) confusion matrix of the reclassified dNBR.

Figure 8 .
Figure 8. Visual comparison of burn severity classification by reference map, model result, and dNBR for transferability assessment (Destroyed class corresponds to high severity of dNBR map, damaged class corresponds to moderate and low severity of dNBR map).

Table 2 .
Pre-/post-fire images employed for each site.

Table 3 .
Spectral indices used in this study.

Table 4 .
Proposed damage scale of burn severity and guidelines for classification based on remote sensing imagery.
gamma values were 200 and scale, respectively, but it was lower than other experimental results of RF.

Table 5 .
Accuracies of the best model for each algorithm.

Table 6 .
Statistical values of the spectral indices for each severity level of Castelo Branco, Portugal.