Linking repeat lidar with Landsat products for large scale quantification of fire-induced permafrost thaw settlement in interior Alaska

The permafrost–fire–climate system has been a hotspot in research for decades under a warming climate scenario. Surface vegetation plays a dominant role in protecting permafrost from summer warmth, thus, any alteration of vegetation structure, particularly following severe wildfires, can cause dramatic top–down thaw. A challenge in understanding this is to quantify fire-induced thaw settlement at large scales (>1000 km2). In this study, we explored the potential of using Landsat products for a large-scale estimation of fire-induced thaw settlement across a well-studied area representative of ice-rich lowland permafrost in interior Alaska. Six large fires have affected ∼1250 km2 of the area since 2000. We first identified the linkage of fires, burn severity, and land cover response, and then developed an object-based machine learning ensemble approach to estimate fire-induced thaw settlement by relating airborne repeat lidar data to Landsat products. The model delineated thaw settlement patterns across the six fire scars and explained ∼65% of the variance in lidar-detected elevation change. Our results indicate a combined application of airborne repeat lidar and Landsat products is a valuable tool for large scale quantification of fire-induced thaw settlement.


Introduction
Fire is the largest driver shaping the diverse boreal ecosystems in interior Alaska where ∼40% is underlain by ice-rich permafrost (Jorgenson et al 2022). Effects of wildfires on permafrost such as active layer thickness and thermokarst development are well documented (e.g. Jorgenson et al 2010, Jones et al 2015, Minsley et al 2016, and recently reviewed by Holloway et al (2020) and Li et al (2021). The mechanisms of fire in changing permafrost ecosystems have been recognized: severe wildfire removes the vegetation and surface soil organic matter, and the loss of this insulation increases the ground heat flux and promotes permafrost thaw. In areas of ice-rich permafrost, thaw triggers ground subsidence and thermokarst development, leading to surface water inundation, vegetation shifts, changes in soil carbon balance and carbon emissions, that can provide positive feedback to climate warming (Douglas et al 2014). Recent research of the role of fire in permafrost landscapes has focused on determining which variables control post-fire permafrost behavior, the interactive effects of fire and climate on permafrost degradation and resilience, and modeling and predicting post-fire resilience (Holloway et al 2020).
Research interest in the permafrost-fire-climate system has been growing. Current and future projected increases in mean annual air temperature, the length of the summer growing season, and the severity and extent of wildfire are expected to lead to an increasingly dominant role of wildfire in permafrost ecosystems (Gibson et al 2018). One challenge in understanding this system is to quantify fire-induced thaw settlement which is a gentle and continuous ground subsidence as the ice melts and the active layer deepens downward into near-surface permafrost (Anders et al 2020). Thaw settlement is difficult to measure as there are often no absolute reference frames to compare to the subtle but widespread topographic change in permafrost landscapes (Anders et al 2020). Mapping fire-induced thaw settlement is critical as it is associated with subsequent thermokarst development, snow accumulation, hydrology, vegetation shifts, and commensurate changes in the land-atmospheric exchange of water, energy, and greenhouse gases. Efforts have been made to estimate fire-induced permafrost thaw settlement using spaceborne interferometric synthetic aperture radar (InSAR) techniques (Liu et al 2014, Molan et al 2018, Michaelides et al 2019, Yanagiya and Furuya 2020. Constraints of InSAR for such applications include interferometric phase decorrelation (Antonova et al 2018), and loss of image coherence across large areas within a burn (Jones et al 2015).
Airborne lidar is so far the best technique to generate accurate and fine resolution (<5 m) digital elevation models (DEMs) for a variety of terrestrial applications (Dong and Chen 2018). The application of lidar for cryospheric studies has been increasing but has been limited in permafrost landscapes (Bhardwaj et al 2016). Ideally, repeat lidar measurements can accurately estimate and map thaw settlement with a fine resolution, which is evidenced by Jones et al (2013), Jones et al (2015), Douglas et al (2021) and Rodenhizer et al (2022). However, repeat lidar has been seldom applied to quantify elevation changes in fire-influenced permafrost terrains. This is attributable to the high cost for lidar data acquisition, and a limited spatial coverage of airborne platforms compared to the spaceborne sensors. Chasmer and Hopkinson (2017) applied repeat lidar to characterize permafrost loss. The work from Jones et al (2015) is the only effort analyzing post-fire repeat lidar for detecting subsidence caused by the 2007 Anaktuvuk River tundra fire which burned ∼1000 km 2 in northern Alaska. Multi-temporal lidar is a useful means detecting fire-induced terrain subsidence and thermokarst development since it allows for a direct measure of land surface elevation relative to a geodetic reference frame. Lidar-detected subsidence is highly related to Landsat-derived differenced Normalized Burn Ratio (dNBR), and the trend of post-fire multispectral indices such as Normalized Difference Vegetation Index (NDVI) and Normalized Difference Moisture Index (NDMI) (Jones et al 2015). dNBR is a variable developed from adjacent pre-fire and post-fire optical imagery to estimate burn severity, which plays a critical role in postfire permafrost degradation and resilience (Keeley 2009). Multispectral indices NDVI and NDMI indicate vegetation succession, which is coupled with topographic change, and other interactions in post-fire permafrost environments. A linkage of repeat lidar with optical satellite sensors like Landsat through modern machine learning modeling techniques may provide potential for large-scale quantification of fire-induced thaw settlement across broader, recently burned regions, as well as other historic fires.
The Tanana Flats (∼2500 km 2 ), a representative lowland landscape south of Fairbanks in interior Alaska, consists of a complex mosaic of ice-rich permafrost and permafrost free ecosystems. The region is a hotspot for thermokarst, and ground collapse caused by thawing permafrost (Jorgenson et al 2020(Jorgenson et al , 2022. Much of the land is part of a military training area managed by the U.S. Department of Defense. In the past two decades, six large fires with burn scars larger than 20 km 2 have occurred and influenced a total of 1250 km 2 (∼50%) across this training area. The large scale effects of these fires on land cover change, post-fire resilience, and subsequent thaw settlement remain unknown, although analyses using data collected along several transects have revealed fire-induced permafrost degradation over this area . To further understand the effects of disturbance from fires and climate warming on the Tanana Flats training site, the U.S. Army Corps of Engineers (USACE) Cold Regions Research and Engineering Laboratory coordinated the collection of airborne repeat lidar that partly overlapped with the fire scars. This provides a unique opportunity to quantify fire-induced thaw settlement and assess the capacity of operational Landsat products for characterizing fire-influenced permafrost terrains.
In this study, we systematically analyzed the effects of six large fires that occurred since 2000 on the Tanana Flats lowland on land cover change, vegetation dynamics and terrain subsidence, and for the first time developed a machine learning based ensemble approach to quantifying fire-induced thaw settlement across the entire Tanana Flats by linking limited repeat lidar measurements with Landsat products. We assessed three commonly used machine learning algorithms including artificial neural network (ANN), support vector machine (SVM) and random forest (RF) for fire-induced thaw settlement modeling. Machine learning has been extensively applied for modeling in geosciences as reviewed by Dramsch (2020) and Sun et al (2022). Our logic is that each algorithm has its pros and cons, and an ensemble analysis (EA) of comparative models can produce a more robust estimation than the application of a single model. In addition, a spatial uncertainty map can be created based on analyses of multiple model outputs to complement the traditional statistical error metrics (Zhang et al 2022). The specific objectives were to: (a) characterize the response of vegetation to fires using time-series Landsat products; (b) analyze repeat lidar-derived elevation changes related to vegetation shift and burn severity; (c) assess the performance of machine learning algorithms for upscaling lidardetected permafrost thaw settlement; and (d) identify fire-induced subsidence patterns across the Tanana Flats lowland.

Study area, data, and fire history
The Tanana Flats lowland in interior Alaska is underlain by discontinuous permafrost (figure 1). It is situated in a large, abandoned floodplain and alluvial fan, and receives deposits from the fluvial and glaciofluvial sediments of the Alaska Range to the south (Jorgenson et al 1999). Tanana Flats has a gentle elevation gradient of approximately 1 m km −1 declining from southeast to northwest. The Tanana River, the largest tributary of the Yukon River, forms the boundary around the northern portion of Tanana Flats. Land cover types include evergreen forest (spruce), deciduous forest (birch), shrubland, fen, bog, streams, and open water bodies. Field photos of the major land cover types are provided in Figure SI1. Forests are generally located on ice-rich permafrost plateaus elevated above fens and bogs formed in thermokarst depressions. Approximately 50% of this area is in some stage of permafrost degradation caused by fires and climate warming (Jorgenson et al 1999). The climate in interior Alaska is continental, with annual mean precipitation of 28 cm, a typical annual snowfall of 1.7 m, and a wide variation of seasonal mean air temperature ranging from −40 • C in winter to 25 • C in summer.
Data used in this study include historical fire perimeters from the Alaska Interagency Coordination Center (AICC), Landsat derived dNBR products from the National Aeronautics and Space Administration (NASA)'s Arctic-Boreal Vulnerability Experiment (ABoVE, Goetz et al 2022) (Loboda et al 2018), Landsat derived annual dominant land cover across the ABoVE core domain, 1984-2014 (Wang et al 2019), Landsat 8 Collection 2 Level 2 surface reflectance products from the U.S. Geological Survey (USGS) Earth Explorer, and airborne repeat lidar measurements in 2014 and 2020 for USACE. The USACE contracted with the Quantum Spatial Incorporated (Anchorage, Alaska) for lidar data collection using a Lecia ALS80 laser system to yield an average pulse density of ⩾25 points m −2 over the targeted lidar survey area (∼40 km 2 , figure 1(b)). The vendor processed the lidar point cloud data and generated a 0.25 m hydro-flattened DEM product. The hydroflattening process eliminated artifacts in the digital terrain caused by both increased variability in ranges or dropouts in laser returns due to the low reflectivity of water. However, limited water bodies appeared in our study area, and they were not within the fire scars. The 2014 lidar was collected on May 9-11, and 2020 lidar was acquired on May 17-18, in a period of 95% snowmelt with leaf-off conditions.
Six  figure SI2. Because the latest big fire occurred in 2012, we selected six snow and cloud free Landsat 8 scenes after 2012 and calculated annual time series NDVI for modeling. These were collected on 06/18/2013, 08/08/2014, 07/12/2016, 06/13/2017, 07/05/2019, and 07/01/2021, respectively. Landsat 5 and 7 data products were also available following some fires but were not used in this study to reduce uncertainties of reflectance difference across Landsat sensors (Flood 2014, Roy et al 2016. Landsat 8 products are only available since 2013.

Methodology
To examine the response of vegetation to fires, we selected land cover data for years 2000,2002,2008 and 2014 from NASA's ABoVE annual dominant land cover products and calculated the gain and loss of major land cover classes caused by the six fires. The change between 2000 and 2002 could illustrate the disturbance from the 2001 fire, change between 2002 and 2008 could reveal the effects from the 2004 fire, while change between 2008 and 2014 could document the combined influence from four fires between 2009 and 2012. We selected the dNBR data for each fire from NASA's ABoVE dNBR product and subset it to the fire scar defined by the AICC fire perimeter dataset. We mosaiced the dNBR for the entire Tanana Flats to be used for model development and analysis. We used the dNBR from the first fire that occurred for the reburned areas where a larger dNBR was observed from the first fire than the second fire because the first fire had burned down the vegetation. We calculated time-series NDVI from the selected Landsat 8 surface reflectance datasets, and similarly calculated NDMI to assess its contribution for estimating fire-induced thaw settlement.
To examine the capacity of Landsat products for upscaling thaw settlement detected by repeat lidar data, we developed an object-based machine learning ensemble approach that has proven powerful for thaw depth and snow depth estimation in permafrost ecosystems of interior Alaska using optical imagery and lidar DEMs (Douglas and Zhang 2021, Zhang et al 2021). We first determined the lidardetected subsidence by analyzing the errors from lidar measurements (section 3.1), and then linked the detected subsidence with Landsat-derived reflectance, dNBR, and spectral indices at the object level to identify a quantitative relation between the target variable (subsidence) and predictors (Landsat variables) using machine learning regression algorithms. The models were assessed using traditional error metrics including correlation coefficient (r), mean absolute error (MAE), and root mean square error (RMSE). If the models were comparable, we applied a weighted ensemble approach to combine the outputs from multiple models to make the prediction more robust (section 3.3.2). Major steps for upscaling lidardetected subsidence include determination of thaw settlement using repeat lidar, object-based data processing, machine learning ensemble model development, assessment, and mapping applications. They are described below.

Thaw settlement detection using repeat lidar
Lidar DEMs have errors which should be considered because they will propagate into the DEM of difference (DoD) for elevation change analysis (Wheaton et al 2010). Based on the vendor's 2014 lidar report, the accuracy of the lidar DEM was assessed based on a total of 183 ground control points (GCPs) measured using real time kinematic and post processed kinematic techniques across five land cover classes defined by the USGS: mixed forest, grasslands/herbaceous, shrubland, wetlands, and barren. Lidar data vertical accuracy was reported using the fundamental vertical accuracy designed to meet guidelines in Federal Geographic Data Committee National Standard for Spatial Data Accuracy. A summary of the lidar accuracy is listed in table 1. For the 2020 lidar, the vendor only reported errors for vegetated area (23.2 cm) and non-vegetated area (5.1 cm) but did not provide errors for each land cover class. Since both 2014 and 2020 lidar data were collected by the same vendor using the same laser system, similar settings, and time (May 2014 and 2020), we thus used the 2014 reported errors in this study. Connecting this error table with the 2014 land cover dataset, we developed an error layer and calculated the propagated error in DoD using an equation revised from Wheaton et al (2010) as below: where δU DoD is the propagated error in DoD, and δZ 2020 and δZ 2014 are individual errors in 2020 DEM and 2014 DEM, respectively. Since we assumed 2020 and 2014 had the same errors across the eight land cover classes appeared in the burned areas, these two variables used the same values in table 1. We selected the conservative 95th percentile error for each land cover class defined by the ABoVE 10-class classification system for the burned areas (table 1) to calculate the accumulative error. Any elevation changes in DoD (|2020 DEM minus 2014 DEM|) less than δU DoD were considered noise and dropped in further analysis. The repeat lidar measurements within the unburned areas and the 2001 fire was also dropped

Object-based data processing
The object-based image analysis (OBIA) segments imagery into relatively homogeneous objects. It is more accurate than the pixel-based analysis because an object is more representative than any pixels within this object and using the spectral mean of an object in a model can reduce the local spectral noise/variance caused by surrounding pixels (Dronova 2015). We selected the OBIA in the upscaling and generated image objects from the 2014 Landsat imagery using the multiresolution segmentation algorithm in eCognition Developer 10.0 (Trimble 2020). This algorithm starts with a 1 pixel image segment and merges neighboring segments together until a heterogeneity threshold is reached. A challenge using OBIA is to select an appropriate scale parameter given its effects on modeling and mapping results. We applied the commonly used 'trial-and-error' approach to select an optimal scale parameter. We selected the 2014 imagery to generate objects because lidar data was collected in 2014 and no snow and cloud free imagery was available for 2020. Following segmentation, we extracted the spectral mean for each object, and calculated two spectral indices NDVI and NDMI. We also calculated annual time-series NDVI for each object to examine their contribution in thaw settlement estimation. Landsat 8 images collected during 2013-2021 for time series NDVI calculation were provided in section 2. The mean lidar-detected subsidence was calculated for objects where lidar measurements were available, which was spatially matched to Landsat variables at the object level for model development. In total, we obtained 284 matched objects across different land cover types with varying burn severity, which were used for training and testing the model.

Machine learning ensemble modeling, mapping, and accuracy assessment 3.3.1. Machine learning modeling
Our preliminary analysis showed that the relation of subsidence and Landsat predictors was nonlinear due to the complicated post-fire permafrost response. Machine learning algorithms are valuable in quantifying the nonlinear relation and thus were explored. We selected three popular machine learning algorithms, ANN, SVM and RF, for subsidence modeling based on an exploration of many algorithms integrated in the open-source machine learning software package Waikato Environment for Knowledge Analysis (WEKA) (Hall et al 2009). ANN is a sophisticated machine learning algorithm that has been widely applied in remote sensing classification, as reviewed by Mas and Flores (2008). We selected the commonly used multilayer perceptron method which sets up multiple layers connected by weights to train a model in a nonlinear manner. This algorithm needs to set the number of hidden layers, learning rate and the number of training cycles. SVM is a popular approach (Vapnik 1995) which has also been broadly used in remote sensing, as reviewed by Mountrakis et al (2011). In SVM, parameters including a kernel to be used, precision and penalty need to be specified. RF is a tree-based ensemble approach that constructs numerous small regression trees contributing to the predictions (Breiman 2001), as reviewed by Belgiu and Drãguţ (2016) for remote sensing applications. Two parameters need to be defined in RF: the number of decision trees to create and the number of randomly selected variables considered for splitting each node in a tree. We tuned each model using the experimenter function in WEKA which can determine the best model by setting different parameter through a cycling procedure.

Ensemble estimation for mapping thaw settlement
Each algorithm has its pros and cons, and these algorithms may generate similar accuracy in terms of reference samples but different predictions across a large area. An EA of comparative models can produce a more robust estimation than the application of a single model. In addition, the application of multiple models can generate a spatial uncertainty map to complement the traditional statistical metrics used for error analysis (Zhang et al , 2022. We used a weighted combining scheme developed by Zhang et al (2018) based on r derived from each model to integrate predictions from different models. If model i (i = 1, 2, … M) has a correlation coefficient r i , then the final prediction of subsidence S is calculated as: where S i is the prediction from model i, and M is the total number of models in the EA. In this way, a model with a larger r obtains a higher weight, and the sum of weights is 1.0. To make the prediction more robust, at least two models which voted subsidence (negative elevation change) were combined for estimation.

Model accuracy assessment and uncertainty mapping
We applied the commonly used k-fold cross validation method to calibrate and validate each algorithm.
The k-fold cross validation randomly splits the reference samples into k equal groups, and then iterates the model k times. In each iteration, one group is used to assess the model, and the remaining k-1 groups are used to train the model. In this way it can avoid over-fitting issues if the same dataset is used for both training and testing. We set k to 10, a common value used in literature. Setting of k was detailed in Anguita et al (2012). After the iteration, prediction was generated for each matched sample and could be used to evaluate model performance by calculating statistical metrics including r, MAE, and RMSE. If EA is used for predicting subsidence, the STD of multiple model outputs to the ensemble prediction (here after referring to as STDE) can be calculated to map the uncertainty in prediction caused by the application of different models. STDE is calculated as: where S i , S, and M are same variables as in equation (2).

Burn severity, vegetation change, and lidar-detected subsidence
The mosaiced burn severity quantified by the dNBR across the six fires is displayed in figure 2( Burn severity was heterogeneous within a fire scar and across the six fire scars. The blank stripes in the dNBR map were caused by the lack of data from Landsat 7 due to the failure of Scan Line Corrector after 31 May 2003 in the dNBR dataset (Loboda et al 2018).
In terms of the burn severity levels defined by the USGS and the 2000 land cover map, we analyzed areas with dNBR larger than 100 (totaling 842 km 2 ) for each major land cover within the AICC fire perimeters (table SI2). The greatest damage occurred in evergreen forest which accounted for 80% on the affected area. Fen was the second largest ecosystem associated with burned areas, but we attribute the burn estimates to problems with classifying land cover and burn severity in fens rather than to fires. Fen was reported to have the lowest user's accuracy (61%) among the identification of ten classes in NASA ABoVE annual dominant land cover product (Wang et al 2019).
The change between the land cover maps in 2000 and 2014 further illustrated the fire effects on the evergreen forest which covered 1177 km 2 (46.5% of the area) in 2000 before the six fires but was reduced to 775 km 2 in 2014 as the result of 2001 fire burned ∼120 km 2 , the 2004 fire burned ∼50 km 2 , the 2009-2012 fires burned ∼230 km 2 . In total, ∼400 km 2 (57%) of the Evergreen Forest within the fire perimeter was affected. Deciduous forest covered ∼7.8% of the area in 2000 and was reduced by only 0.8% after fires. Fires resulted in the expansion of Shrubland from a pre-fire areal extent of 548 km 2 (21.6%) to a post-fire areal extent of 720 km 2 (28.4%). This expansion is distinct across the 2001 fire scar (figures 2(b) and (c)) and better evidenced by the increase of ∼144 km 2 between 2000 and 2008, but only ∼28 km 2 between 2008 and 2014. This indicates that it might take more than 6 years for a complete encroachment of shrubs after a fire. The expansion of shrubs in the 2009 fire is also clear. The removal of vegetation by fires was also illustrated by the large increase in sparse vegetation defined as areas with 20%-50% vegetation coverage (Wang et al 2019) from 15 km 2 (0.6%) to a post-fire areal extent of 296 km 2 (11.7%). Fens covered ∼23% of Tanana Flats and were reduced by 2.6% after fires. The decrease in fens, however, is inconsistent with observations that wet, herbaceous fens do not burn and the reported increase in fens from permafrost degradation (Jorgenson et al 2020). The change, again, was more likely due to difficulties in classifying fens with seasonally variable water levels and herbaceous cover. Figure 4 shows the lidar DoD and detected thaw settlement between 2014 and 2020 over the 2010 fire scar. The elevation change was heterogeneous varying from increase to decrease ( figure 4(a)). Here we only focused on thaw settlement analyses after dropping areas with an accumulative error larger than DoD ( figure 4(b)). For the burned areas with dNBR larger than 100, repeat lidar analyses revealed a mean elevation decrease of 18.4 cm in evergreen forest with a    2(c)). We attribute the consistent elevation decreases in the permafrostaffected evergreen forest, shrubland, and sparse vegetation to thaw settlement, whereas attribute changes in fens to water level changes.

Model performance, thaw settlement mapping, and analysis
We conducted correlation analyses and applied the variable importance function in the RF algorithm to identify optimal Landsat predictors for thaw settlement. The results illustrated that an inclusion of post-fire NDVI for 1 or 2 years was adequate for thaw settlement estimation, while including timeseries NDVI was not beneficial. Both dNBR and NDVI were valuable for thaw settlement modeling with dNBR negatively related (r = −0.23) and NDVI positively related (0. 30-0.4). An addition of NDMI did not improve the model performance due to its high correlation with NDVI. Inclusion of spectral bands 2, 3, 5, and 7 was valuable in the model. Performance of the final selected models with optimal predictors for fire-induced thaw settlement is displayed in table 2. Three machine learning models We applied the EA model to estimate thaw settlement across all the six fire scars (∼1250 km 2 ) on Tanana Flats with a dNBR larger than 100, as shown in figure 5. Thaw settlement within the fire scars was not homogeneous with some regions larger than 25 cm and some areas less than 5 cm. Most of the regions had a settlement larger than 10.0 cm comparable to the RMSE of the EA model, suggesting the model was acceptable. The fire-induced thaw settlement pattern was reasonable by a joint observation of the settlement map ( figure 5(a)) and the dNBR map (figure 2(a)). Severe settlement (<−25 cm, in purple) was mainly present in the 2001 fire and southern part of the 2009 fire due to extensive loss of ecosystemprotection provided by evergreen trees. The STD of multiple model outputs (STDE) map illustrated that three models produced consistent estimations for most areas with STDE less than 6.0 cm (in green in figure 5(b)). A small area in the 2012 fire scar had STDE larger than 15 cm (in red), indicating a large disagreement among the model estimations.
We calculated the mean subsidence across the three major burned vegetation communities evergreen forest, shrubland, and fen, and each fire scar. Areas covered by evergreen forest experienced an averaged subsidence of 17.2 cm, and fen margins had an estimated subsidence of 14.1 cm. Estimated subsidence in Shrubland was 14.1 cm. The across fire analysis revealed that the greatest subsidence occurred in the 2001 fire scar with an estimation of 18.0 cm, which was followed by areas in the 2011 fire scar with an estimation of 17.1 cm. The smallest subsidence was present in the 2004 fire scar which was reburned in 2009.

Effects of wildfire on vegetation and thaw settlement
Fire effects on vegetation on Tanana Flats were generally consistent with findings reported in the boreal and arctic biome following a trajectory of changing evergreen trees to broadleaf trees, shrubs, or grasslands (Li et al 2021). Analyses of the entire ABoVE domain of land cover change during 1984-2014 illustrated that most of the Evergreen Forest loss occurred in localized patches related to fire disturbance (Wang et al 2020). Abrupt decreases in most plant function types caused by burns were also revealed by a recent ABoVE dataset 'Modeled Top Cover by Plant Functional Type over Alaska and Yukon, 1985-2020. Our time-series analyses of land cover change and patterns on Tanana Flats across the six fire scars showed a large increase of sparsely vegetated areas in 2002 (post-fire 2001) followed by a large increase of shrubs in 2008 (pre-fire 2009). This particular time span with focus on the 2001 fire scar revealed the path of vegetation shift after fire over this lowland area: a removal of evergreen trees, a transition of sparsely vegetated area, and then a colonization of shrubs. Further observation of the land cover maps of 2000 (prefire) (figure 2(c)) and 2014 (post-fire) (figure 2(b)) showed that the expansion of shrubs to sparsely vegetated areas started from areas occupied by fens which became shrubbier after fire. We found that fens could also expand into burned shrubs. A shrub-rich fen system was favorably developed due to the association of fens and shrubs in the fire regime at the lowland Tanana Flats. However, most severely burned areas were still classified as the sparse vegetation mixed with herbaceous in the 2014 land cover map, suggesting it might take more than two decades for a complete colonization of shrubs over these areas. Similar shifting patterns are underway for burned areas caused by fires between 2009 and 2012, although a distinct expansion of shrubs toward fens and sparsely vegetated areas has not been observed in the 2014 land cover map across these scars. The fire and climate warming induced expansion of shrubs often occurs in tundra areas (Myers-Smith and Hik 2017), but the rapid regeneration of shrubs following fires is also reported in interior Alaska . Our field visits showed an abundant spruce and birch regeneration at the fire scars of 2001 and 2010, but it was very slow. It could take decades for this to be visible on Landsat imagery. Failure of regeneration is possible because a synthesis study from Baltzer et al (2021) showed that interior Alaska had the highest instances of total regeneration failure of any tree species after a fire based on 1538 field sites across boreal North America. Earlier aerial photographybased maps during 1949-1995 revealed a large shift from broadleaf birch to fens and bogs due to climate warming, leading to a widespread permafrost degradation on Tanana Flats . In terms of our land cover maps, deciduous forest (birch) now only accounts for 7%. Fires are accelerating the elimination of lowland birch forests over this area. The recovery of vegetation-permafrost may take decades and is unlikely to return its pre-burn conditions under a warming climate scenario.
Effects of fire on permafrost thaw and surface settlement is complicated by the interaction of numerous biophysical factors, including highly variable ground ice contents (Jorgenson et  Flats, volumetric ice contents in the top 3 m often vary from 40% to 80% (Jorgenson et al 2001) and thaw settlement up to 50 cm was observed within a few years after the 2010 fire (Brown et al 2015). In our analysis, elevation decreases were spatially variable. Given the close association of field and lidar measurements of elevation changes (Brown et al 2015), we are confident in concluding that the large elevation decrease is due to permafrost thaw after fire and subsequent thaw settlement.

Capacity of Landsat products for quantifying fire-induced thaw settlement
This study shows there is a strong relationship between fire-induced thaw settlement and Landsat observations. A combination of Landsat reflectance, NDVI, and burn severity dNBR could explain ∼65% variance of lidar-detected thaw settlement. The fire severity dNBR was the most critical factor in controlling the settlement based on the function of variable importance to measure the contribution of each variable in the RF algorithm, similar to what was observed by Jones et al (2015). Landsat spectral reflectance band 2 (blue) was the second important factor in the estimation, because it is valuable for distinguishing soil from vegetation (USGS 2022). The addition of band 2 was useful because of the large exposure of bare lands, and expansion of deciduous vegetation after fire. NDVI was the third important variable in the estimation. Analyses of time-series NDVI after fire, as well as other spectral indices such as NBR, can indicate the recovery of greenness and vegetation regrowth (Pérez-Cabello et al 2021). Thus, inclusion of NDVI was useful in the estimation. However, an addition of time series NDVI was not beneficial. We observed a consistent increase of NDVI across time (2013-2021) within each fire scar. This trend did not contribute the estimation of thaw settlement. Including one or two NDVIs in the model was adequate to characterize the response of plants across space and time, and its covariation with fire-induced thaw settlement. Inclusion of band 5 (near infrared) and band 7 (shortwave infrared) also improved the estimation in the ANN model but was not valuable in the RF model. To be consistent, we included both bands in the modeling and final estimation.

Object-based machine learning ensemble estimation
Application of machine learning for fire-induced thaw settlement modeling was valuable. Three machine learning algorithms ANN, SVM and RF achieved a comparable performance in the model in terms of the matched dataset, but the estimations for mapping were different. These differences were expected due to differences in the architecture of the algorithms. ANN models data as a structure of neurons and synapses of human brains. SVM looks for an optimal hyperplane to minimize training errors, whereas RF searches for optimal decision trees to generalize a model. Due to the weaknesses and strengths of each algorithm, an EA of multiple model outputs can make the estimation more robust than the application of an individual model alone. This is especially critical for regions without any field or lidar measurements but with a high diversity in prediction from models. Furthermore, an application of ensemble estimation generated an uncertainty map ( figure 5(b)) to complement traditional performance metrics. This map reveals the uncertainty based on model analyses and identify regions with varying confidence in prediction. This type of map can guide field or lidar data collection for further validations, preferably over regions where large differences in model estimations were obtained.
The object-based modeling and mapping was beneficial. Lidar DEMs often have a high spatial resolution (<5 m) while Landsat has a moderate spatial resolution (30 m). Resampling lidar DEMs into 30 m to match Landsat would bring uncertainties from the resampling algorithms though this resampling may reduce some uncertainties from lidar measurements. Matching lidar data and Landsat at the object level was more reasonable by segmenting the Landsat imagery into relatively homogeneous patches and calculating the mean subsidence within each patch detected by repeat lidar to develop the model and estimate subsidence. In addition, application of the mean lidar-detected thaw settlement for a patch could filter out a portion of errors from lidar measurements. We did not examine the pixel-based matching and modeling in this study, but detailed comparison between pixel-based and object-based modeling and mapping for estimating wetland soil properties was provided in Zhang et al (2019).

Limitations and sources of error
Constraints in applied datasets and data processing and analyses were identified as two major sources of limitations and errors in this study. First, there was a time gap between the occurrence of fire and lidar data collection. The overlapped area between repeat lidar and the latest fire was the 2010 fire where the first lidar was collected in 2014 (4 year time gap). Field data analyses between 2011 and 2014 in the forests burned in 2010 showed that the subsidence was up to 50 cm . This suggests that an addition of up to ∼50 cm should be added to the lidar-detected subsidence across the burned forests, or the final projection from our model. Our estimated thaw settlement across the 2010 fire was ∼15.5 cm, indicating a decreasing rate of thaw settlement during 2014-2020 with the recovery of vegetation. The total fire-induced thaw settlement for the burned forests could be up to ∼60-70 cm if lidar data were collected in 2011. Field measurements revealed an average of ground elevation change of ∼27 cm along a transect in the 2010 fire scar (Brown et al 2015), suggesting subsidence was smaller for other plant communities. This was consistent with the lower subsidence estimated in our study.
Second, the repeat lidar measurements in the 2010 fire scar had a small coverage (∼10 km 2 , figure 4) in terms the scars of six fires (∼1250 km 2 ) to be mapped. Evergreen forest, deciduous forest, shrubland and fen were present in the lidar coverage with varying burn severity, however, only limited training samples were obtained, especially for deciduous forest (∼0.06 km 2 ). Repeat lidar or field measurements across all fire scars using a sampling scheme would generate a better estimation but in practice collecting such type of data is challenging or costly, especially in a remote area like Tanana Flats. Limited lidar data coverage was identified as a major error source in the estimation. The in-orbit Ice, Cloud, and land Elevation Satellite 2 (ICESat-2) has potential for large scale elevation change analyses with a revisit of 91 d (Michaelides et al 2021), but it has a coarse spatial resolution, worse precision, as well as other issues compared to airborne repeat lidar. An integrated application of airborne lidar, ICESat-2, InSAR and Landsat data products may improve the thaw settlement estimation given the complementary features of each sensor system. This was identified as the future research need.
The third constraint was from Landsat products. There might be more uncertainties using Landsat 8 for predicting the thaw settlement for the 2001 fire due to the large time gap between this fire and Landsat 8 products. Ideally, for the 2001 estimation, Landsat products within 4 years following the fire should be used. However, direct application of Landsat 5 products for the 2001 fire generated a poor pattern (results were not shown) due to the difference of reflectance and NDVI across sensors (Roy et al 2016, Berner et al 2020. Cross-calibration models have been developed to address this issue, but it involves more data preprocessing. Snow and cloud cover is another issue using Landsat products as well as other spaceborne optical sensors. For interior Alaska, snow free imagery is limited to a time window between May and September, and the need for cloud free conditions further constrain data availability for such applications. The fourth constraint came from the incompleteness of predictors in the machine learning model in relation to the factors affecting thaw settlement. Despite the complicated biophysical interactions that affect permafrost thaw and surface settlement, Landsat products explained 65% of the variance in lidardetected thaw settlement but the models did not capture all drivers controlling thaw subsidence. Time since disturbance is particularly important because thaw is more rapid immediately after disturbance, slows with time, and can continue over many decades (Burn 1998). Inclusion of soil variables, particularly ground ice contents, or others might increase the estimation, but such types of data products with comparable resolution to Landsat are often not available.
Data processing and analysis is the last major source of error. Though the developed object-based machine learning ensemble approach was encouraging for thaw settlement estimation, uncertainties from major processing steps like image segmentation and parameters used in machine learning were inevitable. An evaluation of the selected processing techniques for estimating settlement was beyond this study but was identified as an area of future research need. An independent subsidence dataset such as InSAR or ICESat-2 estimation was also needed to further evaluate the Landsat estimated settlement. The upcoming NASA-Indian Space Research Organization Synthetic Aperture Radar mission to be launched in 2024 would offer another opportunity to validate the settlement estimation at a broad coverage.

Summary and conclusions
We systematically evaluated the effects of six large fires on the ice-rich Tanana Flats lowland in interior Alaska and for the first time estimated and mapped fire-influenced thaw settlement over ∼1250 km 2 . In total the six fires resulted in a loss of ∼400 km 2 evergreen forest during 2000-2014 among ∼590 km 2 fire-influenced forests with varying degree of burn severity. The fires provided favorable conditions for shrub-fen development, resulting in a comparable post-fire coverage of shrubland and evergreen forest and increasing encroachment of shrubland to areas with sparse vegetation. The regrowth of forests was not observed after 13 years of the oldest fire 2001 based on Landsat observations. Despite of uncertainties in the timing of lidar data acquisition following a wildfire, we generated a reasonable fire-induced thaw settlement pattern which was generally consistent with the burn severity pattern. Our study indicated that linking airborne repeat lidar with Landsat products was an encouraging tool for large scale quantification of fire-induced thaw settlement. While there are no constraints to applying this methodology to other fire-influenced permafrost terrains, direct application of our developed model to other boreal or tundra areas is limited because the model was calibrated by site-specific lidar measurements, and ice contents and thaw settlement potential is highly variable among terrain types. Because airborne lidar measurements are increasingly being made across northern permafrost regions (e.g. Jones et al 2015, Rodenhizer et al 2022, our method is a valuable means of projecting elevation change across entire fire scars within uniform permafrost-affected landscapes by using data-driven machine learning techniques. Our approach is complementary to the application of InSAR (e.g. Yanagiya and Furuya 2020) or lidar technique alone (e.g. Jones et al 2015) for thaw subsidence estimations. Its ability to combine active airborne lidar sensors with passive spaceborne optical sensors for a large-scale quantification of fire-induced thaw settlement is unique. It is anticipated that this work can advance the multi-sensor fusion techniques for permafrost thawing studies under the scenario of climate warming and increasing fire activities in cold regions.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.