Estimation of boreal forest biomass from ICESat-2 data using hierarchical hybrid inference

,


Introduction
Satellite lidars have potential to improve the accuracy of global above-ground biomass (AGB) surveys by providing information on the forest height (Duncanson et al., 2019).Research into using spaceborne lidar data in AGB estimation started with the first ICESat mission (e.g.Lefsky et al., 2005;Boudreau et al., 2008;Nelson et al., 2017).ICESat has since been followed by GEDI, a dedicated forest observation mission on the International Space Station (Dubayah et al., 2020), and ICESat-2, launched in 2018 (Markus et al., 2017).
The ICESat-2 carries the ATLAS (Advanced Topographic Laser Altimeter System) instrument which is a profiling photon counting lidar operating at green wavelength (532 nm).ICESat-2 data consist of parallel ground tracks produced by the three pairs of strong and weak beams, which have a power ratio of 4:1 (Neumann et al., 2019).While primarily designed for snow and ice monitoring, ICESat-2 has the advantage of providing a good coverage of the boreal zone, the northern parts of which are not covered by GEDI.
The current spaceborne lidar sensors have a limitation that the measurements consist of either discrete footprints or, in the case of ICESat-2, discrete height profiles.As it is unlikely that the discrete footprints or profiles overlap with the ground sites with AGB field measurements, the construction of regression models that link AGB with the satellite measurements is more complicated than, for example, in optical satellite imagery provided as spatially continuous data.The current practice is to use airborne laser scanning (ALS) to bridge the gap between reference field measurements and the satellite measurements (e.g.Wulder et al., 2012), either by constructing an intermediate proxy model (e.g.Margolis et al., 2015;Holm et al., 2017;Narine et al., 2020;Varvia et al., 2022;Guerra-Hernández et al., 2022) or by simulating satellite lidar measurements from the ALS data (Narine et al., 2019;Duncanson et al., 2022).It is also possible to measure field plots directly at the space lidar footprint or track locations (Lefsky et al., 2005;Nelson et al., 2009;Song et al., 2022), although in practice it is often not feasible due to e.g.poor accessibility of the footprint locations.
In addition to producing estimates of AGB or its areal density (AGBD), it is important to quantify the uncertainty of the estimated values, for example, by variance estimation.However, estimating the variance of the estimated AGB is complicated by the hierarchical modeling approach and the methodology has matured only relatively recently.As the response variables of the spaceborne lidar model are not coming from field measurements, but are predictions from the linking proxy model, they have an associated uncertainty.In the earliest studies, such as Nelson et al. (2009), the uncertainty from this model hierarchy was omitted due to intractability.
The first study to account for multiple modeling steps in space lidar application was Holm et al. (2017), which used the so-called hybrid inference approach (Ståhl et al., 2011) and reformulation of the hierarchical modeling to a more tractable form.In hybrid inference, the model predictions at the satellite lidar footprint or track level are treated in a similar way as observations in traditional design-based inference, such as measured sample plots in a forest inventory.As the observations are model predictions with associated uncertainty, a hybrid estimate combines the variance coming from the sample design with the propagated variances coming from model uncertainties.Hybrid inference has been used in several previous space lidar studies (e.g.Healey et al., 2012;Neigh et al., 2013;Margolis et al., 2015;Nelson et al., 2017;Patterson et al., 2019).
A parallel development to variance estimation in the case of hierarchical modeling was the so-called hierarchical model-based (HMB) approach (Saarela et al., 2016(Saarela et al., , 2020)), which was originally applied to a scenario where a proxy ALS model is used to link field plot data to wall-to-wall satellite imagery.HMB has since been applied also to satellite lidar applications, such as GEDI (Saarela et al., 2018(Saarela et al., , 2022)).
The aim of this study is to evaluate the estimation of the mean AGBD and its variance using a hierarchical modeling procedure with ICESat-2, Sentinel-2, ALS and field data.We propose a hierarchical hybrid inference approach that combines error propagation through the model hierarchy in HMB with the hybrid inference approach (Saarela et al., 2023).The estimation approach is similar to what has been previously done with GEDI data (e.g.Patterson et al., 2019;Dubayah et al., 2022), but has been modified to work with ICESat-2 and includes the uncertainty of the allometric models used to produce field-plot AGBD values.To the authors' knowledge, this is the first study where hybrid inference is used with ICESat-2 data.

Study sites and field measurements
The two adjacent study areas are located near Valtimo (N 63 • 46 ′ E 28 • 13 ′ ) and Nurmes, Finland (N 63 1) Both consist of similar boreal forests, dominated by Scots pine (Pinus sylvestris L.), with a minority of Norway spruce (Picea abies (L.) Karst) and birches (Betula spp.).The Valtimo site is approximately 60 × 50 km in size and the Nurmes site is 50 × 50 km.We used sample plots measured by The Finnish Forest Centre as a part of ALS-based forest management inventories in the summers 2019 and 2020 in Valtimo and Nurmes, respectively.
The field data included circular plots with radius of either 5.64 m, 9.00 m, or 12.62 m depending on the forest maturity.In total, there are 797 field plots in the Valtimo area and 891 plots in the Nurmes area.At each plot, diameter at breast height (DBH) was measured for each tree with DBH ≥ 5 cm.The height of a sample tree of each species was measured on each plot and a calibrated height model (Eerikäinen, 2009) was used to predict the height for the rest of the trees.A summary of the field-plot data is presented in Table 1.

ALS data
The ALS data in the Valtimo area were collected between June 7th and July 9th 2019 using a Leica ALS 80 HP scanner.The flying altitude was 1700 m above ground level, resulting in a nominal pulse density of 5 pts/m 2 and a footprint diameter of 39 cm.In the Valtimo area, the publicly available data were used, which were resampled from 5 pts/m 2 to 0.5 pts/m 2 before distribution.The ALS data in the Nurmes area were collected between June 17th and June 22nd 2020 using a Riegl VQ-1560i scanner at a flying altitude of 2100 m.In the Nurmes area the original point cloud with >5 pts/m 2 was used.
The ALS processing was done identically for both sites.The ALS echoes were height normalized with respect to ground using LAStools (Isenburg, 2020).For each plot, canopy metrics were computed using "first-of-many" plus "only" echoes, and "last-of-many" plus "only" echoes, producing two sets of metrics.The metrics included mean and maximum heights, standard deviation of heights, height percentiles p 5 , p 10 , p 20 , . . ., p 90 , p 95 , p 99 , canopy density percentiles b 5 , b 10 , b 20 , . . ., b 90 , b 95 , canopy cover, and the mean and standard deviation of intensities.

Sentinel-2 data
For the Valtimo site, a cloud-free Sentinel-2 image was available from June 14th 2019.For the Nurmes site, a cloud-free Sentinel-2 composite was constructed from images captured on June 16th, July 16th, and July 18th 2020.Atmospheric correction of the Sentinel-2 images was done using Sen2Cor (Main-Knorn et al., 2017), after which the atmospheric bands (bands 1, 9, and 10) were omitted.The images were then calibrated using histogram matching before compositing.The pixel values were used as predictors in the proxy AGBD models, in addition to several common spectral vegetation indices calculated from the images.

ICESat-2 data
The ATL03 (Neumann et al., 2021) and ATL08 (Neuenschwander et al., 2021) data for the Valtimo site covered the period from October 2018 to December 2019.For the Nurmes site, ATL03 and ATL08 data captured during the year 2020 were used.Version 4 of the data products were used for both sites.
The ICESat-2 data were processed following Varvia et al. (2022).First, the ICESat-2 tracks were split into 90 m × 15 m segments, centered on the locations of the ATL08 product.Each 90 m segment was further divided into six 15 m × 15 m subcells, which were used for the prediction of proxy AGBD on the 90 m track segments.The ATL08 individual photon classifications were then matched with the photon locations from ATL03 product.Photons classified as noise were discarded.
The classified photons were then clipped to the 90 m track segments.Using the photons classified as ground, the above-ground height was computed for each photon.Several height metrics were then calculated, and similar to the ALS metrics detailed above, they included the number of photons (canopy only (n c ) and total (n all )), mean photon height, standard deviation, maximum, height percentiles p 5 , p 10 , p 20 , . . ., p 90 , p 95 , p 99 , canopy density percentiles Poor quality segments were omitted if they did not meet the criteria of at least 100 classified photons and a fraction of high confidence photons (signal_conf_ph in ATL03) being at least 60%.In addition, a polygonal forest mask produced by the Finnish Forest Centre (Finnish Forest Centre, 2021) was used to discard ICESat-2 segments in certain non-forested areas, such as agricultural fields, water, roads, and built-up areas.We used only strong beam data captured outside the snowy season during daytime.While night data would be preferable due to the absence of solar noise, not enough snowless strong beam night data were available from the Nurmes site (only 60 segments).The use of daytime data thus represents a compromise between expected performance and data availability.For the Valtimo area there was a total of 1721 valid 90 m segments and 5760 segments in Nurmes (Figure 2).

Methods
The process of estimating AGBD using ICESat-2 data follows hierarchical modeling with three steps: 1) deriving field-plot AGBD from allometric AGB models and measured trees, 2) model based on ALS and Sentinel-2 data for predicting proxy AGBD on ICESat-2 tracks, and 3) ICESat-2 model for predicting AGBD from ICESat-2 height metrics.The model chain is trained using the data from Valtimo area and the fitted ICESat-2 model is then used to predict AGBD with the Nurmes data.
We used the species-specific biomass models by Repola (2008Repola ( , 2009) ) to predict allometric AGB for each measured tree on the field plots using the calipered DBH and predicted tree height.The field-plot AGBD was then calculated by summing up the tree-level AGBs of each plot and scaled to per-hectare level (Appendix A.1) .
A quadratic model with four variables was then fitted between the field-plot AGBD, and ALS and Sentinel-2 metrics using generalized nonlinear least squares.The four metrics in the model were chosen using a simulated annealing based variable selection routine (Packalen et al., 2012).The ALS and Sentinel-2 model was then used to predict proxy AGBD on the 15 m × 15 m subcells constructed on the ICESat-2 tracks.The subcell AGBDs were averaged to calculate proxy AGBD values for the 90 m track segments used in the ICESat-2 modeling (Appendix A.2).In the final modeling step, a quadratic model with four variables was fitted between the 90 m proxy AGBD and ICESat-2 metrics; the four metrics were again chosen using simulated annealing.This model was then used to predict AGBD on the Nurmes ICESat-2 track segments.(Appendix A.3) The AGBD predictions on the Nurmes ICESat-2 track segments were used to calculate a hierarchical hybrid estimate for the mean and variance of AGBD in the Nurmes area.The predictions on the track segments were modeled as a clustered random sample (Ståhl et al., 2011), where each ICESat-2 track was a cluster, similar to what was previously done by Dubayah et al. (2022) with GEDI data.As the tracks can cross and overlap, the design is considered as sampling with replacement.The estimate for mean AGBD is μI2 = where AGBD (i) I2,sum is the summed up predicted AGBD of the i'th ICESat-2 track, n (i) seg is the number of 90 m segments in the i'th track, and n track is the number of tracks.
The hybrid estimator for AGBD variance consists of two parts.First is the design-based sampling variability under the assumed design where nseg is the average number of segments per track.The second part is the model-based uncertainty of the predicted AGBDs, which is where n tot is the total number of ICESat-2 segments and 1 is a vector of ones.The covariance matrix C I2 is the hierachical model-based covariance of the ICESat-2 predictions.For the derivation of C I2 , see Appendix.Schematic of the error progation in the model-based part is shown in Figure 3. Finally, the estimated variance of the mean AGBD is Var Additional details and derivation of the hierarchical hybrid approach are presented in the companion article by Saarela et al. (2023).

Reference estimate
Hierarchical model-based (HMB) estimate calculated using an independent local data set was used as a reference estimate of mean AGBD at the Nurmes validation area (see Appendix A.4).As in the ICESat-2 workflow, a quadratic model with four variables was fitted between the field-plot AGBD, and ALS and Sentinel-2 metrics in Nurmes.The model was then used to produce a 15 m × 15 m wall-to-wall AGBD raster over the area.The variance estimation procedure followed Saarela et al. (2020) and included the uncertainty from the allometric models and the ALS and Sentinel-2 model fitted using the Nurmes field plots.
In addition to comparing the estimated mean AGBD and its variance, ICESat-2 model performance was evaluated using root mean square deviation (RMSD) (5) and mean difference (MD) (6).The ICESat-2 predictions in the Nurmes area were compared to AGBD predicted on the ICESat-2 segment locations using the local ALS and Sentinel-2 model.

AGBD models
The ALS and Sentinel-2 AGBD models and the ICESat-2 AGBD had a similar quadratic form with four predictors chosen using a simulated annealing.The fitted ALS and Sentinel-2 proxy AGBD model in the Valtimo area was: AGBD proxy =(8.52 + 0.44 avg f + 0.34 avg l − 0.0013 NIR − 0.0012 SWIR1) 2 . (7) The model was fitted using generalized nonlinear least squares (Pinheiro et al., 2021), with an estimated residual variance function of a constant plus power structure Var(e f ) = 0.61 2 6.23 + AGBD 0.69 proxy 2 . (8) Similarly, the ALS and Sentinel-2 model for the Nurmes validation area, which was used to obtain the reference HMB estimate, was  The ICESat-2 AGBD model ( 11) was then applied to predict the AGBD for the ICESat-2 segments in the Nurmes validation area.The accuracy of these predictions was first compared with the predictions obtained with the local ALS and Sentinel-2 model ( 9) at the Nurmes track locations.The ICESat-2 predictions had an RMSD of 21.3 Mg/ha (30.8%) and an MD of -3.3 Mg/ha (-4.8%).Scatter plot of the predictions is shown in Figure 5. Summary of the training proxy AGBD and ICESat-2 model predictions in the Valtimo area, Nurmes area, and of the local ALS and Sentinel-2 model predictions are shown in Table 2. Histograms of the proxy AGBD and ICESat-2 predictions are shown in Figure 6.

Variance estimation
The mean AGBD and its standard error over the forested parts of the Nurmes area was 65.7±1.9Mg/ha when using ICESat-2 data and hierarchical hybrid estimation (Table 3).The reference hierarchical model-based estimate using local ALS and Sentinel-2 data was 63.9 ± 0.6 Mg/ha.The relative standard errors were 2.9% and 1.0%, respectively.The contributions of the model hierarchy levels to the standard error in the hierarchical hybrid estimation are shown in Table 4.The topmost row shows the standard error using the full model hierarchy (1.91 Mg/ha).In the next row, the allometric part is removed, with the assumption that the allometric AGB values predicted are accurate, this decreases standard error slightly to 1.87 Mg/ha, which is 97.9% of the total modeled standard error.Moving down, the uncertainty of the proxy AGBD model is removed; this assumes that the proxy AGBD derived from the ALS and Sentinel-2 model is accurate.This further slightly decreases the standard error to 1.81 Mg/ha (94.9% of the total).Finally, removing the model-based uncertainty altogether, we arrive at the standard error from sample design only, which is 1.61 Mg/ha or 84.3% of the total.By calculating the differences, the sources of uncertainty in order of decreasing magnitude are: sample design (1.61 Mg/ha, 84.3%), ICESat-2 model (0.20 Mg/ha, 10.7 %), proxy AGBD model (0.06 Mg/ha, 2.9%), and allometry (0.04 Mg/ha, 2.1%).
Table 4: Contribution of the uncertainty components.For example, the field "Design, I2 model" includes only the uncertainty from the limited number of ICESat-2 segments in Nurmes and the uncertainty of the ICESat-2 model parameters when the proxy AGBD values are assumed to be accurate.

Modeled uncertainties
Standard

Discussion
The average AGBD estimated using the ICESat-2 model transferred from a nearby area (65.7 ± 1.9 Mg/ha) was close to the value produced using the local ALS and Sentinel model (63.9 ± 0.6 Mg/ha).The reference estimate is within the 95% confidence interval of the ICESat-2 estimate.However, the comparison of ICESat-2 and reference model predictions at the track segment level revealed that the ICESat-2 model had a tendency to underestimate AGBD (MD of -3.3 Mg/ha).If the estimated MD was subtracted from the ICESat-2 model predictions, the average AGBD estimate would rise to 69.0 Mg/ha.In this case, the difference to the reference estimate would be 5.1 Mg/ha.Considering that the ICESat-2 model was transferred from adjacent area and previous year, this is still a promising result.
One reason for the observed systematic error in ICESat-2 AGBD predictions was that the ICESat-2 training data from the Valtimo area have smaller AGBD values (Table 2) than the Nurmes target population, on average by 23 Mg/ha.If the estimation process was scaled up, to e.g.country level, the problem could likely be mitigated by using more training data from a larger number of locations which would better capture the full variation in the population.
The estimated relative standard error (3.5%) of hybrid estimation is small, especially considering the complex model chain and relatively poor accuracy of the ICESat-2 model.This is partly due to using variance as an uncertainty measure: variance corresponds to the variation of the estimate around the expected value of the estimate.As seen in Table 2 and Figure 6, the ICESat-2 model generally predicted biomass values that have less variation, which then also reduced the variation in the estimated mean AGBD.This seems to be an inherent property of long model chains, where each modeling step further reduces the variation (Saarela et al., 2023).This effect can be directly seen when comparing the reference HMB standard error (0.6 Mg/ha) to the model-based part in the hierarchical hybrid estimate (0.3 Mg/ha).An underlying assumption in variance as a uncertainty measure is also that the model is approximately unbiased in the target population.Based on the observed MD of the ICESat-2 predictions, the assumption likely did not hold.
Mean square error (MSE)1 of the estimatorwould likely be a better uncertainty measure by aiming to model the discrepancy between the estimate and the true value and thus accommodate for systematic errors.However, deriving a MSE estimator for a complex model hierarchy appears to be currently intractable, for example, due to the problem of cross-correlation of the spatial autocorrelation effects at different modeling steps.Further complicating the situation, addition of spatial correlation effects can have a relatively small effect on the quantified uncertainty in some cases.For example in Fortin et al. (2022) it slightly reduced the uncertainty compared to using only variance.
In this study, predicted tree heights were used in the allometric models due to unavailability of measured tree heights for all trees.This is a source of uncertainty that was omitted in the study, as it would add a further, complicating modeling step.Propagating the uncertainty from the estimated tree heights would likely require refitting of the mixedeffect models presented in Eerikäinen (2009).
Interpretation of the contribution of the model components to the resulting standard error (Table 4) is complicated by two factors.First is the decrease in variation described earlier.In addition to the modeling steps, the averaging of 15 m proxy AGBDs to the 90 m segments used in the ICESat-2 model further decreases the variation coming from allometry and the ALS and Sentinel-2 model.The second factor is a problem in calculating the contributions.Previously, Saarela et al. (2020) used fraction of the total variance to evaluate component contributions.We opted here to evaluate contribution using standard error by removing modeling steps one at a time and calculating the differences, primarily since standard error is in the same units as the estimate (Mg/ha) and thus could be easier to interpret.Both approaches have the limitation that the contribution of the lower modeling steps in the hierarchy cannot be evaluated directly, as they are affected by the propagation through the model chain.However, keeping these limitations in mind, the contribution of the model components to the total standard error is logical.The largest model-based contributor is the ICESat-2 model (10.7%), which is also the least accurate model when measured by goodness of fit.The order of the contributions of allometry and ALS and Sentinel-2 model are also in line with their respective performance.
The reference HMB estimate had a considerably smaller relative standard error (1.0%) compared to that reported by Saarela et al. (2020) (7.5%).The discrepancy seems to be mostly explained by differences in the data.The current study had a larger number of field plots, and Sentinel-2 data was used in addition to ALS, which resulted in a better performing AGBD model.The smaller positioning error of sample plots in this study (< 1 meter) and the placement of field plots within the forest stands (never on stand borders) may also have contributed to the reference model accuracy.In the allometric modeling, Saarela et al. (2020) used separate models for trees with only measured diameter and for trees with measured diameter and height.We used models with measured diameter and estimated height for all trees.Comparatively small uncertainties in AGBD estimation have also been reported earlier by Esteban et al. (2019) (1.8%), although the study did not include allometric contribution.

Conclusions
In this study, we evaluated estimation of average AGBD using ICESat-2 data and hierarchical modeling.Uncertainty of the estimated AGBD was quantified using hierarchical hybrid inference, which combines the error propagation through the multiple modeling steps with the variance coming from the sparse spatial coverage of the ICESat-2 data.
The ICESat-2 based estimate for the Nurmes validation area was 65.7 ± 1.9 Mg/ha compared to the local reference estimate of 63.9 ± 0.6 Mg/ha.The reference estimate was within the 95% confidence interval of the ICESat-2 based estimate.However, the interpretation was complicated by the observed presence of systematic error in the ICESat-2 AGBD predictions (-3.3 Mg/ha) at the validation area.
While the small estimated standard error should not be interpreted in the way that the proposed ICESat-2 estimate is highly reliable, the results support the use of ICESat-2 data for AGBD estimation.In this study, the ICESat-2 model was transferred from a different year and an adjacent area with relatively good results.Further studies should consider similar estimations for larger areas where also the structure of the forest can change.
For variance estimation, we need to calculate the covariance matrix of AGBD plot .Let us first combine the speciesspecific biomass models into a single model using binary species indicator variables s pi , s sp , and s de for pine, spruce and deciduous trees, respectively: AGB( α, d, h, s) = s pi AGB pi ( αpi , d, h) + s sp AGB sp ( αsp , d, h) + s de AGB de ( αde , d, h), (A.1) where α = αpi , αsp , αde T , in which e.g.αpi is the fitted parameters for the biomass model of pine, d is diameter at breast height, h is tree height, and s = s pi , s sp , s de T .We then use Taylor approximation to calculate the covariance matrix of the tree-level AGB predictions: where C α = diag C α pi , C α sp , C α de is a block-diagonal matrix consisting of the estimated covariance matrices of the species-specific biomass models reported in Ståhl et al. (2014).The matrix J tree is the Jacobian matrix of the combined model ( 7), which is formed from the partial derivatives where d i , h i , and s i are the DBH, height and the species of the i'th tree.
To produce AGBD plot , tree-level AGBs of each plot are aggregated and then divided by the plot area.These can be written as matrix operations and thus the covariance of AGBD plot is where A is a diagonal matrix, where A ii is the area in hectares of the i'th plot, and U is an aggregation matrix, for which 1, j'th tree belongs to i'th plot 0, otherwise (A.5)

Appendix A.2. Proxy biomass model
The field plot biomass values AGBD plot and metrics from ALS and Sentinel-2 were used to fit a quadratic proxy biomass model with four predictor variables: (A.6) where β i are the model coefficients, x (i) the predictors and e f ∼ N 0, Σ f is an additive error term.
The covariance matrix C β of the model parameters β now depends on two sources of uncertainty: 1) the sample used to fit the model ( 7), and 2) uncertainty of AGBD plot used to fit the model (7).Following Saarela et al. (2020), we use the law of total variance and Taylor approximation of the nonlinear model ( 7) to write: where J f is the Jacobian matrix of the model f (β, x) with respect to β, which is formed from the partial derivatives .8)where x (i) is the predictor vector of the i'th field plot.

Figure 2 :
Figure 2: The locations of 90 m ICESat-2 segments in Valtimo (left, green) and Nurmes (right, blue) superimposed on a canopy height map (Finnish Forest Centre).Field plot locations shown with circles.

Figure 3 :
Figure 3: Schematic of the error propagation.

Figure 4 :
Figure 4: Scatter density plots and absolute residuals of the fitted models.Red line in the residual plots shows the standard deviation from the residual variance model used in model fitting.

Figure 5 :
Figure 5: A comparison of AGBDs predicted by the local ALS and Sentinel-2 model vs. the ICESat-2 model fitted in Valtimo training area for the Nurmes validation area.

Figure 6 :
Figure 6: Histograms of the ICESat-2 model predicted AGBD and the AGBD predicted by the local ALS and Sentinel-2 models in the Valtimo and Nurmes areas.

Figure 7 :
Figure 7: 15 m resolution AGBD [Mg/ha] map of the Nurmes area produced using the reference ALS and Sentinel-2 model.ICESat-2 tracks shown in black.

Table 1 :
Summary of the field plot data.Height is the plot average.SD is standard deviation.

Table 2 :
Statistical summaries of the estimated AGBD values at 90 m ICESat-2 track segment level.Units are Mg/ha.

Table 3 :
Estimated average AGBD in the Nurmes area, its standard error and relative standard error for the hybrid hierarchical estimation using ICESat-2 and the reference hierarchical model-based estimate using local ALS and Sentinel-2 model.