Estimating forest aboveground biomass using temporal features extracted from multiple satellite data products and ensemble machine learning algorithm

Abstract Many studies have attempted to estimate forest aboveground biomass (AGB) from satellite data accurately. Temporal information may be beneficial to AGB estimation but remains underexplored. Thus, this paper aims to investigate whether and how temporal features extracted from multiple satellite-derived data products can improve prediction accuracy. To this end, we develop four methods to exploit the temporal features of moderate resolution imaging spectroradiometer (MODIS) data products: the method that uses all annual features (AAF), the method that selects essential features based on the Spearman correlation coefficient (SCC) criterion, the method that employs the seasonal average and principal component analysis (PCA) components (SAP), and the method that includes phenological characteristic parameters (PCP) as the predictors of forest AGB. Lidar-derived forest AGB in California serves as the reference AGB data, and the XGBoost ensemble algorithm is utilized to model forest AGB with temporal features of MODIS data products. The results demonstrate that the AAF-based features lead to the most accurate AGB prediction, whereas using information extracted by SAP and PCP gives rise to less accurate results. Annual MODIS surface reflectance data combined with forest canopy height can provide the AGB estimates, with an average R-squared (R2) of 0.58 and root-mean-squared error (RMSE) of 147.58 Mg/ha. The results of this study highlight the necessity of utilizing annual time-series data, particularly the annual surface reflectance data, for AGB prediction.


Introduction
Forest aboveground biomass (AGB) plays a vital role in the global carbon cycle and climate change mitigation and has been identified as an essential climate variable by the Global Climate Observing System (GCOS) (Le Toan et al. 2011;Dhillon and Von Wuehlisch 2013;Mitchard 2018).Many recent studies have focused on forest AGB estimation and mapping by combining optical imagery, synthetic aperture radar (SAR), and lidar (Amini and Sumantyo 2009;Song 2012;Ahmed et al. 2013;Sinha et al. 2015;Moradi et al. 2022;Nuthammachot et al. 2022).Some forest AGB maps have thus been generated on regional or global scales (Saatchi et al. 2011;Baccini et al. 2012;Zhang et al. 2019).However, evaluation and comparison results indicate significant uncertainties in these biomass maps.Various studies have provided different data on the magnitude and spatial distribution of forest AGB prediction (Mitchard et al. 2013;Zhang et al. 2019), hindering its further application and indicating the vital necessity for improving AGB estimation.
Some studies have extracted spatial information (e.g.texture metrics) from remote sensing images as predictors of forest AGB to improve prediction accuracy (Sarker et al. 2013;Li et al. 2020).However, the primary data sets have been acquired within specific time, and the saturation problem at lower AGB values and poor accuracy of AGB estimates leave much room to improve (Zhu and Liu 2015).Other studies have suggested that time-series data can provide more information on AGB prediction than single-temporal data and have enhanced the accuracy of forest AGB estimation (Le Maire et al. 2011;Kamir et al. 2020;Nguyen et al. 2020).For example, Zhu and Liu (2015) found that the Landsat normalized difference vegetation index (NDVI) in fall (e.g.September 24) had a stronger relationship with forest AGB than the commonly used peak season NDVI (e.g.June 20), with a correlation coefficient of 0.638 compared to 0.457.They also reported that using NDVI based on seasonal time series resulted in an accurate estimation of forest AGB.Zhang et al. (2014b) compared the performance of moderate resolution imaging spectroradiometer (MODIS) time-series data from May to October with monthly composite multiangle imaging spectroradiometer (MISR) data in mapping forest AGB.They showed that MODIS time-series data produced more accurate results than or comparable to the MISR data.Hayashi et al. (2019) found that using phased-array L-band SAR-2 (PALSAR-2) time-series data could improve the signal saturation issue in high-biomass regions, with an estimated AGB of 280 Mg/ha.In addition, some studies have included forest metrics extracted from remotely sensed time-series data as predictor variables (e.g.disturbance and recovery metrics and tree age), significantly improving the results (Liu et al. 2014;Pflugmacher et al. 2014;White et al. 2017;Nguyen et al. 2020).
Although the significance of utilizing temporal information has been noticed, published studies have mainly concentrated on the contribution of a few temporal features extracted from satellite data.Many land surface parameters, such as gross primary productivity (GPP), net primary production (NPP), and leaf area index (LAI), have been demonstrated to be correlated with forest AGB (Keeling and Phillips 2007;Malhi et al. 2011;Zhang et al. 2014a;Machwitz et al. 2015).However, the role of these time-series data in AGB estimation and how to use the temporal information provided by satellite data products to enhance AGB prediction accuracy remains underexplored.This study aims to explore whether and how forest biomass estimation can be advanced by exploiting temporal features from multiple satellite-derived data products.To this end, a variety of satellitederived data products, such as MODIS LAI, GPP, and surface reflectance (SR), are employed.Four methods are developed to extract temporal features from each satellitederived data product: the method that uses all annual features (AAF) without any transformation, the method that selects important features from annual data based on the Spearman correlation coefficient (SCC) criterion, the method that utilizes the seasonal average and principal component analysis (PCA) components (SAP), and the method that includes phenological characteristic parameters (PCP) as the predictors of forest AGB.
The extreme gradient boosting (XGBoost) ensemble algorithm is utilized to model forest AGB with extracted temporal features based on the four methods.In addition, since the synthetic use of multiple remotely sensed data can improve AGB estimation (Yu et al. 2010;Sun et al. 2011), we combine temporal features extracted from different data products and examine their effects on the accuracy of forest AGB prediction.

Study area
The study area is located in California, one of the world's five largest Mediterranean climate zones (Polade et al. 2017).The annual average precipitation in California is about 600 mm, with an uneven spatial distribution and evident annual variation (Allen and Luptowitz 2017;Gershunov et al. 2019).The surface temperature increased by about 2 C from 1950 to 1999, and the most significant increase occurred after about 1990 (Duffy et al. 2007).The population quadrupled from 6.9 million in 1940 to over 30 million in 1990 and is projected to increase to 63.3 million in the next 50 years (Cramer 1998;Cramer and Cheney 2000).The main tree species include coast redwood (Sequoia sempervirens), Jeffrey pine (Pinus jeffreyi), Giant sequoia (Sequoiadendron giganteum), Monterey cypress (Cupressus macrocarpa), white fir (Abies concolor) and red fir (Abies magnifica) (Minnich et al. 2000;Mcbride and La can 2018).
Due to anthropogenic climate change, the risk of severe drought has increased (Mann and Gleick 2015), which further leads to an increase in forest mortality and high risks to large trees (Fettig et al. 2019;Stovall et al. 2019).Previous studies have reported that the density of large trees has decreased across California, and the decline has been severer in regions experiencing a significant increase in water deficit (Mcintyre et al. 2015).In addition, population growth combined with increased ethnic diversity and economic development in California has posed challenges to environmental management, urging the government to accelerate its efforts to reduce carbon dioxide emissions (Struglia and Winter 2002;Sanstad et al. 2011).
According to the MODIS land cover product, MCD12Q1, and International Geosphere-Biosphere Program (IGBP) classification scheme, evergreen needleleaf forests (ENF), woody savannas (WSA), and savannas (SAV) are the primary forest types across California.Figure 1 illustrates the spatial distribution of the forest types.The California boundary in the shapefile format was obtained from the website https://earthworks.stanford.edu/catalog/.

Forest AGB data
Forest AGB data were initially derived from field measurements and airborne lidar data (Figure 1) and used to train the AGB model and validate the accuracy of the estimated AGB.They had accuracy as high as the field measurements and matched remotely sensed data better in spatial resolution than field plots (Zolkos et al. 2013).Thus, they were ideal for calibrating remotely sensed data and evaluating the estimated AGB.
Two lidar-derived data sets of forest AGB were collected.Xu and Greenberg (2018) generated forest AGB estimates at a 30 m spatial resolution from field measurements and airborne lidar surveys throughout California between 2005 and 2014.The field data used in their study included California-wide forest inventory and analysis (FIA) data and the data set collected by the United States Department of Agriculture (USDA) forest service.
The AGB estimates at a 30 m resolution were obtained from the distributed active archive center website at https://doi.org/10.3334/ORNLDAAC/1537.Dubayah et al. (2017) provided AGB estimates at a 30 m spatial resolution for Parker Tract and Hubbard Brook Experimental Forest sites, California, for the nominal year 2013, by combining the field data and airborne lidar data collected by the Leica ALS70 sensor, which could be retrieved from https://doi.org/10.3334/ORNLDAAC/1523.
To be consistent with satellite-derived data in the following section, we aggregated both data sets at 30 m resolution to the 500 m by average (Figure 1).The aggregated AGB data were then randomly resampled with a geographic distance larger than 1 km from each other, generating 3671 reference AGB samples used for AGB modeling and accuracy evaluation (Liu et al. 2020).According to the statistics for the corresponding AGB values at a 500 m spatial resolution, generated reference samples ranged from 0.19 to 1616.32 Mg/ha, with an average and standard deviation of 285.70 and 229.71Mg/ha respectively.

Satellite-derived data products
Due to their high quality and long time series, MODIS data products have been widely used in vegetation and environmental studies (Pastor-Guzman et al. 2018;Notarnicola 2020).We selected the MODIS data in this study because they have the eight-day temporal resolution and are dense in time series.Previous studies have proposed the correlations of the SR, LAI, the fraction of absorbed photosynthetically active radiation (FAPAR), GPP, and the canopy height (CH) with forest AGB (Muukkonen et al. 2006;Macdonald et al. 2012;Song et al. 2018;Pandey et al. 2019;Armstrong et al. 2020;Knapp et al. 2020).Hence, this study included the MODIS SR (MOD09A1), the LAI/FAPAR (MCD15A2), and GPP (MOD17A2) to estimate forest AGB.All the MODIS data products were downloaded from NASA's Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/)and then reprojected from the sinusoidal projection to a geographic coordinate system using the nearest neighbor method.Table 1 tabulates the details about these MODIS products.Each product had an 8-day temporal resolution, and we extracted all 46 features within the year corresponding to the reference AGB samples collected.
In addition to the MODIS data products, the forest canopy height was incorporated as a predictor due to its good correlations with forest AGB (Wang et al. 2013;Knapp et al. 2020).This work selected the global forest canopy height map produced by Simard et al. (Simard et al. 2011) from the Geoscience Laser Altimeter System (GLAS) data.We resampled the 1 km data to a 500 m spatial resolution using the nearest neighbor method to be consistent with other satellite-derived products.

Extracting temporal features from MODIS data products
To explore whether and how temporal information could enable accurate estimates of forest AGB, we developed four methods to extract temporal information from satellitederived data products for modeling forest AGB.The AAF method included all annual features for the AGB estimation.The second method selected important features from annual time-series data based on the SCC criterion.Only features correlated with forest AGB larger than the mean correlation of features with forest AGB across the year were included.This method was independent of the AGB modeling algorithms used.The third approach utilized seasonal information for AGB estimation.The seasonal average and principal component analysis (PCA) components (SAP) of time-series data covering a specific season were employed for each season to estimate forest AGB.The PCA created new uncorrelated variables from the seasonal time-series features that might be highly correlated, which could enhance computational efficiency and minimize information loss (Li et al. 2008).The last method used phenological characteristic parameters (PCP) as the predictors of forest AGB.The Savitzky-Golay (S-G) method filtered annual MODIS data to extract phenological parameters from remotely sensed data.A dynamic threshold method proposed by Jonsson and Eklundh (Jonsson and Eklundh 2002) was then applied to the filtered annual time-series data for extracting the phenological parameters, including the start of the season (SOS), the end of the season (EOS), the length of the season (LOS), the peak of the season (POS), the minimum value of the left curve, the minimum value of the right curve, the slope of the SOS, the slope of the EOS, the amplitude of the season, and the cumulative value over the growing season (Zeng et al. 2011;Clement et al. 2013;Michele et al. 2014;Potapov et al. 2021).The SOS refers to the date when vegetation begins to grow, and the EOS refers to the date when vegetation photosynthesis begins to decline rapidly; the LOS refers to the period from the SOS to the EOS in days.We first conducted a pixel-by-pixel analysis and extracted the maximum value of the annual time series as the POS to determine these parameters.The SOS was defined as the point when the base level (the minimum value of the left curve) increased by 20% of the distance between the base level and the maximum value (Delbart et al. 2006;Van Leeuwen 2008;White et al. 2009;Michele et al. 2014).Similarly, the EOS was detected when the value of the curve dropped below the base level (the minimum value of the right curve) plus 20% of the amplitude (Meroni et al. 2014;Michele et al. 2014).Among the above four methods, the SAP had four results corresponding to spring, summer, autumn, and winter.The optimal result that achieved the highest R-squared (R 2 ) and lowest root-mean-squared error (RMSE) values was taken as the seasonal result compared to the three other methods.To investigate the potential of MODIS data products for forest AGB estimation, we applied the four methods to each of the 10 MODIS layers listed in Table 1: the 7 SR layers (bands 1-7), the LAI, the FAPAR, and the GPP.In addition, we examined the accuracy of AGB estimates by combining several satellite-derived data products and considered eight scenarios: ( 1 2).The surface reflectance data for bands 1, 4, and 6 were considered because the initial results indicated that these three SR bands resulted in the accurate estimation of forest AGB.

AGB modeling algorithm
Ensemble machine learning algorithms are insensitive to outliers and achieve high accuracy in many fields (Zhang et al. 2022b).Random forest (RF) is the most extensively used tree-based ensemble algorithm (Breiman 2001).Some studies have suggested that boosting ensemble algorithms, such as gradient boosting machine (GBM) and CatBoost, can outperform RF and have a higher accuracy in predicting forest AGB than random forest (Zhang et al. 2020).Studies have recently reported that XGBoost is excellent at predicting PM2.5 concentrations, mapping plant species diversity, and estimating forest biomass (Zamani Joharestani et al. 2019;Li et al. 2020;Luo et al. 2021).Therefore, we selected the XGBoost algorithm for estimating forest aboveground biomass from satellite-derived time-series data in this study.
The XGBoost algorithm, proposed by Chen et al. (2016), is an improved version of the GBM algorithm.It is a scalable tree-boosting system that implements parallel preprocessing at the node level, making it faster than the GBM algorithm.It also includes regularization techniques to reduce overfitting.The primary hyperparameters of XGBoost included the maximum depth of the tree (max_depth), the minimum sample weight in leaf nodes (min_child_weight), the minimum value of the loss reduction required for the leaf nodes to the branch (gamma), the learning rate to prevent overfitting (learning_rate), the proportion of each tree randomly sampled (subsample), and the proportion of the number of columns randomly sampled per tree (colsample_bytree), which were optimized by the grid search method (Zhang et al. 2020;Zhang et al. 2022c) and implemented using python library xgboost.

Accuracy assessment
We used the five-fold cross-validation (CV) method to evaluate the accuracy of AGB estimates (Zhang et al. 2021).The 5-fold CV was repeated 10 times, generating 50 training and testing data sets.For each of the 50 data sets, the training data were used for AGB modeling, and the testing data were employed to assess the accuracy of AGB predictions.The evaluation metrics included R 2 , RMSE, and bias (Zhang et al. 2020).

Estimated forest AGB based on annual features extracted from different data products
We calculated the evaluation metrics achieved using the AAF method and the XGBoost algorithm under 18 scenarios (Figure 2).The most accurate estimation was provided by combining MODIS SR bands 1, 4, and 6 with the LAI, FAPAR, GPP, and CH, with an R 2 of 0.60 and an RMSE of 145.37 Mg/ha, followed by annual SR bands 1, 4, and 6 and the CH with an R 2 of 0.58 and an RMSE of 147.58 Mg/ha.In contrast, the estimated forest AGB had a lower degree of accuracy under the LAI (average R 2 of 0.44, RMSE of 172.01 Mg/ha, and bias of À0.32 Mg/ha), band 2 (average R 2 of 0.44, RMSE of 170.68 Mg/ ha, and bias of À1.26 Mg/ha), and band 5 (average R 2 of 0.44, RMSE of 171.92 Mg/ha, and bias of À2.39 Mg/ha) scenarios.Comparing the results obtained based on the 10 individual layers revealed that MODIS band 4 outperformed the other scenarios, with an achieved average R 2 of 0.55, RMSE of 153.41 Mg/ha, and bias of À0.98 Mg/ha.Bands 1 and 6 also provided results comparable to MODIS band 4, with an R 2 higher than 0.50.The GPP, with an average R 2 of 0.49, RMSE of 163.65 Mg/ha, and bias of 0.28 Mg/ha, provided a more accurate AGB prediction than the LAI and FAPAR.
The LFG provided the least accurate estimation of forest AGB among the eight combination scenarios.Including the forest canopy height remarkably enhanced AGB prediction, as shown in Figure 2, which revealed the excellent correlation between the CH and biomass.The combination of the forest canopy height with the surface reflectance bands 1, 4, and 6 outweighed the prediction accuracy of the other combination scenarios.It also produced results comparable to those of the models built with all the data sets used in this study, indicating the potential of the SR-CH scenario to predict forest AGB.These results were also obtained when the SCC, PCP, and SAP methods were employed to extract time-series information (Figures 3 and 4).

Contribution of seasonal time-series data to AGB estimation
According to Figure 3, the AGB predictions based on the summer data were more accurate than those based on the spring, autumn, and winter seasons, except for bands 2 and 5, with R 2 values ranging from 0.28 to 0.54 and RMSE values varying from 153.10 to 194.99 Mg/ha.The advantage of using the summer data for AGB estimation was more noticeable for bands 1, 4, and 6 than for the other bands.In contrast, the spring data  2.
provided more accurate estimates than the three other seasons for bands 2 and 5. Averaging the evaluation metrics obtained under 10 individual scenarios, we found that the summer data still provided more accurate AGB predictions, with an R 2 value of 0.36 and an RMSE of 183.28 Mg/ha.The average results of the eight combination scenarios showed that the summer data led to an average R 2 of 0.48, RMSE of 164.81 Mg/ha, and bias of 1.55 Mg/ha.
The AGB estimates based on the summer data were taken as the SAP results because of their high accuracy compared with the AAF, SCC, and PCP methods.Despite the higher degree of prediction accuracy obtained by the summer data, using MODIS data covering only one season could not provide reasonable estimates for forest AGB (Figures 3 and 4).

Impact of different methods and their extracted features on AGB estimation
The AAF provided the most accurate forest AGB estimates among the four methods used to extract temporal features from MODIS data (Figure 4).The SCC-based AGB estimates were close to the results obtained using the annual time-series data.The average R 2 value of the 10 individual scenarios using the SCC was 0.42, compared to an R 2 value of 0.49 achieved using the AAF method.Under the eight combination scenarios, the average R 2 value was 0.53 for the SCC method and 0.55 for the AAF, and the average RMSE was 157.54 Mg/ha for the SCC and 153.39 Mg/ha for the AAF method.
Furthermore, the results demonstrated that the predictions based on the PCP and SAP methods were worse than those determined by the AAF and SSC methods (Figure 4).The PCP approach led to a higher decrease in R 2 and an increase in the RMSE values relative to the results achieved by the AAF method than the SAP method under the 10 individual scenarios.These results indicated that the selected temporal features could be more effective in estimating forest AGB, and using seasonal or growing season features directly might not be an optimal choice for improving AGB estimation, as reported in previous studies.Under the LAI-CH, FAPAR-CH, GPP-CH, LFG, and LFG-CH scenarios, the PCP method provided more accurate results than the SAP approach, implying the capability of vegetation phenology for the retrieval of biomass (Duparc et al. 2013;Gwenzi et al. 2017).
According to Figure 4, the PCP and SAP methods had overall positive biases, while the AAF and SSC methods had negative biases, indicating that the PCP and SAP methods generally underestimated forest AGB, but the AAF and SSC overestimated it to some degree.However, the scatter density plots of the reference AGB versus the estimated one indicated that forest AGB was overestimated at low values and underestimated at high values, independent of the methods employed to extract the temporal features from the MODIS data (see Figure 5).Among the four methods, the SCC estimated AGB with the least bias values, indicating that it could effectively identify essential predictor variables of forest AGB by performing feature selection, could avoid the curse dimensionality, and could remove unnecessary features from the one-year time-series data (Luo et al. 2021).
Despite the observed differences in the accuracy of the forest AGB estimates using features extracted by the four methods, a combination of the MODIS layers enhanced the accuracy of AGB estimation relative to the results achieved by using single indicators (see Figures 2 and 4).

Spatial distribution of estimated forest AGB in California
On the basis of the combined MODIS SR bands 1, 4, and 6 and the CH data, we mapped forest AGB using the AAF, SCC, PCP, and SAP methods and the XGBoost algorithm.Figure 6 shows the results.The forest AGB maps obtained using the AAF and SCC methods were basically consistent.The PCP and SAP methods led to higher AGB predictions in shrublands than the AAF and SCC methods.The AGB estimated based on the SAP was even higher than 200 Mg/ha in open shrublands, which indicated the severe overestimation of AGB in low biomass regions.In contrast, the AGB values of evergreen needleleaf forests determined by the SAP and PCP methods were smaller than those predicted using the AAF and SCC methods (Figure 6).Roughly 14.73% of the forest pixels had an estimated AGB of less than 50 Mg/ha, while the proportions were 9.31% and 7.55% for the SCC and PCP methods.According to the SAP-based biomass map, all the forest pixels had an AGB value higher than 50 Mg/ ha.Almost 44.32% of the forest pixels had estimated AGB values ranging from 50 to 200 Mg/ha, while 48.27% and 53.78% of the forest pixels were located in the ranges according to the results obtained by the SCC and PCP methods respectively.According to the results estimated by the SAP method, only 30.69% of the forest pixels had a forest AGB, and 28.77% had an estimated AGB of 200-250 Mg/ha.

Discussion
This study examined the performance of the different temporal features extracted from MODIS data products in estimating forest AGB.The results indicated that the green band, red band, and shortwave infrared were sensitive to forest AGB, agreeing with previous works (Muukkonen and Heiskanen 2005;Mutanga et al. 2012;Nandy et al. 2017;Dube et al. 2018).Band 4 provided the most abundant information for AGB prediction among the seven MODIS SR bands.Although some studies agreed that the green band reflectance was suitable for predicting biomass, more works found that the shortwave infrared reflectance was highly correlated with biomass (Molinier et al. 2016;Nandy et al. 2017).However, the results of this study showed that the green band reflectance was more useful than the shortwave infrared and near-infrared bands.Thus, green band reflectance is necessary for forest AGB estimation in future studies.Further investigation should be conducted to determine whether and under which conditions the findings are established or applied to different spatial resolutions.
Regarding the four analytical methods to extract the features of annual MODIS data, this study revealed that the AAF and SCC methods led to similar results and provided a more accurate estimation of forest AGB than the other methods.Employing seasonal data or growing season data (phenological information) always produced the worst results, implying the limitations of estimating forest AGB using the data of a specific season.Published studies exploring the contribution of seasonal or time-series data have commonly utilized summer or autumn data or only one image acquired in a specific season (Zhu and Liu 2015;Nguyen et al. 2020), mainly ignoring the contribution of information from the other seasons.Therefore, our results highlighted that spring or winter features extracted from indicators might play a dominant role in AGB estimation; moreover, a few features from the other seasons were also indispensable for accurately estimating forest AGB.Performing feature selection to extract critical features from annual time-series data rather than simply using seasonal or growing season data might be more effective in taking advantage of temporal features to enhance the accuracy of AGB prediction.In addition, we extracted quantile statistics of the MODIS SR, LAI, FAPAR, and GPP and combined these statistical predictors for AGB estimation, similar to most published studies.The experimental results were poorer than ours, which also confirmed the significance of this study.
Satellite-derived time-series data can suffer from cloud cover, atmospheric variability, and aerosol scattering.For this reason, some studies have utilized smoothing or reconstruction techniques to improve the temporal trajectories of various remotely sensed data and then used the smoothed data instead of the original time-series data to estimate forest AGB.Their results showed that the AGB models developed using the smoothed time-series data substantially outperformed those based on the original data (Urbazaev et al. 2016;Li et al. 2018).This study preprocessed the raw data utilizing an S-G filter (Pan et al. 2015) and used the smoothed data instead of the original data to predict forest AGB based on satellite-derived data products.Nevertheless, we found that the results under the four time-series scenarios were consistent with those determined using raw time-series data without preprocessing, which, to some extent, indicated the robustness of our analytical methods.
Although the AAF method produced the most accurate results, about 60% of variations were explained by combining annual time-series data, leaving much room to improve.This study obtained large RMSE values partly because some training samples had AGB values higher than 500 Mg/ha, which was far beyond the capability of optical imagery for estimation, and RMSE was quite sensitive to the data set distribution and outliers.The forest AGB estimated herein was higher than some published results (Zhang et al. 2014a;Gonzalez et al. 2015), which might be because we used lidar-derived forest AGB data to calibrate satellite data sets, and some training samples were located in dense forests and had higher AGB values (see Figure 1).Nevertheless, the results of this study confirmed that by exploiting temporal information from multiple satellite-derived data products, the accuracy of AGB prediction could be markedly improved.Future studies should consider more advanced algorithms, such as deep learning algorithms, to estimate forest AGB using time-series data to further enhance forest AGB estimation.
The main objective of this study was to improve the accuracy of AGB estimates by exploiting temporal information.To this end, we conducted experiments with MODIS data products at the 500 m spatial resolution because of their high temporal resolutions.Although ignored in this study, spatial information could improve the AGB predictions (Zhang et al. 2019).Some recent studies combined Landsat and MODIS data to generate time-series data with high spatial and temporal resolutions (Gevaert and Garc ıa-Haro 2015;Zhang et al. 2022a).Using these generated data could improve the AGB prediction in theory and should be explored in future studies (Yang et al. 2022).In addition, the SAR data, topographical data, climatic factors, land surface temperature, and soil moisture should be included to further improve the prediction accuracy of forest AGB (Zhang et al. 2019;Jiang et al. 2021).

Conclusions
This study thoroughly investigated whether and how the temporal information of multiple satellite-derived data could be beneficial for forest AGB estimation.AAF, SCC, PCP, and SAP methods were developed to extract temporal features from the MODIS SR, LAI, FAPAR, and GPP data.The results of this study revealed that the AAF produced the most accurate estimation of forest AGB among the four analytical methods, followed by the SCC.The summer features of the reflectance data and biophysical variables, including the LAI, FAPAR, and GPP, played a more crucial role in biomass estimation than the other seasons.However, the seasonal results were less accurate than the SCC and AAF methods, indicating that the satellite data covering a specific season might not be enough to predict the forest AGB accurately.The SCC-based AGB estimation had the slightest bias, possibly due to removing redundant predictors within a year by performing feature selection.
In addition, the inclusion of the forest canopy height in the MODIS time-series data improved the accuracy of AGB estimation, so it was applied to the four methods.Combining the annual time-series SR data for bands 1, 4, and 6 with the CH could provide an accurate estimation of forest AGB, with an average R 2 of 0.58 and an RMSE of 147.58 Mg/ha, close to the results achieved by combining MODIS SR bands 1, 4, and 6 with the LAI, FAPAR, GPP, and CH with an R 2 of 0.60 and an RMSE of 145.37 Mg/ha.This highlights the necessity of utilizing annual SR time-series data in future studies.Last but not least, more advanced algorithms for modeling the forest AGB should be developed to further enhance the accuracy of estimated results.

Figure 1 .
Figure 1.The forest types and forest AGB data across California; the dot points correspond to forest AGB at a 500 m spatial resolution obtained by aggregating the lidar-derived biomass at a 30 m spatial resolution.

Figure 2 .
Figure 2. The accuracy of forest AGB estimated using the annual time-series data under the 18 scenarios; (a), (b), and (c) respectively represent the average R 2 , RMSE, and bias values achieved by the AGB models developed under the 18 scenarios listed in Table2.

Figure 3 .
Figure 3.The accuracy of forest AGB estimated using the seasonal time-series data under the 10 scenarios; (a), (b), and (c) respectively represent the seasonal R 2 , RMSE, and bias values achieved by the AGB models developed under the 18 scenarios; the error bars show the 95% confidence intervals.

Figure 4 .
Figure 4.The accuracy of the forest AGB estimated using the temporal features selected by the AAF, SCC, PCP, and SAP methods; (a), (b), and (c) respectively represent R 2 , RMSE, and bias values achieved under the 18 scenarios; the error bars show the 95% confidence intervals.

Figure 5 .
Figure 5.The density scatter plots of the reference AGB versus the AGB estimated by the AAF, SCC, PCP, and SAP methods and the five-fold cross-validation.

Figure 6 .
Figure 6.The spatial distribution of the forest AGB in 2013 estimated by the AAF, SCC, PCP, and SAP methods and the XGBoost algorithm based on the SR-CH data.

Table 1 .
The data sets of the MODIS data products used for AGB estimation.

Table 2 .
The temporal features used for modeling AGB under 18 scenarios.