Quantifying understory vegetation density using multi-temporal Sentinel-2 and GEDI LiDAR data

ABSTRACT Understory vegetation contributes considerably to biodiversity and total aboveground biomass of forest ecosystems. Whereas field inventories and LiDAR data are generally used to estimate understory vegetation density, methods for large-scale and spatially continuous estimation of understory vegetation density are still lacking. For an evergreen coniferous forest area in southern China, we developed and tested an effective and practical remote sensing-driven approach for mapping understory vegetation, based on phenological differences between over and understory vegetation. Specifically, we used plant area volume density (PAVD) calculations based on GEDI data to train a support vector regression model and subsequently estimated the understory vegetation density from Sentinel-2 derived metrics. We produced maps of PAVD for the growing and non-growing season respectively, both performing well compared against independent GEDI samples (R2 = 0.89 and 0.93, p < 0.01). Understory vegetation density was derived from the differences in PAVD between the growing and non-growing season. The understory vegetation density map was validated against field samples from 86 plots showing an overall R2 of 0.52 (p < 0.01), rRMSE = 21%. Our study developed a tangible approach to map spatially continuous understory vegetation density with the combination of GEDI LiDAR data and Sentinel-2 imagery, showing the potential to improve the estimation of terrestrial carbon storage and better understand forest ecosystem processes across larger areas.


Introduction
Understory vegetation is composed of diverse species and is an important component of forest ecosystems. Understory vegetation provides habitats and forage for wildlife and also contributes to carbon sequestration, soil nutrient cycling, and reduced risk of erosion (Echiverri and Macdonald 2019). Furthermore, considering that forest ecosystems are sensitive to climate change, information on understory vegetation may also be relevant for estimating potential climate change impacts on future forest resources (Tuanmu et al. 2010). Thus, quantifying understory vegetation in forest ecosystems is essential to accurately assess the ecosystem functioning and services.
Generally, understory vegetation is being estimated by approaches based on field measurements and remote sensing techniques applicable for localscale studies. Traditional sampling methods used in the field include ocular estimation, line-intercept sampling, and fixed plot sampling, but are often labor-intensive, time-consuming, and costly. Moreover, the measurements of understory vegetation implemented in such plot-or transect-based field campaigns limit the spatial extent of the survey. Remote sensing has been widely used to quantify forest-related parameters across large areas, such as tree cover (Tang et al. 2019;Xi et al. 2021), forest height (Li et al. 2020;Potapov et al. 2021) and forest aboveground biomass (Shen et al. 2018;Huang et al. 2019;Wang et al. 2020). Among the versatile remote sensing techniques, LiDAR sensors have been used extensively to estimate the three-dimensional vegetation structure of forests, which also allows to study understory vegetation (Crespo-Peremarch et al. 2018). This technique is based on the narrow beams of laser light emitted in rapid succession, which can exploit small open gaps in the forest canopy. The understory structure information is recorded by the returned time of emitted pulses, which interact with features in the understory (tree leaves, branches, and trunks, shrubs, grasses, and forbs) and reflect back to the sensor (Martinuzzi et al. 2009). However, the pulse energy cannot reach the understory under dense forest canopies. In the past decades, LiDAR sensors, data collection platforms (i.e. from ground-based and airborne LiDAR) and associated data processing capacities have rapidly advanced to contribute to the monitoring of understory vegetation. For example, Wing et al. (2012) developed a new airborne LiDAR metric (e.g. understory LiDAR cover density, understory point density, and effective plot coverage) to estimate understory vegetation cover. Crespo-Peremarch et al. (2018) have estimated a range of understory vegetation metrics including maximum and mean height, cover, and volume, using fullwaveform airborne LiDAR. Campbell et al. (2018) quantified understory vegetation density using smallfootprint airborne LiDAR data. Hancock et al. (2017) measured the fine-spatial-resolution 3D vegetation structure with airborne waveform LiDAR. These studies have developed efficient and reliable techniques for estimating understory vegetation structure using LiDAR that have advanced our knowledge on remotely detecting understory vegetation. However, the limited spatial and temporal extent of airborne LiDAR data restricts the applicability to local-scale assessments of understory vegetation, which calls alternative remotely sensed-based methods for estimating understory vegetation density at a large spatial scales.
From the combined use of different satellite sensor systems, it is, however, possible to overcome the limitations of airborne LiDAR systems, to produce largescale and continuous estimates of understory vegetation. Optical remote sensing data are sensitive to upper vegetation canopy properties at a two-dimensional level, with limited capabilities of sensing vertical structures. Nonetheless, the spectral differences captured by optical data when the vegetation canopy layer changes from being a dense canopy toward being a mix of trees and shadows when the tree density gets sparser, enables us to estimate proxies of vegetation vertical structure to a certain extent (Lee et al. 2020). Using both temporal and spectral information from optical data also advances the characterization of vegetation structure, taking into consideration the difference in phenology between different types of trees (Tuanmu et al. 2010). The combined use of LiDAR and optical remote sensing have proven to be suited for large-scale estimations of forest heights (Li et al. 2020) and aboveground biomass (Chen et al. 2021). Optical multi-spectral imagery could also potentially be used to upscale LiDAR-based estimations of understory vegetation, but this has not yet been well studied. Upscaling is generally conducted with machine learning models (Bihamta Toosi et al. 2020;Xu et al. 2018), such as a support vector regression model or random forest (Huang et al. 2019;Li et al. 2020;Shen et al. 2018;Wang et al. 2020).
The NASA Global Ecosystem Dynamics Investigation (GEDI), a spaceborne LiDAR instrument operating on the International Space Station, has collected data since April 2019 (Kokalj and Mast 2021). GEDI is a full waveform LiDAR providing footprints averaging 25 m in diameter. The GEDI mission is specifically designed for measuring vegetation structure (Kokalj and Mast 2021) and provides an unprecedented sampling density and temporal feature of forest structural properties from space, which could be an ideal reference data set used for mapping understory vegetation . However, the GEDI sampling design of data is conducted as footprints of discrete laser beams representing a point sample, that limits its applicability to capture vegetation structure continuously across large spatial extents, if not combined with other sources of information. Sentinel-2 offers the opportunity to separate understory and woody vegetation due to its high revisit frequency, rich spectral information, and relatively high spatial resolution (10 m) (Zhang et al. 2019). Therefore, integrating the GEDI LiDAR and Sentinel-2 may serve as a viable data combination to estimate understory vegetation continuously across large areas.
To verify this assumption, we examined to which extent the combination of GEDI LiDAR and Sentinel-2 can predict understory vegetation density for a selected evergreen coniferous forest at the regional scale. Extensive establishment of evergreen coniferous plantations has resulted in a simplified forest structure and losses in species biodiversity particularly in southern China. However, naturally growing understory vegetation can improve the stability of forest ecosystems and biodiversity (Wing et al. 2012). This causes a need to characterize the spatial distribution of understory vegetation density to accurately assess the full array of ecosystem services. In these coniferous forest areas, understory vegetation is mostly deciduous, which results in notable differences in the phenology between the forest canopy and the understory vegetation. These phenological differences potentially provide a way forward toward mapping understory vegetation. However, considering that understory vegetation cannot be measured directly from space, it is necessary to link the independent measurements of light interaction provided by remote sensing to field-based measures of specific biometrics, such as understory vegetation density.
In this study, we aim to map understory vegetation density in the coniferous forest areas for an area in China using field samples, GEDI LiDAR data and multitemporal Sentinel-2 imagery. Specifically, we aim to: (1) select the most relevant Sentinel-2 based indices to estimate the plant area volume density index (PAVD) for the growing and non-growing season, and (2) develop a reproducible and robust approach for mapping understory density using GEDI LiDAR data and Sentinel-2 imagery.

Study area
The study area is located in Mingguang County, Chuzhou City, Anhui Province, China (Figure 1), covering an area of nearly 1,650 ha with an average elevation of 32 m. The area belongs to the subtropical monsoon climate region with an annual average temperature of 15.4°C and precipitation between 700 and 1,400 mm. As part of the National Key Ecosystem Protection and Restoration Major Projects in China (NERPC), the masson pine forest vegetation in the study area was planted in the 1990s (Cui et al. 2021). Masson pine is an evergreen coniferous tree species having heights of 15 to 45 m. Branchlets usually grow twice per year. Each is colored yellowish brown, occasionally with a pale gray or bluish-green (glaucous) appearance. The tree species is fast-growing at in the beginning and growth gradually slows down in later years. Affected by climatic factors, understory vegetation is mainly composed of annual herbaceous vegetation and deciduous shrubs. The dominant species are Setaria viridis, Artemisia carvifolia, Dendranthema indicum, Duchesnea indica, Amaranthus spinosu, and Erigeron annuu. While masson pine is evergreen, the understory vegetation has no leaves during the nongrowing season (October to April) but only during the growing season.

Field data collection
We conducted two field surveys covering the same plots ( Figure 1) during the non-growing season (25 Feb 2020) and the growing season (15 Jul 2020) ( Table 1). We sampled along a diagonal ( Figure 2a) and we chose the sampling dates close to the Sentinel-2 satellite over-pass dates. In total, 86 plots each 10 × 10 m were randomly selected, being consistent with the pixel size of Sentinel-2 images ( Figure 2b). In each plot, we used a LAI-2200C plant canopy analyzer (Li-Cor Biosciences, Lincoln, NE, USA) to measure the plant area index (PAI) (Rodríguez-Ronderos et al. 2016). The PAI is the sum of leaf area index (LAI) and wood area index (WAI) (Kalacska et al. 2005;Sylvain and Leblanc 2001). We selected five positions of equal distance along the diagonal direction of each plot, and measured the PAI for all vegetative layers above the soil surface and at the height of the tree layer, defined as trees taller than 1.5 m ( Figure 2b). The measured PAI for all-vegetative layers is a mixed signal from grasses, shrubs, and trees, whereas PAI for the tree layer only contains information from the coniferous forest canopy. Since the understory vegetation was not photosynthetically active during the non-growing season, only the tree layer was measured. We used the FV2200 software (Li-COR Biosciences) for field data post-processing. We  also recorded GPS positions for the sampled sites and recorded basic vegetation information such as plant shape, leaf size, and moisture content. We implemented the same measurements for all vegetative layers and the tree layer for all plots during the growing season survey.

GEDI LiDAR data
We downloaded GEDI LiDAR L2B data from the Land Processes Distributed Active Archive Center (LP DAAC) for January and July 2020, matching as closely as possible the dates of the field surveys and Sentinel-2 images ( Table 1). The GEDI instrument records the amount of energy returned by various tree components at different heights above the ground and contains structural information, such as surface topography and metrics of canopy height, cover, and vertical profiles (Duncanson et al. 2020). Plant area volume density (PAVD) is a function of plant area per unit volume as a parameter obtained by the GEDI LiDAR instrument, which is often used to quantify the vertical structure of forests. PAI is defined as the one-sided area of plant material surface per unit ground surface area, which includes different constituents in the plant (e.g. stem, branches, and leaves). The two metrics are generated based on the directional gap probability profile derived from the wave-forms provided in the L1B products of GEDI LiDAR and locations of reflecting surfaces (Dubayah et al. 2020). Specifically, the canopy gap probability was firstly derived by the LiDAR waveform, and then the vegetation structure parameters were calculated using the cumulative laser energy return under the known canopy to ground reflectance ratio. In our study, both the PAI and PAVD were extracted from 1,532 observations from GEDI L2B (Dubayah et al. 2020). GEDI has eight ground tracks, and the two metrics extracted from the four full-power beams relative to "coverage" beams were used . The sensitivity of a GEDI footprint indicates the maximum canopy cover that can be penetrated considering the signal-to-noise ratio of the waveform, and we thus excluded footprints with the sensitivity less than 0.9. After filtering out these invalid observations, 881 pairs of PAI and PAVD were used for further analysis.

Sentinel-2 data
Two tiles of Sentinel-2 Level 2A images were downloaded from the Copernicus Open Access (COA) Hub (scihub.copernicus.eu/), acquired on 28 December 2019 and 8 July 2020. The Sentinel-2 images were atmospherically corrected using the Sen2Cor plug-in provided by the ESA. The nearinfrared (NIR) and short-wave infrared (SWIR) bands at 20 m spatial resolution were resampled to 10 m using the bicubic convolution interpolation method. In addition to these Sentinel-2 spectral bands, we calculated multiple vegetation indices that are widely used for classifying vegetation (Xi et al. 2021), canopy height (Li et al. 2020) and biomass (Chen et al. 2021). As a result, a total of 40 vegetation indices and 20 spectral band with a spatial resolution of 10 m were extracted. The details of the vegetation indices are displayed in Appendix A, Table A1.

Selection of Sentinel-2 features
Advances in satellite technology offer a rich collection of spectral bands and associated vegetation indices that are often highly correlated. It is thus critical to reduce redundant features by assessing the variable collinearity that can improve the interpretability of the model and shorten data processing times (Erinjery, Singh, and Kent 2018;Grabska, Frantz, and Ostapowicz 2020). We conducted two steps to remove the redundant features: (1) We first removed candidate features with stronger collinearity (p < 0.05) by performing collinearity tests between all Sentinel-2 features (Table A2), (2) Recursive feature elimination (RFE) (Guyon and Barnhil S 2002) was then applied to identify the key features from 30 features comprising 10 bands and 20 metrics extracted from Sentinel-2.
The key feature selection in RFE was conducted with a decision tree as the regression model and R-score stopping criteria. The selection of features is stopped when additional variables do not significantly (p < 0.05) improve the R-score.

Algorithms to map understory vegetation density
In this study, we conducted four main steps to quantify understory vegetation density. (1) the pre-processed GEDI data and Sentinel-2 images were divided into two groups of growing and nongrowing season acquisitions in order to make use of the phenological difference to derive PAVD. (2) GEDI LiDAR data were used as dependent variables, and Sentinel-2-based features were used as independent variables for feature importance assessment and regression analysis. (3) We used the growing season PAVD subtracted by the non-growing season PAVD and the associated PAVD of increasing canopy to derive understory vegetation density. (4) The understory vegetation density was validated using field samples. The overall work-flow of PAVD mapping is presented in Figure 3.

Establishment of function between PAVD and PAI in GEDI LiDAR data
PAI is related to canopy foliage and crown structure, and has been widely used for assessing the structure and functions of forest ecosystems (Sylvain and Leblanc 2001;Tang et al. 2014). Accurate PAI values can be obtained from field observations, whereas from remotely sensed observations understory vegetation density is usually represented by PAVD indices, e.g. in studies based on LiDAR data (Campbell et al. 2018;Crespo-Peremarch et al. 2018). Based on the GEDI LiDAR data, we observed a strong correlation between PAI and PAVD for both the growing season and non-growing season (R 2 = 0.99, p < 0.01) ( Figure 4). As a result, the field measured PAI has potential to characterize the understory vegetation density and will be converted to PAVD used as an independent data source to validate the satellitederived understory vegetation density in this study.

Principle of understory vegetation density extraction
Understory vegetation is expected to predominantly contribute to the phenological differences in PAVD between the growing and non-growing season (see Section 2.2), and we thus used the growing season PAVD subtracted by the non-growing season PAVD to estimate understory vegetation density from GEDI LiDAR data. It should be noted that increment in woody foliage from non-growing season to growing season also contributes to the seasonal differences in PAVD. We thus used our field measured PAVD of the tree layers to calculate the increased PAVD by subtracting the averaged growing-season PAVD by the averaged non-growing-season PAVD using all plots. We applied this fixed PAVD increment to calibrate satellite derived PAVD. This approach is considered valid because the dominant tree species in this study area is masson pine characterized by the same stand age. Hence, we can approximately treat the seasonal increment of PAVD as a constant ( Figure 5). Therefore, understory vegetation density (UV) can be calculated using the following equation: Where PAVD GS is the growing season PAVD, and PAVD NGS is the PAVD of non-growing season from GEDI LiDAR data. k is a constant, which refers to the increment of PAVD in the forest canopy.

Mapping understory vegetation density
Given the limited number of training samples, we used a support vector regression (SVR) model to train the selected features from Sentinel-2 using 80% of the GEDI data observations. Multivariate SVR is a regression version of support vector machines that project the training dataset from a lower dimensional space into a higher dimensional space by using a kernel function, which is effective for the regression of complex, high-dimensional data providing multivariate, non-linear, and nonparametric regression. We used the radial basis function (RBF) in our study. Compared to other kernel functions, RBF compresses the training process and increases the calculation efficiency. Based on ten-fold cross-validation, the hyper-parameter sets as shown in Table 2 are tuned to obtain the best fit, including gamma, regularization parameter (C) and the kernel width (σ) for growing and non-growing season. The optimal models were run for the growing and non-growing season to derive PAVD maps covering the entire study area from Sentinel-2 imagery.
In addition, considering that the GEDI lidar beams have a 25 m footprint with a positional accuracy of 15 m , we tested the GEDI LiDAR points with Sentinel pixels with pixel window-sizes of 2 × 2, 3 × 3, and 5 × 5 in the SVR model estimation.
Our test results showed that applying the average values of four neighboring pixels (5 × 5) to match the GEDI LiDAR points produced the highest performance. As a result, we build training datasets and optimized SVR models to estimate vegetation density. Next, the understory vegetation density was calculated using the derived map and equation (1). k is converted from PAI using the linear regressions (see Figure 4).

Accuracy assessment
PAVD maps for the growing and non-growing seasons were validated using the remaining 20% of the GEDI-based PAVD samples. Additionally, the derived   Figure 4. The PAVD values from field survey are used to estimate k, which is a constant referring to the seasonal increment of PAVD required to compute understory vegetation density from PAVD of GEDI data. In this study, the k value was 0.39. map of understory vegetation density was validated by independent field samples from the 86 plots. The coefficient of determination (R 2 ), and the relative root mean square error (rRMSE) were used for the accuracy assessment.

Sentinel-2 feature importance
To find the optimal number of features, crossvalidation was used with RFE to score different feature subsets and select the best scoring collection of features. In the growing season, we identified four spectral bands (B3, B5, B8, and B11) and seven vegetation indices (TSAVI, GNDVI, S2REP, ARVI, MCARI, NDI45, and PSSRA), which strongly explained the variations in PAVD (R score reaching 0.79) and from where additional variables did not increase the R score markedly (Figure 6a). In the non-growing season, a total of 12 variables were identified, namely B3, B5, B6, B8, B11, B12, TSAVI, GNDVI, MTCI, S2REP, PSSRA, and DVI, with an R score of 0.81 (Figure 6b). The detailed results of variable importance are displayed in Appendix A, Figure A1.
For Sentinel-2 spectral bands, red-edge, and NIR were closely related to PAVD for both the growing season and non-growing season. For VIs, the TSAVI, GNDVI, S2REP, ARVI, and NDI45 were identified as the most important metrics explaining the variations in PAVD during the growing season, while TSAVI, GNDVI, MTCI, S2REP, and PSSRA contributed the most to explained variation in PAVD for the non-growing season. The spectral band of B5 and the index of TSAVI were ranked the highest for the growing season and non-growing season, respectively.

Sentinel-2 derived PAVD estimation during the growing and non-growing season
Based on GEDI LiDAR training data and the selected optimal features from Sentinel-2, PAVD was calculated using the SVR model for both the growing and non-growing season. The PAVD maps of the growing and non-growing season (Figure 7(a,c)), produced in a 10 m spatial resolution, reveal a considerable spatial heterogeneity of PAVD in the study area and notable differences in PAVD between growing and nongrowing seasons. PAVD is generally lower in the nongrowing season as compared to PAVD in the growing season, but some scattered areas show high PAVD, such as in the north-eastern and south-eastern part of the study area. Moreover, low PAVD values are observed along road and river courses during the growing season. Based on the independent GEDI samples, the accuracy of the PAVD maps for the growing and non-growing seasons are R 2 = 0.93(±0.01) and rRMSE = 15% (Figure 7b) and R 2 = 0.89(±0.03) and rRMSE = 13% (Figure 7d), respectively. Note: "rbf" denotes radial basis function. Figure 6. Feature contribution score and its standard deviation for (a) the growing season and (b) non-growing season. The vertical dashed red lines indicate the optimal number of features selected for the estimates of PAVD, corresponding to 11 features for the growing season and 12 for the non-growing season.

Understory vegetation density mapping and validation
The understory vegetation density was derived based on equation (1) (Section 3.2). From Figure 8a, the averaged understory vegetation density is 0.31 in the study area. Relatively high understory vegetation density is mainly observed in the northwest and central part of the study area, while a relatively low understory vegetation density is observed in the southern part. A small proportion (0.97%) of understory vegetation density values is marginally smaller than 0, mainly located in the southern and southeast Figure 7. The PAVD 10-meter spatial resolution maps and accuracy for the non-growing season (a-b) and the growing season (c-d). ** denotes a significant relation at the 0.01 level and the shading indicates 95% confidence interval of the linear fitting. of the study area. This can be attributed to the estimated error and is an inevitable consequence of using a regression-based approach, and such negative values can of course be reclassified into zero-values. We also show three different subsets of the predicted understory vegetation density and the corresponding Google Earth images for visual interpretation (Figure 8b-c). It clearly shows less understory vegetation density along built-up areas and roads (gray color), while there was a high understory vegetation density in densely forested areas. Compared with field data, our estimates of understory vegetation density show moderate correlations with R 2 = 0.52 and rRMSE = 21% (Figure 9). The highly significant relationship found (p < 0.01), however, clearly indicates this as a promising way forward to model understory vegetation density over large areas using GEDI LiDAR and Sentinel-2 data.

Mapping understory vegetation density
We developed a new methodology to derive a continuous map of understory vegetation density in the Anhui Province, China using GEDI LiDAR and Sentinel-2. The correlation with field data was highly significant, yet not very high. This might partly be explained by the uncertainties of field observations due to small differences in the viewing angles and geolocation between field measurements and Sentinel-2 pixels (Hancock et al. 2017;Hill and Broughton 2009) as well as a general difference in the spatial scale of the footprint of observations between ground and satellite observations. A clear underestimation of high understory vegetation density is observed and vice versa for low values of understory vegetation density (Figure 8). This reflects the inability of the model to correctly predict the extreme values measured in the field, as these tends to be evened out from the use of satellite data for the above-mentioned reasons. The Airborne LiDAR-based studies of understory vegetation reports a similar range of R 2 values (Campbell et al. 2018;Hill and Broughton 2009), and our results underlines a solid potential of using freely available LiDAR and optical data in combination to estimate continuous understory vegetation density. The approach is expected to be adoptable over large areas given some preconditions are met (see below), which would improve the estimation of total forest carbon storage and dynamics to improve our understanding of terrestrial ecosystem processes (Duncanson et al. 2020).

Prospects of GEDI LiDAR and Sentinel-2 data
Previous studies have shown that a waveform LiDAR system with ~25 m horizontal resolution and ~1 m vertical accuracy can support accurate measurement of vertical canopy structure (± 10%) in presence of moderate slopes of terrain (Potapov et al. 2021). These requirements fit well into the current design of GEDI for estimating canopy cover (Kokalj and Mast 2021), LAI (Estévez et al. 2020) and foliage height diversity (Tang et al. 2019). We thus used the directional gap probability profile which is derived from the waveforms provided in the L1B product and locations of reflecting surfaces to extract vegetation density from each GEDI waveform used to generate the structural metrics of PAI and PAVD. The established relationship between PAVD and the optical spectral bands of Sentinel-2 are primarily dependent on the strong capacity of the multiple optical spectral bands of Sentinel-2 to capture the structural and functional characteristics of vegetation. In particular, the rededge bands are highly related to vegetation leaf properties, such as leaf area index, biomass, and structural carbohydrates (Martin-Gallego et al., 2020), while NIR and SWIR bands are sensitive to the changes of water Figure 9. Validation of the understory vegetation density estimation using independent field-collected samples (n = 86). ** denotes a significant relation at the 0.01 level and the shading indicates 95% confidence interval of the linear fitting.
content, lignin, starch, and nitrogen (Fassnacht et al., 2016). These optical spectral bands have been successfully used to estimate forest biomass (Chen et al. 2021), canopy density (Qi and Dubayah 2016) and forest height (Henrik J and Roland 2016) in recent studies as well. We also rely on the close linear relationship between PAI and PAVD that enables the field measured PAI conversion to PAVD, which provides the independent samples to validate the resultant map of understory vegetation density.

Limitations of understory vegetation density mapping
There are several issues that may limit the applicability of our approach for mapping vegetation understory density. Firstly, in relation to the field survey, the masson pine in our study area was planted at the same time resulting in all trees having almost same height and crown sizes, thereby showing strong homogeneity of the forest canopy. This makes it possible to apply a constant value over the study areas to estimate the increment in the PAVD resulting from seasonal variations in the forest canopy between the growing season and non-growing season. Our approach applies to coniferous forest, however, other types of forests could have different responses resulting from species-or community-level differences in plant physiology, leading to variable seasonal increment in the PAVD for the forest canopy between non-growing season and growing season. In such case, the method developed in our study should be used with caution given the spatial variations of PAVD increment, and more research is needed to further advance the feasibility of the method developed here. Secondly, current GEDI LiDAR data using the laser energy distribution (nearinfrared wavelengths) could have limitations when forest canopies are dense such as in the case of rainforest (Bauer, Knapp, and Fischer 2021). Under such conditions, the complementary use of L-band SAR data may advance the capacity for estimating understory vegetation density (Vittucci et al. 2021). Moreover, other machine learning approaches used for upscaling such as random forest, and extreme gradient boosting, should be considered to test if consistent results are obtained when estimating understory vegetation density. Thirdly, there are some inevitable systematic errors associated with the estimation process. For example, the residual geolocation error of the GEDI data (25 m per footprint) affects the calibration of the Sentinel-based (10 m per pixel) model and validation at foot-print level. The different acquisition time of GEDI LiDAR data, Sentinel-2 images may bring in uncertainty on the estimations of the related variables, however, the maximum 11-day time interval between field observations and remote sensing data acquisition ensured that vegetation structure did not change significantly between the acquisition of the different types of data.
In the non-growing season, the acquisition dates of the three data sources were more distant in time, but as the vegetation structure remains unchanged during this period (evergreen coniferous forest without understory vegetation) this has no impact on the design of the analysis. Additionally, the bare soil background including the shading caused by tree canopy and topography could cause biases in the reflectance values of the Sentinel-2 data.

Conclusion
Based on the differences in vegetation phenology, our study proposes an effective and practical approach to quantify understory vegetation density using GEDI and Sentinel-2 data in a coniferous forest setting in China, characterized by a subtropical monsoon climate. We produce a map of understory vegetation density that performed well when compared against independent field samples (R 2 = 0.52 and rRMSE = 21%, p < 0.01). In particular, our approach has shown a promising potential for mapping understory vegetation density for a temperate conifer ecosystem (e.g. Carpathian, Da Hing-gan-Dzhagdy, and Siberia) with sparse trees, favoring the laser light penetrate forest canopy to detect understory vegetation. Such large-scale understory vegetation mapping provides a way toward quantifying missing information on the understory vegetation stratum for a complete assessment of forest biomass resources. However, the robustness and accuracy of estimations in different forest ecosystem types should be further tested, taking into consideration different characteristics of forest types, such as boreal and temperate forest, broadleaf and needles forest, and forests of various densities. In summary, with the combination of GEDI LiDAR data and Sentinel-2 imagery our study succeeded in mapping spatially continuous understory vegetation density, which holds promises for improving the estimation of terrestrial carbon storage and providing the basis for a better understanding of various forest ecosystem processes across larger areas.