Using Sentinel-2 and canopy height models to derive a landscape-level biomass map covering multiple vegetation types

Vegetation biomass is a globally important climate-relevant terrestrial carbon pool and also drives local hydrological systems via evapotranspiration. Vegetation biomass of individual vegetation types has been successfully estimated from active and passive remote sensing data. However, for many tasks, landscape-level biomass maps across several vegetation types are more suitable than biomass maps of individual vegetation types. For example, the validation of ecohydrological models and carbon budgeting typically requires spatially continuous biomass estimates, independent from vegetation type. Studies that derive biomass estimates across multiple vegetation or land-cover types to merge them into a single landscape-level biomass map are still scarce, and corresponding workflows must be developed. Here, we present a workflow to derive biomass estimates on landscape-level for a large watershed in central Chile. Our workflow has three steps: First, we combine field plotbased biomass estimates with spectral and structural information collected from Sentinel-2, TanDEM-X and airborne LiDAR data to map grassland, shrubland, native forests and pine plantation biomass using random forest regressions with an automatic feature selection. Second, we predict all models to the entire landscape. Third, we derive a land-cover map including the four considered vegetation types. We then use this land-cover map to assign the correct vegetation type-specific biomass estimate to each pixel according to one of the four considered vegetation types. Using a single repeatable workflow, we obtained biomass predictions comparable to earlier studies focusing on only one of the four vegetation types (Spearman correlation between 0.80 and 0.84; normalized-RMSE below 16 % for all vegetation types). For all woody vegetation types, height metrics were amongst the selected predictors, while for grasslands, only Sentinel-2 bands were selected. The land-cover was also mapped with high accuracy (OA = 83.1 %). The final landscape-level biomass map spatially agrees well with the known biomass distribution patterns in the watershed. Progressing from vegetation-type specific maps towards landscape-level biomass maps is an essential step towards integrating remote-sensing based biomass estimates into models for water and carbon management.


Introduction
The regular assessment of vegetation biomass is important for quantifying carbon stocks, which contributes to an improved understanding of carbon cycling (Houghton, 2005) and the sustainable use of biomass as energy-source and for forest inventory tasks (Koch, 2010). Before the background of climate change initiatives such as the Kyoto protocol, many countries have committed themselves to regular carbon stocks reporting, which often bases on biomass inventories (Patenaude et al., 2005). Hence, many countries regularly assess their vegetation (or at least forest) biomass at a national scale and, in some cases, within smaller spatial units. Biomass inventories often base exclusively on field plots, but additional information from remote sensing sources has been increasingly integrated to enhance spatial detail and inventory efficiency. Various remote sensing data types have been used to map vegetation biomass across a wide variety of ecosystems, including grasslands (e.g., Filho et al. 2019, Sibanda et al., 2015Wang et al., 2019;Yang et al., 2018a), shrublands (e.g., Chang et al., 2018;Li et al., 2017;Viana et al., 2012), agricultural fields (e.g., , savannas (Forkuor et al. 2020) and several forest types (e.g., Dong et al., 2003;Drake et al., 2003a, Lu et al. (2013). In forests, structural information acquired by active sensor systems such as light detection and ranging (LiDAR) and Radar, often performs better than passive optical systems (Fassnacht et al., 2014). Furthermore, information on vegetation structure obtained by optical photogrammetric approaches using structure-from-motion algorithms with spaceborne (e.g., Fassnacht et al., 2017;Persson et al., 2013) and airborne data (e.g., Messinger et al., 2016;Ota et al., 2015) has been successfully applied to estimate forest biomass at local to regional scales. These studies often reported similar accuracies as studies based on more cost-intensive LiDAR data. These photogrammetric approaches were also applied successfully for estimating biomass in grasslands and shrublands (Cooper et al., 2017;Cunliffe et al., 2016). However, for grassland and shrubland ecosystems, spectral information from spaceborne sensors alone can deliver reasonable biomass estimates (Yang et al., 2018a;Viana et al., 2012).
Many earlier studies either focused on estimating biomass at local extents using fine grain remote sensing data or at large extents using coarser resolution data. Since the launch of the Copernicus programme of the European Space Agency, fine grain remote sensing data at approximately 10 m pixel size are globally available through the Sentinel-1, Sentinel-2 and TanDEM-X satellites. This creates new possibilities to obtain fine-grain biomass estimates over large areas.
Most studies so far estimated biomass for only one vegetation type (e. g., grasslands or shrublands or forests), while studies assessing vegetation biomass across a complete landscape remain sparse. One example is the study of Mitchard et al. (2009), who modelled aboveground woody biomass across a diverse African landscape using ALOS PALSAR Radar data. Likewise, Ji et al. (2012) assessed biomass of forests, shrubs, and herbaceous areas in the Yukon Flats ecoregion of interior Alaska using Landsat data. Baccini et al. (2004) estimated landscape-scale biomass in California using coarse spatial resolution remote sensing data combined with topographical and climatological data.
The assessment of vegetation biomass across an entire landscape is of particular interest for some scientific and management applications. For example, biomass maps obtained with remote sensing techniques represent an independent source of validation (or comparison) for biomass estimates obtained with process-based ecohydrological model simulations. Usually, these simulations are validated using field plots or flux measurements (Yang and Zhang, 2016). Validations using field plots or fluxes have limitations in terms of their spatial coverage and remotely sensed biomass estimates offer a unique opportunity to override those limitations. By improving the parameterisation of such ecohydrological models, fine-grain, landscape-level biomass maps can help to refine important information related to the water use efficiency of the land cover types represented in the ecohydrological model.
A second important application for landscape-level biomass maps is to provide information about the amount and distribution of carbon currently stored and produced in a defined area. This information can be valuable for carbon modelling and spatial planning processes related to rural development initiatives (e.g., related to sustainable biomass consumption as an alternative to fossil fuel-based systems). The importance of knowing the spatial aspects of biomass consumption and production (which can be provided by repeated remote sensing-based biomass estimates) has been discussed, for example, by .
Despite the apparent relevance of landscape-level biomass maps, remote sensing studies providing biomass estimates across multiple vegetation or land-cover types are still scarce and corresponding workflows have to be developed. Here, we present a workflow to estimate vegetation biomass at the landscape-level in South-Central Chile. We used structural information collected from airborne LiDAR and TanDEM-X sensors in combination with Sentinel-2 optical data to separately estimate vegetation biomass of four vegetation types (grassland, shrubland, native forests, pine plantations). In a subsequent step, we created a land-cover classification of the four vegetation types to integrate all biomass estimates into a single landscape-level biomass map. We aimed to develop a straightforward, reproducible workflow leading to parsimonious models and robust biomass estimates.

Study area
The study area corresponds to the Cauquenes River watershed (1625 km 2 ), a tributary of the Maule river located on the eastern slope of the coastal mountain range in south-central Chile. The area is part of the Chilean Mediterranean biodiversity hotspot, and its native ecosystems are classified as endangered according to the IUCN Red List of Ecosystems (Alaniz et al., 2016).
The climate is listed as warm-summer Mediterranean (Csb) in the Köppen classification. The watershed is classified as water-limited according to Budyko's dryness index (Mcvicar et al., 2012). Native forests are mostly degraded secondary forests consisting of Nothofagus glauca and Nothofagus obliqua and broadleaved sclerophyllous species such as Cryptocarya alba, Quillaja saponaria, Lithraea caustica, Azara petiolaris, and Maytenus boaria. Acacia caven thorny trees mainly dominate the shrublands. We refer to the Acacia ecosystems as shrublands, as in our study area most of the Acacia trees have a growth habitus of shrubs and hardly reach heights above 3 m while the canopy cover is typically low.
The slopes in the area range from 8 % to more than 50 % and reach a maximum elevation of 731 m above sea level. Alfisols and inceptisols soils with textures ranging from sandy loam to clay are commonly present in the area (CIREN (Centro de Información de Recursos Naturales), 1997).

Methods
We merged four types of field inventories and three types of remote sensing data to derive a vegetation biomass map for the study area using random forest models combined with an automated feature selection using the "Variable Selection using Random Forests" (VSURF) algorithm (Genuer et al., 2015). All processing was conducted in R (R Core Team, 2017). The corresponding codes are available on GitHub (Supplementary material 1).

Field data
Biomass reference data were collected during field campaigns in 2018 and 2019. We focused on grasslands, shrublands, pine plantations, and native forests. All field data were collected in circular plots with a radius of 15 m. The plots' size was selected to approximately match the spatial resolution of the applied remote sensing data. Table 1 shows the number of samples and summary statistics for each vegetation type. We attempted to spread the field sampling across the complete study area; however, due to quite severe accessibility restrictions (private, fenced land), some land-use types (particularly grasslands) were sampled at only a few clustered locations. Nevertheless, we ensured that a possibly large gradient of grassland situations -representing the watershed -is covered in the data.
Grasslands -To derive reference biomass estimates for grasslands, we applied a plate-meter instrument along transects located in circular Table 1 Summary of biomass estimates of field plots.
plots with a radius of 15 m. We intersected the circular plots with four transects, one crossing the plot centre in North-South direction, one in East-West direction and two more with a 90 • shift, resulting in 8 rays extending from the plot centre to the perimeter of the plot. Along the rays, we collected eight rising-plate-meter measurements in a 2 m interval and at distances from 1 m to 15 m from the circle. This resulted in a total of 8 * 8 = 64 rising-plate-meter measurements per plot. The rising-plate-meter measures the height of the compressed grass at a given location and hence its canopy resistance. The plate-meter measurements were transformed to biomass values in ton ha − 1 by calibrating the height measurements with a set of destructive samples collected randomly across grassland areas close to our field plots. For the destructive samples, we first conducted the standard plate-meter measurements and noted the height. Then the plate of the plate-meter was used to define the area from which we completely harvested all the aboveground vegetation material. We transported the material in sealed bags to the laboratory and measured the dry weight of the collected plant material after drying the samples for 48 h at a temperature of 50 • C. Finally, we established an allometric equation between the obtained dry-weights and the plate-meter height measurements (Supplementary Material 2). This allometry (Spearman correlation = 0.76; RMSE =1.68 ton ha − 1 ) was then applied to calculate the biomass estimates for the field plots by first deriving a mean biomass estimate for the plate-meter measurements and then scaling these values up to hectare scale using an expansion factor defined as: expansion factor =10,000 m 2 / area of the plate-meter disc in m 2 .
Shrublands -For shrublands, we measured the biomass of shrubs and the surrounding grasslands independently. For shrubs, we followed the instruction of Milla et al. (2013) to estimate the biomass of Acacia Caven shrubs, which was by far the most frequently occurring species in the studied shrublands. We recorded the minimum and maximum crown diameter and height of all living shrubs in the 15 m plot. The minimal crown diameter and the height were measured in the field while the maximum crown diameter was manually measured from unmanned aerial vehicle (UAV; DJI Phantom 4) data. These three metrics were then fed into the allometric equation provided in Milla et al. (2013) to estimate the biomass of the individual shrubs in the plots. The individual estimates were subsequently summed and scaled up to ton ha − 1 values using the same extension factor as described for the forests below. For estimating the biomass of the grass patches between the shrubs, we followed the same approach as for the grassland plots but took measurements only along two transects instead of four. Biomass estimates for shrub and grass patches were then summed to derive the final shrubland biomass reference values.
Plantation and native forests -In forested areas, we recorded the DBH, height, and species of all trees with a DBH greater than 5 cm. For each tree, we used a suitable species-specific allometric equation to derive the aboveground biomass via the tree volume (Drake et al., 2003b;Milla et al., 2013) (Supplementary material 3). For individuals without available biomass or volume functions, we used functions proposed for species of the same genus; if there were no function for species of the same genus, the allometric function of the forest type Roble-Raulí-Coihue was used (Gayoso et al., 2002). The transformation from volume to biomass was made from the specific density of the wood (Mg m − 3 ). Wood specific density data were obtained from the literature (e.g., Gutiérrez and Huth, 2012). Finally, we estimated the biomass values in tons ha -1 by summing the biomass of all trees in a plot and then multiplying the sum with an expansion factor defined as: Expansion factor =10,000 m 2 / pi*15 2

Remote sensing data
To establish the random forest models, we derived predictor variables from two cloud-free Sentinel-2 scenes (downloaded as surface reflectance product), acquired on the 29th of January 2019 (Chilean summer) and on the 23rd of July 2019 (Chilean winter). We included only the spectral bands with 10 and 20 m pixel size (10 bands in total), with wavelengths ranging from 490 to 2190 nm. We further included a digital canopy height model (CHM). The latter was created by subtracting a LiDAR-derived digital terrain model (5 m pixel size) from a TanDEM-X digital surface model (12 m pixel size) provided by DLR (Zink et al., 2014). The TanDEM-X scenes applied to create the DEM product were collected between January 2012 and August 2014. Furthermore, we included a slightly outdated LiDAR-based CHM with 5 m pixel size acquired in 2009 (derived from a discrete point cloud with an average return density of approximately 5 pts m − 2 ).
We derived additional spectral indices and biophysical canopy traits from the Sentinel-2 optical bands. The biophysical canopy traits included estimates for leaf area index (LAI), Chlorophylls, Canopy water content, Fraction of Photosynthetically Active Radiation (FPAR), and Fractional Vegetation Cover (FCV). Trait estimates were obtained using the SNAP toolbox of ESA (http://step.esa.int/main/download/snapdownload/). Meanwhile, the vegetation indices included GNDVI, IRECI, NDI45, NDVI, SAVI, and TNDVI (see Supplementary Material 4). We further added topographic metrics derived from the LiDAR-based digital terrain model, including slope, aspect, the Topographic Water Index (TWI), Channel Network Base Level, Channel Network Distance, Convergence Index, Slope Length factor (LS factor), Plan Curvature and Profile Curvature. These variables relate to the morphometry and geomorphology of the landscape. Thus, they are proxies for soil water retention and may indirectly relate to the amount of biomass an area can accumulate.
The Grey Level Co-Occurrence Matrix (GLCM) textural metrics "mean", "variance" and "dissimilarity" were calculated for all predictor variables with a window size of 3 × 3 pixels to include textural information. We selected a window size of 3 × 3 pixels as we assumed that, given the comparably coarse pixel sizes of the input data (5− 20 m), larger window sizes are unlikely to reveal textural vegetation patterns that are relevant for successfully estimating biomass.

Landcover classification map
We calculated a land-cover classification map containing our four Table 2 Selected predictor variables for each vegetation type. Predictor variables coloured grey were dropped after the GAM analyses. target classes and a set of additional land-cover classes to provide the spatial context needed to obtain a landscape-level biomass map (see Table 3 for a full list of the considered classes). We included optical, topographic and topological information to derive the land-cover map. The optical information consisted of three Sentinel-2 composites of surface reflectance data corresponding to the whole year (median value of all available 2019 scenes), the summer season (median value for all scenes acquired between January 2018 and March 2019), and the winter season (median value for all scenes acquired between June and September 2019). We also included vegetation indices, Tasseled Cap transformations, and textural variables derived from the Grey Level Coocurrence Matrix (GLCM; Haralick et al., 1973). As topographic information, we derived macro-and micro-topographic indices using SAGA GIS (Conrad et al., 2015) with a digital elevation model (DEM; SRTM) obtained from the Google Earth Engine (Gorelick et al., 2017). Finally, we used distances to roads obtained from OpenStreetMap as topological information (OpenStreetMap contributors, 2017) (detailed list of predictors in Supplementary Material 5).
We used random forest (RF) classifications with Recursive Feature Elimination (RFE, Guyon et al., 2002) as variable selection to derive the land cover maps. RF was tuned using the standard 'superClass' function of the Rstoolbox R-package (Leutner et al., 2018).
For training the algorithm, we used 5600 sample points obtained via visual interpretation of high-resolution imageries available from Google Earth Pro. We considered a total of 12 land-cover classes, including the four target classes and 8 additional commonly occurring classes (see Table 3 for number of samples per class).

Remote sensing-based biomass models
Biomass regressions -We used random forest regression models (Breiman, 2001) combined with the VSURF variable selection procedure (Genuer et al., 2015) to relate the biomass reference estimates collected in the field and the available remote sensing data. We additionally used generalized additive models (GAM) (Wood, 2011) for interpreting the relation of the selected variables with the response and for identifying remaining interrelated predictor variables.
The variable selection algorithm VSURF first filters out unimportant predictor variables based on random forest mean variable importance values. Then, an iterative optimization is conducted to select the variables most suitable for predicting the response variable. VSURF suggests two sets of variables, one optimized for interpretation (i.e. some predictors may be redundant but equally important for predicting the response) and another one optimized for prediction (i.e. focusing solely on obtaining a possibly high model fit) (Genuer et al., 2015). Here, we selected the variable subset optimized for prediction accuracy.
Subsequently, we trained random forest models using the VSURF selected predictors, with the number of trees (ntrees) fixed to 500 and the mtry-parameter to number of predictors / 3. With this, we followed the findings of earlier investigations which stated that these standard settings for mtry and ntrees obtain good accuracies in most cases (Oshiro et al., 2012;Probst et al., 2019). We report Spearman correlation, root mean square error (RMSE) and the normalized root mean square error (nRMSE; normalized with the range) between random forest predictions and our biomass reference values. The random forest models were fed with all available samples and the validation was conducted using the internal bootstrap procedure of random forest which from our experience is sufficient to obtain realistic estimates of the model performance.
However, to confirm this, we additionally conducted an iterative validation procedure based on repeated data-splits. In this procedure we randomly selected 80 % of the reference samples as training and the remaining 20 % of the samples as validation data. In this additional validation, spearman correlation coefficients and nRMSE values were calculated using only the 20 % validation data not used during training. This procedure was repeated 100 times to capture the variability in model performances caused by differing combinations of training and validation samples. During the iterative validation procedure, we also always predicted the RF model to the full area to obtain a biomass map for each iteration. This allowed us to calculate the coefficient of variation for each pixel's biomass estimates (standard deviation of the 100 estimates of a pixel divided by the mean of the 100 estimates). The coefficient of variation describes the variability of the model predictions depending on the training data and hence gives an idea of the stability of the model during the 100 iterations (e.g., Lopatin et al., 2016). The final biomass prediction maps for each of the four vegetation types were obtained by training the models with all samples and predicting the model to the full study area using the selected predictor variables. Biomass maps were obtained at a pixel size of 20 m (Sentinel-2 scale) and predictor layers with a higher spatial resolution were resampled to 20 m using the standard settings of the resample()-function of the raster package in R.
To allow for an improved interpretation of the selected predictor variables, we calculated GAM models between the biomass response and the VSURF selected predictors and plotted each predictor's response curves. We applied the GAM algorithm implemented in the R-package "mgcv" (Wood, 2011) using the "REML" (restricted maximum likelihood)" automatic smoothing parameter option. We also activated the internal variable selection to identify unnecessary predictor variables that were not dropped by VSURF. The same workflow was applied to each of the four vegetation types.

Landscape-level biomass map
As the final step of the workflow, the biomass maps of the four vegetation types were merged into a single map. For this, we combined the information of the land-cover classification map (see section 4.3) with the individual biomass maps for grassland, shrubland, native forests and pine plantations (see section 4.4): For each pixel of the landcover classification map, we checked whether it indicated one of the four vegetation type classes. If this was the case, the spatially corresponding biomass estimate of the corresponding vegetation type biomass map was assigned to the pixel (for example, if a pixel was  classified as "shrubland", we filled the pixel with the biomass estimate of the shrubland biomass map at the same location). If the land-cover map indicated a non-relevant land-cover type (e.g., urban area), we filled it with an NA value.

Biomass models
Random forest models for all four vegetation types resulted in high accuracies ( Fig. 2 and 3). Table 2 shows the predictors selected by the VSURF algorithm.
As depicted in Fig. 2 (Fig. 3). The corresponding nRMSE results show a similar behaviour, with moderate stability and nRMSE values ranging between approximately 10 and 27 %. Grasslands show again a slightly higher variability than the other three vegetation types.
Consistently high model accuracies were obtained for all vegetation types when using all samples to build the model with Spearman correlations ranging between 0.8 (native forests) and 0.84 (shrublands) and nRMSE values between 8.9 % (pine plantations) and 15.7 % (shrublands) (Fig. 3). The scatterplots indicate a reasonable fit for all vegetation types with no clear signs of saturation or biases. In forests, pine plantation performed better than native forests, while the two non-forest vegetation types show similar accuracies.
The predictor variables selected by the VSURF algorithm included Sentinel-2 based predictors (original bands, vegetation indices, or texture metrics) for all vegetation types. Canopy height information was selected for the two forest models and the shrublands. For the pine plantations, Sentinel-2 predictors corresponding to the summer acquisition were selected, while for the native forests only Sentinel-2 predictors from the winter acquisition were selected. Only one biophysical indicators (Fractional vegetation Cover) and one topographic metric (Profile curvature) were selected, both for the shrubland biomass model.

Land-cover map
The land cover classification reached an overall accuracy (OA) of 0.83 (±0.15) and a kappa index of 0.81. Table 3 shows the corresponding confusion matrix, while Table 4 depicts detailed accuracy measures per class. The summarized version of the map showing only the four vegetation types considered in the study is presented in Fig. 1.
The land-cover classification generally shows high accuracy, with better class-specific accuracy for pine plantations and native forests than for the shrubland and grassland classes, which depicted some confusion between each other. The spatial distribution of the vegetation types in the watershed is clearly reflected in the biomass map. High biomass predictions in the western (and partly southern) area correspond to pine plantations and native forests. Meanwhile, lower and intermediate biomass predictions in the central and eastern areas correspond to grasslands and shrublands (Fig. 4).

Landscape biomass map
The variability of the biomass estimates of the individual pixels obtained during the iterative model runs is low throughout the largest part of the study area, with most areas showing a variability of below 10 % (Fig. 4, right panel).

Discussion
We demonstrated a workflow to estimate aboveground vegetation biomass at the landscape level across several vegetation types using a combination of spectral and structural information derived from  satellite and airborne remote sensing data. The workflow proved to deliver sound results for all examined vegetation types, reaching high accuracies in terms of Spearman correlation and nRMSE, and depicting no obvious biases or saturation effects.
The less accurate results for the native forests compared to pine plantations agree with those of earlier investigations in Chilean native forests (e.g., Maack et al., 2015;Fassnacht et al., 2014). This may relate to a saturation of the height ~ biomass relationship in these forests.

Fig. 2.
Violin plots indicating the distribution of the performance metrics obtained during the iterative validation procedure (100 runs with splits into 80 % training and 20 % validation samples). Black dots illustrate individual biomass estimates, boxplots inside violin plots indicates the percentiles of the distribution values (5, 25, 50, 75, 95 %), and the width of the violin indicates the distribution of the values. The results refer to random forest models trained with predictor variables summarized in Table 3 and after dropping the unnecessary variables identified with the GAM analysis. Fig. 3. Scatterplots between predicted and observed biomass values for the four considered vegetation types and corresponding model performances of the models assessed using all samples. The results refer to random forest models trained with predictor variables summarized in Table 3 and after dropping the unnecessary variables identified with the GAM analysis.
Once established in the dominant canopy layer, the broadleaved trees of the Nothofagus forests tend to accumulate more biomass in the stems and invest given resources in the broadening of the tree crown and less to further increase their height. Furthermore, the native forests are generally more heterogeneous in terms of structure and species diversity (Braun et al., 2017), which complicates the accurate prediction of their biomass. Moreover, species-specific allometric equations were missing for a few of the native species. This may have contributed to a slightly higher uncertainty in the reference biomass estimates as compared to the reference estimates for Pinus radiata.
The results for the pine plantations showed the lowest nRMSE value and second highest Spearman correlation. This is expected, as the examined plantations consist of only a single coniferous tree species (Pinus radiata) planted in homogeneous and regular stands. In these stands, height is an accurate predictor for the overall size of trees, and corresponding good model performances have been reported in numerous comparable settings using height metrics derived from LiDAR data and other sources (e.g., González-Ferreiro et al., 2012;Tojal et al., 2019). Even higher accuracies are possible to estimate pine forests' biomass if the temporal off-set between the acquisition of the field and the structural remote sensing data is smaller than in this study (e.g., Navarro-Cerrillo et al., 2017). Hence, the results for both forest types could likely be further improved by replacing the applied LiDAR and TanDEM-X CHM with a CHM that better matches the dates of the field campaigns. However, access to high spatial resolution CHMs across larger areas is still limited by acquisition costs as most established options to collect such data over large areas (LiDAR or photogrammetric based point clouds) require airborne campaigns. The latter are costly or even not possible at all if the corresponding sensor systems are not available. This may be a common situation in many economically less developed countries and rural areas. Very high resolution stereo-satellite images can be used as an alternative, globally available data-source to derive high spatial resolution digital elevation models (Fassnacht et al., 2017). However, the extraction of a corresponding digital terrain model which is required to calculate canopy heights is challenging with these data. It may, however, be possible in areas with subtle topography and comparably sparse vegetation cover (see for example Hosseini et al., 2020).
The performances of the models for shrublands were in the same range as reported for earlier studies focusing on Mediterranean and xeric shrublands and applying exclusively Sentinel-2 data (e.g., Aranha et al., 2020), Sentinel-1 and PALSAR data (Chang et al., 2018) and LiDAR data (Li et al., 2017). The same applies to the grassland models that performed similarly or even better as earlier studies which estimated grassland biomass using Sentinel-2 data alone (Filho et al., 2020) or combined with Sentinel-1 and other optical sensors (Naidoo et al., 2019;Wang et al., 2019).
The comparably high variability of the performance metrics during the iterative model runs, particularly for the grassland models (Fig. 2), is likely to be an effect of the comparably low sample size available in this study. Grasslands and shrublands have the overall lowest number of samples in our study. Other studies in a similar setting have also found a positive relationship between sample size and model performance stability (e.g., Fassnacht et al., 2014, Fassnacht et al. 2018) and accuracy (e.g., Sterenczak et al., 2018). Due to the high time effort required to collect reference samples, particularly when focusing on more than one vegetation type, we could not sample more reference data in our study. However, the median values of the performance metrics in the iterative validations agree reasonably well with the random forest validations obtained when using all samples to build the models (Fig. 3). This indicates that the reported model accuracies are realistic.
The suggested workflow efficiently selected a sparse subset from the original predictor space consisting of 216 variables. In the cases of grasslands and pine plantations, the variable sets selected by VSURF consisted of only two and three uncorrelated predictors, which led to good random forest model performances, exceeding the performance of random forest models trained with all predictors (results not shown). This generally confirms the efficiency of the VSURF algorithm in identifying sparse subsets of well-performing and meaningful predictors (e. g., Chavent et al., 2019;Speiser et al., 2019). However, for the native forest and shrubland models, the additional GAM analysis enabled us to identify additional intercorrelated variables and to reduce the final set of predictors to 3 and 4 variables for native forests and shrublands, respectively. The corresponding random forest models showed no loss in performance compared to the models using all VSURF selected variables (results not shown). Hence, this combination of VSURF and a subsequent verification of the selected variables with GAM seems to be an efficient approach to identify a sparse set of predictor variables to build parsimonious, well-performing models.
Comparing the variables selected for each forest type, the most important variables for both forests were canopy height metrics. This confirms the frequently reported importance of height metrics for estimating forest biomass with remote sensing devices (e.g., St-Onge and Vega, 2008;Tonolli et al., 2011). The corresponding response curves indicate that the canopy height contributed in a close to linear and positive way to explain the biomass values (Table 2, Supplementary Material 6). Interestingly, in the model for native forests the comparably outdated LiDAR CHM got selected rather than the more recent TanDEM-X CHM. We assume that in native forests, the higher spatial resolution of the LiDAR CHM was crucial for accurately predicting biomass by delivering information about small-scale variability in height that could not be depicted in the coarser spatial resolution TanDEM-X CHM (© DLR). Due to the comparably slow growing speed of the native forests, the acquisition date of the LiDAR CHM (10 years off-set) may have been less problematic than for the pine plantations in which no predictor related to the LiDAR DEM got selected. This seems plausible, given the fast growth of pine plantations and the corresponding short rotation periods of approximately 22-years (Jélvez et al., 1990).
For native forests, a Sentinel-2 band from winter and a texture variable were selected along with the CHM, while in pine plantations only Sentinel-2 bands of the summer image remained in the final model. One reason for this might be that native forests in this part of Chile consist of a mixture of evergreen and deciduous tree species (Frank and Finckh, 1999). Hence, winter images may have contributed to differentiate better these species types and hence mirror the species composition of the stands, which may support the accurate estimation of the local biomass. Some earlier studies reported that species or forest type information can improve remote sensing-based biomass estimations in forests (e.g., Anderson et al., 2008;Fassnacht et al., 2017). In the native forests, the texture metric based on the GNDVI indicates that information about the canopy's homogeneity may have also contributed to explain the variation in biomass. The negative correlation of the GNDVI variance with biomass (Table 2) suggests that more homogeneous canopies may relate to older native forests with corresponding higher biomass values. However, the corresponding response curves do not show a truly clear trend (Supplementary Material 6).
In case of the plantation forests, summer reflectance may have supported the discrimination between stands with similar canopy heights but differing canopy densities as the selected SWIR band (S2B11-summer, wavelength at 1610 nm) should contain information about canopy leaf structure, and the red edge band (S2-B6-summer, wavelength at 740 nm) represents general vegetation vigor and density (Ollinger, 2011). Both bands show a negative correlation with biomass (Table 2). While for the SWIR band the negative trend is clearly depicted in the response curves, the red edge band shows a more hump-shaped pattern (supplementary material 6). The trend of the SWIR band is likely related to a reduced crown density in older plantations with higher biomass as compared to very dense, younger plantations with lower overall biomass. The hump-shaped pattern in the red edge band may be caused by very young plantations with low biomass values where crown closure has not yet been reached. Hence, both very young and very old plantation may show an increased influence of the soil signal and a corresponding lower reflectance in the NIR. However, a more detailed analysis, would be required to draw final conclusions on this.
Contrary to the results from the other vegetation types, CHM variables were not selected in grasslands. The limited ability of height metrics from LiDAR data to estimate herbaceous biomass has been reported before (e.g., Li et al., 2017). Our results suggest that reasonable biomass estimates for grasslands can be derived using a combination of field reference data and Sentinel-2 data only. The most important identified predictor is the summer SWIR2 band of Sentinel-2 (2190 nm). The SWIR2 band shows a very high and linear negative correlation with the grassland biomass (Table 2, Supplementary Material 6). We believe that this is due to the sensitivity of the SWIR region to non-photosynthetic vegetation (NPV). In our study area, most grassland areas have dried out by late spring. Therefore, most of the standing grass is not photosynthetically active, and typical greenness indices using the red and near infrared region are less suitable to quantify its biomass. NPV typically has a high lignin and cellulose content, which affects the SWIR reflectance (Li and Guo, 2018). Hence, the high sensitivity of the SWIR to the biomass observed here seems plausible. The second selected band is a red edge band (Band 5, 705 nm) from the winter acquisition, which also shows a negative correlation with biomass (Table 2).
However, its corresponding response curve suggests a more complex relationship (Supplementary material 6), which would require a more detailed analysis to draw final conclusions. One assumption is that the greenness information of the red edge band may help to track differences in the species composition of the grasslands and thereby contribute to explain some of the variance that cannot be explained by the NPV signal captured in the SWIR band. Moreover, studies focusing on photosynthetically active grasslands typically report the highest importances for NIR bands and vegetation indices incorporating NIR bands (e.g., Filho et al. 2019, Sibanda et al., 2015. Some of the grassland areas in the study areas may actually have higher photosynthetic activity in winter than in summer due to the increased precipitation in winter. Future studies may also examine the added value of integrating Sentinel-1 data, which were for example found suitable to estimate grassland biomass by Wang et al. (2019).
In the shrubland model, four variables were selected. The most important variable is the Sentinel-2 based fractional vegetation cover (summer acquisition), which shows a strong positive correlation with biomass (Table 2, supplementary material 6). We assume that the fractional vegetation cover is a good indicator of the green shrub cover in contrast to the dried-out grassland background. Percent cover of shrubs was also reported as the best predictor to estimate biomass by Li et al. (2015). Other studies focusing on dry shrublands furthermore identified NDVI -which also relates to fractional vegetation cover -as good indicator to estimate shrubland biomass (e.g., Aranha et al., 2020;Viana et al., 2012). The second most important predictor is the LiDAR-based CHM. Similarly, as in the case of the native forests, the higher spatial resolution of the LiDAR CHM may have contributed to its selection over the more up-to-date TanDEM-X CHM. The CHM predictor also shows a positive relationship with biomass, which seems plausible as higher heights indicate larger shrubs with higher biomass. Earlier studies also suggested that canopy volume is a good indicator of biomass (Ni-Meister et al., 2010;Li et al., 2015;Olsoy et al., 2014;Greaves et al., 2015as discussed in Li et al., 2017. The combination of the two most important predictors of this study, fractional cover and canopy height, can together be interpreted as an approximation of shrub volume and hence confirm these earlier observations. The third selected variable is a texture metric (variance of the red-edge band 6 (740 nm) from the S2 summer acquisition). It shows a negative correlation with biomass, which suggests that a higher spectral variability is related to lower biomass values. This may be related to the homogeneity of shrub cover in the plots where plots with very high and hence homogeneous cover also reach the highest biomass values. Finally, the selected topographic metric "Profile Curvature" mainly depicts larger topographical features at the applied spatial resolution and is hence unlikely to be directly related to the structure of the shrubs themselves. A visual inspection suggests that the metric may have contributed to separate denser, older shrublands which are often found in more pronounced terrain situations from younger and sparser shrublands, located often in flatter areas.
The final landscape-level biomass map shows realistic patterns of biomass distribution with coarse patterns following the land-cover mosaic of the study area, but also showing smaller scale patterns related to the variability of biomass within the individual vegetation types. Fig. 5 depicts examples of apparent differences in biomass ranges between the two forest types and the two non-forest vegetation types (e. g., bottom row). Here, we observe a cut in the landscape between pine plantations in the West and a more open landscape in the East. However, more subtle differences in biomass can also be differentiated. For example, in the open landscape, a slightly higher biomass accumulation is apparent along the river in the Eastern part of the image, which may relate to a higher density of shrubs in the shrublands close to the river. In the middle row of Fig. 5, several larger patches of native forests and pine plantations with varying biomass quantities are displayed. The general impression of the patterns agrees with the BING high-resolution RGB satellite image which shows forests with larger crowns in areas with higher biomass. In the top row, biomass differences between younger second-growth native forests and older pine plantations are well depicted. Some deviations between the visual impression of the RGB BING images and the depicted biomass maps may be explained with time gaps between the BING images and our biomass estimations. For example, the subsets in the middle row of Fig. 5 shows comparably high biomass estimates for plantation areas at the eastern border of the image. In contrast, the RGB image indicates a recently harvested area.
One drawback of our landscape-level biomass map is that we did not include all vegetation types occurring in the watershed. We omitted Eucalyptus plantations and wetlands due to missing field reference data. In the given case, we think the omission of these classes is acceptable as their overall coverage (154 km 2 or 9.5 %) in the watershed is limited in comparison to the considered vegetation types (1184 km 2 or 72.8 %). Agricultural areas were also omitted but would also generally be challenging to integrate in our static map as the seasonal dynamics of crops is very high. Nevertheless, many earlier studies showed that biomass of crops can be estimated from optical remote sensing data such as Sentinel-2, particularly when coupled with crop growth models (e.g., Battude et al., 2016;Novelli et al., 2019). It may hence be possible to integrate such approaches with the workflow presented here.
The presented workflow and the resulting biomass maps at the landscape level can be useful for various applications, including the management of water resources. Water is a limited resource in the semiarid Mediterranean climate of our study area, and certain vegetation types consume more water than others. In the given case, the fastgrowing pine plantations have high evapotranspiration rates and a corresponding high water consumption (Huber et al., 2008). Contrarily, the transpiration and water consumption rates of the native forests, and especially the shrublands, are notably lower (Olivera-Guerra et al., 2014). However, not only the vegetation type but also the age, the density, and hence the corresponding biomass of forests (which correlates with the former two) influence transpiration rates (Ewers et al., 2005;Forester, 2015). Large scale estimates of water use related to transpiration can be obtained with process-based numerical eco-hydrological models. Such models can simulate all water balance components, sediments and biomass. The Soil Water Assessment Tool (SWAT, Arnold et al., 2012) is a widely used example. Fine-grain landscape-level biomass maps could help to constrain such models to deliver more realistic representations of eco-hydrological processes within the various vegetation types of a watershed. This is especially relevant for those vegetation types where parameter values are usually unknown, such as native vegetation communities. The latter are often represented by default settings, which only differentiate a few coarse vegetation types. Yang et al. (2018b) demonstrated that an improved parameterization of forest-related variables in SWAT led to improved estimates for key ecosystem variables, such as evapotranspiration. Accurate outputs of such detailed ecohydrological models can be useful for landscape-planning measures, benefiting for example local agricultural F.E. Fassnacht et al. and silvicultural activities. In the Cauquenes area, understanding the spatio-temporal patterns of water use and biomass production of the different vegetation types using eco-hydrological models can deliver important information to adequately project future biomass productions under global change scenarios. This is, for example, relevant to the current Chilean Nationally Determined Contributions (NDC) defined for the ratification of the Paris Agreement (Government of Chile, 2015). Further, it can help to understand how a plausible transformation of the landscape composition and configuration (e.g., the reduction or transformation of pine plantations to other vegetation types) can modify water yield and benefit agricultural activities and the few remaining natural vegetation areas which are severely threatened (Alaniz et al., 2016).
For such applications, but also for other potential application fields such as carbon inventories, the mono-temporal biomass map presented in this study is useful, but adding a temporal perspective by providing regularly updated biomass maps would further increase the application potential. Multi-temporal maps would allow to trace changes of the overall biomass and its spatial distribution in the region and enable the quantification of biomass growth rates in different vegetation types and age-classes.
Here, only the biomass models for grasslands exclusively used freely and regularly available Sentinel-2 satellite data and could hence easily be retrieved for more than one point in time. Albeit, even for grasslands some additional field data may be required. Contrarily, the biomass models for the three other vegetation types included height metrics derived from LiDAR and TanDEM-X datasets. The latter are still not regularly and freely available for most parts of the world and a transition towards multi-temporal biomass maps may hence be more challenging. However, in some countries regular orthophoto campaigns are conducted by administrative institutions. With the fast development in the sector of photogrammetric algorithms, point clouds derived from such orthophotos could be a valuable data source to provide structural information required to regularly obtain biomass maps at a fine spatial grain and at low additional cost. However, such approaches may currently be restricted to highly developed countries with the corresponding budgets to conduct regular airborne campaigns. This is unfortunate as the largest uncertainties concerning biomass and carbon stocks often exist in areas with less developed economies or in remote areas such as the tropics (Houghton, 2005). New space missions including, for example, ESA's BIOMASS Radar mission (Arcioni et al., 2012) as well as the combination of free optical Sentinel-2 data with structural data collected by recent spaceborne LiDAR missions such as GEDI (e.g., Hancock et al., 2019) and Icesat-2 (Abdalati et al., 2010) may provide opportunities to develop new workflows towards regularly updated landscape-level biomass maps at a global level and independent from airborne campaigns. Albeit, the latter satellite-based products may come at a coarser spatial grain and data continuity is still a challenge.

Conclusions
We developed a workflow to estimate vegetation biomass in a heterogeneous Mediterranean landscape by combining optical and structural remote sensing data with vegetation-type specific biomass estimates collected in the field. Our workflow successfully estimated biomass across four different vegetation types, including grasslands, shrublands, pine plantation, and native forests. We merged the prediction maps of the individual biomass models into a landscape-level biomass map by integrating the biomass maps of individual vegetation types with a remote sensing-based land-cover map.
We found that biomass of grassland ecosystems can be estimated using optical data only, while ecosystems dominated by woody-species benefit from extra information on vegetation structure. This complicates the future multi-temporal estimation of biomass at the landscapelevel as freely available satellite-based structure and height data is still scarce and at coarse resolution in many parts of the world.
Landscape-level biomass maps at fine spatial grain as presented here are relevant for ecosystem and hydrological modeling applications, such as the refinement of water management strategies and carbon assessment and reporting.

Declaration of Competing Interest
The authors report no declarations of interest.