Inferring plant functional diversity from space: the potential of Sentinel-2

it is important to understand how and why FD varies within and between ecosystems, along resources avail- ability gradients and climate gradients, and across vegetation successional stages. Usually, FD is assessed through labor-intensive field measurements, while assessing FD from space may provide a way to monitor global FD changes in a consistent, time and resource efficient way. The potential of operational satellites for inferring FD, however, remains to be demonstrated. Here we studied the relationships between FD and spectral reflectance measurements taken by ESA's Sentinel-2 satellite over 117 field plots located in 6 European countries, with 46 plots having in-situ sampled leaf traits and the other 71 using traits from the TRY database. These field plots represent major European forest types, from boreal forests in Finland to Mediterranean mixed forests in Spain. Based on in-situ data collected in 2013 we computed functional dispersion (FDis), a measure of FD, using foliar and whole-plant traits of known ecological significance. These included five foliar traits: leaf nitrogen concentration (N%), leaf carbon concentration (%C), specific leaf area (SLA), leaf dry matter content (LDMC), leaf area (LA). In addition they included three whole-plant traits: tree height (H), crown cross-sectional area (CCSA), and diameter-at-breast-height (DBH). We applied partial least squares regression using Sentinel-2 surface reflectance measured in 2015 as predictive variables to model in-situ FDis mea- surements. We predicted, in cross-validation, 55% of the variation in the observed FDis. We also showed that the red-edge, near infrared and shortwave infrared regions of Sentinel-2 are more important than the visible region for pre- dicting FDis. An initial 30-m resolution mapping of FDis revealed large local FDis variation within each forest type. The novelty of this study is the effective integration of spaceborne and in-situ measurements at a continental scale, and hence represents a key step towards achieving rapid global biodiversity monitoring schemes.


Introduction
Plant functional diversity (FD hereafter), defined as the range and dispersion of those plant traits within a community, landscape, or even larger spatial scales that are functionally relevant for growth, reproduction, and survival, is an important component of biodiversity (Tilman, 2001;Petchey and Gaston, 2006;Villéger et al., 2008;Laliberté and Legendre, 2010). Evidence is showing that FD strongly determines ecosystem functioning and stability (Tilman et al., 1997;Díaz and Cabido, 2001;Hooper et al., 2005;Paquette and Messier, 2011;Ruiz-Benito et al., 2014), and also regulates various ecosystem services (e.g., fodder production and maintenance of soil fertility) that underpin human well-being (Díaz et al., 2007a).
Given the importance of FD for ecosystem functioning, it is pivotal to understand its variation across space and time with high accuracy. FD can vary with a large number of drivers, including climatic or geographic gradients (de Bello et al., 2006;Lamanna et al., 2014), successional stages (Purschke et al., 2013), disturbance and land use change (Trejo et al., 2016), climate variability (Gherardi and Sala, 2015), and also topography and soil types (Schneider et al., 2017), making it challenging to extrapolate findings from one site to another without having adequate reference data about spatial and temporal FD variation. Being able to characterize spatiotemporal variation in FD is not only crucial for achieving global biodiversity monitoring (Díaz et al., 2007b;Jetz et al., 2016), but also for improving predictions on how future climate change will affect ecosystem functioning and ecosystem services (Scheiter et al., 2013;Fisher et al., 2018). However, ground-based measurement of plant traits is labor intensive, and it is logistically challenging to perform these measurements spatially continuously over a large area or to repeat these measurements through time. Available regional data on FD are rare and, due to the lack of repeated measurements, not suitable for addressing long-term trends of large-scale patterns (Scholes et al., 2008). As such, there is an urgent need for an integrated system that can effectively and consistently monitor FD globally (Turner, 2014;Pettorelli et al., 2016;Jetz et al., 2016;Anderson-Teixeira et al., 2015).
Recently there have been several efforts made to infer FD over local, regional or global scales using either ground-based or airborne hyperspectral remote sensing measurements (e.g., Schweiger et al., 2018;Asner et al., 2014;Schneider et al., 2017;Asner et al., 2017), or other (e.g., climate and soil) data inputs (e.g., Butler et al., 2017). The rationale underlying the use of remote sensing measurements for estimating plant FD is that phylogenetic differences and resources limitations can affect plant traits (leaf and structural) and these in turn can affect spectral reflectance measured by optical remote sensing . Schneider et al., 2017 mapped three forest leaf biochemical traits, including leaf chlorophyll, leaf carotenoids, and equivalent water thickness, in a forest study area using airborne hyperspectral measurements and computed several functional diversity measures based on these traits. Asner et al. (2017) mapped seven forest canopy traits, including leaf nitrogen, phosphorus, calcium, phenols, lignin, water, and leaf mass per area, over Peruvian Andes-to-Amazon region and then used these traits to classify forests into different functional groups. Using field hyperspectral measurements at a grassland site, Schweiger et al. (2018) showed that spectral diversity is highly associated with functional diversity computed using 14 foliar traits (including the contents of leaf carbon, nitrogen, carbon fractions, chlorophylls, xanthophylls and carotenoids) and hence demonstrated the potential of using remote sensing measurements to directly infer FD instead of retrieving each individual trait first. Butler et al. (2017) generated global maps of three key functional traits, specific leaf area, leaf nitrogen content, and leaf phosphors content, and then computed their statistical dispersion within any 50 km × 50 km grid, with a Bayesian modeling framework driven by climate and soil data. Other recent studies that have attempted to map individual functional traits using remote sensing, with either statistical approach or radiative-transfer modeling, also showed success in mapping leaf mass per area (Mohammed Ali et al., 2017;Singh et al., 2015), leaf nitrogen concentration (Asner et al., 2015;Knyazikhin et al., 2013), and leaf chlorophyll content (Inoue et al., 2016;Gitelson and Merzlyak, 1997). These studies have provided important methodological advances in mapping FD across space and their results provide an efficient basis for investigating biodiversity-ecosystem functioning relationships across large environmental gradients.
Despite recent progress, critical data and knowledge gaps in mapping global FD pattern remain. First, while airborne hyperspectral instruments can yield high-resolution maps of spatial FD variation across small spatial and/or temporal scales (e.g., Wang et al., 2018), acquiring airborne data repeatedly over larger extents is often too costly. Second, while modeling studies based on climate and soil data can provide large-scale predictions of FD (dispersion and range in plant traits) patterns, the accuracy and resolution of the modeled maps are limited by large uncertainties in the input meteorology and soil data interpolated from point measurements (Harris et al., 2014;Wu et al., 2017;Batjes, 2016), and limited knowledge on the relations between climate, soil, and FD (Bruelheide et al., 2018). Satellite RS data are spatially continuous but their resolutions are often too coarse to detect fine scale soil and vegetation processes, and currently discoverable in-situ ecological data are not spatially representative for large scale ecological gradients. As such, a dedicated global Earth Observation network with multiple spaceborne platforms (e.g., hyperspectral, LiDAR, radar), in conjunction with high-quality ground reference data from established ecological networks (e.g., NEON, TERN, NutNet, ForestNet etc.) for RS model calibration and validation, would be critical to achieve the objective of tracking spatial and temporal changers in multiple facets of functional biodiversity globally (Turner, 2014;Stavros et al., 2017;Schimel et al., 2019). Thus, current approaches focus either on highresolution mapping of FD over a small spatial extent or on a global mapping of FD at a much lower resolution. Here, we propose that recent developments in multispectral spaceborne remote sensing can contribute to overcoming this trade-off between a high spatial resolution and a high spatial extent and may additionally allow for tracking temporal FD changes. These new satellite measurements, if coupled with in-situ plant FD data collected systematically over large scales, can greatly facilitate the remote sensing of FD.
Spaceborne remote sensing can scan the entire Earth surface repeatedly, providing a unique data stream that can be exploited to monitor global FD variation across space and time (Jetz et al., 2016;Lausch et al., 2016;Rocchini et al., 2016Rocchini et al., , 2018. In recognizing the potential offered by satellite data, the ecology and the remote sensing communities, supported by the Group on Earth Observation Biodiversity Observation Network (GEO-BON), have agreed on a list of Essential Biodiversity Variables (EBVs), including functional diversity, that can be measured from the ground and can be tracked from space (Scholes et al., 2008;Pereira et al., 2013;Turner, 2014;Skidmore and Pettorelli, 2015;Rocchini et al., 2016Rocchini et al., , 2018Kuenzel et al., 2014).
Recent developments in spaceborne remote sensing technologies enable promising opportunities to link satellite measurements to FD measured on the ground. Sentinel-2, a new satellite constellation of the European Copernicus programme, offers improved spatial (up to 10 m), spectral (13 bands in visible, red-edge, near infrared and shortwave infrared), and temporal resolutions (3-5 days revisit) from previous satellites (Drusch et al., 2012). Sentinel-2 provides a good compromise between the spatial resolution, which is needed to link with field plot measurements, and spectral resolution, which is required to infer FD computed with multiple functional traits. For instance, Sentinel-2 has three bands between the red and NIR regions, known as the red-edge, which are sensitive to important leaf biochemical traits such as leaf nitrogen concentration (Papale et al., 2008;Clevers and Gitelson, 2013;Pérez-Priego et al., 2015). Nevertheless, challenges still exist, as Sentinel-2 cannot be compared to hyperspectral instruments that usually carry hundreds of spectral bands, something that may be termed as the spectral scale challenge . Besides, due to the still limited spatial resolution (relative to individual tree crown size), it would be challenging to use Sentinel-2 measurements to detect species diversity as each pixel can consist of multiple individuals from different species, or more precisely being described as the spatial scale problem (Wang and Gamon, 2019) as well as the presence of soil background (Gholizadeh et al., 2018;Wang et al., 2018).
Here we performed a study to statistically link spectral reflectance measurements taken by ESA's Sentinel-2 satellite directly to FD derived from in-situ measurements in plot networks spanning tree species diversity gradients in six representative European forest types. Our overarching aim is to use spaceborne measurements to track changes in FD across space. Specifically, our objectives are: 1) to investigate the statistical relationship between FD and spectral reflectance measurements taken by Sentinel-2 and to quantify relative importance of the different Sentinel-2 bands for predicting FD; 2) to explore the potential of using compiled trait database to complement in-situ trait measurements; and 3) to calibrate and validate a statistical model to quantify FD at high spatial resolution (30 × 30 m).

FunDivEUROPE plot network
We used field data collected from 117 plots of 30 × 30 m that were set up in mature forest plots in six regions across Europe as part of the FunDivEUROPE project (http://www.fundiveurope.eu) ( Fig. 1; Table 1) (Baeten et al., 2013). The regions represent six of common but contrasted forest types in Europe. Forest types covered include boreal (Finland, 9 plots); hemiboreal (Poland, 34 plots), mixed beech (Germany, 26 plots), mountainous mixed beech (Romania, 19 plots), thermophilous deciduous (Italy, 10 plots), and Mediterranean mixed (Spain,19 plots). Detailed information about all plots used in this study, such as coordinates, dominant species, and in-situ trait sampling date is provided in Table S3 in the Supplementary file. Within each forest region, the plots were established to span diversity gradients of the regional pool of dominant tree species (with a maximum of five species mixtures). In addition, the design ensured that (i) within each richness level, multiple different species compositions were represented, (ii) each species was represented at more or less the same relative frequency at each diversity level (Baeten et al., 2013). This resulting tree species gradient is therefore ideally suited to calibrate and assess taxonomic and functional diversity and identity using remote sensing imagery.
Across all plots we selected, species with traits measured from insitu account for > 95% of cumulative abundance of each plot. This minimizes the influence of the presence of other species which their traits were not measured on our results. In each plot, all trees with diameter at breast height (DBH) ≥ 7.5 cm were identified to species level and measured for tree height (H), DBH, and crown cross-sectional area (CCSA). Fig. 2 illustrates the general plot design and a ground-level photograph of one of the Mediterranean mixed forest plots in Spain.

In-situ leaf trait measurement
Leaf biochemical (leaf nitrogen concentration, %N; leaf carbon concentration, %C), morphological (leaf area, LA, mm 2 ), and leaf anatomical traits (specific leaf area, SLA, mm 2 mg − > 1 ; leaf dry matter content, LDMC, mg g − > 1 ) of well-known significance to plant growth and functioning such as carbon fixation (Cornelissen et al. 2003) were measured on the dominant tree species growing in all plots located in three of the six FunDivEUROPE regions, namely in the boreal forest (Finland), in the mountainous beech forest (Romania), and in the Mediterranean mixed forest (Spain) ( Table 2).
Ten trees per species were randomly selected in each plot. From each target tree, a branch from the top part of the crown facing south was cut and leaves/needles were collected for trait measurements. A total of 1763 trees were sampled. To achieve phenological standardization, fieldwork was carried out once leaves were completely developed within the growing season, thus it took place in August 2013 in Finland, July 2013 in Romania, and June 2013 in Spain.
The aim of selecting traits for assessing FD is that the traits selected should be significantly important for the ecosystem process of interest. It is not practical to sample all traits and hence a strategy needs to be implemented to focus on several important traits and then sample them consistently across all species and all plots. In our case, we are particularly interested in ecosystem C and H 2 0 cycling, which can be approximated by primary productivity and evapotranspiration. Among the five foliar traits we selected, three of them (%N, SLA, and LA) have been identified as critical to performance and fitness of vascular plant species including growth, survival and reproduction (Diaz et al., 2016).
It is well known that leaf %N is closely related with leaf maximum carbon fixation rate and community aggregated %N also scales linearly with ecosystem carbon fixation capacity (Ollinger et al., 2008;Musavi et al., 2016). Furthermore, SLA is often positively related to potential relative growth rate across species and negatively with leaf longevity, and exerts influences on ecosystem-level productivity (Diaz et al., 2016). Leaf %C is essential for converting estimates of forest aboveground biomass into forest carbon stocks (Thomas and Martin, 2012). LDMC has been shown to correlate negatively with potential relative growth rate and positively with leaf lifespan. Lastly, LA has been shown to be related to climate variation and site disturbance regimes (Pérez-Harguindeguy et al., 2011). We randomly selected ten (when present) individuals per dominant species in each plot. From each target tree, a branch from the top part of the crown facing south was harvested and leaves/needles were collected for trait measurements. A total of 1137 trees were sampled. To assess the morphological traits (LA, SLA, and LDMC), five rehydrated leaves were weighed and scanned to measure their area using WinFOLIA and WinSEEDLE for broadleaves and needles, respectively (Regent Instruments Inc., Canada). Then, leaf samples were dried in an oven at 60°C for 72 h and their dry mass was weighed. Additional leaves were also dried and ground in order to estimate the N and C leaf content (in %) using a dry combustion method (Vario EL cube, Elementar Analysensysteme GmbH, Hanau, Germany). Leaf collection, storage, processing, and trait measurement followed the protocols   defined in Garnier et al. (2001) and Pérez-Harguindeguy et al. (2011). Detailed description of in-situ trait sampling procedures can be found in Benavides et al. (2019) and in Sec. 1.1 in the Supplementary file. An exploratory analysis of in-situ traits across FunDivEUROPE plots is provided in Sec. 1.5 in Supplementary file. For the other 3 regions (Poland, Germany, and Italy) with no in-situ data for leaf traits, we tested the potential of filling these gaps using the TRY plant trait database (www.try-db.org) (Kattge et al., 2011). We used data from the TRY-database version 3.0 (first released in 2015).
We computed global average of each trait for each species. An examination of the TRY database indicates that these European tree species have a fairly large number of entries submitted to the TRY database. Trait data were quality checked and filtered as described on the TRY website (https://www.try-db.org/TryWeb/Database.php). Missing data in the TRY trait-species matrix were filled based on the method described in Schrodt et al. (2015). After gap-filling, we computed global average of each trait for each species. We then compared species mean values of each trait from TRY with species mean value from in-situ measurements for the three FunDivEUROPE regions where we have in-situ trait sampling.

Sentinel-2 spectral reflectance measurements
We used spectral reflectance measurements taken by the first Sentinel-2 satellite (Sentinel-2A) launched by the European Space Agency. Sentinel-2A is a wide-swath, high-resolution (10-20 m), multispectral (13 bands), and multi-temporal (5-10 days revisit) imaging mission (Drusch et al., 2012). The Sentinel-2A sensor was launched in June 2015 and started collecting measurements in July 2015. A total of 10 spectral bands were included in this study, four bands at 10-m resolution, and six bands at 20-m resolution (Table 3). Detailed band configuration could be found from ESA's official website: https://earth. esa.int/web/sentinel/user-guides/sentinel-2-msi/resolutions/spatial. The 20-m bands were down-scaled to 10-m using a band sharpening algorithm proposed by Brodu (2017) (refer to Sec. 1.2 in Supplementary file for more details about the band-sharpening algorithm). We did not use the three bands at 60-m spatial resolution (B01, B09, B10), which were designed for atmospheric correction and cirrus cloud detection (B01 for aerosol retrieval, B09 for water vapor correction, and B10 for cirrus detection) (Drusch et al., 2012).
We processed Sentinel-2 data for the same phenological period as when in-situ measurements were conducted (August 2015 for Finland, July 2015 for Romania, Spain, Poland, Germany, and Italy respectively). Sentinel-2 was accessed from ESA's SciHub online data repository. Here we summarize the main steps involved in data preprocessing (more detailed documentation of data processing can be found in Sec. 1.2 Detailed description of Sentinel-2 data processing procedures in the Supplementary file): 1) applied the atmospheric correction to convert Level-1C top-of-atmosphere (TOA) reflectance into Level-2A top-of-canopy (TOC) surface reflectance using the atmospheric correction module in ESA's Sen2Cor toolbox; 2) classified images into different scene classifications (e.g., cloud, snow, soil, vegetation etc.) using the Sen2Cor's built-in scene classification algorithm; 3) applied quality control to remove cloud or cirrus contaminated observations based on scene classification results; 4) down-scaled the 20 m bands (B05-07, B8A, and B11-12) to 10 m resolution using the Sen2Res band-sharpening algorithm; 5) extract all Sentinel-2 pixels for each image falling within each plot using the plot boundary shapefile (on average, 9 pixels were extracted for each plot).
There was a two-year gap between the in-situ (2013) and Sentinel-2 (2015) measurements. However, since the plots from FunDivEUROPE network were selected to be neither in the early successional stage (so that the community composition is not very dynamic) nor in the regeneration phase (so that the stand is not dominated by very old dying trees). As a result, the changes in community composition are expected to be very limited during a period of two years apart (2013)(2014)(2015). To further confirm steadiness in community composition between 2013 and 2015, we checked both in-situ and Landsat data, on a plot-by-plot basis and results indicate high stability in community composition and a lack of disturbance or shift in phenology. Detailed results are reported in Sec. 1.4 in Supplementary file.

Computing functional diversity from in-situ measurements
We used a distance-based multivariate FD measure, termed as functional dispersion (FDis), using multiple traits weighted by species relative abundance (Laliberté and Legendre, 2010). FDis was computed using a different combination of plant traits: 1) using only five leaf traits (%N, %C, SLA, LDMC, LA), termed as FDis lea ; 2) using only three whole-plant traits (DBH, tree height, CCSA) termed as FDis str , and 3) using both foliar and whole-plant traits, termed as FDis all . We first computed all three FDis measures for the three FunDivEUROPE regions where both in-situ leaf and whole-plant traits are available, namely Finland, Romania, and Spain. For the other three FunDivEUROPE regions where in-situ foliar trait data is not available, we explored the potential of using data from the TRY database to fill the gaps.
While FDis lea (FDis computed using only leaf traits) only considers leaf biochemical and morphological traits, FDis str (FDis computed using whole-plant traits) is a measure of tree structure heterogeneity within a plant community. This component of biodiversity has been shown to be of functional significance (Chapin III et al., 2011;Dänescu et al., 2016;Lausch et al., 2016). The structural difference in tree crown height at the top of the canopy, or canopy height heterogeneity within a community, can affect surface roughness, and thereby the efficiency of water and energy exchange between the ecosystems and the atmosphere (Chapin III et al., 2011). A higher structural heterogeneity can also create more habitat niches and lead to a greater species diversity for other taxa (Goetz et al., 2007). As both leaf traits and tree structure can affect optical signals, we thereby expect that Sentinel-2 multispectral measurements will be better linked to FDis all than either FDis lea or FDis str alone.

Partial Least Squares Regression (PLSR)
To predict in-situ functional and other diversity measures using Sentinel-2 spectral measurements we used Partial Least Squares Regression (PLSR). PLSR is a regression technique that reduces the number of predictors to a smaller set of uncorrelated variables and then performs a least squares regression on the subset of variables (Wold et al., 2001). The PLSR method has been used widely in remote sensing studies for predicting plant biophysical and biochemical properties from spectral measurements (e.g., Asner et al., 2015;Vaglio Laurin et al., 2014a, 2014b. The variables we used for predicting diversity metrics included band-wise mean and band-wise mean absolute deviation (MAD) of surface reflectance computed using all nine pixels within each 30 m × 30 m plot. We performed PLSR using the PLS package in R (Mevik and Wehrens, 2007).
To evaluate model performance, we conducted a stratified 10-fold cross-validation. we split our data into 10 folds (using the function createFolds in R package caret), with each fold consists of a roughly equal proportion of sample from each study region. We then performed PLSR for predicting three different types of FDis (FDis lea , FDis str , and FDis all ) and evaluated model performance using the stratified 10-fold cross-validation. To minimize statistical overfitting, we determined the optimal number of components used in the final PLSR model by minimizing the prediction residual error sum of squares (PRESS) statistics (Chen et al., 2004). We evaluated model performance against the crossvalidation dataset using two statistical measures: 1) coefficient of determination (R 2 CV ); and 2) normalized root mean squared error (nRMSE CV , RMSE divided by sample mean), wherein both cases the subscript CV indicates that we obtained these measures from validation datasets. For model performance obtained from calibration dataset, we used the subscript "Cal". We used R package plsVarSel (Mehmood et al., 2012) to compute the variable importance of projection (VIP) score associated with each spectral band to quantify the statistical contribution of each individual variable to the fitted PLSR model across all model components.

Investigate statistical relationship between FDis and Sentinel-2 measurements
We linked FDis to 20 predictive variables from Sentinel-2 (bandwise mean and band-wise MAD for 10 spectral bands) using PLSR in the three study regions with in-situ trait measurements (Finland, Romania, Spain) (Fig. 3). Among the three FDis measures (FDis str , FDis lea , FDis all ), we found that the predictive power using Sentinel-2 is the best for FDis all (FDis computed using both leaf and whole-plant traits) (crossvalidation R 2 CV = 0.55, nRMSE CV = 31%, # of PLSR components = 8, N = 46) (Fig. 3). An additional test based on Moran's I confirms that the model performance is not biased by spatial autocorrelation (Sec. 1.5 in Supplementary file). Our PLSR model can also predict FDis lea (FDis computed using leaf traits) in a good confidence (cross-validation R 2 CV = 0.43, nRMSE CV = 36%, number of PLSR components = 4, N = 46), and to a less degree for FDis str (FDis computed using wholeplant traits) (cross-validation R 2 CV = 0.19, nRMSE CV = 71%, number of PLSR components = 4, N = 46) (Fig. 3). As we stated before, FDis all integrated both foliar and whole-plant traits that are both functionally important, so in following analyses we will focus only on FDis all .
We also tested the relative role of band-wise mean (10 predictive variables, hereafter mean spectra) and band-wise MAD (10 predictive variables, hereafter MAD spectra), representing between-site difference (e.g., different forest types) and within-site spatial heterogeneity (e.g., variation in forest structure within a plot) respectively, for predicting FDis all . Fig. 5 shows that mean spectra contributed more to the model performance (R 2 CV = 0.44, nRMSE CV = 35%, N = 46) than the MAD spectra (R 2 CV = 0.16, nRMSE CV = 43%, N = 46) (Fig. 5). Both mean and MAD spectra provided complementary information as the model using both the mean and MAD spectra performed the best (Fig. 3C).

Exploring the potential of using compiled trait database to complement in-situ trait data
To calibrate the PLSR model for more forest types, we tested the potential of filling in-situ foliar trait gaps with trait data from TRY. Comparison between species average values for each trait between insitu data and TRY data showed very good agreement (Fig. 6). Correlation coefficients vary from 0.78 for LDMC to 0.88 for leaf %N, 0.96 for SLA, 0.97 for leaf %C, and 0.99 for LA (Fig. 6). Table S4 and S5 in Supplementary file give the statistical summary of species-mean trait and TRY trait for all targeted species across the FunDivEUROPE plots. As such, we proceeded and attempted to link FDis computed with a fusion of in-situ whole-plant traits, TRY foliar traits, and in-situ species abundance data across all six FunDivEUROPE sites (117 plots).
The PLSR model predicting FDis all over all six FunDivEUROPE regions using Sentinel-2 measurements explained 22% of variance in the cross-validation dataset (R 2 CV = 0.22, nRMSE CV = 44%, # of PLSR components = 6, N = 117) (Fig. 7A). There is an increasing trend from the lowest FDis found in Finnish boreal forests, to thermophilous deciduous forests in Italy, Mediterranean mixed forests in Spain, beech forests in Germany, mountainous beech in Romania, and to the highest FDis found in hemiboreal forests in Poland (Fig. 7B). Overall, the between site (forest type) variations in FDis was captured by the PLSR model results (Fig. 7B).
We then attempted an initial mapping of FDis at high spatial resolution (30 m) over the six regions. We found strong local, withinforest type, variation in FDis (Fig. 8). The beech forests in Germany appear to be much more homogenous (coefficient of variance or CV = 19%) in terms of FDis across space as compared to other forest types, and the thermophilous deciduous forests in Italy (CV = 54%) and boreal forests in Finland (CV = 39%) are the most heterogeneous in terms of the local variations in FDis.

Integrating in-situ and high-resolution satellite data for mapping FD across space
This study demonstrated the potential of integrating high-resolution satellite data and in-situ measurements to explain spatial variation in FD of leaf and whole-plant traits at a regional scale, and hence is a significant step towards tracking in the future global FD variations using Earth observations. To our knowledge, this study is the first attempt to link functional diversity directly to spaceborne measurements, representing a major advancement compared to previous attempts at site-level based on airborne data.
Compared with mapping FD using airborne remote sensing measurements (e.g., Schneider et al., 2017;Asner et al., 2017), our approach using spaceborne measurements can be applied to a greater geographic extent and can be repeated through time, which offers two important possibilities for future studies. First, we can apply our statistical models to map FD along an environmental gradient and then investigate how FD varies with environmental factors. Second, although we acknowledge that the ability of our model to resolve temporal dynamics in FD remains to be validated, once this is done, future studies could possibly track seasonal and inter-annual FD variation, thereby allowing to study how FD responds to weather, climate or land-use  Fig. 4. Statistical quantification of the relative importance of 20 predictors (mean and MAD, the mean absolute deviation, of 10 spectral bands) from Sentinel-2 in predicting FDis all . On x-axis of Panel (a), 'VIP mean ' and VIP MAD refer to the VIP score for mean and mean absolute deviation of spectral reflectance for a given band within a plot respectively. Please refer to Table 3 for band definitions. X. Ma, et al. Remote Sensing of Environment 233 (2019) 111368 change.
Our approach offers two major advantages to recent studies that model global FD patterns using climate and soil data (e.g., Butler et al., 2017). First, the accuracy of predictions based on global modeling studies might be compromised by uncertainties in the input dataset, especially when mapping FD over regions where networks of meteorological stations or soil sampling sites are sparse. By contrast, our approach that effectively integrates ground measurements and satellite observations can provide more accurate estimates of FDis than global models driven by interpolated meteorology and soil properties. In addition, using Sentinel-2 we can generate FD maps at 30 m resolution, which is also a big step forward as compared to the half-degree resolution of current global maps (e.g., Butler et al., 2017).
Being able to predict FD using Sentinel-2 measurements offers opportunities for monitoring FD over a global scale, with high spatial resolution, and repeated over time. The upscaled FD maps from this study can be used to study the links between FD and ecosystem functioning and stability over broad scale, which has a direct implication in addressing the challenge in scaling up and testing the theories drawn from plot-level analysis to landscape or even continental levels (Isbell et al. 2017;Thompson et al. 2018). In addition, the FD maps generated from Sentinel-2 can also be used to provide a rapid biodiversity change assessment and reporting, which is usually difficult to be done using labor-intensive field surveys.

Interpretations of the observed link between FDis and Sentinel-2
We could use multispectral measurements from Sentinel-2 to , the bars indicate the maximum (upper) and minimum (lower) range of the data in each group. The label '*' and "ns" indicates the test of significant difference between predicted and measured FDis for each study region using the statistical Tukey HSD test, with "ns" indicates no significant difference (p-value > 0.05) and "*" indicates significant difference (p-value < 0.05).

Fig. 8.
High-resolution (30 m) map of FDis predicted using Sentinel-2 measurements for six FunDivEUROPE regions. These maps represent an average condition as they were computed using mean reflectance value across the 2015-2017 time period. Non-forest areas, defined as when percent tree cover of a pixel is < 30% based on a high-resolution global forest cover map (Hansen et al. 2013), are excluded. predict, in cross-validation, 55% of the variance in plant functional diversity derived from 5 leaf trait (%C, %N, SLA, LDMC, LA) and 3 whole-plant traits (H, CCSA, DBH). To gain insight into the observed link between FDis and Sentinel-2, we ran additional analyses for predicting individual foliar traits using Sentinel-2 ( Fig. S13 in Supplementary file). Our results showed that multispectral measurements from Sentinel-2 showed a good predictive power for leaf chemical composition including leaf %N (R 2 CV = 0.37) and %C (R 2 CV = 0.47), and leaf morphological trait such as SLA (R 2 CV = 0.63) (Fig. S13). The model for LDMC and LA showed a moderate predictive power (R 2 CV = 0.25 and 0.37 respectively) (Fig. S13 in the Supplementary file). Our findings here, therefore, align well with previous studies that used multispectral measurements from Landsat and Sentinel-2 for estimating leaf biochemical and morphological traits. For instance, Ollinger et al. (2008) found a strong correlation between NIR reflectance and canopy nitrogen concentration (R 2 = 0.79), while Clevers and Gitelson (2013) found that with simulated Sentinel-2 reflectance, the nitrogen content in grass and potato crops is strongly correlated to spectral indices that has a red-edge band (R 2 > 0.80). Over sites in Australia that contain forests, woodland, and shrubland, Lymburner et al. (2006) found that the canopy average SLA can be well related to several spectral indices computed from Landsat TM bands. Similar results were also reported by Mohammed Ali et al. (2017) for predicting SLA using Landsat-8 over German forests. With the use of airborne hyperspectral measurements, a strong correlation was reported between LDMC and SLA and canopy spectra in the NIR and SWIR region (R 2 value of 0.87 for LDMC and 0.85 for SLA). Noticed that the R 2 value reported in these studies are higher than what we reported here since we reported the R 2 in cross-validation only here. If considering calibration data, our results are actually close to what have been reported in previous studies, with R 2 ranging from 0.55 for LDMC to 0.78 for SLA (Fig. S13 in supplementary file).
The predictive power of the plant traits we selected using Sentinel-2 data are expected as optical remote sensing measurements in the visible and infrared regions are aggregated signals reflecting leaf biochemical and leaf morphological traits and canopy structure from top layers of the canopy (Ustin and Gamon, 2010). The leaf traits we selected are not only ecologically important (Pérez-Harguindeguy et al., 2011) as they capture the trade-off between resource acquisition and conservation (Diaz et al., 2016;Westoby et al., 2002;Wright et al., 2004), but also have a major role in the optical signal response. We found leaf %N and SLA dominate the first principal component that accounts for 65% variance of the traits we selected. Our study and previous studies have found that leaf %N and SLA can be predicted by multispectral or hyperspectral measurements (e.g., Ollinger et al., 2008;Clevers and Gitelson, 2013;Schlerf et al., 2010;Townsend et al., 2003;Loozen et al., 2018;Pérez-Priego et al., 2015;Mohammed Ali et al., 2017;Lymburner et al., 2006). From the variable importance analysis, we found that the most important Sentinel-2 spectral bands in predicting functional diversity are located in the red-edge, near-infrared, and SWIR region (Fig. 4), which aligns well with the known importance of these spectral regions based on other studies in upscaling leaf %N and SLA (Ollinger et al., 2008;Clevers and Gitelson, 2013;Mohammed Ali et al., 2017;Lymburner et al., 2006). Indeed, the correlation between leaf %N and SLA and Sentinel-2 spectral reflectance, which have been frequently reported in previous studies (Ollinger et al., 2008;Asner and Martin 2008;Townsend et al., 2003) can be caused by a structure effect (Knyazikhin et al., 2013). This means that the correlation between spectral reflectance and leaf %N and SLA can be resulted from indirect mechanisms manifested by co-variation in leaf and canopy structural properties instead of a direct physical mechanism between these traits and reflectance. Nonetheless, as Townsend et al. (2003) has pointed out, ample evidence showed that canopy architecture and leaf structural and chemical and optical properties tend to covary among plant functional types (Kokaly et al., 2009;Ollinger et al., 2008;Wright et al., 2004) and such tendency may be exploited for the purpose of diagnostic mapping of traits and plant functional diversity.
Sentinel-2 is a multispectral instrument equipped with much less spectral bands than hyperspectral instruments. A lower spectral resolution means that potentially many spectral reflectance features resulted from varying leaf and canopy traits can be obscured Wang et al., 2018). The upcoming spaceborne hyperspectral missions such as German EnMAP and NASA's SBG (Surface Biology and Geology) offer great potential to improve this situation (Guanter et al., 2015;Lee et al., 2015), as it has been demonstrated on the ground that spectral diversity proxies derived from ground-based hyperspectral measurements are highly relevant for inferring plant functional diversity . As a multi-temporal instrument, Sentinel-2, as well as Sentinel-1, can characterize seasonal and inter-annual variations in vegetation dynamics and phenology. Species situated in different ecological niches within a plant community can have different phenological characteristics (so-called temporal niche differentiation), and therefore phenological metrics for each individual pixel derived from Sentinel-2 and other high-resolution satellites can be included as additional variables for inferring biodiversity. Beyond the spectral regions covered by Sentinel-2 towards longer wavelength, there are thermal infrared bands that can inform forest canopy temperature and hence provide additional information for inferring functional diversity, as species with different functional traits (e.g., leaf size and angle) can have different canopy temperatures under the same environmental conditions (Leuzinger and Körner, 2007). In addition, to better bridge the field plot data with satellite measurements and understand the scaling effect, intermediate scale measurements taken by field spectrometers, drones and aircraft can be collected in tandem with biodiversity measurements (Wang & Gamon, 2019). Above discussions suggest that the rich information content in the constellation of the satellite missions, including optical, thermal, microwave, and multi-temporal (phenology-related), should be exploited in a joint and complementary manner to achieve a better inference of plant functional diversity and other facets of plant diversity from space (Fassnacht et al., 2016).
Optical measurements from Sentinel-2 are not the best information source for resolving tree structural variability within a forest plot as compared to more direct radar or LiDAR measurements (Bergen et al., 2009). Nonetheless, we found that Sentinel-2 can still explain~20% variance in FDis computed using three tree structural traits (H, CCSA, DBH) (Fig. 3). This is likely related to the fact that tree structure can also affect Sentinel-2 spectral reflectance (Asner, 1998;Xiao et al., 2014), especially over the NIR region, as NIR radiation can penetrate deeper into forest canopy and hence provide information about vertical profile of foliage within a canopy. This is in turn related to canopy structural and architectural properties such as canopy height and leaf angle (Colwell, 1974;Huete, 2004). As variability in whole-plant traits (tree-structure) is an important component of functional diversity, it is worth to explore the opportunity of integrating optical and radar and LiDAR measurements for upscaling FD (e.g., Goetz et al., 2007;Mura et al., 2015;Zhao et al., 2018). For instance, measurements from spaceborne radar instruments (e.g., Sentinel-1 C-band SAR, PALSAR2 Lband SAR, BIOMASS P-band SAR), or even spaceborne LiDAR (e.g., GEDI), can be complementary to Sentinel-2 as SAR or LiDAR measurements are more sensitive to tree structure, which may improve the estimation of FD measured that integrates both leaf and whole plant traits . Remote sensing measurements integrate multiple plant including leaf traits, canopy architecture, and vegetation structure (Ustin and Gamon, 2010), and hence FD derived from a combined leaf and whole-plant traits can potentially be better related to spectral variability than using any individual trait alone. The data fusion approach that can effectively integrate multi-source remote sensing measurements, therefore, needs to be explored in the future in the context of global FD monitoring.

Current limitations and future opportunities
We acknowledge a few important limitations of our methodology and identify also a few future opportunities to address these limitations. First, there might be a few remaining uncertainties due to the temporal mismatch between in-situ and Sentinel-2 measurements that cannot be completely eliminated from our additional checks of the steadiness in community composition. For any given forest stand, inter-annual variations in trait value for any given species can still have some influences on the computed FDis, though our analysis on TRY and field measured traits at species-average level suggested that at least for those traits we selected in this study, the trait value for any given species remains to be conservative. Trees have likely grown in the 2-year period between in situ trait and community measurements and RS observations, although this growth was likely moderate in the mature forests we selected. In addition, there was a summer heat wave over much of Europe (except Finland) that can cause uncertainties to our results as forest growth can be affected by this heatwave. To overcome such limitations in future studies, we suggest that a more reliable upscaling of plant FD, than the present one, using remote sensing can be achieved having collocated measurements, in space and time, between field trait and satellite measurements.
Second, our results based on leaf traits from the TRY database were promising and hence these compiled traits can fill data gaps, where insitu trait measurements are not available, for calibrating and validating remote sensing measurements. However, uncertainty still remains, as even for the European tree species the number of trait records per species in the TRY database is limited and therefore it is often impossible to resolve the potential between-community variation of trait values for any given species. In addition, while the focal European tree species in this study are well represented in TRY database, this may not necessarily be the case for other global regions, such as hyper-diverse tropical regions (Kattge et al., 2011). Much effort is thus needed to improve the spatial and species representativeness of the TRY database to be able to extend the remote sensing of FD study to global scale. Indeed, for the purpose of remote sensing upscaling, new trait sampling that can match satellite measurements in space and time would be particularly desired. However, new sampling would demand much resources that might not be available to every group or researcher. In this case, repeated sampling for any given species along an environmental gradient and across time would be highly appreciated to understand the degree of uncertainty that might be caused by spatio-temporal mismatch between field and satellite measurements. In any case, ancillary information about the geographic location, environmental background, and exact time of measurement would be essential to make the best use of the trait database for remote sensing.
Third, we suggest that a network with a similar design like FunDivEURUOPE, but potentially larger plots (e.g. 30-90 m size), extended across Europe to cover more species and a wider environmental space is needed to serve the purpose of both calibrating remote sensing data with in-situ measurements and for carrying our functional biodiversity research. Only with a joint effort from ecology and remote sensing communities, can we achieve eventually global monitoring of plant functional diversity not only across space but also across time.

Conclusion
In this study, we showed that measurements from a high-resolution spaceborne multispectral radiometer (Sentinel-2) can explain 55% of the variability in functional diversity across major European forest types. Among the spectral regions of Sentinel-2, we found red-edge and infrared bands to be more important than visible bands in predicting functional traits dispersion. We also tested the possibility of extending our analysis to sites where in-situ traits measurements are not available by using compiled traits from global trait databases and demonstrated the potential of combining in-situ species abundance sampling with the TRY data for quantifying functional dispersion over a wider geographic and climatic extent. Being able to predict functional diversity using Sentinel-2 measurements offers opportunities for monitoring functional diversity potentially over a global scale, with high spatial resolution, and repeated over time at every few days, over a decade or even longer periods of time. Our study also opens the opportunity for studying the links between FD and ecosystem functioning across space and time. We expect that with the data from upcoming next-generation spaceborne hyperspectral or even dedicated biodiversity monitoring missions there is potential to achieve a better spatial and temporal remote sensing of functional diversity. Meanwhile, the multi-temporal capability of the new generation spaceborne hyperspectral missions would allow a better temporal match with field measurement, which is critical to reduce the uncertainty that can come from temporal trait variability. Though challenges such as reduced spatial resolution of the new hyperspectral missions will need to be addressed by establishing field plots with much larger size to bridge the scale gap. From a remote sensing perspective, spatial resolution of future hyperspectral missions needs to be much improved to be able to better link to field plot measurements. Only through a coordinated effort between the ecology and remote sensing communities can we achieve the global plant functional diversity monitoring goal.

Acknowledgement
This work was primarily supported by an iDiv-Flexpool project "Observing effects of biodiversity on ecosystem functioning across time, space, and wavelength: oBEF-Across" and a European Union H2020 project "Detecting changes in essential ecosystem and biodiversity properties -towards a Biosphere Atmosphere Change Index: BACI" (grant agreement number: 640176). The data from FunDivEUROPE network was collected with the financial support from the European Union Seventh Framework Programme (FP7/2007-2013) (grant agreement number: 265171). Remeasurements across all FunDivEUROPE plots were financed by EU H2020 project Soil4Europe (Bioidversa 2017-2019). We thank Prof. Dr. Michael Scherer-Lorenzen and Prof. Dr. Kris Verheyen from FunDivEUROPE network for their generous help with field data during the paper revision. DEPM and MM and MDM also acknowledge the support of the Trustee European Commission project no H2020-MSCA-ITN-2016 -721995. The free use of Sentinel-2 data is enabled by ESA's Copernicus Open Access Hub. The in-situ plant traits data collected over Finnish, Romanian, and Spanish sites were supported by a Marie-Curie Fellowship (DIVERFOR, FP7-PEOPLE-2011-IEF. No. 302445) to R. Benavides. The free access of plant trait data was supported by the TRY initiative (http://www.trydb.org).