Towards mapping the diversity of canopy structure from space with GEDI

Plant biodiversity supports life on Earth and provides a range of important ecosystem services, but is under severe pressure by global change. Structural diversity plays a crucial role for carbon, water and energy cycles and animal habitats. However, it is very difficult to map and monitor over large areas, limiting our ability to assess the status of biodiversity and predict change. NASA’s Global Ecosystem Dynamics Investigation (GEDI) provides a new opportunity to measure 3D plant canopy structure of the world’s temperate, Mediterranean and tropical ecosystems, but its potential to map structural diversity is not yet tested. Here, we use wall-to-wall airborne laser scanning (ALS) to simulate GEDI data (GEDIsim) over 7380 km2 in the southern Sierra Nevada mountains in California and evaluate how well GEDI’s sampling scheme captures patterns of structural diversity. We evaluate functional richness and functional beta diversity in a biodiversity hot spot. GEDIsim performed well for trait retrievals (r2 = 0.68) and functional richness mapping (r2 = 0.75) compared to ALS retrievals, despite lower correlations in complex terrain with steep slopes. Functional richness patterns were strongly associated with soil organic carbon stocks and density as well as variables related to water availability and could be appropriately mapped by GEDIsim with and without cloud cover. Functional beta diversity was more strongly related to local changes in topography and more challenging to map, especially with decreasing sampling density. The reduced number of GEDIsim shots when simulating cloud cover lead to a strong overestimation of beta diversity and a reduction of r2 from 0.64 to 0.40 compared to ALS. The ability to map functional richness has been demonstrated with potential application at continental scales that could be transformative for our understanding of large-scale patterns of plant canopy structure, diversity and potential links to animal diversity, movement and habitats.


Introduction
NASA's Global Ecosystem Dynamics Investigation (GEDI) is a spaceborne lidar sensor designed specifically for measuring Earth surface structure including detailed information about 3D canopy structure of terrestrial vegetation (Dubayah et al 2020). GEDI was successfully launched and installed on the International Space Station (ISS) in December 2018 and started its official operational data acquisition in March 2019. GEDI provides measurements of the terrestrial Earth surface between 51.6 • north and south, following the ISS path, over a minimum planned mission length of two years. Besides the goal of providing a contiguous large-scale biomass map of the world's temperate, Mediterranean and tropical forests at 1 km spatial resolution, GEDI provides a range of products that characterize 3D vegetation canopy structure (Dubayah et al 2020). The fullwaveform sampling of GEDI allows the derivation of vertically resolved information related to canopy height, density and layering (Hancock et al 2019, Marselis et al 2019). Using this potential to map plant structural diversity can reveal large-scale biodiversity patterns and inform macroecology through its links to animal habitats, movement and diversity. However, since GEDI is a sampling instrument sending laser pulses that reach 25 m diameter on the ground, spaced at 60 m along track and 600 m across track, it is not yet tested to what degree GEDI can capture large-scale diversity patterns and how GEDI observed structural traits relate to established measurements from airborne laser scanning (ALS) acquisitions.
ALS combines light detection and ranging (lidar) with airborne scanning of actively emitted laser beams, resulting in a geo-located measurement of returned laser energy that can be discretized into a 3D point cloud (Wehr & Lohr 1999). It has proven successful in characterizing vegetation canopy structure and structural diversity for a range of traits, such as canopy height (Naesset and økland 2002), plant area index (Schneider et al 2014), foliage height diversity (MacArthur & MacArthur 1961), the vertical distribution of plant material in the canopy (either through PAI profiles (Marselis et al 2018), or relative height (RH) of lidar energy (Drake et al 2002, Dubayah et al 2010) and combinations thereof (Schneider et al 2017). The study of structural diversity, in particular of forests, has gained increasing interest due to its importance for the carbon cycle, ecosystem services, plant and animal diversity and habitat characterization (Bohn & Huth 2017, Vierling et al 2008, Davies & Asner 2014, LaRue et al 2019. How forests will react to ongoing global change is one of the main open questions in ecology and among the largest sources of uncertainty when predicting the carbon cycle and impacts of anthropogenic and climate change (Mitchard 2018, Baccini et al 2017, Hansen et al 2019. At the same time, we are heading towards the sixth mass extinction of species on Earth (Ceballos et al 2015, Barnosky et al 2011 and suffering severe losses of biodiversity and key species (Isbell et al 2017, Jones et al 2018, Díaz et al 2019, which calls for global biodiversity monitoring and immediate actions to be taken . The Convention on Biological Diversity (CBD) helps guide those actions with a vision to ensure the valuation, conservation and restoration of biodiversity and its sustainable use through a set of policy-relevant targets (Mace et al 2018). Many targets are not or only partially met though and the assessment of the state of biodiversity remains a challenge, as Díaz et al (2019) point out summarizing the global assessment report of the Intergovernmental Platform on Biodiversity & Ecosystem Services (IPBES). Therefore, global mapping of structural diversity, as an important dimension of biodiversity and habitat quality, is of high importance and urgency , Grassi et al 2017.
The diversity of vegetation canopies influences light distribution and utilization within the canopy, with higher structural diversity often leading to a more effective use of energy and thus increased ecosystem productivity (Bohn & Huth 2017, Williams et al 2017. However, the lack of spatially explicit data at the landscape level and across biomes, climatic and edaphic gradients on plant structure, diversity and function limits our understanding of the diversityproductivity relationship (Schimel et al 2015, Sandel et al 2015. Additionally to GEDI, operational and upcoming missions to measure ecosystem function (e.g. OCO-2/3, ECOSTRESS, FLEX), structure (e.g. ICESat-2, MOLI, BIOMASS) and physiology (e.g. HISUI, EnMap, SBG) will help to fill these gaps by providing (near) global coverage and study related processes globally (Stavros et al 2017). Structural diversity is one key aspect that could not only be linked directly to certain functions but also be incorporated in dynamic vegetation models to improve predictions of carbon fluxes under global change drivers (Antonarakis et al 2014, Rödig et al 2018, Braghiere et al 2019. Increased structural diversity could provide a wider range of species niches and habitats supporting a larger number of species. Therefore, lidar-derived structural diversity can be a proxy for biodiversity of plants and animals (Vierling et al 2008, Simonson et al 2014. A range of ALS studies show good relationships between structural lidar measurements to plant species diversity (Hernández-Stefanoni et al 2014, Zellweger et al 2017 and Marselis et al (2019) demonstrated the potential of GEDI to map tree species diversity in tropical forests. Lidar measurements can provide unique features to map and characterize habitats for animals, such as birds (Müller et al 2010, Seavy et al 2009, mammals (Zhao et al 2012, Davies & Asner 2014) and insects (Müller & Brandl 2009, Müller et al 2014 and to be used in species distribution models (Randin et al 2020). GEDI is the first spaceborne sensor specifically designed to map 3D terrestrial vegetation structure, opening up a new era for biogeography and macroecology.
In this study, we simulate GEDI data (GEDI sim ) to assess its ability to capture diversity patterns from space. Advantages of a simulation study are that there are neither temporal nor geolocation mismatches between the reference airborne and the simulated spaceborne data and that a larger range of GEDI sampling densities can be simulated, including simulations with and without cloud cover. We investigate the following four research questions: (1) How do waveform-based structural traits of GEDI sim compare to discrete return ALS traits?, (2) Do trait to trait relationships hold between ALS and GEDI sim ?, (3) How does GEDI sim capture functional richness and beta diversity with and without the simulation of cloud cover? and (4) What is the relationship between functional richness and beta diversity to the environment (climate, topography, soil)? We observe this over a heterogeneous mountain landscape, which provides a hot spot for plant biodiversity in the temperate and Mediterranean biomes comprising over 50% of California's plant diversity with more than 3500 native species (CWWR 1996).

Study area
The study area is located between Yosemite and Sequoia National Park (NP) in the southern Sierra Nevada mountains of California (figure 1). The area spans 7380 km 2 of Kings Canyon NP and parts of Sequoia, Sierra and Inyo National Forests (NF). The area is characterized by complex mountainous terrain, spanning a total elevation range of 3000 m. The climate is Mediterranean with cool winters and long warm summers, with mean annual temperature ranging from −2.7 to 18.0 • C (Fick & Hijmans 2017). Summers are dry and prone to fires, with mean annual precipitation ranging from 125 to 1024 mm (Fick & Hijmans 2017). Temperature and precipitation are highly variable and characterized by strong gradients from west to east due to the terrain and Pacific storm systems moving in from the west (CWWR 1996).
The distribution of vegetation types is mainly driven by elevation and major valleys, but is variable locally depending on water availability, evaporative demand and disturbance history from fires, storm blowdowns, insect and pathogen infestations and avalanches (CWWR 1996, Fites-Kaufman et al 2007. The study area includes part of the lower elevation chaparral and oak savanna-type vegetation and some giant Sequoia trees in the southwestern part. Dominant species are white and Douglas fir (A. concolor, P. menziesii), hemlocks (T. mertensiana) and lodgepole, sugar, ponderosa and Jeffrey pine (P. contorta, P. lambertiana, P. ponderosa, P. jeffreyi). Vegetation transitions to a zone of alpine vegetation at high elevations. Canyons are characterized by California laurel (U. californica), canyon live oak (Q. chrysolepis) and white alder, quaking aspen and tree willows (A. rhombifolia, P. tremuloides, S. lasiolepis) along streams and swampy meadows (CWWR 1996, Fites-Kaufman et al 2007.

Laser scanning data
Airborne laser scanning (ALS) data was acquired during summer of 2016 and 2017 as part of NASA's Airborne Snow Observatory (ASO). A full-waveform scanning lidar system (Riegl Q1560) was operated at an altitude of 6550 m asl, a nominal footprint size of 0.75 m to 1.5 m depending on elevation and a swath width of ≈ 4.3 km. We combined multiple flight strips and optimized their co-registration and geolocation accuracy following Ferraz et al (2018) to create a dense point cloud with an average of 6 pts m -2 . A Delaunay triangulation was calculated from the points classified as ground to create the digital terrain model (DTM) and normalize the point cloud by the DTM. Additional details about the point cloud acquisition, processing and the data set itself will be published in Ferraz et al (2020) (in preparation).
Spaceborne laser scanning data was simulated according to the expected GEDI sensor and acquisition characteristics using the approach of Hancock et al (2019), which was previously tested (Marselis et al 2019, Duncanson et al 2020 and validated against NASA's airborne Land, Vegetation and Ice Sensor (LVIS) and related products (Hancock et al 2019). We simulated two years of operational mission data assuming a 5% data transmission loss and 60% power allocation time, resulting in 395 days of data acquisition. This leads to 594,462 GEDI sim shots in total and 81 shots per square kilometer on average. However, GEDI's laser beam at 1064 nm does not penetrate clouds. Since cloud cover is highly clustered and mainly persistent at very high elevations, we used the cloud climatology data set by Wilson & Jetz (2016) to simulate the expected GEDI sim coverage. We randomly removed a percentage of days per month and 1 km 2 pixel from GEDI sim based on the average monthly 1 km 2 cloud frequency calculated based on the years 2000 to 2014 (Wilson & Jetz 2016). This does not fully account for spatial clustering of clouds exceeding the 1 km 2 scale, which does not influence the diversity metrics derived at the same scale but might lead to more randomly distributed data gaps in the final maps. The final GEDI sim coverage over time simulated with cloud cover is shown in supplementary figure S1 (https://stacks.iop.org/ERL/15/115006/mmedia).

Functional trait mapping
We used the approach of Schneider et al (2014), Schneider et al (2017) to map canopy height, relative heights (RH) as percentiles of the vertical distribution of canopy points at 25, 50, 75 and 95%, plant area index (PAI) for the whole vertical column and per 10 m height layers (PAI0-10, PAI10-20, PAI20-30, PAI30-40) and foliage height diversity (FHD) from ASO ALS data. Additionally, we calculated the canopy ratio (CR) as the percentage of canopy depth to canopy height as follows: The above mentioned traits are also output of the GEDI simulator and were used to test the potential of GEDI sim . Relative height refers to the vertical distribution of returned lidar energy (mostly following the distribution of plant material), whereas plant area index defines plant area per unit ground area for a given vertical extent. An RH25 value of 6 m means that 25% of energy is located below 6 m from the ground and a PAI0-10 of 2 means that there are 2 m 2 of leaves and branches per m 2 ground within 10 m from the ground. In both cases, the ground location has to be known. Here, we used GEDI sim traits derived using gaussian fitting to detect the ground. It has to be noted that for GEDI sim RH and FHD the full GEDI sim waveform was used as for the GEDI level 2 products. The traits were derived relative to the ground location but including ground energy due to the difficulty of separating ground and canopy energy with long pulses on slopes, whereas for the ALS RH metrics only canopy points above ground were used. We applied a correction factor to the GEDI sim RH values to limit the maximum height at very steep slopes (>≈ 50 • ) for that reason (see supplementary figure S2 for more details). CR was then calculated using corrected GEDI sim RH as defined above (equation (1)). Including ground energy in the case of GEDI sim also changes the interpretation of GEDI sim CR, since it is not only a function of vegetation height distribution but also fractional cover and the distribution of ground energy. The GEDI sim metrics were calculated at the footprint level following the specifications of GEDI (Gaussian energy distribution within footprint with about 80% of the energy contained in a 22 m diameter). Therefore, we calculated the ALS area-based metrics on a 20 m grid to approximate a similar spatial grain.

Functional diversity mapping
We followed the concept of Schneider et al (2017) to map plant functional diversity based on morphological canopy structure traits. Additionally to canopy height, density (PAI) and layering (FHD) which has been successfully mapped by Schneider et al (2017) for a range of scales, we also included canopy ratio and the density of plant material from 0 to 10 m above ground. These are important descriptors of the canopy structure related to the compactness and filling of the canopy space as well as the presence of understory and low vegetation. To derive functional diversity, we analyzed the distribution of pixels or GEDI sim shots of an area of interest (e.g. 1 km 2 ) in the functional space defined by the five traits described above.
We adapted the concept of Schneider et al (2017) to work with trait probability densities (TPD) in the functional trait space following Carmona et al (2016b), allowing the derivation of functional richness (FRic) as the percentage of trait space occupied by a minimum density of pixels or GEDI sim shots (figure 2). In contrast to FRic derived as convex hull volume of pixels in trait space, this accounts for concave shapes or gaps in trait distributions. The approach works when dealing with varying sampling sizes per unit area, as is the case with the irregular spatial sampling of GEDI. Defining a density threshold for occupied space makes functional richness more robust and less susceptible to outliers than using the convex hull volume approach (Carmona et al 2016a, Blonder 2016). Moreover, the TPD approach . Trait probability density (TPD) in one, two and three dimensions (1D, 2D, 3D) based on canopy height, density as plant area index (PAI) and layering as foliage height diversity (FHD). This concept can by extended to any number of dimensions (nD) and allows to derive functional richness and functional beta diversity, among many other dimensions of functional diversity. We derived functional richness (FRic) as percentage of occupied trait space and functional beta diversity (FBeta) as percentage of unique, non-overlapping probability density. The example shows FRic for two adjacent 1 km 2 pixels (Area 1 & 2) and its corresponding FBeta. provides a suitable concept to derive functional beta diversity (FBeta), defined as the non-overlapping areas of two density distributions from spatially separate areas (Carmona et al 2016b, figure 2).
For the functional diversity calculations, we linearly scaled the traits from 0 to 1 with a 0.1% cut off of extreme values. We calculated TPD estimates on a 5D sampling grid of the trait space at 0.1 intervals for 1 km 2 spatial grid cells (mvksdensity in Matlab 2019a). We used the same grids for the calculation of ALS and GEDI sim trait densities. We then calculated functional richness as the percentage of number of trait space grid points with a TPD higher than 2 points per 0.1 kernel bandwidth. For functional beta diversity, we applied a moving window with a 3x3 neighborhood and calculated pair-wise functional beta diversity between each 1 km 2 pixel and its eight neighbors. Since strongest trait turnover takes place within the first few kilometers (supplementary figure S5), we decided for the smallest window size of 3x3 km 2 . We then calculated the average non-overlapping densities as functional beta diversity without applying a density threshold. Figure 2 illustrates the concept and sampling of the trait space for 1D, 2D and 3D examples that can be extended to nD trait spaces.

Statistical analyses
We compared GEDI sim and ALS traits by direct trait correlation using linear, power-law and logarithmic regression. We assessed trait-to-trait relationships and compared traits in principal component space. We present the analysis for three components in the results based on supplementary figure S3, with the third component still explaining 4.3 and 5.5% of the variance for ALS and GEDI sim traits (supplementary table S1). Since GEDI sim shots do not necessarily fall on ALS pixel centers, we calculated ALS traits at 10 m resolution and used bilinear interpolation to derive an ALS trait average at each GEDI sim shot location.
For interpretation of the functional diversity maps, we ran a random forest regression model to derive the most important environmental predictors of functional richness and beta diversity  (2015), see supplementary tables S2, S3 and S4. In a first step, we selected the top 15 predictors of functional richness for each set based on predictor importance estimates from permutations of out-of-bag predictor observations of 300 regression trees (fitrensemble, oobPermutedPredict-orImportance in Matlab 2019a). We then applied a principal component analysis on the normalized predictors to map the major environmental patterns based on the first three components. We used the same 15 climate, soil and topography variables to estimate the importance on predicting beta diversity and repeated the analysis using all 45 variables together.

Functional traits comparison
The first step in evaluating GEDI's potential for mapping functional diversity is to compare GEDI sim traits to the ones derived from ALS data. On average, GEDI sim traits explained 68% of ALS trait variability for the 12 traits analyzed. Supplementary figure S4 shows the linear trait correlations with a tendency of higher GEDI sim values in steep areas of low vegetation height and density. The average r 2 increases to 0.72 in areas with less steep slopes < 10 • . Generally, estimates of the upper canopy are closer to ALS traits than from the mid-or understory, where GEDI sim RH values tend to be lower due to the inclusion of the ground energy. The most correlated traits are RH75, RH95 and maximum canopy height, as well as PAI in height . Trait relationships generally hold between ALS and GEDI sim estimates, with GEDI sim capturing 59% of variation in ALS trait correlations (figure 3, see supplementary figure S5 for the full matrix). GEDI sim traits include information from the canopy and the underlying terrain and therefore show lower correlations among each other. Differences also arise from the differing trait distribution of PAI in the lowest layer 0 -10 m above ground.

Major trait axes
We applied a principal component analysis to show major trait axes, possible correlations and trait contributions (loadings) to the first three components (figure 4). Small loadings could suggest little information content or redundancy with other traits and thus guide trait selection. Height related traits mainly contributed to the first component, explaining 78% and 69% of total ALS and GEDI sim variance (see supplementary table S1). The contribution is similar and none of the RH traits emerge as strong independent components, but the spread among the second and third component shows independent contributions to trait variability. FHD behaves similarly and is close to canopy height, but also shows a strong contribution to the first principal component and functional beta diversity (FBeta, bottom row) at 1 km resolution in percent of filled and non-overlapping trait probability density space, respectively. Maps on the left are derived from spatially continuous airborne laser scanning (ALS) traits, compared to maps derived from simulated GEDI data (GEDI sim ) without (middle) and with cloud cover (right). on its own. Some of the strongest and most independent loadings stem from CR, PAI and density of the understory (PAI 0-10 m). These are strong contributors to the second and third components, explaining 13% and 4% of ALS and 18% and 6% of GEDI sim variance. Conversely, PAI layers above 10 m do not have a strong contribution, indicating little added information content due to many zero values and correlations to total PAI. This is the case for both ALS and GEDI sim derived traits.

GEDI's ability to map functional diversity from space
We mapped functional richness and functional beta diversity from five structural traits, namely mean canopy height, plant area index, PAI at 0-10 m, foliage height diversity and canopy ratio, at 1 km 2 spatial resolution (figure 5). FRic is generally higher along the main valleys and in the western part of the study area and lower in the central area and higher elevations. These patterns are related to temperature seasonality, soil organic carbon stock, cation exchange capacity and the generalized DTM, which are the best predictors of functional richness at the landscape scale based on a random forest regression model ( figure 6). Soil organic carbon stock and variables controlling water availability through runoff (distance to river), radiation (sky view factor, diurnal anisotropic heating) and soil properties (occurrence of bedrock and distance to bedrock) are most important when combining all environmental variables in one model. These general patterns are well captured by GEDI sim too, both with and without the simulation of clouds (r 2 of 0.85 and r 2 of 0.75 with cloud cover). There is a slight underestimation of FRic in areas with denser, taller and more diverse forest canopies, whereas GEDI sim overestimates FRic in areas with lower vegetation and complex steep terrain (supplementary figures S7, S9). Differences do not change much with cloud cover, but tend to be a bit more negatively biased due to the reduced number of samples (supplementary figures S8, S10).
Functional beta diversity is an indicator of trait turnover and unique niche space that is not shared from one area to another (Carmona et al 2016b). Therefore, Fbeta is high along the major valleys and canyons, where there is a strong shift in plant community traits between riparian and shrubby canyon vegetation to more open, mixed coniferous forests (figure 5). Moreover, FBeta is high at higher elevations, where vegetation gets patchy and its occurrence and structure are more strongly determined by geomorphological activity, avalanches, microtopography and soil. This is best described by changes in topography (ridge level, generalized DTM, sky view factor) followed by the occurrence of bedrock and changes in climatic moisture index (figure 6). GEDI sim captures some of these patterns, but with an overestimation when simulating clouds. The trait turnover along major valleys is less visible in GEDI sim derived maps and the error is strongly related to sampling density (supplementary figures S11, S12). The reduced number of GEDI sim shots when simulating cloud cover leads to a strong overestimation of beta diversity and a reduction of r 2 from 0.64 to 0.40 compared to ALS (supplementary figure S8).

Discussion
GEDI is designed to provide 3D canopy structure information from space by sampling vertical canopy profiles with a 25 m footprint laser. The measured laser waveform includes the energy returned from the ground, which can be strongly elongated in steep terrain. Ground energy might be mixed with energy returned from understory vegetation or small trees. In this case, disentangling the contribution from ground and vegetation canopy is challenging and can lead to inaccurate estimates of canopy structure traits, such as the systematic overestimation of high canopy ratio and low foliage height diversity. In extreme cases, complex steep terrain with boulders can look like vegetation due to multi-modal ground returns (Hancock et al 2012). Considering the challenging terrain, GEDI sim performed well in terms of structural trait retrieval with best results in flat areas and for traits related to the upper canopy. Further research is needed to develop an algorithm to decompose the waveform into ground and canopy energy, but issues might remain in steep areas with dense understory vegetation.
Functional diversity estimates are dependent on the accuracy and selection of traits and the spatial scale used to derive them (Funk et al 2017, Anderson 2018. For functional diversity analyses, we suggest to include plant functional traits that are functionally relevant in terms of growth, reproduction or survival (Violle et al 2007, Díaz et al 2015, ecologically relevant in terms of competition, niche space or succession (Kunstler et al 2016, Cadotte 2017 and that build independent trait axes without functional redundancy or over-representation of one trait axis (Petchey & Gaston 2006). We followed this suggestion and the principal component analysis showing five major trait axes for both ALS and GEDI sim traits. Optimizing the trait selection towards best correlation to ALS could have potentially improved GEDI sim results, but also changed the meaning and relevance of functional diversity. The impact of changing trait combinations and number of traits on diversity estimates is not assessed here, but can be relevant depending on the science question, application and scale (Roscher et al 2012, Zhu et al 2017, Jarzyna & Jetz 2018. Trait selection might be less critical at large scales where traits and their spatial organization are more likely to be correlated, as was shown by Schneider et al (2017) for diversity of morphological and physiological forest traits along an environmental gradient.
We mapped functional diversity at 1 km resolution, which we think is the smallest reasonable resolution for our application with 72% of pixels having more than 20 and 44% more than 40 GEDI sim shots per pixel (see supplementary figure S13). This number drops to only 12% and 0% respectively at 500 m resolution considering simulated cloud cover. The results also show that a lower resolution might be needed to create a gap-free diversity product of the area. The fusion with additional spatially continuous data sets like Sentinel, Landsat, Planet, or the upcoming NISAR might help to create a higher resolution gridded product. Nevertheless, the GEDI sim data analysis in this study indicate that GEDI will provide a unique spatial grain and extent on canopy structure and diversity, which has previously been unavailable and could help scale existing ground based trait and diversity maps. For example, the global trait maps of Bruelheide et al (2018) are interpolated at 10 km with many remaining gaps, whereas Butler et al (2017) modeled a continuous global coverage but for 50 km cells. Current global data sets of animal diversity are produced at 2 • (≈ 200 km) by Hurlbert & Jetz (2007), Pollock et al (2017).
Functional richness calculated as the filled probability density space of five major canopy structure traits shows broad patterns of diversity cold and hot spots. The availability of water is strongly associated with those patterns in a Mediterranean climate with prolonged summmer dryness. Therefore, environmental variables related to soil water availability, evaporative demand and incident radiation, as well as distance to stream and precipitation are shaping the distribution of plant communities. These might be the main drivers of structural diversity, whereas the strong link of soil organic carbon stock and density with plant functional richness could be a result of the higher plant productivity and biomass in more diverse ecosystems. Many studies have shown positive effects of plant biodiversity on productivity (Liang et al 2016, Huang et al 2018, with a strong link through structural diversity as a regulator for radiation interception and increased resource use efficiency (Bohn & Huth 2017, Williams et al 2017. Those general patterns of functional richness agree with total species richness reported by Wathen et al (2014) for the Kings Canyon NP and are captured well both by GEDI sim simulated with and without clouds.
Abiotic factors are shaping the distribution of plant structure and species, especially in a water and temperature limited system as presented here. However, the importance of individual variables has to be taken with care, since random forest feature importance can strongly vary depending on how many co-varying features were included in the analysis. Furthermore, the SoilGrids dataset was modeled with spatial covariates including long-term MODIS enhanced vegetation index, near and middle infrared reflectance and land surface temperature, among others (Hengl et al 2017). Therefore, correlations to functional diversity might not purely reflect soil properties.
Overestimation of functional richness happens in areas with low vegetation or sparse coverage and complex, steep terrain due to topography effects on the laser waveform discussed above. Underestimation occurs in areas of higher canopy structural complexity and tree height due to the spatial sampling of GEDI. Since functional richness is a measure of total occupied niche space, the spatial sampling and heterogeneity of the landscape will determine how much of the total trait diversity GEDI can capture. When sampling density decreases with increasing cloud cover, it is more likely to miss certain trait combinations and thus underestimate functional richness. This could have implications for mapping tropical forests, which are especially effected by cloud cover but would require a dense sampling to capture the large heterogeneity of canopy structure. At our study site, a simple relationship between terrain slope, topographic diversity and tree cover can explain about a third of the observed over-and underestimation of functional richness (supplementary figures S14 and S15) and could provide a guide to build hypotheses on where to expect potential over-and underestimation of diversity in California (figure 7). One caveat of the current approach is that we are not providing a ground validation, which is challenging and also potentially inaccurate at 20 m grain and 1 km 2 extent. Therefore, we rely on comparisons to state-of-the-art ALS with wall-to-wall coverage. Future research will be needed to further test this method at additional study sites, covering a larger range of vegetation types and environments and with real GEDI data.
Functional beta diversity is more complex, since it describes trait turnover and shifts in trait distributions from one region to another. At the scale of 1 km and 3x3 km neighborhoods, it is more related to trait turnover among large environmental gradients than local species beta diversity between communities. The beta diversity patterns shown in figure 5 are clearly reflecting the shifts in canopy structure between wetter and more densely vegetated valleys and canyons and drier open areas with sparser coniferous forests, best reflected by change in topographic variables. This is in line with Jucker et al (2018), who found that topography is shaping the distribution of forest structure and diversity through its impact on microclimate and local soil variability, which might not be captured by the more broadly modeled global climate and soil data sets. Moreover, it shows the patchiness of vegetation at higher elevation, where canopy structure can change dramatically based on disturbance history. These patterns are captured by GEDI sim simulated without clouds, but disappear with decreasing sampling density. There is no clear threshold, but the errors seem to stabilize at around 40 or more samples per unit area in this study area (supplementary figures S11 and S12) and potentially more in regions of higher local variability. Fusion with other structural data sets, e.g. from ICESat-2, might overcome issues due to low sampling density and may enable improved capabilities (Neuenschwander & Pitts 2019).
The implication of near-global maps of functional richness and beta diversity for assessing the state of biodiversity, as mandated by IPBES and related CBD targets could be large, because it is breaking the ground for new ways of observing ecosystem structure that have not been available before. Ecosystem structure has been identified as a key Essential Biodiversity Variable (EBV) class  and the proposed approach could contribute or be added to candidate EBVs as well as CBD targets 5 (Habitat loss, fragmentation and degradation), 7 (Sustainable management) and 15 (carbon sequestration, ecosystem resilience), see O'Connor et al (2015). In combination with models or other sources of biodiversity data, the mapping of functional richness and beta diversity with GEDI could also help address CBD targets 9 (Control of invasive alien species), 11 (protected areas) and 14 (ecosystem services safeguarded) and more broadly help to sustainably manage forests and monitor land degradation and biodiversity loss (Sustainable Development Goal 15, Díaz et al 2019). GEDI enables a whole suite of new ecosystem structure products that can be used for monitoring in the future (Dubayah et al 2020). Finally, for a long time the diversity of plant canopy structure was neglected or oversimplified in global Earth system models, leading to potential errors in the radiation budget and underestimation of plant photosynthesis (Braghiere et al 2019). This could be changed with the availability of a GEDI diversity product and its integration into dynamic vegetation models (Schimel et al 2019).

Conclusion
GEDI provides unique measurements of 3D canopy structure from space and the simulation study presented here shows the potential to successfully characterize functional richness at large spatial scales. Care has to be taken in steep areas with low or sparse vegetation cover to not mistakenly treat variation in ground returns as plant functional richness. Implications for the understanding of large-scale patterns of plant canopy structure, structural diversity and potential links to animal diversity, movement and habitats are immense and could be transformative for global ecology (Schimel et al 2019). Our results suggest that functional richness could be estimated from GEDI data with little influence by sampling density, whereas functional beta diversity shows large uncertainty in areas of low coverage. Estimating biodiversity from functional traits could have a range of advantages. GEDI provides consistent measurements of canopy height, layering and density over all of the world's temperate, Mediterranean and tropical forests, except for the discussed issues in mountainous areas and possible data gaps due to cloud cover. This provides a new view on biodiversity, including intra-specific diversity and the vertical component of canopy structure in a range of natural, managed and disturbed forests. This could help monitor biodiversity and policy targets and greatly improve the representation of plant canopies in dynamic vegetation and land surface models, improving our understanding of the carbon cycle and ecological forecasting.
GEDI products are being released month by month on NASA's LP DAAC since January 2020. Dubayah et al (2020) show the excellent fidelity of GEDI on-orbit waveforms compared to airborne lidar, but future research is needed to test GEDI level 2 A and B products for functional diversity mapping across biomes, once sufficient coverage is available at near-global extent.