Combining GEDI and Sentinel-2 for wall-to-wall mapping of tall and short crops

High resolution crop type maps are an important tool for improving food security, and remote sensing is increasingly used to create such maps in regions that possess ground truth labels for model training. However, these labels are absent in many regions, and models trained in other regions on typical satellite features, such as those from optical sensors, often exhibit low performance when transferred. Here we explore the use of NASA's Global Ecosystem Dynamics Investigation (GEDI) spaceborne lidar instrument, combined with Sentinel-2 optical data, for crop type mapping. Using data from three major cropped regions (in China, France, and the United States) we first demonstrate that GEDI energy profiles are capable of reliably distinguishing maize, a crop typically above 2m in height, from crops like rice and soybean that are shorter. We further show that these GEDI profiles provide much more invariant features across geographies compared to spectral and phenological features detected by passive optical sensors. GEDI is able to distinguish maize from other crops within each region with accuracies higher than 84%, and able to transfer across regions with accuracies higher than 82% compared to 64% for transfer of optical features. Finally, we show that GEDI profiles can be used to generate training labels for models based on optical imagery from Sentinel-2, thereby enabling the creation of 10m wall-to-wall maps of tall versus short crops in label-scarce regions. As maize is the second most widely grown crop in the world and often the only tall crop grown within a landscape, we conclude that GEDI offers great promise for improving global crop type maps.


Introduction
Crop type maps are a crucial step toward estimating crop area, mapping yield, studying local nutritional outcomes, and developing hydrological models (Boryan et al. 2011, Jin et al. 2019. Recent years have seen significant progress in remote sensing-based crop type mapping, particularly in high-income countries, with maps now produced in the US (USDA-NASS 2020), Canada (Agriculture and Agri-Food Canada 2021), much of Europe (Defourny et al. 2019, Belgiu & Csillik 2018, and parts of Asia (You et al. 2021). While often high in accuracy, the models that produce these maps remain local, in the sense that model application is confined to the region where ground labels exist for crop types. Applying these models outside the region of training sees rapid performance declines , Kluger et al. 2021, because the models largely use opticallysensed time series as features. These time series, which reflect crop phenology, change from region to region as growing season timing, climate, management practices, soil properties, and crop varieties change. As a result, crop type maps remain elusive in places where ground labels are scarce, which includes the vast majority of low-and middle-income countries.
To date, solutions proposed for creating crop type maps in label-scarce regions include matching satellite time series to crop type profiles (Foerster et al. 2012, Belgiu et al. 2021, designing machine learning models that need fewer labels to perform well (Jean et al. 2019, Tseng et al. 2021, substituting crowdsourced labels in lieu of surveybased labels (Wang et al. 2020), and investing more resources to collect ground data in low-income regions (Lambert et al. 2018, Jin et al. 2019, Rustowicz et al. 2019. Another potential solution is finding remote sensing features that are invariant to geographic shifts -in other words, finding a remote sensing modality under which a particular crop type looks the same way everywhere on Earth. So far, such a feature has not been found in multi-spectral imagery at the spectral resolution of MODIS, Landsat, or Sentinel-2, or in radar imagery like that acquired by Sentinel-1, but the ever-growing list of sensors offers new possibilities each year.
The Global Ecosystem Dynamics Investigation (GEDI) is a spaceborne light detection and ranging (lidar) sensor that was launched in late 2018 and installed on the International Space Station . As a lidar waveform instrument, GEDI measures the reflection of a laser beam off of vegetation and the ground surface, with a nominal spatial resolution of 25m. The waveforms are then processed to provide information on surface topography, canopy height, canopy cover, and vertical canopy structure . GEDI was designed with the goal of improving measures of forest canopy structure, and several recent studies have applied GEDI to this end (Schneider et al. 2020, Potapov et al. 2021. Because spaceborne lidar sensors typically provide only a sparse sampling of the Earth's surface -during its planned mission GEDI will measure 4% of the land surface -lidar data are commonly used as a source of training data for models that estimate forest structure from wallto-wall imaging sensors, such as Landsat , Potapov et al. 2021, Tandem-X (Qi et al. 2019), or Sentinel-1 (Chen et al. 2021, Bruggisser et al. 2021. Although designed for forest systems, the GEDI measures could also prove useful in cropland systems. In particular, crop height may be a more consistent feature of crops across regions than the spectral and phenological features detected by passive optical sensors. For example, Figure 1 displays the distribution of reported crop heights for thousands of varieties of different key species stored in the U.S. National Plant Germplasm System (https://npgsweb.ars-grin.gov/), which contains seed samples from around the world. Among the four staple crops grown most widely across the world, maize is clearly taller than the others, with even the 25th percentile of maize samples exceeding the 95th percentile of the other three crops (rice, wheat, and soybean). On average, maize is roughly 1m taller than the other crops, with 1m equal to the reported vertical resolution of GEDI . Thus, it is plausible that GEDI could distinguish maize from other common staples, although it is unlikely that it could distinguish wheat from rice or soybean.
In this paper, we explore the potential of GEDI to distinguish between taller and shorter crops, and thus to provide more generalize-able features that can be used to transfer crop type models from one region to another. Although several tall crops are commonly cultivated - Figure 1 illustrates that crops such as sorghum and sunflower exhibit a similar height distribution as maize -we focus on maize for two main reasons. First, it is by far the most widely grown tall crop in the world, with many regions relying on maize either directly or indirectly (via animal feed) for a substantial portion of their calories and protein. In sub-Saharan Africa, for instance, distinguishing maize from all other crops is often a key step towards estimating national grain supply (Jin et al. 2019. Second, maize is the predominant tall crop in the regions for which we have extensive field-scale crop maps to test our crop type estimates. Since we are interested in finding geographically-invariant features, we test GEDI in three maize-producing regions around the world: the state of Iowa in the US, the province of Jilin in China, and the region of Grand Est in France. The three study areas were chosen for their geographic diversity and availability of accurate, up-to-date fieldscale crop type maps (USDA-NASS 2020, Agence de Services et de Paiement 2019a, You et al. 2021). By training maize classifiers within each region and applying them across regions, we show (1) GEDI data can distinguish maize from non-maize crops based on height, (2) GEDI features transfer much better than optical features across regions spanning multiple continents, and (3) GEDI data can generate training labels that then enable wall-to-wall crop type mapping with optical imagery in the absence of other ground labels.

Study areas
To evaluate the potential of GEDI to distinguish between tall and short crops, we considered three regions of the world: Jilin in China, Grand Est in France, and Iowa in the United States ( Figure 2). These regions are representative of major agricultural production areas on three separate continents, contain a mix of tall and short crops, and have accurate, up-to-date field-scale crop type maps that are publicly available.
Jilin Province is located in Northeast China and is one of the most important maize-producing provinces in China. It spans the mid-latitudes from 40.8°N-46.3°N and 121.7°E-131.3°E and has a humid continental climate. Other major crops grown in the area are soybeans and rice. Because early frost usually appears in September and early October, fast maturing maize varieties are cultivated. Maize is typically planted in April and harvested in September, with an average maize cycle duration of 150 days.
The Grand Est administrative region in northeastern France cultivates a wide variety of crops including wheat, barley, maize, alfalfa, sugar beets, legumes, and oilseeds. Twenty-two percent of French sugar production, 21% of rapeseed production, 13% of wheat production, and 13% of maize production come from this region (Le Service statistique ministériel de l'agriculture 2019). Maize is generally planted between April and May, and harvested between September and November. The region spans 47.4°N-50.2°N and 3.4°E-8.2°E and has a climate that varies from oceanic in the west to humid continental in the east. While the administrative region Nouvelle-Aquitaine is the largest producer of maize in France (31%), Nouvelle-Aquitaine also produces significant quantities of sunflower, which is also a tall plant. To focus on evaluating GEDI's ability to distinguish maize, we conducted experiments in Grand Est, which is France's second-largest producer of maize, instead. We elaborate on the application of GEDI in regions with more than one tall crop in the Discussion.
Iowa, a state located in the Midwestern region of the United States, is in the heart of the U.S. Corn Belt and is the country's largest producer of maize. Maize and soybean are the two primary crops cultivated, comprising well over 95% of total cropped area (Fig. A1). Maize in Iowa is planted from late April to May and harvested from late September to early November. Located between 40.4°N-43.5°N and 90.1°W-96.6°W, Iowa also experiences a humid continental climate with cold winters and hot summers.

GEDI data and feature extraction
The GEDI instrument is the first spaceborne lidar instrument specifically optimized to measure vegetation structure. By firing a laser at 25-meter spots (termed "footprints") on the Earth's surface and observing the return of the laser pulse, the instrument is able to measure the vertical distribution of vegetation at each spot. GEDI collects data globally (between 51.6°N and 51.6°S latitudes) at the highest resolution and densest sampling of any lidar instrument in orbit to date . The raw GEDI waveforms collected undergo several processing steps to retrieve a variety of metrics, and the derived products are saved as both footprint and gridded datasets.
For our analysis, we used the Level 2A Elevation and Height Metrics Data (L2A), which includes footprint-level elevation and relative height (RH) metrics. RH metrics represent the height (in meters) at which a percentile of the laser's energy is returned relative to the ground. For example, RH50 = 20m means that 50% of the laser's energy was returned by objects up to 20 meters above the ground. The ground position is determined based on the center of the lowest mode of the returned waveform (Hofton et al. 2000). RH metrics are saved at 1% intervals, so each shot contains 101 values representing RH at 0-100%. These footprint data are geolocated with a mean positional error of 10.3 m.
We downloaded the GEDI L2A version 2 (GEDI02 A v002) data from July to September 2019 that intersects our study regions through NASA's Earthdata Search website. GEDI L2A data come with a series of flags and properties to help the user filter for data of quality appropriate for the specific application. For this study, we omitted shots with a quality flag value of zero, which indicates poor quality, and a nonzero degrade flag, which indicates poor geolocation. We note that, unlike RH values observed for forests, RH values used here were commonly below zero. This happens because waveforms from agricultural areas often have only one mode, and the GEDI algorithms define RH relative to the center of the lowest mode. We also filtered out shots with RH100 greater than 10m, as the field crops in the study areas do not grow that tall. We also dropped a full orbit for September 25 in Jilin, China because of abnormally high RH100 values. The outliers removal only dropped a small percentage of points (1.5% in Iowa, 3.6% in Jilin, and 3.5% in Grand Est). A map of the shots left in each region after cleaning the dataset and filtering for cropland are shown in Fig. 2, and the counts of shot numbers for each crop type are summarized in Fig. A1.
Consecutive RH metrics are highly correlated with each other. We therefore sampled a metric every 10% to reduce the number of features used from 101 to 11. The difference in accuracy of random forest classifiers trained on all 101 RH metrics and a subset of 11 is only slightly lower for all three study regions.
This suggests that the information lost from reducing feature dimensionality had little impact on the ability to distinguish crop types.

Sentinel-2 imagery and feature extraction
Current state-of-the-art crop type maps use imagery from passive optical remote sensing as input features for classification (USDA-NASS 2020, Defourny et al. 2019). To compare GEDI features to optical features for crop type classification, we extracted S2 time series at each GEDI shot location. We also extracted S2 time series for the entirety of the study areas in order to demonstrate how GEDI can be used to create labels for wallto-wall crop type mapping. All optical imagery was processed using the Google Earth Engine (GEE) platform.
The Sentinel-2A/B (S2) satellites acquire images with a spatial resolution of 10meters (Blue, Green, Red, and NIR bands) and 20-meters (Red Edge 1, Red Edge 2, Red Edge 3, Red Edge 4, SWIR1, and SWIR2 bands), and together they provide images at a 5-day interval. The spatial resolution of 10-m to 20-m is sufficient to resolve individual fields in the three study areas.
We used S2 surface reflectance data (Level-2A) present in GEE and filtered out clouds using the S2 Cloud Probability dataset provided by SentinelHub in GEE. To capture crop phenology, we used Sentinel-2 imagery from January 1 to December 31, 2019, using the same time window across the three regions. In our study areas, this time window encompasses a single growing season for the majority of crop types.
Features were extracted from S2 time series by fitting harmonic regressions to all cloud-free observations in 2019. For each spectral band or vegetation index f (t), the harmonic regression takes the form where a k are cosine coefficients, b k are sine coefficients, and c is the intercept term. The independent variable t represents the time an image is taken within a year expressed as a fraction between 0 (January 1) and 1 (December 31). The number of harmonic terms n and the periodicity of the harmonic basis controlled by ω are hyperparameters of the regression. We used a second order harmonic (n = 2) with ω = 1.5, shown in previous work  to result in good features for crop type classification. This yields a total of 5 features per band or vegetation index, resulting in 20 harmonic coefficients total.
We computed harmonic coefficients for three bands and one vegetation index: NIR, SWIR1, SWIR2, and GCVI. GCVI is the green chlorophyll vegetation index (Gitelson et al. 2005) computed as GCVI = NIR/Green − 1 Unlike the commonly-used NDVI, GCVI does not saturate at high values of leaf area and has previously been shown to aid in distinguishing crop types .

Crop type labels
We used high-accuracy crop type maps in Jilin, China, Grand Est, France and Iowa, USA to filter out non-crop areas, train maize classifiers, and evaluate each classifier's performance. In each region, the corresponding map's value at each GEDI shot footprint centroid location or S2 pixel location was used as the ground truth for crop type. Note that GEDI footprints have a 12.5 m radius, so it is possible for a GEDI shot to span multiple crop type map pixels and have mixed crop type labels (Fig. 3). The data products available in each region for the year 2019 are described below. for the three major crops in the area (maize, soybean, and rice) using S2 time series data and ground samples from field surveys. The overall accuracy for the 2019 crop map for the whole Northeast region is 87%, with F1-scores of 94%, 85%, and 87% and rice, maize, and soybeans, respectively. Maize and soybean have higher recall (producer's accuracy) (86% and 90%) than precision (user's accuracy) (both 84%), indicating that the commission errors of maize and soybean are higher than the omission errors. According to the authors, this mainly resulted from the incorrect identification of other crops as maize and soybean. We imported the 2019 crop type map for the province of Jilin in GEE and used it to sample crop type labels at GEDI shot locations for the three major crops mapped.

Registre Parcellaire Graphique (RPG)
The Registre Parcellaire Graphique (RPG) is an geographical database of agricultural fields in France maintained by the Service and Payment Agency (ASP). The ASP is the institution that pays aid to French farmers under the Common Agricultural Policy (CAP) of the European Union. As part of their request for CAP aid, farmers send the ASP plot boundaries and certain plot characteristics. Unlike the CDL in the US and the You et al. (2021) map in Northeast China, the RPG in France is a georeferenced vector product derived via survey, rather than a raster product generated by a machine learning algorithm. Each plot is drawn to centimeter resolution and associated with a crop type also submitted by the farmer (Agence de Services et de Paiement 2019b).
An anonymized version of the dataset is released publicly by the ASP each year, and we accessed this dataset at https://www.data.gouv.fr/. The entire 2019 database contains 9.6 million plots; filtering for those that fall within the Grand Est region results in a dataset of 851,090 plots. Although the RPG does not include farmland not receiving CAP aid, in reality 98% of agricultural land in Grand Est is recorded in the RPG.
We imported the RPG dataset in GEE, filtered out non-crop parcels, and rasterized it to sample the crop type labels at GEDI shot locations.

USDA Cropland Data Layer (CDL)
Each year, the US Department of Agriculture (USDA) produces the Cropland Data Layer (CDL) for the lower 48 states of the US. A raster product with pixels at 30m resolution, CDL covers 132 classes spanning field crops, tree crops, developed areas, forest, and water. It is the output of a decision tree algorithm trained on ground labels obtained through surveys and a combination of Landsat, Disaster Monitoring Constellation, ResourceSat-2, and S2 imagery (USDA- NASS 2019). The accuracy of CDL labels varies by class and geographic region but is generally high.
We accessed CDL via GEE and used it to filter out GEDI shots in non-crop areas of Iowa and assign crop type labels to GEDI shots in cropped areas. Of the cropped area, 57% of GEDI shots are maize and 41% are soybean (Fig. A1). In the 2019 Iowa CDL, maize is classified with a precision of 97% and recall of 95%, and soybean is classified with a precision of 96% and recall of 95% (USDA-NASS 2019), indicating that CDL is accurate enough to be used as ground truth to evaluate GEDI features for maize classification.

Assembling training and test sets
For each region, we split the GEDI shot locations into a training (80%) and test set (20%) for training and evaluating the models. We discretized each study region into 0.5 × 0.5 degree grid cells; all GEDI shots in each grid cell were placed entirely in either the training set or test set. Splitting GEDI shots along grid cells maximizes the chance that samples from the same field are kept together in the train or test set, thereby preventing classification metrics from being inflated due to leakage. We ran each classification task 11 times using different training and test splits set each time and reported the mean and standard deviation of accuracy over all runs.

Random forest classifier
We used a random forest classifier to classify crop types in all experiments. Random forests (Ho 1995, Breiman 2001 are an ensemble machine learning method comprised of many decision trees in aggregate. Each decision tree is trained on a bootstrapped version of the training set and a random subset of features to reduce the correlation of predictions across decision trees and improve performance when those predictions are averaged. Random forests are commonly used in crop type classification (Defourny et al. 2019, Jin et al. 2019 and other Earth observation tasks due to their high accuracy and computational efficiency.
We used the RandomForestClassfier implemented in Python's scikit-learn package. We kept the default parameters, with the exception of raising n estimators from 10 to 100 to reduce prediction variance.

Crop type classification
In Table 1 we give an overview of the experiments tested in this paper and detailed below.
3.3.1. Testing GEDI features for crop type classification Our first experiments test how well GEDI waveforms can distinguish maize from non-maize crops due to maize being a significantly taller crop. As GEDI was designed to monitor forests, it is unknown whether the instrument would be able to resolve height differences between crop types at all. We trained two random forest classifiers for each study region, one using GEDI RH metrics (GEDI Local) as features, and one using the S2 harmonic coefficients (S2 Local). Both models were tasked with distinguishing maize samples from non-maize samples, where the ground truth for crop type was provided by the datasets described in Section 2.4. The S2 Local model is representative of current state-of-the-art crop type classifiers that use optical imagery to predict crop types. It provides a reference for the GEDI Local model as well as models that are transferred across regions.
The number of samples and locations used for training and testing the two models were identical. Although S2 provides wall-to-wall imagery while GEDI only observes a small subset of Earth's surface, we limited S2 samples to GEDI shot locations. By controlling for sample size and location, we directly compare the two sets of features.
The timing of GEDI observations is important, as maize will be most distinguishable from other crops when their height difference is the greatest. To test the sensitivity of classification performance to growing season timing, we compared the performance of GEDI Local models trained on July only shots, August only shots, September only shots, and shots from all three months.

Testing GEDI feature transfer across regions
After classifying maize versus nonmaize crops within each region, we tested model transfer across regions. For each GEDI Local and S2 Local classifier trained in the U.S., China, or France, we applied the classifier to the test sets of the other two regions to separate maize from non-maize. We refer to these models as GEDI Transfer and S2 Transfer. The models were not shown any additional data from the new regions. High classification accuracy in a new region would indicate that the model's features generalize across space and few if any labels are needed from the new region to classify maize; conversely, low classification accuracy would mean that the learned relationship between features and crop types holds true only locally, and labeled data is needed in the new region to learn new classification boundaries.
We compared the GEDI Transfer models to their Local counterparts to see how well GEDI RH metrics generalize across geography. To understand whether growing season timing affects model transfer, we repeated this analysis for each GEDI model trained on July only shots, August only shots, September only shots, and shots from all three months.
3.3.3. Wall-to-wall crop type mapping using GEDI as training labels Even if GEDI features transfer perfectly across geography-i.e. maize is always identifiable in GEDI waveforms no matter where on Earth one looks-GEDI only samples 4% of the land surface and cannot alone generate a wall-to-wall crop type map. Achieving a continuous map in space requires GEDI-based approaches to be combined with wall-to-wall imagery like that provided by S2.
To create wall-to-wall maize maps using GEDI and S2 imagery, we used methods similar to those employed previously to calibrate local maps of forest height with GEDI and Landsat . In a new region, we trained a model (GEDI-S2 Transfer) using predicted crop types from GEDI Transfer as labels and local S2 harmonics as features. By applying the GEDI-S2 Transfer model to all cropland pixels in a new region, we produced a wall-to-wall 10m spatial resolution maize map without the need for local labels. Figure 4 presents a graphical explanation of the GEDI-S2 Transfer approach.
We compared GEDI-S2 Transfer to two other models. The first is the S2 Local model, which provides an upper bound for how well S2 harmonic features can classify maize when trained on in-region ground truth. The gap between GEDI-S2 Transfer and S2 Local reflects the accuracy of GEDI Transfer's predictions relative to ground truth. The second benchmark is the S2 Transfer model, which shows how well a model trained on S2 harmonics in one region fares when applied to other regions. Differences between S2 Transfer and GEDI-S2 Transfer reveal how robust GEDI features are across space compared to optical features.

GEDI and Sentinel-2 feature comparison
The median harmonics of GCVI from S2 and median RH energy curves from GEDI are shown for the top three crops in each region in Fig. 5. In each region, the S2 maize profile reaches peak greenness in August, and is generally distinguishable from the other crops because of different timing and magnitude of the greenness peaks. For the GEDI energy profiles, roughly 50% of the energy returned comes from negative RH values, Figure 4. Graphical overview of the GEDI-S2 Transfer approach: an example of transferring a model trained on US crop type labels and applying it to China to create wall-to-wall maize maps. In (a) a random forest model is trained using GEDI RH metrics as features and CDL as labels in Iowa, U.S. The model is tested in a different region within Iowa (GEDI Local) to evaluate model performance. In (b) the Iowatrained model is used to create predictions in China at GEDI shot locations (GEDI Transfer). These predictions then serve as labels for a new model trained on Sentinel-2 harmonic coefficients. Combining Sentinel-2 and GEDI (GEDI-S2 Transfer) enables a wall-to-wall maize map in China without any ground truth labels from China. In this schematic, the GEDI-S2 Transfer predicted map is shown for the same region used for training, but model evaluation is always done using a test set not used in training.
which as mentioned previously arises from the GEDI algorithm's definition of ground elevation based on the the center of the lowest mode, with waveforms from cropped areas typically having only one mode. Thus, we emphasize that the values of RH should not be interpreted as physically meaningful; for instance RH100 does not correspond to the physical crop height. Nonetheless, the curves exhibit a clear separation between maize-the tallest crop-and the other crops. This difference is especially apparent at the extremes of RH curves, shown as insets in Fig. 5. Whereas Fig. 5 compares different crops within a region, Fig. 6 displays the median features for maize from different regions on the same plot. This comparison is especially relevant for the question of how well a model is likely to transfer across regions. Ideally, features would be similar across regions in order to use a model trained in one region on another. For the S2 harmonics, clear differences emerge between the regions, with the   U.S. generally having a steeper increase in GCVI during June and July and a higher peak in August compared to other regions. The harmonic curves for the other crops also differ considerably, both in timing and magnitude of the peak (Fig. 5). In contrast, the GEDI curves are remarkably consistent across regions (Fig. 6). This difference between S2 and GEDI indicates that the peak height of maize is a better preserved characteristic across regions than the timing of the maize growing season and the total crop biomass, both of which influence the S2 harmonics.

Local maize classification
The feature comparisons above suggest that GEDI features should be reliable for classifying tall crops (in this case, maize). To quantitatively evaluate its performance for local classification, we used these GEDI features to train a classifier for each region and compared it to a classifier trained on the S2 harmonics.
In all cases, the task is to separate maize from other crops, and we considered various possible timings of the GEDI features. Test accuracies indicate that GEDI metrics can distinguish maize from other crop within region with more than 79% accuracy in all cases (Fig. 7). The optimal timing of GEDI observations differs by region, with September for China, July for France, and August for the U.S. being the best times for classification, resulting in 88%, 85%, and 91% accuracy, respectively. August is generally a good month for GEDI observations in all regions, with performance in all regions above 83% for this month.
As expected, the locally-trained S2 models do well in each region (93% accuracy in China, 95% in France, and 95% in the U.S.), and indeed perform better than the GEDI features. The gap between S2 local models (trained on the same locations as the July-September GEDI shots) and the best GEDI models is less than 5% for China and the U.S. and less than 10% in France.
To understand why GEDI model errors are larger, we show maps of GEDI misclassifications in Appendix Fig. A2 and confusion matrices for representative median runs in Appendix Fig. A3. From the maps, we see that a significant percent of errors occur at field borders, where the GEDI shot footprint contains multiple crop types or a mix of crop and non-crop classes. We also observe that the most difficult crop types for GEDI Local models to classify are soybeans in China and silage corn and sunflower in France. In China, this can be partly explained by the 84% precision for soybeans in the You et al. (2021) map used for ground truth (Section 2.4.1). In other words, up to half of the "misclassification" of soybeans as maize in China could be correct. Since the "ground truth" maps in China and the U.S. are created using optical satellites, the predictions of S2 Local models-both correct and incorrect-are likely to correlate with the ground truth more than those of GEDI Local models.
Local GEDI accuracies in France are overall lower than the other two regions. This appears to arise mainly from the fact that two kinds of maize are grown in France, maize for grain and for silage (see Fig. A1 for the crop distribution by region). Whereas grain maize is always grown to maturity, and thus has a more reliable seasonality, silage maize is grown for biomass and thus can be sown and harvested at any point in the season. As a sensitivity test, we recalculate local GEDI accuracy in France after omitting the silage maize from the test set, finding that accuracies improve by 4% or more in all periods (see Fig. A4 in Appendix). The improvement is largest in September, consistent with the notion that early harvest of silage maize is affecting the performance of GEDI features, since a harvested crop is no longer tall. As both the U.S. and China grow predominantly grain maize, the less predictable timing of silage maize is not an issue in those regions.

Transferring classification across regions
As noted in Section 4.1, the consistency of GEDI features across regions suggests that models trained in one region can be reliably applied to new regions. A quantitative test of this proposition is shown in Figure 8, which compares the performance of GEDI models trained using data from the local region (GEDI-Local) to those trained in other regions (GEDI-Transfer). Although the locally trained models are typically the best performers, the transferred models perform nearly as well and are occasionally indistinguishable from the locally trained models. For example, models trained in the U.S. or China both perform as well in France in August as one trained in France. Transferred GEDI models are only able to make predictions at GEDI shot locations. In order to extrapolate beyond these point locations, we used transferred GEDI predictions as labels to train a new model that takes local S2 harmonics as input (GEDI-S2 Transfer). GEDI-S2 Transfer test accuracies for the month of August are reported in Fig. 9 together with S2 Transfer and S2 Local accuracies for comparison. S2 Transfer shows how well state-of-the-art optical features perform out-of-region, while S2 Local provides an upper bound on how well optical features can separate maize when paired with ample local ground truth.  . Test accuracies of models using Sentinel-2 harmonic features for wall-towall mapping of maize and non-maize for the month of August, trained either with direct transfer of Sentinel-2 features from other regions (gray bars) or using labels from GEDI predictions (blue bars). Error bars show one standard deviation. Dashed lines show performance of a locally-trained Sentinel-2 model (with training samples from the same GEDI shot locations) for the month of August as a comparison.
The results in Fig. 9 show first that harmonic features, while good at distinguishing crop types locally, transfer poorly across study regions. The average accuracy of an S2 Transfer classifier is 64%. In the U.S., the S2 Local model achieves an accuracy of 94%, but S2 Transfer models trained in China and France only manage accuracies of 60% and 62%, respectively. This is consistent with previous work that found optical feature transfer-ability to deteriorate across geography , as well as with the differences in crop phenology observed in Fig. 6. As the phenology and prevalence of crop types shift across regions, the optimal decision boundary for classifying maize versus non-maize with harmonic features also changes. S2 Transfer therefore results in many misclassified samples.
GEDI RH features, on the other hand, transfer much better across regions, and consequently the S2 model they supervise (GEDI-S2 Transfer) also performs much better than direct S2 Transfer. Fig. 9 shows that GEDI-S2 Transfer accuracies exceed 82% for all cross-region pairs. For example, GEDI-S2 Transfer achieves 86% accuracy in the U.S. for models trained in China or France. While the S2 Local model has an 8% higher accuracy, the GEDI-S2 Transfer significantly outperforms S2 Transfer from China and France by 26% and 24%, respectively. GEDI-S2 Transfer and GEDI Transfer accuracies are about the same; in China they are the same, in the U.S. GEDI-S2 Transfer is 1% lower, and in France GEDI-S2 Transfer is 1% lower for the model trained in China and 1% higher for the model trained in the U.S. Example crop type predictions for the three regions using the best GEDI-S2 Transfer model are shown in Fig. 10. The corresponding ground truth crop type maps are also shown for reference.

Discussion
The results show that GEDI features can distinguish a tall crop like maize from shorter crops, and that these features are highly transferable across geography. Should spaceborne lidar sensors someday sample the Earth's surface more densely, they would add a useful set of features for mapping crop types that are complementary to optical and radar features. In regions where crop type maps are already available, such as the study areas considered here, lidar could augment field surveys to generate crop type labels, reducing the cost of creating products like CDL. This is especially true in a system like the U.S. Corn Belt, where agriculture is heavily dominated by maize (a tall crop) and soybeans (a short crop).
Most importantly, lidar has the potential to enable mapping of tall crops like maize in areas of the world where crop type maps are not available due to a lack of ground labels. Our experiments transferring GEDI features and using GEDI to train wall-towall crop type maps in China, France, and the U.S. show the robustness of lidar features across continents, despite the GEDI instrument being designed to monitor forests rather than cropland. In fact, S2 models trained on GEDI performed only slightly worse than those trained on local ground truth (Fig. 9). Although the planned lifetime for GEDI is only two years, the S2 models trained on GEDI could be applied in years without GEDI observations, likely with adjustments to account for potential shifts in features over time (Kluger et al. 2021).
Despite the promising results, we recognize many potential issues that would emerge when extending this approach to the global scale. First, mapping locations of tall and short crops will not suffice for many applications, which require more detailed crop information. Where maize is the predominant tall crop, as was the case for the three regions studied here, a map of tall crops can be reliably used to identify maize areas. In regions with multiple tall crops, additional features would be needed to separate individual crop types; for instance, optical data has proven useful for distinguishing maize from sorghum (Soler-Pérez-Salazar et al. 2021) and radar data for distinguishing maize from sunflower (Veloso et al. 2017, Belgiu & Csillik 2018. Second, many of the errors we observed in GEDI predictions occurred at the edges of fields. These disparities likely reflect some combination of errors in the labels as well as mixed crop types within the GEDI shot footprint. For applications in smallholder regions, these mixed pixels will be increasingly common. It is possible that such errors would have only minimal effect on model training, since they are likely to be random, or that maps of field boundaries could be used to filter out shots near field edges (Waldner & Diakogiannis 2020).
Another issue, particularly in tropical systems, could be frequent cloud cover during the time of year when crops are at peak height, which could limit the availability of clear GEDI shots. Although we observed lower availability of clear shots during August for the current study regions (Fig. 2), it did not appear to compromise the performance of the model relative to other months with more observations, presumably because many thousands of clear observations were obtained even in the cloudier months. Nonetheless, clouds could emerge as an important constraint in other locations.

Conclusions
We conclude that GEDI holds great promise for improving agricultural monitoring, because it captures features that are much more generalize-able than those typically used in satellite-based crop type mapping. The demonstrated ability to distinguish crops with height differences of just one meter suggest other potential applications in agriculture should be feasible, such as monitoring the age and growth of tree crops or identifying intercropped fields that contain mixtures of different crops. Figure A1. Number of GEDI shots in each region by crop type. Figure A2. GEDI Local predictions in the three regions. In blue are the correctly classified shots, in red the misclassified ones. Many errors occur at the borders of the fields, likely because of mixed crop types within GEDI shots footprints. Some other errors can be attributed to errors in the crop type maps used as ground truth. Figure A3. Confusion matrices of GEDI Local classification for the best time in the three regions: September for China, July for France, and August for the U.S., which resulted in a mean accuracies of 88%, 85%, and 91%, respectively. For our analysis, we ran the classification task multiple times, each time with different train and test sets, and computed mean accuracies. Here we are showing confusion matrices only for the median run, i.e. the run with median accuracy. We ran a binary classification (maize vs. other crop). Here we show ground truth labels detailed by crop type to get deeper insights into misclassifications. Figure A4. Comparison of GEDI Local accuracies in France when (left) excluding versus (right) including maize for silage. Accuracies improve when considering only grain maize by more than 4% in all periods, with September improving 10% most likely due to early harvest of silage maize.