Lidar-based estimates of aboveground biomass in the continental US and Mexico using ground, airborne, and satellite observations

a Biospheric Sciences Lab, Code 618, NASA-GSFC, Greenbelt, MD 20771, USA b Centre d'Étude de la Forêt, Université Laval, Quebec City, Quebec, G1V0A6, Canada, and Earth Science Division, NASA Headquarters, 300 E Street SW, Washington, D.C. 20546-0001, USA c Science Systems and Applications, Inc., 10210 Greenbelt Road, Lanham, MD 20706, USA d Department of Geographical Sciences, Lefrak Hall, University of Maryland, College Park, MD 20742, USA e US Forest Service, Pacific Northwest Research Station, 107 Anderson Hall, University of Washington, P.O. Box 352100, Seattle, WA 98195-2100, USA f El Colegio de la Frontera Sur, Av. Ranco Polígono 2-A, Ciudad Industrial, 24500 Lerma, Campeche, Mexico g Colegio de Postgraduados en Ciencias Agrícolas, Km 36.5, Texcoco 5, México-Texcoco 56230, Montecillo, Estado de México, Mexico h Infrared Baron, Inc., 2372 North 1st St., Suite B, Hermiston, OR 97838, USA i Dept. of Forest Resources and Environmental Conservation, Cheatham Hall, 310 West Campus Drive, Virginia Tech, Blacksburg, VA 24061, USA


Introduction
Two NASA space lidars are planned for launch over the next few years, (1) a near-polar photon-counting instrument, ICESat-2/ ATLAS (Gwenzi and Lefsky, 2014; http://icesat.gsfc.nasa.gov/icesat2/ instrument.php), and (2) a waveform instrument in an equatorial orbit on board the International Space Station, GEDI (Global Ecosystem Dynamics Investigation, Dubayah et al., 2014aDubayah et al., , 2014b; http://science. nasa.gov/missions/gedi). Both will be multi-beam profilers that will intensively sample the Earth's terrestrial surface. Unlike the single-beam ICESat/GLAS (Ice, Cloud, and land Elevation Satellite/Geoscience Laser Altimeter System) which collected useful vegetation data from 2003 to 2007, these new space ranging instruments will collect sequential measurements along multiple near-nadir tracks beneath their orbital paths. The challenge to researchers will be to transform these linear, sequential measurements into robust, repeatable, statistically defensible, area-based estimates of biomass and carbon. To do this, researchers will have (1) to develop models to predict ground-based biomass using the space lidar measurements, and (2) to process these groundspace biomass estimates within a sampling framework that makes sense with respect to the observational strategies used for the collection of the space data.
To this end, much work has been done with GLAS data to provide such a conceptual framework, and we continue this effort with the current study. The purpose of this investigation is to further refine techniques which might be employed by researchers to develop large-area estimates of biomass. In particular the objectives of this study are: 1. Determine which of 3 predictive models -linear, linear-no-intercept (forced through origin), and log-linear -best serves to predict total biomass at the state and national level. 2. Quantify how well the best model does at the state and national level with respect of assessing biomass density, total biomass, and the variability of these estimates. 3. Compare satellite-lidar-based estimates of biomass density and total biomass to a comprehensive ground survey in the continental United States (CONUS). 4. Extend those findings to inventory Mexico at the state and national levels.
One intention of this study is to empirically characterize how well we might expect to estimate regional biomass using a space laser system. We do not argue that GLAS-, ICESat-2-, or GEDI-like systems should replace existing regional or national forest inventories. Comprehensive sets of ground observations are the gold-standard, and these groundbased arrays of field plots typically deliver regional and site-specific information concerning not only biomass, volume, basal area, canopy height and density, but also information concerning forest health, downed-woody debris, tree and shrub species, understory cover, herbaceous and moss/lichen cover, and soils information, observations that are difficult if not impossible to reliably infer from space remote sensing platforms. For instance, over 130 separate metrics are recorded on each USFS-FIA (US Forest Service -Forest Inventory & Analysis) level-3 plot. Thus far, GLAS can be used to reliably measure or estimate, arguably, six of them -canopy height, canopy density, aboveground biomass, aboveground carbon, volume, and basal area. We note that GLAS has been used to estimate LAI (Tang et al., 2014a) and vertical vegetation profile (Tang et al., 2014b), but these measurements are not typically acquired on national forest inventory (NFI) ground plots hence are not included in this list. However, space lidars can play a critical role in that they provide a repeated measurement and estimation capability on areas where access is limited, difficult, expensive, or dangerous, e.g., Siberia, northern Canada, northern and western Alaska, Africa, portions of the Amazon, SE Asia, and the Middle East and thus will be critical for developing a robust, global forest-monitoring capability. These space lidars can also provide useful ground surface elevation measurements, that, when linked with canopy surface elevation from spaceborne image stereogrammetry, can provide spaceborne estimates of canopy height in sparse forests (Montesano et al., 2014a). Lidars can augment but not replace ground observations, and lidar studies must depend on representative, ground-based measurements to develop predictive models used to estimate the response variable of interest, e.g., biomass, carbon.

Background
Numerous regional biomass studies that integrate ground observations and space lidar measurements have been conducted in an effort to develop techniques that can be used to better estimate forest biomass and carbon across large areas. These studies have all depended on the ICESat-GLAS satellite to provide the space-based ranging measurements needed to develop and refine large-area forest inventory approaches. These data, available for download via the National Snow and Ice Data Center (http://nsidc.org/data/icesat), have been used to inventory Quebec (Boudreau et al., 2008;Nelson et al., 2009a), central Siberia (Nelson et al., 2009b), the North American boreal forests (Margolis et al., 2015), and the circumpolar boreal forests (Neigh et al., 2013). The current study relies on the three-phase sampling techniques, a hybrid estimation framework , and two-phase estimators proposed by Ståhl et al. (2011) and employed by Neigh et al. (2013) and Margolis et al. (2015). The term "hybrid estimation framework" or "hybrid" refers to a situation where the 1st phase, in our case the GLAS orbital acquisitions, is a probability sample, i.e., a systematic-like sample assumed to have been randomly acquired, and the 2nd phase rests on modelbased principles, in our case the models which predict ALS estimates of biomass as a function of GLAS measurements.
To develop regression models to predict biomass using satellite lidar data, the analyst must tie the ranging measurements acquired by the space lidar to ground observations of biomass. Currently, there are three ways that this can be done.
1. An analyst can hope that, by luck, one or more of the space lidar pulses image(s) an existing ground plot. With GLAS, across large areas in the temperate latitudes and given the duty cycle of the GLAS system, this probability turns out to be exceedingly small. For instance, in CONUS, we estimate that only~50 of the 119,414 nonzero biomass plots, (i.e., FIA plots where at least one subplot has measureable forest biomass) scattered across the US would have been measured by the GLAS acquisitions considered in this study. This situation should markedly improve when the next generation of profiling space lidars are launched in 2017/2018, for these future missions will sample the Earth's surface much more intensively.
2. An analyst can establish ground plots directly on GLAS pulses previously collected. GLAS pulse center locations are known to within 3.1 m, 1 σ (Magruder et al., 2010), enabling researchers to use terrestrial GPS receivers to navigate to pulse locations with, admittedly, some unknown but constrained spatial error. This technique was used by joint Russian/US field teams in Siberia (Nelson et al., 2009b;Montesano et al., 2014b) to directly relate ground estimates of height and biomass to GLAS measurements. The benefit of this approach is that the Ståhl et al. (2011) estimators can be appropriately applied since their estimators account for both the GLAS sample variability and ground-satellite model variability. The downside is that the work is labor-intensive and, in remote areas where it is most logical to use satellite observations to extend a sparse network of ground plots, very expensive and logistically challenging. 3. An airborne lidar can be used as an intermediary to tie ground observations to satellite measurements. This approach was used in the Quebec, North American boreal, and circumpolar boreal studies noted above. The approach involves 3 phases (ground, airborne lidar, space lidar) and two sets of predictive models. During a mission, the airborne lidar is flown over existing ground plots and also over/along a subset of the existing GLAS orbital acquisitions.
Assuming that the area of interest (AOI) is stratified into cover types, then one set of models, one for each cover type, is developed that relates ground estimates of aboveground dry biomass/ha to airborne lidar measurements. These ground-air models are used, in turn, to estimate biomass on each of the GLAS shots overflown with the same airborne lidar. A second set of stratified equations is developed that relate airborne estimates of biomass (Y) to GLAS measurements (X). Once these airborne lidar -GLAS equations are developed, then GLAS can be used as a regional sampling tool.
The current study employs option #3. Currently, there are downsides to using this relatively efficient method of inventorying forests within a hybrid estimation framework using space lidar measurements. The most important limitation is that a hybrid variance estimator does not yet exist that takes into account both the 1st phase sampling variability and the model variability inherent in a three-phase design, i.e., a design that incorporates not one set of models but two. This limitation has affected those employing option #3 for the last decade, and the net effect of this limitation is that we employ a hybrid two-phase variance estimator  that accounts for the model variance of only one of the two models used to generate GLAS estimates of biomass. Necessarily then, we underestimate the model variance component of our estimates by an unknown amount. The good news is fourfold: (1) We know that we have this problem and acknowledge it in our studies (Neigh et al., 2013;Margolis et al., 2015). (2) Empirically, we know the order of magnitude of this model variance underestimation problem and have ways to mitigate it, though the mitigation strategy is far from ideal. (3) A hybrid, three-phase variance estimator is currently being developed and validated by colleagues at the Swedish University of the Agricultural Sciences Sören Holm, personal communication). (4) Until that estimator is ready for use, we have in previous studies and will in this study employ a variance estimator that may mitigate but not correct the effects of the model variance underestimation problem. The reader should view all standard errors related to GLAS estimates reported here-in as approximations. Ståhl et al. (2011) developed hybrid, two-phase variance estimators that take into account the uncertainties associated with both sampling error and model error. These two error sources are additive. The Ståhl et al. estimators are applicable in situations where an airborne lidar is used to sample extensive areas that may have only a limited number of ground observations on or near the AOI. Given the model-based nature of the estimators, the ground plots do not need to be probabilistically allocated. Practically speaking, this means that (1) an existing NFI is not a precondition to estimating regional biomass with an airborne lidar, and (2) ground plots do not have to be overflown as an integral part of the lidar sample survey, i.e., they can be located off the sample flight line grid. These two characteristics free the researcher to purposefully select existing ground plots anywhere, with best effort made to collect a representative sample of forest conditions within the AOI. The Ståhl estimators can also be appropriately applied in situations where satellite lidar measurements are tied directly to ground biomass estimates, i.e., options #1 and 2 above. Similar to the approach taken by Nelson et al. (2009b) in Siberia, this allows researchers to gather ground observations of biomass on existing GLAS pulses in clusters throughout the AOI. It is important that these ground-satellite observations are representative of the entire AOI; in particular it is most important that the full range of the GLAS measurements are characterized in order to avoid, as much as possible, extrapolation. Gregoire et al. (2016) encourage scientists who employ lidar as a regional sampling tool to make clear the survey design, the estimators employed, assumptions which underlie the sampling framework, and the limitations of inference. To this end, we describe the foundations on which our US and Mexico surveys rest. In this study, both surveys rest on a hybrid 3-phase sampling framework (1) where the 1st phase, i.e., the GLAS acquisitions, are acquired probabilistically and the 2nd (ALS) and 3rd (ground) phases are acquired within model-based inferential frameworks, (2) where the ground-ALS observations and the ALS-satellite lidar observations are not spatially coincident, (3) where a GLAS ascending or descending orbit is considered a cluster, (4) where GLAS clusters are assumed to randomly sample CONUS, Mexico, and the states within, (5) where individual GLAS pulses within these clusters (orbits) are post-stratified, and (6) where we assume that ALS estimates of biomass density on each GLAS pulse overflown are correct and calculated without error. This last assumption, patently untrue, leads directly to our underestimation concerns regarding biomass variance estimates.
This underestimation problem with the model variance component of the Ståhl et al. (2011) variance estimator, expected to be on the order of 2-4x given the ground-ALS and ALS-GLAS regression sample sizes in this study , their case C.2 versus C.3) can be partially addressed by knowingly using an estimator of the sample variance component known to be inflationary. As noted above, the sample variance component of Ståhl's variance estimator rests on the assumption that the GLAS orbits randomly sample the AOI. Though GLAS orbits are not a true systematic sample, they have the appearance of a systematically-acquired transect sample. It is well documented that sampling variability may be overestimated when systematically-acquired samples are assumed to be randomly collected (Osborne, 1942;Nyyssönen et al., 1967Nyyssönen et al., , 1971. For instance, Ene et al. (2012), using a simulated forest population in Hedmark County, Norway, demonstrates that the Ståhl et al. hybrid two-phase variance estimator performed well, i.e., was unbiased, when the assumptions upon which the estimator rest are met. However, when the assumption of randomness is not met, for instance when a systematic sample of flight lines is assumed to be a random sample, then significant variance inflation can occur. In Hedmark, treating a systematic sample of flight lines as a random sample lead to a variance inflation of 4.0x due in large part to an overestimation of the sampling variance in an ordered population. Hedmark, in fact, might be regarded as a worst-case since (1) there is an obvious N-S biomass gradient, and (2) the flight lines were flown E-W. This sample variance inflation is especially significant in situations where the population of interest, in our case the biomass in the state and national forests of CONUS and Mexico, exhibits spatial trends approximately perpendicular to the GLAS nearpolar orbital inclination. Both the US and Mexico exhibit such E-W spatial trends in many states and at the national level. While large-scale biomass trends associated with geography and topography may inflate GLAS sample variances, a 2nd orbital characteristic may reduce this inflationary tendency. ICESat/GLAS was a repeat-track satellite; it flew the same near-polar orbits within 1 km of the reference ground tracks, 1 σ, every 91 days. The repeat-track nature of the coverage occasionally lead to closely spaced, parallel orbits 100 s of meters apart across-track, and this will tend to decrease sampling variability. Further, in order to conserve laser resources, GLAS was typically tasked to acquire data 2 or 3 times a year in~35 day blocks. Given the 91 day repeat cycle, this tasking leads to occasional random, large gaps in coverage up tõ 130 km in CONUS and~88 km in Mexico. These gaps, along with cloud cover/obscuration, can significantly affect biomass estimates and sampling variability in small states by limiting the number of GLAS orbits acquired over a particular area. In short, we expect sampling variance to be inflated, but the amount of inflation is unknown, and will vary by state.
The GLAS orbital characteristics detailed above lead to concerns about the accuracy of the biomass means and biomass totals. The near-polar GLAS orbits and the random nature of the extensive E-W gaps place states such as Vermont (4 orbits), New Hampshire (7 orbits), Rhode Island (3), New Jersey (7), and Delaware (0), at risk of estimation based on small, possibly nonrepresentative samples. Likewise, larger states that might be measured by many GLAS orbits but which support little forest (e.g., the states of the US Central Plains) where those forests are spatially clustered, may exhibit estimation instability. Certainly the GLAS orbital sampling pattern, a pattern which excludes from measurement large chunks of the Earth's terrestrial surface, is not ideal.
Near term, future space lidar multibeam profilers (GEDI, ICESat-2), with increased orbital sampling density and randomization, will mitigate this sampling problem.

Sampling framework and estimators
The specific estimators employed in the current study have been reported in a number of publications including Ståhl et al. (2011), Eqs. (1)-(4), and Gregoire et al. (2016), Section 4.7, Eqs. (10)-(16). The sampling approach and regression estimation rest on a few important assumptions, some which are outlined in Ståhl et al. (2011), in standard regression textbooks, or in publications noted elsewhere in this report: (1) The linear regression model form is correctly specified. The models are unbiased.
(2) All regressions are homoskedastic, i.e., regression errors are iid (independent and identically distributed). (3) The airborne lidar observations acquired to develop the groundair regressions are independent of the airborne lidar observations acquired along GLAS transects. This condition does not affect our ability to purposively sample ground plots or GLAS transects with the airborne lidar (Royall, 1970;Royall and Herson, 1973) as long as the aforementioned spatial independence is respected. (4) Although not a true systematic sample, GLAS ascending and descending orbits are systematic-like, and we assume that these orbits, when treated statistically as a random sample, will have similar inflationary tendencies with respect to variance estimation. The degree of this inflationary tendency in CONUS and Mexico is unknown.

US and Mexico ground plot data
In CONUS, US Forest Service -Forest Inventory and Analysis (FIA) ground plots served as our ground reference data set both with respect to (1) developing the predictive models that relate aboveground dry biomass on ground plots to coincident ALS measurements, and (2) developing ground reference estimates of biomass for five land cover types in each of the 48 states in the continental US. Of the 895 FIA plots overflown by our ALS sensor, 842 were used to develop the generic and stratum-specific equations used to inventory CONUS forests. The remaining 55 FIA plots were excluded from the model development process based on inconsistent ground biomass-ALS height relationships which suggested that the plots had been either mislocated or cut/ cleared between the time of the FIA measurement and the laser overflights.
With respect to state and national ground reference estimates, USFS software (Evalidator, www.fia.fed.us/tools-data/tutorials_training/ docs/EVALIDator_user_guide_1.5.1.05.pdf) was used to estimate forest biomass in each of the five strata defined by the Landsat land cover map based on the predominant species classification of each FIA plot as determined on the ground at time of measurement. The FIA species-specific categories (e.g., longleaf-shortleaf pine, oak-hickory, maple-beech-birch, pinyon pine-juniper) were cross-walked into one of the five forest classes -wetlands, hardwood, conifer, mixedwood, and burn -in order to compare satellite lidar estimates to groundbased estimates. The Evalidator software was used to query ground measurements on 119,414 FIA plots scattered across the entire US to develop the ground-based stratum, state, and national estimates. FIA plot remeasurement dates varied from 1998 to 2013 (mean year ± 1 standard deviation: 2004.7 ± 2.5 years), with measurement dates varying by state.
In Mexico, the Mexican NFI, Inventario Nacional Forestal y Suelos (INFyS, www.cnf.gob.mx:8090/snif/portal/infys), provided the necessary ground reference observations of biomass. Mexico's NFI includes 26,000 ground plots established between 2004 and 2007 scattered across the forested areas of the country. Our ALS system measured 294 INFyS ground plots in northern and central dry forests, in the Sierra Madre pine/oak forests, in the Veracruz and Yucatan moist forests, and in the Yucatan dry forests.

US and Mexico land cover maps used for stratification and reporting
Landsat and MODIS data were used to categorize both ground plot data and GLAS data overflown by the ALS into five forest cover strata in CONUS. A 30 m Landsat land cover map of CONUS, resampled to 90 m, the NLCD (National Land Cover Database) -2006 (www.mrlc. gov/nlcd2006.php), was used to identify each FIA ground plot and each CONUS GLAS pulse as being in one of four land cover categories: wetlands (NLCD class 90,95), hardwood (NLCD class 41), conifer (NLCD 42), and mixedwood (NLCD 43). Wickham et al. (2013) state that the NLCD-2006 map used in this study to stratify the US has a classification accuracy of 78%. A fifth land cover class -burn -was identified by joining FIA plot locations and GLAS pulse locations with 2000-2006 MODIS burn data (Giglio et al., 2009). This MODIS multitemporal burn product was used to report the most recent burn date of a given 500 m pixel identified as burned within the seven year window.
Similarly in Mexico, a 90 m land cover map produced by INEGI (Instituto Nacional de Estadística y Geographía, INEGI, 2010, http:// www.inegi.org.mx), Version 4, based on photointerpretation of SPOT data, with 59 land cover classes, was recoded into the 4 NLCD-like forest cover classes of interest -wetlands, hardwood, conifer, and mixedwood. The actual recoding, i.e., crosswalk, used in this study is provided in the Appendix, Table S.1. This land cover map of Mexico was joined with both the Mexican NFI ground data and the GLAS acquisitions. As in the US, the MODIS multitemporal burn data was used to identify areas that burned between 2000 and 2006, thereby adding a fifth forest land cover class.
Of the 842 US ground plots, 66 were located in NLCD nonforest, 103 in shrubs, 39 -wetlands, 181 -hardwood, 404 -conifer, 38mixedwood, and 11 -burn. Given the paucity of burned plots overflown by the ALS sensor, the nonforest and shrub plots were used along with the 11 burn plots to derive the FIA-ALS biomass equation for burn (n = 180). In Mexico, of these 294 NFI plots overflown by the ALS sensor, 33 were identified as nonforest based on our simplified INEGI crosswalk, 0 in shrubs, 0 in wetlands, 128 in hardwood, 120 in conifer, 0 in mixedwood, and 13 in burn. In Mexico, as in the US, land cover strata not identified as one of the five forest cover types are assumed to contain zero forest biomass regardless of the GLAS measurements obtained.

Airborne LiDAR scanning data
NASA's GLiHT (Goddard's Lidar, Hyperspectral, & Thermal imager, Cook et al., 2013, http://gliht.gsfc.nasa.gov) ALS system was used to acquire laser ranging measurements over the US and Mexico in 2011 (Table 1). The heart of the ranging system is a Riegl VQ-480, a 300 kHz, 1550 nm Class 1 (eye-safe) scanning lidar The nominal flight altitude was 335 m AGL and the usable scan angle was defined to be ± 15°for these overflights, yielding an effective ground swath width of 180 m, though the laser does measure out to ±30°. The flight speed varied with topography but was typically 110-120 knots (205-223 km/h, 57-62 m/s). At a 335 m AGL flight altitude, given a laser beam divergence of 0.3 mrad, the footprint size is 10 cm. The GLiHT lidar is a multistop system, recording up to 6 significant returns per pulse, though the great majority of the pulses acquired in the US and Mexico generated 1 or 2, rarely 3 returns. On average, the pulse density on all 3 missions was 4-5 pulses/m 2 .
The US and Mexico flights were scheduled to try to measure canopies leaf-on. In CONUS, flights were conducted mid-summer -July and August. In Mexico, scheduling was more challenging given that leafon/off conditions vary with latitude, elevation, and the timing of the regional rainy/dry seasons. All of Mexico was flown in April and early May 2013 in an attempt to avoid clouds associated with the rainy season in southern Mexico (May-September) while also avoiding the leaf-drop associated with the dry season (November-March). Maass et al. (1995, Fig. 5) provides some guidance for the state of Jalisco in western Mexico (latitude~20°N), showing maximum leaf drop in November and December. In Mexico, we were not successful as regards acquiring consistently leaf-on ranging measurements. In southern Mexico, especially in the pine/oak and dry forests of Oaxaca (~17°N) and in the dry forests of the northwestern Yucatan peninsula (~20.5°N), extensive stands of hardwoods were leaf-off, and these leaf-off stands were frequently just a few kilometers from similar hardwood stands which, from the aircraft, looked to be in full canopy. In hindsight, it may have been better to fly Mexico perhaps 20-30 days later, trading increased cloud cover for leaf flush. Given GLiHT's relatively low design flight altitude, data collection missions can be conducted under overcast skies. However, given the 1550 nm wavelength of the ALS transmitted pulse and concerns about pulse attenuation/absorption due to water, care was taken not to fly wet targets.
The GLiHT aircraft transited 895 USFS-FIA plots in the United States and 294 INFyS plots in Mexico. GLiHT also transited 6797 km of GLAS orbits in the continental US and 3160 km of GLAS orbits in Mexico. The FIA and INFyS ground plots and GLAS transects overflown by GLiHT in the US and Mexico are shown in Fig. 1, as well as the GLAS orbits used to sample both countries.

GLAS data
GLAS served as our large-scale sampling tool; the Landsat land cover and MODIS fire products in the US and the SPOTphotointerpreted and MODIS fire products in Mexico served only to stratify the two countries for reporting purposes and, arguably, to improve the predictive biomass models. In the US, 3 leaf-on GLAS acquisitions incorporating 230 orbits are used to estimate biomass in 47 of the 48 states. In the US, none of the 230 GLAS orbits transected Delaware nor the District of Columbia, hence biomass estimates, like the sample size, are zero for these two locales. In Mexico, 5 GLAS acquisitions were accessed to inventory the 32 states and the entire country. Three-hundred-eighty-four orbits were acquired over Mexico and each state was measured during at least one of those acquisitions (Table 2).

Ancillary data
Ninety meter SRTM (Shuttle Radar Topography Mission) data acquired in February 2000 were used to approximate local slope on a 3 × 3 window centered on GLiHT-GLAS observations. Only GLAS pulses acquired on slopes ≤20°were used to (1) formulate regressions and (2) to inventory US and Mexico forests. This slope criterion was established to mitigate the effects of GLAS waveform pulse spreading on steep slopes (Lefsky et al., 2005) which leads to inflated canopy height values.

Regression analysis
The LIN, LNI, and log-linear models were developed using SAS Proc Reg (SAS Institute, Inc., 1989). The Reg procedure's All Possible Subsets Regression (APSR) feature was used to select significant variables for each model. APSR considers all possible variable combinations and reports the best single variable models, the best two-variable models, three-variable models, etc., according to a user-specified criterion, in our case, R 2 . These best models are then tested individually and down-selected based on variable significance and VIF (variance inflation factor) metrics. VIF values approaching or exceeding 10 indicate multicollinearity issues and identify those independent variables that should be trimmed in order to avoid overfitting. The log-linear model estimates of ln(biomass) were backtransformed to real-world units, i.e., biomass density, in t/ha, using a ratio estimator for bias correction (Snowdon, 1991).

Temporal concerns
As reported above, this study integrates ground and GLAS data sets acquired circa 2005 (ca. 2005) and ALS data acquired ca. 2012. The ALS is used to tie GLAS to ground. The ground-ALS models relate ca. 2005 ground biomass density estimates (Y) to ca. 2012 ALS measurements (X). The ca. 2012 ALS measurement are then used to estimate ca. 2005 biomass conditions on the ca. 2005 GLAS pulses overflown for ALS-GLAS model development. No growth or mortality adjustments were made to the ground plot data to harmonize dates. The important factor here is that the ground data collects were, at least approximately, temporally coincident with the GLAS acquisitions.

Continental United States
In CONUS, three sets of explicitly or implicitly linear models, (1) linear (LIN), (2) linear-no-intercept (LNI), and (3) log-linear, were used to develop stratum-within-state, state, stratum, and national-level estimates of aboveground total dry biomass for the contiguous US.
A summary of CONUS LIN and LNI models are given in Table 3. An expanded version of Table 3 that includes the specific regression coefficients and the variance-covariance matrices can be found in the Appendix, Table S.2.
The third set of models, log-linear, i.e., ln(b ground ) = f(ht, den) ALS and ln ðb ALS Þ ¼ f ðht; denÞ GLAS were likewise formulated, but these equations produced state and national estimates that grossly overestimated the ground-based FIA ground reference values by factors of 2-3. Though log-linear ground-ALS models have been used successfully in the predominantly coniferous Scandinavian countries to estimate volume (Naesset, 1997) and biomass , airborne-GLAS log-linear models have consistently failed to produce reasonable biomass estimates in Quebec (Boudreau et al., 2008;Nelson et al., 2009a) and now, in CONUS. As illustrated in Fig. 2, the failure is broad-spectrum. Fig. 2 illustrates comparable GLAS biomass estimates from the three models made on 241,718 US GLAS pulses located in the 5 forest strata in CONUS. For whatever reason, e.g., inappropriate model form, sparse, nonrepresentative ALS data collects over ground plots and GLAS pulses, noisy GLAS data -the log-linear models turn out to be much less forgiving than the linear model forms, especially at the high end of the AGB spectrum. As such, the log-linear equations and resultant estimates derived based on those equations are not reported.
The LIN and LNI model results were compared to USFS-FIA groundbased estimates of biomass density at the state and national levels. Table 4 Table 4, the linear-no-intercept models were better at reproducing FIA results which, in this study, serve as our ground reference data. The linear models produce a national total biomass estimate 7% larger than the FIA estimate whereas the linear-no-intercept models reproduced the national FIA estimate within b1%. State-level LNI estimates also exhibited less scatter about the 1:1 line that denotes equivalence between FIA and GLAS biomass estimates. The average absolute value of the departure from the 1:1 line at the state level was 19%; the comparable linear model departure was over 30%. Based on the results presented in Table 4, the decision was made to inventory Mexico using LNI models only.
Both the GLAS LIN and LNI estimates of national biomass density were greater than the FIA average biomass density, however the NLCD-2006 map used to stratify the US GLAS pulses into forest and nonforest reported less forest area relative to the FIA. The FIA reports that the US is 34.5% forested; the NLCD-2006 Landsat map reports that the US is 31.4% forested if we judge that all pixels identified as wetland, hardwood, conifer, mixedwood, and burn are forest. Fig. 3 illustrates state-to-state differences between FIA and NLCD forest area estimates.
Tabular estimates of total aboveground dry biomass based on LNI equations, by state, are provided in Table 5. The ground-based USFS-FIA estimates are provided for comparison. Ground-GLAS state comparisons of biomass density and total aboveground dry biomass are illustrated in Fig. 4.

Mexico
Based on the US results, Mexico was inventoried using linear-no-intercept models. Unlike the US, we did not inventory Mexico using separate equations for wetland, hardwood, conifer, mixedwood, and burn forest. Two significant characteristics of the Mexican data set suggested that a different analysis approach in Mexico was needed. The first characteristic was that the distribution of forest cover types flown by the ALS varied significantly from north to south, affecting our ability to generate cover-type-specific ground-ALS equations throughout the country. Sixty-four of the 79 Mexican NFI plots overflown in the 12 northern states were pine; only 6 were hardwood, and none were mixedwood. Northern Mexico (excluding Veracruz and southern Tamaulipas), primarily steppe/desert, tends to support dry pine or pine-oak forests. In this northern zone, one general NFI-ALS equation was generated. In central and southwestern Mexico, states on or west of the Sierra Madre support mountain pine-oak forests and to the west, the dry Pacific forests. In this southwest zone, two NFI-ALS biomass equations were generated, one for hardwood, one for conifer. The third zone, the Atlantic Crescent states which include Veracruz, Tabasco, Campeche, Yucatan, and Quintana Roo, are a mixture of moist and dry hardwood forests with little conifer. Of the 112 NFI plots transited in this third zone, 86 were hardwood, 2 were conifer. One general NFI-ALS equation was generated for this Atlantic Crescent zone, primarily characterizing the moist and dry tropical hardwood cover types.
The second characteristic that suggested that Mexico should be analyzed as three independent zones -Northern, Southwest, Atlantic Crescent -concerned the Atlantic Crescent data specifically. The problem is illustrated in Fig. 5, a scatterplot of GLAS h90 (X) versusb ALS (Y) for the entire country, GLAS acquisitions L2a, L3a, L3c, L3d, and L3f. The bi-lobed nature of the data cloud is obvious and troublesome from the standpoint of fitting linear models; at a minimum, at least two distinct subpopulations are seen in this plot.
Additional work indicated that the near-vertical lobe closest to the Y-axis was tied to the L2a, L3a, and L3d GLAS acquisitions, i.e., the autumn measurement periods, in the Atlantic Crescent. We hypothesize that Fig. 5 illustrates the effects of dry season deciduous leaf drop on the upper GLAS height deciles, e.g., h70, h80, h90. Based on this hypothesis, a decision was made to inventory the five Atlantic Crescent states using early summer GLAS acquisitions only, i.e., L3c and L3 f. Table 3 CONUS linear models and linear-no-intercept models used to predict AGB for each of the 5 forest strata within US states. Note: While the linear versus linear-no-intercept ground-ALS models may be directly compared within cover type, the ALS-GLAS models may not. Within cover type, the linear and linear-no-intercept Ground-ALS models use the same dependent (b grnd ) observations; the ALS-GLAS models do not. VIF = variance inflation factor. cc Canopy closure, the fraction of first returns intercepted by a tree, i.e., first return height N 1.3 m AGL (0 b cc b 1.0). h10 a , h30 a , h50 a , h70 a , h90 a Height deciles, all pulse returns, i.e., height below which 10% of the returns occur, 30% occur, etc.; ground hits included (m). d0 a , d1 a , d2 a , d4 a , d5 a , d7 a Density deciles, all pulse returns, i.e., the proportion of returns that occur in equally spaced height bins between ground and h95 a , ∑ 9 x¼0 dxa = 1.0.

ALS-GLAS variables:
b ALS ALS estimate of biomass density (t/ha) based on ground-ALS equation.
centroid Height of the GLAS waveform where half of the waveform energy is above and half below, signal start to signal end (m), relative to signal end. medh Median waveform height above ground peak (m). h14 The height from ground, i.e., the last Gaussian peak, to signal start (m). h10, h25, h30, h75, h90 Height deciles or quartiles; e.g., h75 is that waveform height below which 75% of the waveform energy resides (m), relative to signal end. h75 srtm , h90 srtm Height deciles corrected for SRTM ground slope (m). trail Vertical distance, ground peak to the end of the Gaussian ground signal (m). lead Vertical distance, start of the Gaussian ground signal to the ground peak (m). fslope Front slope angle, that acute angle formed by a vertical line denoting ambient waveform noise and a 2nd line joining signal-start and the first (highest) vegetation peak (degrees The other two zones were inventoried using all five GLAS acquisitions, none of which exhibited this distinct bi-lobe pattern. The Mexico linear-no-intercept models used to estimate stratum-within-state and national biomass are given in Table 6. An expanded version of Table 6 that includes the specific regression coefficients and the variance-covariance matrices can be found in the Appendix, Table S.3.  Table 7 reports the individual state estimates of biomass density, with error bounds, derived using the LNI equations reported in Table 6. Table S.5 in the Appendix reports stratum-within-state biomass densities for Mexico. As expected, forest biomass density increases as one moves from northwest to southeast in Mexico, with the Baja states reporting the lowest densities (and forest areas) and the Atlantic Crescent states, Veracruz and Quintana Roo, reporting the highest biomass densities.
At the national level, a number of Mexico forest biomass estimates can be compared to this study's national estimate (Table 8). The FAO estimates for 2000, 2005(FAO, 2010a and the various biomass estimates reported by Cartus et al. (2014, p. 5573) were derived by multiplying their aboveground carbon estimates by 2 to calculate dry biomass, an inversion of the procedure employed by FAO to convert biomass estimates to carbon. A few observations can be made regarding the numbers reported in Table 8. First, as witnessed by the FAO estimates, Mexican forests are under pressure much more so than forest area and forest growing stocks in the US. In fact, between 2000 and 2010, Mexican forest area losses were offset by forest area gains in the US (FAO, 2010a, p. 19). As noted in Table 8, our national estimate for Mexico, developed based on GLAS observations acquired between 2003 and 2006, are bracketed by the FAO ground-based and the PALSAR results presented by Cartus et al. (2014).

Continental United States
In a sampling study such as this where the NLCD is used to poststratify the FIA plots overflown by the ALS and all of the GLAS pulses, there are two moving parts: (1) an estimate of biomass density (t/ ha) derived from all of the GLAS pulses that intercept a given forest cover type as characterized by the NLCD, and (2) the optical satellite/GIS-based areal estimate of that forest cover type. Fig. 4A illustrates the significant differences between FIA and GLAS estimates of biomass density. Fig. 3 illustrates the significant differences between FIA and NLCD-2006 forest area estimates. Given the fact that the FIA and GLAS national estimates are so close -LNI model results within 1%, LIN model results within 8% -how do such disparate state components produce such national agreement? Texas, the largest outlier in Fig. 3, serves as an informative example. As shown in Fig. 3, the area covered by Texas forests according to the FIA is 249,127 km 2 .   Fig. 4B illustrates an average FIA-NLCD/GLAS biomass difference of 18.8% at the state level. These large biomass differences are primarily due to the factors listed directly above, i.e., the interplay between forest area (and the associated differences in the definition of forest) and forest biomass density. Perhaps as important is the fact that the estimator used to calculate the mean biomass density of each stratum within state assumes that the area is randomly sampled. As discussed previously, GLAS does not sample the Earth's surface randomly; large areas between orbits literally have no chance of being sampled. This characteristic, coupled with the small orbital, i.e., cluster, sample sizes associated with small states or states with few forest resources, can result in nonrepresentative samples which poorly estimate mean biomass density or total biomass. Though an imperfect sampling tool, GLAS does permit analysts to hone large-area sampling techniques, and future space lidars will mitigate these GLAS weaknesses by greatly increasing orbital sampling density.
Noted in Fig. 4B are the outliers on the far right-hand side of the graph, 3 points representing California, Oregon, and Washington. In these three states, FIA estimates of forest area are 4-24% larger than the corresponding NLCD areas and FIA biomass densities are 25-35% larger than GLAS estimates. These states support the largest trees in the conterminous US, with the tallest trees reaching heights of almost 100 m. The GLiHT data used in this study were postprocessed with a maximum upper height limit of 50 m; any pulse with a first return height-above-ground of N 50 m was discarded. This was done nationally to mitigate the effects of spurious, abovecanopy returns, but in these three western states this may have lead to legitimate ALS measurements being ignored. This truncation was expected to have, at most, only a minor effect, an effect worth suffering to improve overall data quality nationally given that trees exceeding 50 m on FIA plots can be considered rare events. More likely, we expect that the GLAS underestimation problem on the west coast is due to the fact that certain areas of the US such as the west coast should be considered separately with respect to the development of predictive ALS -GLAS equations. Subsequent studies should consider (1) increasing the maximum allowable height threshold and (2) developing an independent set of stratified equations in, at a minimum, these three states. Blackard et al. (2008), mapping biomass via imputation using satellite optical (MODIS, Landsat-NLCD), topographic, and climatic data sets, ran into a similar problem with these three western US states despite developing predictive equations by ecoregions. They report large mapped biomass underestimation errors relative to their FIA estimates (their Fig. 4). They underestimate total biomass in California, Oregon, and Washington by approximately 43, 35, and 41%, respectively (interpolated from their Fig. 4). Our corresponding GLAS-based estimates under-report FIA in these same states by 35, 41, and 26%, respectively. We hypothesize that the consistent underestimation in these three states in both studies is due to an inadequate characterization, measurement, and modeling of the exceptionally large trees found on the west side of the Coast Range and in the California Sierra Nevada Mountains. ALS overflights should target ground plots and satellite lidar profiles in 'big tree' areas, e.g., redwood, Sequoia, and fir/hemlock associations, so that these tall, high-biomass cover types are adequately modeled. In the current study's model-based inferential framework of the 2nd and 3rd sampling phases, such purposeful, targeted sampling is not only allowed, it would be encouraged to facilitate model development.
Stratum-level biomass density estimates for each state in CONUS are provided in the Appendix, Table S.4. In considering these stratum-within-state estimates, note that standard errors increase, reflecting the fact that the reliability and most likely the accuracy of these estimates become more suspect since national-level predictive equations are used to characterize local conditions (e.g., Huang et al., 2015). Also note that there is a general, albeit noisy, tendency for the percent-model error to decrease as a percentage of the total standard error as one drills down (spatially) into the data. This phenomenon is due to the fact that, as larger and larger areas area considered, the sampling error component decreases since more GLAS orbits hit the target. Model error tends to be relatively stable regardless of the spatial extent of the AOI. (Nelson et al., 2012, their Fig. 3). In the context of the logistic modeling of Landsat spectral data to predict forest proportion on circular AOIs 5, 10, and 15 km in radius, the stability of model standard error relative to sampling error is obvious in Tables 1-3 or Tables 4-6 in McRoberts (2006).
In this study, the continental US was specifically targeted as our primary AOI because of the existence of a country-wide, well-documented, well-measured national forest inventory, the USFS-FIA. The FIA provided ground reference, allowing us to compare our model-based, threephase estimates of biomass density and our NLCD estimates of forest area to those of an independent, probabilistically allocated, groundbased inventory program. The analysis techniques that produced the most accurate estimates in CONUS were subsequently extended to the Mexican neotropical and subtropical forests.
The total variance of an estimate of biomass, whether it is mean density or total, additively combines (a) a sampling variance component that is a function of the number of orbits intercepting the entire AOI (not just the stratum) and (b) a model variance component that adds the uncertainty associated with the model coefficients. This model variance, which essentially characterizes how the model coefficients change with repeated sampling, assumes that the specified model form, in our case, LNI, is correct. In this study, the model error component of overall standard error of a biomass estimate played a relatively minor roll, adding, on average, only 0.07 t/ha to state-level standard errors of biomass density, accounting for, on average, only 4.5% of the standard error. Previous aircraft lidar studies  had noted much larger model error contributions; sometimes over 90% of the total variance of the biomass estimate was attributable to model error. There are two reasons why the contribution in this study is relatively small, though still significant: (1) The current study employs LNI models; the Y-intercept is locked at (0,0) and any uncertainty associated with the estimation of that Y-intercept is discounted. Fig. 4. A comparison of 4A) aboveground biomass density, t/ha, and 4B) total aboveground biomass, million tons, for each of the 48 states in CONUS. These results were developed using the linear-no-intercept models in Table 3. 2σ error bars (~95% confidence interval) are shown on each GLAS estimate. In fact, in CONUS (this study), the LIN model error contributions to statelevel biomass density standard errors was 0.18 t/ha, accounting for, on average, 13.7% of the states'standard errors. At the national level, about 5.9% of CONUS standard error can be attributed to LNI model uncertainty; in Mexico that number is 22.6%.

Mexico
With respect to the estimates provided in Table 8, which estimate is correct? Certainly an argument could be made that the Cartus C estimates may be better since (1) theirs depends on wall-to-wall L-band radar observations, (2) L-band radar is sensitive to the biomass densities typically present in most Mexican forests (0-100 t/ha, Imhoff, 1995;Wagner et al., 2003), (3) the NFI-ALS sample (this study) is relatively sparse with respect to adequately representing the full range of canopy height and biomass conditions in Mexico, and (4) the GLAS sample is just that, a sample, i.e., not wall-to-wall. A counter-argument could be made that the ground-airborne lidar-GLAS sampling procedures used in CONUS and Quebec produced regional-and national-level estimates well within 10% of intensive ground-based samples.
Though we have no state-level ground information for Mexico, we expect scatter around the 1:1 line to be similar to that seen in the US ( Fig. 4A and B). Fig. 12 in Cartus et al. (2014) compares their state estimates of C-density based on wall-to-wall PALSAR measurements of the entire country (no forest/nonforest mask employed) to national forest inventory ground measurements. The radar C estimates track the 1:1 (ground:radar) line very closely (R 2 = 0.98) and, multiplying their C estimates by 2, range from 0 to approximately 66 t/ha dry biomass, with 30 out of the 32 Mexican states reporting biomass densities at or below 35 t/ha. The current study's state-level estimates range from 2 to 109 t/ha (Baja California Sur and Veracruz, resp.) in areas defined as forest in our INEGI version 4 land cover crosswalk. Twenty-nine of our 32 state biomass density estimates were at or below 70 t/ha. Our perstate biomass density estimates are approximately double those of Cartus. This is understandable from the standpoint that our biomass numbers describe only the biomass in forest strata for a given state, whereas the biomass density estimates illustrated in Fig. 12 (Cartus et al.) report biomass averages for the entire state, including nonforested areas.
In terms of national biomass estimates (Table 8), the Cartus estimate associated with the state-level numbers in their Fig. 12 overestimates our national estimate, 3.645 ± 0.06 Gt, by 21% ((4.42/3.645)*100). At the other end of the spectrum, when Cartus applied a forest/nonforest mask to exclude biomass contributions from nonforested areas, then their total forest biomass in Mexico drops to 3.06 Gt, a 16% underestimate ((1-3.06/3.654)*100) relative to this study's GLAS total. Table  8 highlights two facts: (1) The areal extent of Mexican forests have been changing rapidly, with the forests experiencing extensive deforestation and degradation (FAO, 2010a, p. 19). (2) The definition of forest and the areal extent of those forests as described using satellite optical data are huge drivers with respect to calculating a country's total biomass. As demonstrated in Table 8, Mexico's total biomass can change by 44% ((4.42 Gt/3.06 Gt) * 100) just by employing a forest/nonforest mask to control biomass contribution from areas defined as nonforest. Similarly, in a study in Alaska (Margolis et al., 2015), GLAS estimates of biomass and carbon for the entire state were generated with and without the contribution from the NLCD "shrub" land cover class (NLCD classes 51, 52). Inclusion of the shrub classes in Alaska lead to an apparent 43% increase in state estimates of total forest biomass, i.e., 3.02 Gt with shrub, 2.11 Gt without. In the final version of the North American boreal report, Margolis et al. (2015) reported Alaskan biomass without shrubs because the team had not included shrub allometry in the generation of the predictive ground-airborne lidar nor the airborne lidar-GLAS equations. A major take-home point is this: In a given country or state, a consistent definition of forest and a consistent delineation of that forest is, for bookkeeping purposes in a UN/REDD + environment, all-important (Sexton et al., 2016).

Conclusion
A hybrid three-phase sampling framework is used to generate ca. 2005 biomass density estimates and total biomass estimates, with uncertainties, for CONUS and for Mexico at the state and national levels. The standard errors reported incorporate sampling variability and ALS-GLAS model variability, but due to our inability to incorporate ground-ALS model variability, all standard errors must be considered approximations and most are likely underestimated. Regardless, these results empirically demonstrate that studies that rely on models that employ airborne or satellite remote sensing assets to predict variables of interest -e.g., biomass, carbon, LAI, Lorey's height, canopy gap fraction, percent lignin, carbon flux -should report variances or standard errors that account for both sampling variability (if applicable) and model variability. Not to do so implies that the models used to generate the estimates are perfectly specified, perfectly parameterized, and without error, an entirely unrealistic assumption. To be clear, this comment applies not only to airborne and space lidar studies, but also to all studies that employ any remote sensing assets -e.g., radar, optical sensors, thermal sensors -to estimate, via modeling, a particular variable of interest.
What is the best way to keep track of forest biomass and carbon, by nation, globally? "Best" in this instance can be defined by accuracy, precision, statistical robustness, consistency over time, and repeatability. This study points up the difficulties involved when analysts try to develop analysis approaches to address this question because the question itself implies that there is, for a given nation, one correct answer. Viewed in the extreme, there is. In a study like this one, where the current state of a variable is being estimated, the correct answer, known without error, is the weight of the wood in a given country when we cut down all of the forests, dry them out, and weigh them on a particular date. As nonsensical as this solution is, it is fair to ask if there is a consistent sampling or census mechanism, preferably one dependent on global remote sensing assets, that can be employed to provide an accurate, repeatable national mean or total estimate that is unbiased or minimally biased, with a statistically defensible estimate of uncertainty. We argue that space lidars and moderate resolution space optical systems (15 m -30 m, e.g., SPOT, Landsat, IRS), when used in conjunction with sampling frameworks similar to the one reported in this study, might provide such a mechanism. But, regardless of the asset employed, the key to the utility of those remote-sensing-based estimates is consistency in the definitions, measurement methods, and statistical techniques over time. A concerted effort would have to be made to calibrate airborne and space sensors so that the measurements acquired are comparable across decadal time scales, similar to the calibration efforts made with Landsat. Otherwise, national and continental remotesensing-based estimates of biomass and carbon such as those reported in this current study will continue to be one-offs, dependent on the definition of forest and on the particular measurement and observational characteristics of the sensors employed.