High-resolution mapping of aboveground biomass for forest carbon monitoring system in the Tri-State region of Maryland, Pennsylvania and Delaware, USA

Accurate estimation of forest aboveground biomass at high-resolution continues to remain a challenge and long-term goal for carbon monitoring and accounting systems. Here, we present an exhaustive evaluation and validation of a robust, replicable and scalable framework that maps forest aboveground biomass over large areas at fine-resolution by linking airborne lidar and field data with machine learning algorithms. We developed this framework over multiple phases of bottom-up monitoring efforts within NASA’s Carbon Monitoring Program. Lidar data were collected by different local and federal agencies and provided a wall-to-wall coverage of three states in the USA (Maryland, Pennsylvania and Delaware with a total area of 157 865 km2). We generated a set of standardized forestry metrics from lidar-derived imagery (i.e. canopy height model, CHM) to minimize inconsistency of data quality. We then estimated plot-scale biomass from field data that had the closet acquisition time to lidar data, and linked to lidar metrics using Random Forest models at four USDA Forest Service ecological regions. Additionally, we examined pixel-scale errors using independent field plot measurements across these ecoregions. Collectively, we estimate a total of ∼680 Tg C in aboveground biomass over the Tri-State region (13 DE, 103 MD, 564 PA) circa 2011. A comparison with existing products at pixel-, county-, and state-scale highlighted the contribution of trees over ‘non-forested’ areas, including urban trees and small patches of trees, an important biomass component largely omitted by previous studies due to insufficient spatial resolution. Our results indicated that integrating field data and low point density (∼1 pt m−2) airborne lidar can generate large-scale aboveground biomass products at an accuracy close to mainstream lidar forestry applications (R2 = 0.46–0.54, RMSE = 51.4–54.7 Mg ha−1; and R2 = 0.33–0.61, RMSE = 65.3–100.9 Mg ha−1; independent validation). Local, high-resolution lidar-derived biomass maps such as products from this study, provide a valuable bottom-up reference to improve the analysis and interpretation of large-scale mapping efforts and future development of a national carbon monitoring system.


Introduction
Forest carbon monitoring requires accurate estimates of aboveground biomass at high-resolution over large domains. Existing biomass maps are often inadequate for guiding such efforts because they have limited precision either by the spatial resolution (Blackard et al 2008b, Saatchi et al 2011, Wilson et al 2013a or by the geographical coverage (Huang et al 2013, Cook et al 2014. Although summary statistics from national to continental map products converge at continental scales (Avitabile et al 2016, Neeti andKennedy 2016), discrepancies have been observed at local to regional scales when compared to high-resolution biomass maps (Huang et al 2015. Thus there is a critical need to develop a highresolution map of forest biomass within the context of carbon monitoring over terrestrial ecosystems to assess ecosystem response to climate change (Houghton 2005, Houghton et al 2012, Gu et al 2016, Hurtt et al 2010, 2019, and to inform stakeholders at sub-regional to national scales to support policy-relevant activities such as Reduce Emissions Deforestation and Degradation (REDD+) and Paris Agreement (UNFCCC 2015).
Light Detection and Ranging (lidar) is the state-ofthe-art remote sensing technology for forest carbon accounting because of its accuracy in measuring the height and three-dimensional canopy structures (Goetz and Dubayah 2011, Huang et al 2013, Lu et al 2014, Chen and McRoberts 2016. Improving biomass estimates with lidar and other remotely sensed data such as from multi-spectral images and Interferometry Synthetic Aperture Radar (InSAR) has been a subject of intense research with substantial advances in the last decades (Swatantran et al 2011, Shendryk et al 2014, McRoberts et al 2018a, Zhao et al 2018. Increasing, lidar data are becoming available across large geographical scales, especially in the USA, Canada, and European countries. At statewide or nationwide scales, data from airborne lidar have been combined with field measurements such as those from Forest Inventory and Analysis database to assist stratified sampling and biomass estimation (Gregoire et al 2011), and produce wall-to-wall maps of forest structures and biomass (Gregoire et al 2011, Nord-Larsen and Schumacher 2012, Huang et al 2015, Chen and McRoberts 2016, Nilsson et al 2017. For example, lidar data were used to map forest variables such as tree height and biomass for Denmark Yet challenges remain in the operational use of existing airborne lidar for statewide or nationwide seamless mapping (Lu et al 2014). These include but are not limited to differences in collection time, instruments type and parameters set up for lidar acquisitions and data format. Leaf-off lidar data, originally designed for producing seamless DEM, are increasingly available at regional to national scales (USGS 2018). However, these datasets are normally acquired through multiple flight campaigns in adjacent counties/states during winter time and distributed in varied data formats.
Other fundamental challenges include the definition of forest/non-forest areas, and uncertainties in tree allometry and field measurements (Huang et al 2015). Different definitions of forest and non-forest could lead to large discrepancies when comparing local-scale mapping results to national-scale products (Huang et al 2015. Choice of national or local tree allometry could resulted in as much as 20% difference in field estimates of biomass and similar level of difference in county level estimates built upon lidar data . Last but not least, the uncertainties arose from the empirical relationships between field-measured biomass and lidar variables (McRoberts et al 2018a).
In this paper, we developed a framework that quantifies forest aboveground biomass over large geographical areas at high-resolution using lidar and field measurements. We combined forest attributes such as tree canopy cover and canopy structures predicted from leaf-off lidar data with field measurements to estimate biomass in both natural, planted and urban forests. In the process, we evaluate our maps rigorously with both with-held and independently collected plot data and discuss factors affecting our model predictions and the spatial patterns of resulting maps. This bottom-up approach at regional-scale complements the top-down continental-scale mapping to form a comprehensive carbon monitoring system (CMS).

Data and method
We combined existing field plots, wall-to-wall lidar, and optical data. The field data were used with allometry to estimate plot level biomass. The remote sensing data were used with a subset of these field estimates of biomass to drive machine learning models. In general, our framework has four general steps: (1) field data preparation, (2) remote sensing data processing, (3) model development and mapping, and (4) map validation and error analysis (figure 1).

Study area and field data
The study area, Tri-State region of Pennsylvania (PA; 119 280 km 2 ), Maryland (MD; 32 133 km 2 ), and Delaware (DE; 6452 km 2 ) in the USA (hereafter Tri-State region) covers a land area of approximately 157 865 km 2 . The Tri-State region contains four major ecological regions that are generally uniform with respect to species-composition and environmental gradients (figure 2). These regions are the Outer Coastal Plain Mixed, Eastern Broadleaf, Central Appalachian Broadleaf-Coniferous, and Northeastern Mixed Forest, adapted from Bailey's ecoregions for the Conterminous United States (Bailey 1995). Forests, including woody wetlands, comprise 56% of the total area based on National Land Cover Dataset (NLCD) 2011 (Homer et al 2015). The topography varies substantially with elevations ranging from about sea level near the Delaware River to 1027 m in Western Maryland.
The field data collected over the domain is built from two sections: USDA forest inventory analysis (FIA) plot data for model development, and independent field measurements collected by the University of Maryland (UMD) for product validation. For this study, 2243 forest plots that most closely matched the lidar year of acquisition over the domain were extracted from the FIA database (EVALIDator Version 1.6.0.03a). These plots, consist of measurements for individual tree height, diameter and species, were collected from year 2004 to 2013 in MD andDE, andfrom year 2005 to 2009 in PA, 92% of them had a less than 2 years difference (supplement S2 and table S2 is available online at stacks.iop.org/ERL/14/095002/ mmedia).
Additionally, a total of 1163 variable-radius plots (VRP) were collected by UMD plots) between 2010 and 2013 in MD (Dubayah 2012, Huang et al 2015, in the summers of 2014 and 2015 for the northeastern forest in PA and DE, respectively. VRP locations were determined by a modified random stratified sampling scheme broken down by ecoregions (Bailey 1995), NLCD 2011 land cover classes (Home et al 2015), and lidar canopy height (figure 2). Individual tree species, dbh and maximum height were measured in the field.
Plot-level aboveground live biomass was estimated using allometric models. For the sake of comparison with different map products, two types of allometric relationships were prepared: (1) regional models from Lastly, both the FIA and UMD plot data were screened for outliers. The screening omitted plots located in no lidar data areas (i.e. missing tiles in Maryland) and excluded mismatched plots between biomass and lidar heights with three filtering rules: had zero biomass but lidar heights (>10 m); had large biomass (>500 Mg ha −1 ) but low lidar heights (<10 m); and small biomass (<50 Mg ha −1 ) but high lidar heights (>30 m). These filters reduced the number of FIA plots from 2243 to 2215, and the number of UMD plots from 1163 to 1046. Summary statistics of the FIA and UMD plots are given in table 1.

Data processing for lidar and auxiliary data
Discrete return lidar data were collected through different county and state government agencies (summarized in supplement S2 and table S1) over the Tri-State region ( figure 2(b)). These airborne lidar collections were originally dedicated to surveying digital elevation model (DEM). Therefore leaf-off season (December-January) were picked for flight. Lidar first returns were interpolated into digital surface model (DSM); last returns were utilized to derive a bare earth DEM using quick terrain modeler and other GISbased software (O'Neil-Dunne et al 2013, 2014. Then DSM and DEM were differenced to obtain a normalized difference surface model (nDSM).
High-resolution tree canopy cover map at 1 m resolution for circa 2011 over the domain was created using an object-based approach that integrated the lidar data and high-resolution multi-spectral images from the National Agricultural Imagery Program or NAIP (O'Neil-Dunne et al 2013, 2014, USDA 2018. A canopy height model (CHM) was then extracted by applying the canopy cover mask to the nDSM, thereby creating a height model that excluded non-canopy height.
The definition of tree canopy used for high-resolution [1 m] product was a minimum height of 2.5 m and a minimum area of 4 m 2 . This definition is wider than the FIA definition of forest, which only refers to forested areas that are not developed for nonforest land uses, at least an acre in size and 120′ in width, and contain a live plus missing canopy cover of at least 10% (USFS 2018a).
A set of percentage canopy cover and mean canopy height maps were generated from aggregating the 1 m tree canopy cover and tree height maps. The percentage tree canopy cover was aggregated to 30 m. From the CHM, the max tree height within each 10 m cell was first calculated and then averaged to 30 m resolution (hereafter referred to as average-maximum height).
A set of forestry metrics were prepared, including percentile heights, density metrics, mean heights, canopy covers, and topographical indices (table 2). Specifically, metrics were calculated at plot-scale and pixelscale for modeling and mapping purpose. At plot-scale, we located the extent of a given plot using the plot centroid and calculated metrics within a given size of plot  figure  S2. At pixel-scale, we calculated metrics at 30 m resolution for each pixel within the domain. Countywide maps were created first and merged into a seamless statewide map aligned to a common extent.
Four national and a global scale biomass maps were prepared for map comparisons, summarized in table 3. The national biomass and carbon dataset (NBCD) 2000 was the first 30 m national product developed using medium resolution satellite optical imagery from Landsat enhanced thematic mapper plus (ETM+) and InSAR data from shuttle radar topography mission (SRTM) (Kellndorfer et al 2012). Two versions of the NBCD biomass maps were extracted for consistency with other national maps, including the FIA database version using CRM for allometry (NBCDcrm) and the nationally consistent allometric models (NCE) version using JKS allometry models (NBCDjks). The Saatchi map is a 90 m national-scale biomass map generated using optical, radar and lidar data and forest inventory data . Both the Blackard and Wilson were 250 m national biomass datasets generated by USDA forest service two generated by USFS (Blackard et al 2008a, 2008b, Wilson et al 2013b. The GlobBiomass map was a 100 m global biomass product converted from growing stock volume (Santoro 2018). Detailed descriptions for these maps are given in supplement S6.
For forest mask, the National Land Cover Dataset or NLCD 2011 (30 m) (Homer et al 2015) were utilized to differentiate between forested and non-forested areas for map comparisons. Specifically, woody forestlands (deciduous, conifers and mixed) and woody wetlands were included as forested area. From the NLCD forest mask, we applied a 20% threshold to aggregate the mask from 30 to 250 m. For consistency and statistical measurement purposes, all biomass and land cover maps were projected into Albers Equal-area Conic projection. All pixels were aligned to NLCD 2011 30 m cell.

Statistical models calibration
Random Forest models were developed using 80% of FIA plots to predict field-estimated biomass as a function of forestry and topographical metrics. Quantile Regression Forests (QRF), which predicts a particular quantile (10% and 90%), was employed to provide model error bounds (Meinshausen 2006). The R packages 'randomForest' and 'quantregForest' were used for modeling analyses (Liaw andWiener 2002, R Development Core Team 2016). We employed several statistical measurements such as coefficient of determination (R 2 ), root-mean-squared error (RMSE), RMSE CV (%), and mean estimated bias error (MBE) to validate results at plot-scale.
where x i is the value from field measurements, y i is the predicted value from RF model; i is the sample index; and x and y are the means of field measurements and predicted values respectively; and n is the sample size. Models developed for each ecoregion were applied to make countywide predictions for a total of 94 counties. Specifically, Outer Coastal model was applied to 16 counties, Eastern Broadleaf model was applied to 31 counties, Central Appalachian model was applied to 21 counties, and Northeastern Mixed model was applied to 26 counties. Details about random forest models were given in supplement S2.
Lastly, county-scale mean biomass density (Mg ha −1 ) and biomass stock (Tg) over DE, MD, and PA were summarized using average and sum methods. The Maryland canopy height map had gaps comprising around 1.5% of the state because of data collection restrictions and quality issues. To overcome this, these data gaps were filled with values from the NBCD product. Specifically, the height values were from weighted basal area heights (NBCD_BAW) and biomass

Plot-scale validation and error analysis
At the plot-scale, we compared estimates of biomass density from this study (RFjks) to with-held FIA plots and UMD plots. Explicitly, we conducted plot-scale validation using the 20% with-held FIA plots. Then, we used all UMD plots as independent plot-scale validations. In addition, we test the scale effect by comparing the RF predicts from 30 m and 90 m metrics to estimates from FIA plots (supplement S2). Maps of uncertainty were generated for every 30 m pixel. RF predicts the mean response for any given location. Prediction intervals were developed using QRF. QRF is similar to RF but it retains all responses and allows for constructing prediction intervals. We generated the 10th and 90th prediction percentiles and analyzed the spatial distribution of errors across the state. The uncertainty index is calculated as: where y i is the predicted value from RF model; Perct10 i and Perct90 i are the predicted values from 10th and 90th prediction percentiles, respectively.

National map validation
Estimates of biomass from this study (RFjks) were compared to four national and one global products (NBCDjks, Saatchi, Blackard, Wilson, and GlobBiomass) at the pixel, county and state scales following the framework we setup in previous studies in Maryland and California (Huang et al 2015. At the pixel scale, we compared biomass maps at 250 m spatial resolution, the coarsest resolution among all maps in this study. We choose to use this resolution because previous analysis found that the map comparisons at coarser resolution are more reliable . The aggregation process reduces the difference caused by geolocation and misalignment between different maps. Statistical measurements such as R 2 , root-mean-squared difference (RMSD, calculated similar to RMSE) and mean estimated bias difference (MBD, calculated similar to MBE), were selected to evaluate results at pixel-scale.
At the county to the state scale, we compared estimates from five map products to FIA estimates. Especially, we compared the map-based county and statescale estimates of aboveground biomass from this study, four national products (NBCDjks, Saatchi, Wilson, and Blackard), a global product (GlobBiomass) and samplebased estimates from FIA plots using two different allometric equations (FIAjks and FIAcrm). Additionally, at the state-to Tri-State region scales, we compared estimates of mean carbon density and stock from mapbased results to sample-based estimates from FIA. A coefficient of 0.5 was used to convert from biomass to carbon (Chave et al 2005). We used similar statistical indicators (i.e. R 2 , RMSE, and MBE) as described in section 2.3 to evaluate the map-based results.

Statewide forest biomass products
Visually, the spatial patterns of canopy cover, mean canopy height and biomass matched land cover and topography ( figure 3). In general, counties in western Maryland and northern Pennsylvania had the highest mean canopy heights, largest forest cover, and total and mean biomass. The Blue Ridge region had large biomass while the coastal plains had smaller biomass in general. Biomass distribution was patchy and heterogeneous as expected from the patchy nature of the landscape. The map of uncertainty index derived from QRF models shows similar spatial patterns matched canopy cover and height ( figure S2).
High-resolution maps reveal more details ( figure 4). The canopy cover map at 1 m resolution detailed delineating out important urban forests that  traditionally defined as 'non-forest', such as open, low and medium high developed land cover types. The canopy height map revealed not only the extent of forest but also vertical information, which is critical for estimating carbon in forests. Larger biomass values were observed in urban areas that have tall trees and residential neighborhoods that have tree plantation programs ( figure S6(a)). Biomass values were larger near water bodies, as canopy heights were higher in protected wetlands ( figure S6(b)).

Plot-scale validation and error analysis
Plot-scale validation of biomass density showed different levels of agreement when using FIA plots (figure 4) and UMD plots ( figure S4)  their bootstrap validation for five mapping zones (zone 60 to zone 64) that covers the study area. Unexpectedly, we noticed saturation effects in all ecoregion models. Validation using 20% with-held FIA plots indicated that four ecoregion models performed differently (b). Independent validation using UMD plots (figure S3) showed larger errors (R 2 =0.33-0.61, RMSE=85.3-100.9 Mg ha −1 , RMSE CV=43.2%-85.0%) and different scales of under estimations (MBE=−8.9∼−45.5 Mg ha −1 ), but followed similar patterns as the results using FIA plots. Similar patterns in biomass distribution were found when comparing the RF products from 30 and 90 m metrics to estimates from FIA plots (figure S5). Two of the four ecoregion models (Eastern Broadleaf and Northeastern Mixed) have smaller estimated biases in biomass predictions (MBE=0.6-1.6 Mg ha −1 ) using metrics generated at 30 m or 90 m scales. While the other two ecoregion biomass models had either underestimation (Outer Coaster, MBE=−17.4 Mg ha −1 ) or overestimation (Central Appalachian, MBE= 11.6 Mg ha −1 ) when using metrics at 30 m scales.

Comparison to existing products
The random forest derived biomass map (RFjks) was compared to four national products (NBCD, Saatchi, Blackard, Wilson) and one global map product (GlobBiomass) over the Tri-State region. Although all map products followed general patterns in topography and landscape, fine-scale details varied especially over urban fringes (figure 5). Over forested area, less disagreements were observed among the three finer scaled map products (NBCDjks 30 m, Saatchi 90 m and GlobBiomass 100 m) and RFjks, compared to larger disagreements found among the two coarser map products (Wilson and Blackard,250 m) and RFjks, as illustrated in the scatter plots and associated errors (figure 6). Biomass estimates had stronger correlations (R 2 >0.65) among the three InSARoptically fused national maps (NBCDjks, Saatchi and GlobBiomass) and the RFjks map over all regions. The two coarser resolution maps (Wilson and Blackard) had the poorest agreement with the RFjks.
The relationships at county-and state-scale varied between biomass estimates among five map products and those produced by the USFS FIA for the Tri-State region (table 4). All map-based estimates agreed well with the FIA estimates at county-scale (R 2 >0.92, RMSD ranged between 3.0 and 9.0 Tg; figure S6). Details in county-scale of biomass density and stock were illustrated in figures S7 and S8. At the state-scale, the estimated aboveground biomass among different maps and USFS FIA showed large differences over forest and non-forest area. In Pennsylvania and Tri-State region (PA dominated), the four map-based estimates (RFjks, NBCDjks, Saatchi and GlobBiomass) that used either lidar or InSAR derived variables as model predictors, achieved closer values to each other as well as to the FIAjks over forest regions, but were more diverse over non-forest regions. In Maryland and Delaware, the estimates from this study (RFjks) matched best to FIA estimates over forest regions, followed by the estimates from the other three finer scaled maps (NBCDjks, GlobBiomass, and Saatchi). The estimated aboveground carbon density and stock among different maps and USFS FIA estimates showed large differences for each state and the Tri-State region. The map-based estimates in aboveground biomass ranged between 527.5 and 680.2 Tg across the Tri-State region. Whereas the USFS FIA estimates ranged between 638.4 (CRM) and 695 (Jenkins) Tg. Our estimates in aboveground carbon (563.8, 103.4 and 13 Tg) matched best to USFS for Pennsylvania, Maryland and Delaware (562.4,116.7 and 15.9 Tg). The carbon estimates from the two coarser maps (Wilson & Blackard) were consistently ∼20% less than FIAcrm estimates in forested areas.

Discussion
Combining lidar derived metrics and auxiliary data, and FIA plots, the RF models were calibrated, applied and validated for estimating biomass over Tri-State region. Validation of biomass estimates at FIA plotlevel using with-hold data showed similar levels of expiation and associated errors as RF training data. Although independent validation using finer scale UMD plots indicated consistent negative biases, the biomass map from this study reveals more details at finer scales. Additionally, map comparison between existing products and this study indicated that biomass estimates from InSAR-optically fused products (e.g. NBCDjks, Saatchi, and GlobBiomass) generally agree better with our map (RFjks) over forested.
Our estimates of biomass have uncertainties that arose from several factors. One uncertainty in our biomass product is caused by using leaf-off lidar data that have relatively low point density. In this study, we utilized many archived lidar data (e.g. Pennsylvania and several counties in Maryland) that have less than 1 pt m −2 as well as newly collected higher-quality lidar data (i.e. Delaware and a few counties in Maryland). Studies had indicated lidar tends to underestimate actual heights with decreasing point density (more likely to miss the canopy top), and suggested to further normalize by integrating CHM to coarser resolutions (e.g. 5 m) (Zhao et al 2018). We also used maximum canopy height at 10 m resolutions and integrated into 30 and 90 m mean max canopy height metrics (Max10Avg), which was expected to dampen much of the low sampling effect. This mean height metric as expected was always among the best predictors (figure S3). However, lidar measured canopy height and canopy cover were only moderately correlated with biomass and with the best metrics explaining around 30%-40% variability of biomass in general ( figure S2). These findings suggested that the ability of lidar data in predicting biomass are limited in certain regions, particularly in eastern temperate broadleaf and mixed forests and urban forests with widely varying management and growth patterns.
A second major uncertainty arose from the allometric models and carbon fraction used in biomass and carbon estimation. A study over the mixed forests of Sonoma County, California indicated that the choice of local or national allometric models (CRM versus Jenkins) resulting in a 20% difference in biomass estimates at plot to county scales . Further, the complexities associated with quantifying biomass accurately with lidar in northeastern temperate broadleaf and mixed forests are increasing acknowledged (Anderson et al 2008, Huang et al 2015, Weiskittel et al 2015. Two recent studies indicated insufficient sampling for the parameterization of allometric models leads Table 4. Estimates of aboveground carbon density (Mg C ha −1 ) and carbon stock (Tg C) in forest from this study (RFjks), four national and one global map products, and FIA estimates (FIAjks and FIAcrm) over (a) All, (b) Forested, and (c) Non-Forest area summarized by states and the Tri-State region. RFjks, NBCDjks and FIAjks represent estimates using Jenkins allometry. RFcrm, NBCDcrm and FIAcrm represent estimates using CRM allometry. to bias in these temperate forests , Weiskittel et al 2015. Our research concurred with this and also showed that at sub-hectare plot scales, there was such a wide variability in basal area for a given tree height that it is very difficult to predict biomass with more than 60% variability (figure S1) with three-dimensional structure and height from airborne lidar with low point density (∼1 pt m −2 ). However, this is not to say that the results cannot be enhanced with greater datasets. Such improvement would require a better knowledge about the allometric relationship between biomass and forest structure from field survey, potentially available from recent advances in terrestrial laser scanning (Liang et al 2016). In addition, a consistent fraction (0.5) was used when converting biomass to carbon, which may overestimate carbon in angiosperms in the Tri-State region (Martin et al 2018). Alternatively, we would expect to improve results by testing refined stratification over the study area in our future study. Third, uncertainty occurred because of the mismatch and time lag between lidar metrics and field measurements. For instance, the lidar metrics were calculated at 30 m equivalent pixels (i.e. a square size) for mapping purposes, but the field measurements were taken in different ways. The FIA plot data were sampled using a four-subplot methodology, not all the trees within the whole plot were measured. Our independently collected UMD plots were measured using a mixture of variable and fixed radius methods (Huang et al 2015). Although preliminary comparisons between variable and fixed radius plots did not show a substantial difference in model performance (Dubayah 2012), we found errors were introduced because of the complexity and heterogeneity in the landscape as well as partial alignment of plots and lidar data. Additionally, due to lack of coincident field and lidar data, we used FIA plot that have time lags with lidar data. Only 20% of the FIA plot were measured in the same acquisition year as lidar data, while 72% and 8% of the FIA plots have 1-2 and 3-5 year time differences, respectively (table S2). The Outer Coaster ecoregion model had the largest time difference, where 36.2% of the FIA plot had a 3-5 year difference, showed an underestimation (MBE=−17.4 Mg ha −1 ) in the independent validation using UMD plot. However, the mismatch and associated errors were reduced when estimates were scaled-up and comparing results at county-scale. These findings are consistent with another recent study (McRoberts et al 2018b).
Lastly, the varied definition of forest/non-forest among different forest cover, biomass and carbon products have made the comparison between different products difficult. Traditionally, the FIA program did not consider plots located in urban regions that had some forest cover, resulting in a possible underestimation in carbon from urban forests (Johnson et al 2014. For example, there are approximately 1000 FIA plots distributed across the state of Maryland, more than half fell in areas considered non-forest or had conditions that precluded them from being measured. This was especially an issue in areas with highly fragmented land use, such as in the two-county pilot regions where 70% of the plots were non-forest. A previous study  applied non-forest biomass gap-filling methods to non-forest plots that had tree cover, revealing that 15% of the total biomass in Maryland was accounted for in non-forest conditions. USFS has started a new URBAN FIA Program that will help fill this non-forest biomass gap in urban areas (USFS 2018b).

Conclusion
We presented an evaluation and validation of a framework to quantify aboveground biomass over large areas at high-resolution (30 m). Random Forest models for four ecological regions were calibrated by linking lidar and auxiliary metrics to estimates of biomass from FIA plots. UMD plots were used for independent validation, and spatial errors were examined at the pixel scale using QRF. Across the Tri-State region, we estimated ∼680 Tg C (13 DE, 103 MD, 564 PA) in aboveground biomass circa 2011. Our results indicated that allometrically-derived biomass estimates and lidar-derived biomass estimates are moderately correlated with FIA plot measurements (R 2 =0.46-0.54, RMSE=51.4-54.7 Mg ha −1 ) and UMD plot data (R 2 =0.33-0.61, RMSE=65.3-100.9 Mg ha −1 ).
Additionally, we conducted detailed map comparisons of forest biomass/carbon to national products at pixel-, county-, and state-scales. Although estimates of mean biomass density from different national products converged at county scales (figures S8 and S9), the biomass map from this study reveals more details at finer scales (figure 4). Pixel-scale comparisons showed great differences in estimates of biomass over forested and non-forested areas. County-scale and state-scale map estimates were close to FIA sample based estimates over forested areas, but diverse over non-forested areas. Particularly, total biomass from lidar or InSAR-optically fused products (e.g. RFjks, NBCD, Saatchi, and GlobBiomass) agree better with the FIA sample-based estimates than those values from other maps (Wilson and Blackard). This is consistent with our previous study over mixed conifer forest in Sonoma, California , where an underestimation in total and mean biomass were found when missing direct height information from active sensors (e.g. lidar or InSAR).
Local, high-resolution lidar-derived biomass maps provided a valuable bottom-up reference to improve the analysis and interpretation of large-scale mapping efforts, and future development of a national-scale carbon monitoring system. Such high-resolution products can be used in support of policy decision can also be used to support subnational policy. For example, the Maryland State's 2013 Forest Protection Act used the canopy cover product to set the minimum limit on forest cover for the state legislation. The empirical and modeling biomass/carbon products are actively being incorporated into planning for the Greenhouse Gas Reduction Act in Maryland. Additionally, operational lidar was proposed to fly over more states as part of USGS's 3DEP efforts (USGS 2018). New spaceborne lidar will offer more consistent acquisition such as from Global Ecosystem Dynamics Investigation (GEDI) lidar  and Ice, Cloud, and land Elevation Satellite-2 (ICESat-2). Extending the framework presented in this research to national scales using newer and consistent lidar data in conjunction with cloud computing will facilitate a truly comprehensive and high-resolution carbon monitoring system.

Acknowledgments
This work was funded by the NASA Carbon Monitoring System (NASA-CMS) projects NNX12AN07G, NNX14AP12G and 80NSSC17K0710. We would express special thanks to helps from local and federal agencies, including Erin Silva from Department of Natural Resources in Maryland, Bud Gudmundson from the County GIS office and Roger Barlow from the USGS for providing raw lidar data, DEM and auxiliary datasets, as well as FIA MOU program and UMD CMS team for providing field data. Particularly, we are thankful to the large field crews who helped collect data in particular Rachel Moore and Dr Amanda Whitehurst. We are appreciative to Dr Xiaoguang Cheng, who developed the prototype C++ scripts for calculation of lidar metrics. We are also thankful to Dr Sassan Saatchi for providing us the CMS continental scale maps, Dr Andy Lister for the critical discussions, Dr Laura Duncanson for her valuable comments that made the manuscript stronger.
The canopy cover, height, and aboveground biomass products from this study were archived to the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) (Dubayah et al 2018).