The role of predictive model data in designing mangrove forest carbon programs

Estimating baseline carbon stocks is a key step in designing forest carbon programs. While field inventories are resource-demanding, advances in predictive modeling are now providing globally coterminous datasets of carbon stocks at high spatial resolutions that may meet this data need. However, it remains unknown how well baseline carbon stock estimates derived from model data compare against conventional estimation approaches such as field inventories. Furthermore, it is unclear whether site-level management actions can be designed using predictive model data in place of field measurements. We examined these issues for the case of mangroves, which are among the most carbon dense ecosystems globally and are popular candidates for forest carbon programs. We compared baseline carbon stock estimates derived from predictive model outputs against estimates produced using the Intergovernmental Panel on Climate Change’s (IPCC) three-tier methodological guidelines. We found that the predictive model estimates out-performed the IPCC’s Tier 1 estimation approaches but were significantly different from estimates based on field inventories. Our findings help inform the use of predictive model data for designing mangrove forest policy and management actions.


Introduction
Forest carbon offset programs are controversial, partly due to the high levels of uncertainty associated with estimating carbon fluxes from land use change (Grassi et al 2008, Griscom et al 2009, Vanderklift et al 2019. The validity of these programs for mitigating climate change depends in part upon these estimates and it is therefore important for them to be accurate (Grassi et al 2017). One key step in accurately estimating the climate benefits from these programs is the estimation of baseline carbon stocks, or the reference levels upon which potential project interventions are evaluated (Bento et al 2016, Gren andZeleke 2016). Despite their importance, obtaining accurate estimates of baseline carbon stocks can be a barrier for program design due to the costs of implementing statistically valid field inventories. There has consequently been longstanding interest in improving both the accuracy and precision of baseline carbon stock estimates at lower costs (Willcock et al 2012, Langner et al 2014. The Intergovernmental Panel on Climate Change (IPCC) is the foremost authority on inventorying ecosystem carbon stocks. The IPCC provides a threetier system for categorizing the accuracy and uncertainty of baseline carbon stock estimates (IPCC 2003). Under the IPCC's guidelines, the Tier 1 and Tier 2 approaches use global and regional default parameters, respectively. The Tier 3 approach uses 'higher-order methods,' which may include models or field data from national forest inventories to meet country-specific conditions. Inventorying baseline carbon stocks under the Tier 3 approach provides the highest data quality but is the most complex and resource-demanding (Kovacs et al 2011).
To better understand global variation in forest carbon and potentially provide baseline carbon stock estimates under a Tier 3 approach, global maps of carbon stocks are increasingly being produced using predictive modeling. Modern classification techniques (e.g. machine learning algorithms), access to remotely sensed data, and larger compilations of empirical data have enabled these models to accurately predict trends in environmental variables from global to sub-regional scales (Saatchi et al 2011, Baccini et al 2012, Ge et al 2014. The benefits of these models include wall-to-wall mappings of environmental variables, which can account for broad-scale variation in forest carbon stocks or land use change (Herold et al 2019). The shortcomings, on the other hand, include relatively coarse spatial resolutions as well as the risk of introducing biases when correlating remotely sensed metrics to field data. Despite their promise, it remains unclear i) whether the estimates are sufficiently accurate for designing forest carbon programs at local scales, and ii) how these global predictive models fit best within the IPCC's three tiers of approaches for estimating carbon stocks (Hill et al 2013, Langner et al 2014.
Mangroves are one ecosystem for which accurate estimates of baseline carbon stocks from predictive models would be highly valuable (Macreadie et al 2019). Mangroves provide many environmental and social benefits, including the stocking of large amounts of organic carbon (Gedan et al 2011, Donato et al 2011. As a consequence, mangrove-holding nations are interested in 'blue carbon' projects, or the financing of mangrove conservation and restoration through forest carbon programs (Ullman et al 2013, Macreadie et al 2017, Hamilton and Friess 2018. However, quantifying baseline carbon stocks in mangroves is particularly resource demanding due to limited accessibility and the importance of the soil organic carbon pool. Numerous predictive models of mangrove carbon stocks have consequently emerged in recent years, and may potentially meet the demand for accurate estimates of baseline carbon stocks (Hutchison et al 2014, Jardine and Siikamäki 2014, Sanderman et al 2018, Simard et al 2019. (Sanderman et al 2018, Simard et al 2019. Although a number of studies have compared predictive models of forest carbon stocks against empirical data for pan-tropical forests, no study has done this for mangroves despite their explicit inclusion in the 2013 Supplement to the IPCC Guidelines for National Greenhouse Gas Inventories (IPCC 2014).
The lack of such a study is a key gap in the literature as pan-tropical forest carbon maps are often inaccurate for mangroves due to unique ecological conditions. For example, tidal dynamics greatly influence remotely sensed imagery often used to produce these pan-tropical maps, potentially inducing high levels of uncertainty (Lagomasino et al 2019). Operationalizing predictive models of mangrove carbon stocks for forest carbon program design thus requires assessing the accuracy of these datasets as well as guidance on their use.
The goal of this study was to (i) compare estimates of baseline carbon stocks in mangroves derived from predictive model data against stock estimates derived through the IPCC's methods, and (ii) assess the accuracy of the predictive model data estimates against statistically valid field inventories. To do so, we compared estimates of baseline carbon stocks built off predictive model data against the IPCC's approaches for mangroves located along four coastlines of the globe. We compared the four estimates to gain insight into potential biases, shortcomings, and benefits of each of the approaches. While the results are directly relevant for the blue carbon community, the study also provides guidance on the role of predictive models in environmental decision-making.

Study sites
We estimated ecosystem carbon stocks for mangroves along four coastlines of the world: (a) the northwest coast of the United Arab Emirates, (b) the Brazilian coast south of the Amazon river, and both (c) the western and (d) eastern coasts of peninsular Thailand (figure 1). The sites were selected to capture a range of mangrove climatic and geomorphological variation (table 1), including arid mangroves (UAE), sites heavily influenced by fluvial transport of sediment (Brazil and eastern Thailand), and tidally-dominated estuaries (western Thailand). Furthermore, only sites that used standardized methods and had field inventory data not included in the predictive model parameterization were used. Each of the sites were sampled with the primary objective of estimating site-level carbon stocks, and each of the sampling regimes used protocols that were designed specifically to meet the IPCC's Tier 3 approach. Additional details of the sites and our selection criteria for inclusion are provided in the supplementary material.

Estimation approaches
We compared baseline carbon stock estimates at each site using four different approaches. It is worth noting that there are errors and biases inherent to estimates of baseline carbon stocks derived from both field inventories and predictive models, and objective comparisons of the approaches are limited by the absence of 'true' values of extant carbon stocks (Hill et al 2019) . However, it is valid to assume that each of the approaches provide independent estimates of the 'true' values of site-level baseline carbon stocks, and thus their comparison is informative. We followed each of the IPCC's Tier 1, Tier 2, and Tier 3 approaches for estimating baseline carbon stocks,  which are defined in terms of increasing methodological rigor. The Tier 1 and 2 approaches use global default parameters and country-level data on baseline carbon stocks, respectively. The Tier 3 approach uses empirical data that account for site-specific conditions and are collected through statistically valid field inventories. In addition to the Tier 1, Tier 2 and Tier 3 approaches, we also performed site-level pseudoinventories by extracting carbon stock data from the modeled datasets at each of our field plots. We then compared the plot-level and site-level estimates of baseline carbon stocks using each of the estimation approaches.

Field inventories
Field inventory data were collected using variations of the Kauffman and Donato protocols for sampling forest structure and carbon stocks in mangrove forests (Kauffman and Donato 2012). The protocols were designed to fit the IPCC's Tier 3 approach for estimating baseline carbon stocks. We sampled the sites in Thailand and obtained plot level field inventory data for the UAE and Brazilian sites from published datasets that used the same protocols (Schile et al 2017, Kauffman et al 2018b. All field inventories were designed with the stated purpose of estimating site-level ecosystem carbon stocks. The boundaries of the sites under consideration were delineated using geographic information systems software. Transects consisting of five to six circular plots at 25 m intervals were randomly located and placed perpendicular to the shoreline within each mangrove forest, allowing for unbiased estimation of site-level ecosystem carbon stocks. Within each plot, all trees were identified to species and their stem diameters at breast height were recorded. Additionally, soil cores up to 2 m depth were collected from the center of each plot with a Russian peat auger. Biomass carbon was estimated by converting diameter at breast height measurements to volume estimates using species-specific allometric equations when available. In the absence of species-specific equations, a general allometric equation for mangroves with species-specific wood densities was used (Komiyama et al 2008). Soil carbon was estimated by coring each plot, collecting 5 cm soil samples at five depth intervals (0-15, 15-30, 30-50, 50-100, and 100-200 cm), and processing the samples for percent organic carbon, bulk density, and soil organic carbon density. Minor variations in the laboratory analyses of soil carbon existed across the studies, but all methods used widely accepted techniques for deriving bulk density (drying until constant mass) and percent organic carbon (dry combustion with an elemental analyzer) (Robertson 1999). Soil organic carbon density was calculated as the product of percent organic carbon and bulk density. Despite the coring to a maximum of 2 m depth, we only examined soil organic carbon stocks in the top meter of soil to match the predictive model data. The field inventory methods are described in full detail in the supplementary information, as well as in the other publications associated with the published datasets (Bukoski et

Pseudo-sampling using predictive model data
We performed a pseudo-inventory of each site using the locations of the field inventory plots by substituting predictive model data for field data. We used two raster maps at 30 × 30 m spatial resolution to extract modeled estimates of aboveground biomass and soil organic carbon to 1 m depth using the Simard et al and Sanderman et al datasets, respectively (Sanderman et al 2018, Simard et al 2019. The Simard et al mangrove biomass data were produced by extracting mean canopy height from synthetic aperture radar data and converting the measurements to biomass estimates using allometric equations. The Sanderman et al dataset of soil organic carbon was produced using the random forest algorithm to predict soil organic carbon in mangroves as a function of globally coterminous covariates. We provide additional details of the predictive models in the supplementary information. We used the plot-specific coordinates to extract the modeled estimates of aboveground biomass and soil organic carbon from each sampling plot. We excluded plots whose geographic coordinates either could not be confirmed or did not align with the extents of the modeled data. Aboveground biomass was converted to aboveground biomass carbon using the IPCC's conversion factor of 45.1% dry-weight biomass to biomass carbon. Accurate estimates of belowground biomass are lacking due to the difficulties of field sampling root biomass, and predictive models of belowground biomass in mangroves consequently do not exist (Adame et al 2017). While we excluded belowground biomass from our statistical tests, we calculated rough estimates using a simple root-to-shoot factor for mangroves of 27.8% and a belowground dry-weight biomass to biomass carbon ratio of 39% for a more complete picture of ecosystem level carbon stocks (Donato et al 2011, Kauffman andDonato 2012). Others have recommended the adjustment of belowground biomass based on salinity and stem density; however, these variables are absent for our plots and we did not apply this correction (Adame et al 2017). For those plots that were less than 1 m in soil depth, we adjusted the predictive model estimates of soil organic carbon to the actual soil depth of the plot given that the modeled soil organic carbon data are estimated at 1 m depth.

Calculation of Tier 1 and Tier 2 estimates
We calculated Tier 1 and Tier 2 estimates of ecosystem carbon stocks using global and regional default factors, respectively. For the Tier 1 estimates, we used default parameters for mangroves specific to different climatic zones from the IPCC Guidelines (IPCC 2014). While the IPCC Guidelines were recently updated, the specific guidance for wetlands were not refined (Lovelock et al 2019). Losses from the soil organic carbon pool under shifting forest management practices are assumed to be non-existent under the Tier 1 approach, and we therefore omitted the soil organic carbon pool from our Tier 1 estimates. The IPCC's Tier 2 methods are analogous to Tier 1 methods but use country-or region-specific estimates of ecosystem carbon stocks to reduce uncertainty. For the Tier 2 estimates, we used ecosystem carbon stock estimates from published studies out of the same region. Specifically, we used a regional inventory from Southeast Asia, an inventory from mangroves in Northeastern Brazil, and two studies quantifying biomass and soil organic carbon stocks for mangroves from the Red Sea (Donato et al 2011, Abohassan et al 2012, Almahasheer et al 2017, Kauffman et al 2018a. Additional details are provided in the supplementary information file.

Statistical analyses
We calculated mean baseline carbon stocks for all sites using each of the four approaches. For those approaches that allowed estimation of uncertainty, we also report the standard error of the mean. Normality in the field inventory and model-derived data were assessed using Shapiro-Wilk tests and quantilequantile plots. We tested for significant differences in baseline carbon stocks between the field inventory and model-derived estimates. To account for spatial autocorrelation within transects, biomass carbon and soil organic carbon from all plots within the same transect were averaged for both the field inventory and model-derived data prior to the statistical tests. The statistical tests were performed with one-way analysis of variance for normally distributed data and non-parametric Kruskal-Wallis analysis of variance for non-normally distributed data.

Results
The estimates of baseline carbon stocks varied by both site and estimation approach. Figure 2 shows the ecosystem carbon stocks for the individual sites using each of the four estimation approaches. The Tier 1 estimates do not incorporate soil organic carbon and therefore differed substantially from the other estimation approaches at an ecosystem level. Given that the sites only fell within two of the IPCC's climatic classes for mangroves, only two Tier 1 parameters were used (33.8 Mg C ha −1 for the UAE site, and 86.6 Mg C ha −1 for all others). The Tier 2 estimates (regional defaults) both over-and under-estimated baseline ecosystem carbon stocks relative to the Tier 3 field data (table 2). Visual comparison of baseline carbon stock estimates using the field inventory vs. predictive model data revealed significant biases, particularly for aboveground biomass carbon.
Pooling the data across all sites, we did not find a significant difference in aboveground biomass carbon for the field inventory vs. predictive model data (Kruskal-Wallis Test, X 2 = 2.19, p-value = 0.1). However, for the soil organic carbon data, we found a significant difference between the field inventory and predictive model data when pooling across all sites (Kruskal-Wallis Test, X 2 = 14.9, p-value = < 0.001). The results were variable for individual sites. Only one of the five sites showed a significant differnce in aboveground biomass carbon stock estimates whereas four of the five sites had significant differences in soil organic carbon estimates (table 3).

Discussion
Our results reveal substantial differences in baseline carbon stock estimates that arise from the estimation approaches. The results suggest that estimating site-level baseline carbon stocks in mangroves using default factors is inaccurate and does not account for important regional and local variation. If we assume the field inventory data are the most accurate for estimating true carbon stocks (as is widely done), it is clear that the predictive model data better-approximate these estimates compared to the IPCC Tier 1 defaults and may outperform the Tier 2 approach in certain cases. These results parallel similar findings for predictive models of biomass in tropical forests more generally and suggest that the widespread availability of predictive models of biomass may obviate the IPCC's default factors at global scales (Langner et al 2014).

Results of the four approaches for estimating baseline carbon stocks in mangroves
For the sites in which the Tier 2 estimates closely approximated the site geomorphology (i.e. neighboring sites rather than regional inventories; Brazil and the United Arab Emirates), the Tier 2 estimates based on field data better approximated site level values than estimates from predictive model data. However, the estimates derived from predictive model data better approximated the field inventory estimates than the Tier 2 estimates for the sites in Thailand. These results suggest that while the predictive models are capable of accounting for regional scale variation in ecosystem carbon stocks, this ability begins to break down at local scales. For mangroves, these differences at sub-regional scales are likely a result of differing mangrove typologies, which may depend upon the particular hydrological, sedimentary, or climatic conditions at a given site . While previous studies have provided country-level estimates of mangrove carbon stocks, a potentially promising and more ecologically-informed update would be to produce country-specific default mangrove carbon stocks by mangrove typology (e.g. lagoon vs. deltaic vs. estuarine sites) (Hamilton andFriess 2018, Rovai et al 2018).
Despite the promise of predictive models for improving default estimates of carbon stocks, our statistical comparisons of field inventory vs predictive model carbon stock estimates at the site level reveal significant differences. The findings emphasize that even with the relatively fine spatial resolution of the predictive models (30 m), caution should shows Tier 3-model vs. Tier 3-field estimates of plot-level carbon stocks for the aboveground biomass and soil organic carbon pools. The SOC estimates in panel (a) are constrained to 1 m for the T2, T3 m and T3 f estimates. The Arabian Gulf plots are from the United Arab Emirates, the Coast of Para plots are from Brazil, and the Krabi River Estuary, Pak Panang Mangrove and Palian River Estuary are from Thailand.

Table 3 | Results of statistical tests for differences in site-level carbon pool estimates using predictive model vs. field inventory data.
The tests are performed for aboveground biomass carbon (AGC) and soil organic carbon (SOC) constrained to a maximum of 1 m depth. All values are in Mg C ha -1 . All statistical tests are performed with the non-parametric Kruskal-Wallis analysis of variance given non-normality in the data. Note: NS = not significant, * = significant at α = 0.1, * * = significant at α = 0.05, and * * * = significant at α = 0.01; degrees of freedom = 1 for all tests.

Field-based (Mg C ha -1 ) Model-based (Mg C ha -1 ) X 2 P-value
be taken in their use at site-level scales. These differences are particularly pronounced at the pixel level, confirming the warnings of model producers against use of products at local scales (panel b of figure 2). While we acknowledge that direct comparisons of the plot-level field inventory and predictive model estimates of carbon stocks are not valid due to their differing spatial footprints, we visualize the data to further emphasize this point. Visual inspection of plot-level carbon stock estimates against a one-to-one line (i.e. perfect alignment of stocks estimates from field inventory and predictive model data) indicates that the variation in field inventory aboveground biomass at the plot-level was not captured by the predictive models (figure 2). Estimates of aboveground biomass from the predictive model data fell between < 1 to 114.4 Mg C ha −1 across all sites whereas the estimates from the field inventories varied from < 1 to 490.3 Mg C ha −1 . Although it is not possible to say for certain, the use of different allometric equations (regional-level equations based on height for the predictive model vs. species-specific based on diameter at breast height for the field inventories) likely contributed to the differences in plot-level estimates of biomass. Other sources of uncertainty may have included geolocation errors, error propagation and differences in timing of measurements.

Recommendations for the design of blue carbon projects
In considering our results, we recommend the use of predictive model outputs for estimating site-level baseline carbon stocks over global defaults (Tier 1) and regional inventories (Tier 2). The predictive model data can provide large improvements in accuracy and are freely available for those with capacity in geographic information systems (GIS). Free and open source GIS software are sophisticated, welldeveloped, and provide a readily accessible means to analyze the publicly available maps of mangrove carbon examined here. We further discuss the utility of GIS for designing blue carbon projects in the supplementary information. However, our results also indicate that Tier 2 estimates may out-perform predictive model estimates when using field data from neighboring sites with similar geomorphological and climatic conditions (e.g. see panel (a) of figure 2 for the Arabian Gulf and Coast of Para). It is important for blue carbon projects to justify their use of one data type over the other and may be most appropriate to provide both. Additionally, we advise caution in using predictive model data for decision-making at the within-site level despite their high spatial resolution. Methodological differences in producing the datasets may bias estimates of carbon stocks and may ultimately be ill-suited for interventions that are not uniform across space. A hybrid approach that uses the predictive model outputs for stratifying sampling regimes may hold promise in reducing uncertainty at lower costs. The aboveground biomass model is based on a remotely sensed measure of canopy height, which is an appropriate variable to stratify sampling regimes of mangrove biomass by. Should programs have capacity in GIS analyses on hand, significant cost reductions can be achieved by using predictive model data to inform stratified inventories (Tang et al 2018). Ultimately, a combination of modeled-derived data and field inventory data may provide the best combination of cost-efficiency and accuracy in estimating baseline carbon stocks.
It is important to note that the epistemic stance of this paper emerges primarily from the field of predictive modeling. While accurate estimates of carbon stocks are of clear importance for advancing valid forest carbon programs in mangroves, there are other critical barriers that emerge from disciplines such as the field of environmental justice (Schroeder and McDermott 2014). For example, equitable benefit sharing, assent of local communities, and de-/centralization of governance can be equal, if not larger, barriers to forest carbon programs (Lovell 2015, Friess et al 2016. Our aim here is not to argue for more complicated measurements of forest carbon in mangroves but rather situate the accuracy of publicly available datasets that may meet this need. While we only note the importance of these additional barriers to carbon forestry programs here, we provide additional discussion of them in the supplementary information.

Considerations for future field-based vs. model-based approaches
The uncertainty associated with not knowing the 'true' value of ecosystem carbon stocks will persist within forest carbon programs and is likely best addressed by a combination of field inventory and model-based data. Given the absence of 'true' values of mangrove carbon stocks at our sites, we cannot state that the predictive model data or field inventory data provide more accurate or more valuable estimates of baseline carbon stocks in mangroves. Field inventories provide nuanced measurements of environmental variables but are resourcedemanding to collect and require the extrapolation of measurements from plot to stand or site-level scales. Conversely, predictive models also provide direct estimates of forest metrics across broad regions but are limited in their ability to account for fine scale variation. While both have their strengths and limitations, they are capable of providing complementary information.
Numerous satellite missions with the primary objectives of estimating and monitoring ecosystem biomass will be launched from 2020-2030 (Herold et al 2019). These missions will be critical for measuring changes in forest biomass over broad scales, but will also need corresponding field inventory data to validate the measurements and calibrate the predictive models based upon them (Schepaschenko 2019, Chave et al 2019. Although limited in number, networks of large permanent plots exist for other tropical forest types that will facilitate the use of space-based estimates of forest biomass. However, to the best of our knowledge, permanent field plots of mangrove forest structure and biomass are largely absent. While the Kauffman and Donato protocols and the associated widespread collection of mangrove forest structure data have greatly benefited the mangrove community, the next phase of mangrove forest biomass estimation and monitoring would be appropriate in aligning with spacebased missions capable of estimating ecosystem structure.

Conclusion
We tested the utility of predictive models to estimate baseline carbon stocks in mangroves, which are among the most carbon dense ecosystems globally. Our results show that predictive models are capable of providing more accurate estimates of ecosystem carbon stocks at local levels than the IPCC's Tier 1 default parameters. However, we also found that estimates of mangrove carbon stocks derived from predictive model data were significantly different from analogs based on comprehensive field inventories (IPCC Tier 3 approach). We recommend the use of predictive models in designing national or regional forest policy and strategies but also recommend caution in using them at local scales.