Ignoring variation in wood density drives substantial bias in biomass estimates across spatial scales

Rapid development of remote sensing and Light Detection and Ranging (LiDAR) technology has refined estimates of tree architecture and extrapolation of biomass across large spatial scales. Yet, current biomass maps show significant discrepancies and mismatch to independent ground data. A potential obstacle to accurate biomass estimation is the loss of information on wood density, which can vary at local and regional scales, in the extrapolation process. Here we investigate if variation in wood specific gravity (WSG) substantially impacts the distribution of above-ground biomass (AGB) across a range of scales from local plots to large regions. We collected wood cores and measured tree volume in 341 forest sites across large altitudinal and climatic gradients in Colombia. At all spatial scales, variation in WSG was substantial compared to variation in volume. Imputing study-wide average values of WSG induced regional biases in AGB estimates of almost 30%, consequently undervaluing the difference between forest areas of low and high average wood density. Further, neither stem size nor climate usefully predicted WSG when accounting for spatial dependencies among our sampling plots. These results suggest that remote sensing- and LiDAR-based projections to biomass estimates can be considerably improved by explicitly accounting for spatial variation in WSG, necessitating further research on the spatial distribution of WSG and potential environmental predictors to advance efficient and accurate large-scale mapping of biomass.


Introduction
The world's forests contain the largest pool of carbon in the biosphere (Bar-On et al 2018) and play an essential role in the global carbon budget (Pan et al 2011, Friedlingstein et al 2019. Carbon-focused conservation programs have been established to mitigate increasing levels of atmospheric carbon dioxide (Houghton et al 2015). For instance, multilateral REDD+ initiatives have raised more than 5 billion US dollars to incentivise reduction of tropical deforestation and forest degradation (Watson et al 2021). However, large uncertainties remain in the forest carbon stock estimates that underpin such Quantification of forest biomass traditionally is based directly on inventory data, but rapid development of remote sensing and LiDAR technologies provides new opportunities for accurate and automated measurements of forest structure, from individual tree architecture (Raumonen et al 2013, Kellner et al 2019 to regional and global stand-level estimates (Saatchi et al 2011, Baccini et al 2012, Asner and Mascaro 2014, Ferraz et al 2016. Such remote sensing techniques can provide tree volume estimates across large spatial scales, but lack the ability to detect variation in xylem densities, commonly represented as wood specific gravity (WSG): the oven-dry mass per unit water-saturated volume of wood, relative to the density of water (Williamson and Wiemann 2010). Species-average values range from 0.1 to 1.4 (Zanne et al 2009) and have an approximately multiplicative effect on individual tree biomass (Chave et al 2014), independent of volume (Phillips et al 2019). Any correlation between WSG and height or diameter of individual trees is weak or absent (Wittmann et al 2006, Martínez-Cabrera et al 2011, Fan et al 2012, Hietz et al 2017, Ubuy et al 2018, Phillips et al 2019. This absence of correlation extends to the scale of plots, where total basal area and LiDAR derived topof-canopy height carry insufficient information to predict average WSG (Asner et al 2012, Jucker et al 2018a, Muñoz Mazón et al 2020.
Two pantropical maps of forest biomass derived from satellite LiDAR data and wall-to-wall remote sensing products (Saatchi et al 2011, Baccini et al 2012 have been widely applied as reference data in ecological studies (Sullivan et al 2020, Walker et al 2020, Wigneron et al 2020. However, the maps contain spatial discrepancies with one another (Mitchard et al 2013) and interregional biases when compared to independent ground plots (Mitchard et al 2014, Avitabile et al 2016. Information on spatial variation in WSG could be lost in the intermediate LiDAR step between sparse plot data and continuous remotely sensed data as neither pantropical analysis accounts for any intracontinental differences in the relationship between above-ground biomass (AGB) and LiDAR-detected forest structure (Saatchi et al 2011, Baccini et al 2012. However, such a role of WSG in promoting the observed inconsistencies is debated (Mitchard et al 2014, Saatchi et al 2014.
Ground-plot studies have demonstrated differences in community-weighted WSG between distant tropical regions (ter Steege et al 2006, Asner andMascaro 2014), and local differences up to 20% between adjacent forests induced by shifting substrate properties ( But the degree to which such spatial variation in WSG control biomass, and how it emerges and develops across intermediate spatial scales, is unclear. Previous assessments of biomass estimation error driven by omittance of spatial variation in WSG show contrasting results (Asner et al 2012, Mitchard et al 2014, Saatchi et al 2014, Phillips et al 2019, but the studies diverge on the spatial scaling of analyses and weighting of upscaled WSG, and fail to account for potential spatially contingent bias introduced by intra-specific variation (Patiño et al 2009, Bredin et al 2020.
The spatial scale at which extrapolated biomass estimates are most reliable thus depends on the magnitude and scale at which average WSG varies and our ability to capture and retain such variation through extrapolation processes. In this study, we investigate the spatial pattern of variation in WSG across Colombia, quantify the relative errors induced on AGB estimates at different spatial scales when neglecting spatial variation in WSG, and explore the potential of environmental correlates to predict WSG across space. We address the current knowledge gap by explicitly assessing variance manifesting at different spatial scales and assuring relevance for biomass estimation and integration of intra-specific effects by applying volume-weighting of average WSG values and relying on locally collected WSG measurements.
The large-scale distribution of plots was selected haphazardly based on research permits and the access to primary and mature secondary forest at each location. Local scale plot placement was chosen without reference to the plot-scale forest structure or terrain except where the terrain constrained options for access on foot. Plot coordinates were determined by handheld GPS. For each plot, we extracted climatic variables from the WorldClim ver. 2 database (Fick and Hijmans 2017) and elevation and topographic slope from the AW3D30 ver. 2.2 dataset (Tadono et al 2014).
The sample plots spanned an elevational gradient from 133 to 3415 m above sea level. The total annual precipitation ranged from 910 to 3128 mm and the coefficient of variation of monthly precipitation ranged from 19.2% to 73.9%. Mean annual temperature ranged from 8.2 • C to 27.2 • C and showed strong correlation with elevation (Pearson's r = −0.99), and the standard deviation of average monthly temperatures ranged from 0.2 • C to 0.76 • C. Within areas, plots spanned an elevational range of 507 m on average (range 38-1243 m), and an average annual precipitation range of annual precipitation of 298 mm (range 18-933 mm; supplementary methods, table S1-1 available online at stacks.iop.org/ERL/17/054002/mmedia).

Field measurements
Within each plot, we measured the diameter at breast height (DBH) for all trees greater than 5 cm DBH. We estimated individual tree volume using the tree volume term from the Alvarez et al (2012) type II.4 biomass equation, which was developed from a dataset on felled trees across a range of forest types in Colombia.
Wood cores were extracted (two threads, 5.15 mm; Haglöf, Sweden) for a subset (mean = 44%) of trees in each plot. We collected taxonomic information for a limited number of stems (n = 28.4%) belonging to locally dominant species (Quercus humboldtii, n = 776; plot-specific morphospecies, n = 192; cluster-specific morphospecies, n = 19), and cored only a subset of each (Quercus humboldtii, n = 43; morphospecies, n = 2-5 per plot or cluster) (see supplementary methods, figure S1-1). We attempted to extract cores from all the largest trees in each plot and the largest individuals of each morphospecies, while the remainder of stems were cored at random (see supplementary methods, figure S1-2). The wood density (ρ wood ) of the core was measured as the dry weight to wet volume ratio (g cm −3 ), after removing the bark and pith from the sample. WSG is given as the unitless, standardized density of wood relative to the density of water (ρ water = 1 g cm −3 ). Cores were rehydrated prior to volume measurement and dried at 105 • C for at least 60 h before weighting (Díaz et al 2016). All field data is available online at https://data.mendeley.com/datasets/zzzzcnt2bd/1 (Saebø 2022).

Statistical analysis 2.3.1. Stem-level variation in WSG
We fit a linear mixed-effects regression model on the stem-level WSG data to investigate stem-level variation in WSG across our study, test for any size-WSG relationships that could support the use of LiDAR for biomass extrapolation, and derive uncertaintybounded estimates of WSG for uncored trees for further analysis. The full model for each WSG measurement i is where α is the overall intercept, β v is a fixed effect of the tree's volume, β plot[i] is a plot-specific random effect centred on a cluster-specific random intercept β cluster [j] , and where cluster j is centred on an areaspecific random effect β area[k] for area k. To avoid potential sampling bias due to plots with dominant tree species, the limited taxonomic information was added as an additional random effect β s for identified trees. The Boolean variable ID excludes this random effect for unidentified trees. As the intra-specific variances are captured in the β s random effect for identified trees only, we allowed separate variances for identified and unidentified trees.

Contribution of WSG to variation in above-ground biomass
We multiplied each plot's total tree volume by the plot-level volume-weighted WSG estimate (calculated as the uncertainty-bound volume-weighted average WSG of all stems in the plot) to derive final plot-level values of AGB in t ha −1 . We then used two strategies to assess how strongly variation in WSG contributes to the variation in AGB at different spatial scales. First, we calculated the between-plot coefficient of variation of AGB, WSG and volume within each region and across the study extent to assess their relative contribution to the variation in estimated AGB. We further calculated the between-cluster and between-area coefficients of variation across the study extent to indicate the relative change in variation in WSG and volume among sample units of different spatial extents. Second, we calculated the deviation between the volume-weighted WSG of individual spatial units and the volume-weighted average of the spatially superior unit (leaving out the unit of interest). For example, the observed deviation for the Amazon region was calculated as its volume-weighted WSG subtracted from the volume-weighted WSG of all other regions, while for an area within the Amazon it was calculated as its volume-weighted WSG subtracted from the volume-weighted WSG of all other areas within the Amazon. Proportional deviation was represented as the deviation relative to the observed WSG, and absolute deviations as the magnitude of the resulting differences in AGB estimates.
We judged whether these observed deviations significantly exceeded expectations from sampling variation alone by the 95% intervals of the differences between the observed WSG at each unit and a null distribution generated under the assumption of no spatial effects occurring at the spatial scale in question. We generated these null distributions for the volume-weighted WSG value of each spatial unit by repeatedly (n = 4000) resampling (with replacement) the volume-weighted WSG values of spatially subordinate units from the pool within the spatially superior unit (excluding the unit being tested), with each new sample consisting of the same number of subordinate units as the original unit being tested. For example, the null distribution for the Amazon region was generated by repeatedly drawing two areas (the number of areas in the Amazon) from all other regions with replacement, while the null distribution for an area within the Amazon was generated by repeatedly drawing eight clusters from all other areas in the Amazon with replacement.
We repeated the above procedure for clusters, areas and regions. Spatial units were ranked from subordinate to superior according to their spatial scale (i.e. from plots to clusters, areas, regions and the overall study area).

Predicting WSG from environmental variables
We regressed plot-level volume-weighted WSG on elevation, total annual precipitation, the coefficient of variation of average monthly precipitation, the standard deviation of monthly average temperature, and the topographic slope.
The Moran's I statistic indicated strong spatial autocorrelation in the model residuals. Therefore, we cross-validated the model while accounting for the spatial structure of the data. K-fold cross-validation divides the data into a number (k) of 'folds' . The model is refit k times on k-1 folds, iteratively calculating predictions and model performance statistics for the excluded fold. In leave-one-out cross validation, we iteratively excluded one of the 341 individual plots, while in larger fold cross-validation, we divided the data into folds corresponding to the size of clusters (108-fold), areas (23-fold) and regions (7-fold). To track the effect of spatial dependence on validation estimates, we performed a 'random' cross-validation for each fold size that assumed spatial independence among residuals, and a 'spatial' cross-validation that retained the spatial structure of the data.
Additionally, we fit a second model that included the same environmental covariates as well as random effects of either cluster or area. We compared these models to a model that included random effects while omitting the environmental predictors.

Model fitting and error propagation
We fit the models in JAGS (Plummer 2003) via R package jagsUI (Kellner 2019). We used weakly informative gamma priors (shape = 1, rate = 0.001) on all precision hyperparameters and normal priors (mean = 0-0.5, precision = 1-2) on all random effect parameters (see supplementary methods S1 and table S1-2 for the reasoning justification of our priors). We ran all models on four chains for a functionoptimized number of adaptation iterations (Kellner 2019). The individual tree model ran for 10 000 burn-in iterations and 50 000 sampling iterations, saving every 50th iteration for a total of 4000 posterior samples. Plot-level models ran for 1000 warmup and 10 000 (models without spatial effects) or 50 000 (models with spatial effects) sampling iterations. Every 10th or 50th sample were saved for a total of 4000 combined posterior samples across the iteratively fit models. Model fit and convergence was assessed using graphical predictive checks and the Gelman-Rubin R-hat estimator (Gelman and Rubin 1991).
Model performance was assessed from posterior estimates of the root mean squared error (RMSE), the squared Pearson correlation between model fitted and observed WSG (R 2 ) and deviance information criterion (DIC). Predictive performance of cross-validated models was assessed from posterior estimates of the expected log predictive density (elpd), the root of average squared prediction errors (RMSPE), and the squared Pearson correlation between model predicted and observed WSG (pR 2 ).
Uncertainty in stem-level WSG values was propagated through consecutive analysis by repeating all procedures across an array of volume-weighted plot-average WSG combining the field measurements for cored trees with 100 posterior draws from the final stem-level model for non-cored trees.

Spatial variation in stem-level WSG
The distribution of WSG, DBH, and tree volume estimates showed variation across our study sites (figure 2, supplementary methods, table S1-1). The relationship between individual tree volume and WSG was not detectably different from zero (figure 1(D), supplementary results, table S2-1). RMSE and Bayesian R 2 estimates indicated large residual variation and weak predictive capabilities at the individual tree level (supplementary results, table S2-1 and figure S2-1). However, the correlation between estimated and raw plot-level volumeweighted WSG was strong (Pearson's r = 0.95), indicating that our results do not hinge heavily on imputed WSG values for uncored stems (supplementary results, figure S2-1). The model estimated that around 60% of the spatial variance in stem-level WSG occurred between the 21 areas, 14% occurred between clusters within areas, and 26% occurred between plots within clusters (figure 3, supplementary results, table S2-1). The estimated random intercepts of spatial units of different regions overlapped but showed clear overall differences, with the Amazon portraying consistently high and the central Andean regions consistently low WSG at all spatial levels (figure 3).

Effect on above-ground biomass estimates of ignoring variation in WSG
The regional-scale AGB ranged from 307.4 t ha −1 in the lowland Amazon to 117.3 t ha −1 in the highest-altitude areas in the southern Central Andes (∼3000 m. msl., table 1). The coefficient of variation was mainly driven by large variations in tree volume, but the additional variability due to WSG was substantial. The between-area coefficient of variation was 32% of that of tree volume across the study extent (table 1). The variation in volume increased more substantially than the variation in WSG at smaller scales (table 1), indicating that the control of WSG on spatial patterns of biomass increase with increasing scales. Average volume weighted WSG ranged from 0.40 to 0.58 at the scale of regions, resulting in substantial deviations between individual regions and the full dataset (figure 4). Replacing region-specific WSG values with the overall average WSG value led to relative errors of up to 29% and absolute AGB errors of up to 41.4 t ha −1 (supplementary results, table S2-3, Central Andes north and Amazon, respectively). Area-level volume weighted WSG ranged from 0.37 to 0.63 s across the dataset and showed inter-regional spans of up to 0.09, resulting in inter-regional errors of up to 17% and 50.4 t ha −1 (supplementary results, table S2-4, Amazon area II and I). Clusterlevel volume-weighted WSG ranged from 0.34 to 0.74 across the dataset and showed inter-area spans of up to 0.22, resulting in inter-area errors of up to 30% and 69.6 t ha −1 (supplementary results, table S2-5, East Andes north VI cluster XI and West Andes I cluster I).
The null model approach demonstrated the existence of spatially driven variation among regions, beyond patterns of random variation and subscale covariance. The average WSG of three regions fell outside the expected value under the scenario of no spatial effect above the area level (figure 4, supplementary results, table S2-3). Spatial patterns at smaller scales were generally more consistent with null variation, but some spatial effects at the level of areas and clusters were indicated within the Amazon, northern Central Andes and West Andes. A total of five areas (supplementary results, table S2-4) and nine clusters (supplementary results, table S2-5) deviated significantly from their null distributions.

Predicting plot average WSG from environmental variables
The non-spatial and cluster random effect models for plot-level volume-weighted WSG estimated significant decreases in volume-weighted WSG along the elevational gradient of 0.035 and 0.039 per 1000 m altitude, respectively. No effects remained significant (i.e. 95% credible intervals do not overlap zero) when including random intercepts for each area    4. Regional deviations from the overall average WSG expressed relative to the regional WSG (left) and as absolute regional AGB estimates (middle), and regional WSG estimates (right). Coloured dots indicate observed values, with 95% confidence intervals derived from the uncertainty in stem-level WSG values. Grey dots indicate average values under the null assumption of no regional effect on WSG, generated by repeatedly resampling each region from the observed values of all other regions. 95% confidence interval combine stem-level uncertainty and variance among resampled regional units. Asterix ( * ) indicate significant difference (p < 0.05) between observed and resampled WSG. Cross-validation of the non-spatial model indicated that environmental covariates added little predictive power beyond through autocorrelative effects. RMSPE and pR 2 estimates decreased only slightly with increasing sizes of random folds, but deteriorated strongly when cross-validating to folds accounting for spatial dependencies in the data (table 2). The model with no environmental predictors provided better predictive performance than the environmental model in terms of elpd when cross-validating across areas and regions (table 2).

Discussion
WSG differ largely among species and individuals and alters the capacity of trees to accumulate biomass, but how this variation scales up to extents relevant to forest management and interpretation of carbon maps is poorly known. We observed that small-scale variance in stem-level WSG across diverse forests in Colombia accumulates across increasing spatial scales and substantially influences spatial variation in AGB. Consequently, ignoring the variation in WSG in our dataset across different spatial scales generated considerable regional biases in estimated AGB and an undervaluation of differences between spatial units. Comparison to null distributions assuming the absence of spatial patterning established that interregional differences were involved in generating the overall patterns of variance in WSG.

The importance of WSG in above-ground biomass estimation
LiDAR and remote sensing technologies provide us the ability to collect immense amounts of data on tree size, facilitating large-scale extrapolations of AGB estimates. However, weak and divergent relationships between WSG and tree size measurements Muñoz Mazón et al 2020) indicate that the divergence between the pioneer strategy of rapid volumetric growth and the old-growth strategy of slower growth but higher survival rates (De Souza et al 2016, Hietz et al 2017 do not translate into any general relationship between WSG and adult tree stature. The absence of any relationship between WSG and volume in our stem-level WSG model are consistent with these previous observations, collectively establishing that spatial information on WSG is effectively ignored at scales where AGB is extrapolated across volumetric data (e.g. LiDAR data), and any predictive power contingent on this information is lost.
The implications of this conclusion depend on the degree to which WSG controls spatial patterns of AGB, but previous literature has left this question unresolved. Evaluations of the importance of WSG in spatial AGB estimation have been limited to either the scale of individual forest plots or large regions (ter Steege et al 2006, Asner and Mascaro 2014, Mitchard et al 2014, Phillips et al 2019, leaving it unclear how variance in WSG arise and develops across different spatial scales. Robust conclusions on the scales we evaluated here have also been impeded by confounding between estimated spatial patterns and sampling variation. This issue is evident in the contradictory results on intra-Amazon regional variation (Mitchard et al 2014, Saatchi et al 2014, demonstrating that a scarcity of field data leaves average values of WSG sensitive to collection biases, and questioning the certainty of observed differences between large-scale tropical regions (ter Steege et al 2006, Asner and Mascaro 2014, Phillips et al 2019. Further, while phylogenetic conservatism (Chave et al 2006) has justified the use of taxonomically coarse database values of WSG in all the above studies, considerable spatial variation have been shown to occur within species (Patiño et al 2009, Bredin et al 2020.
By partitioning the variance to a range of spatial scales across a large and varied study area, propagating uncertainty in locally collected WSG estimates through the full analysis, and explicitly disentangling spatial effects from confounding sampling variation, our results provide robust clarifications on the effect of WSG on spatial AGB distribution. Notably, the importance of WSG as a control of AGB increased at larger scales, as indicated by the increasing variance in WSG relative to that of tree volume and a clear rejection of the null hypothesis of no difference between regions. The relatively small sample sizes within individual units at smaller spatial scales left the null analysis approach with weak power to confirm spatial effects between areas and clusters, but both the stemlevel and plot-level mixed models suggest that variation in WSG manifest on all spatial scales and has substantial effects on AGB distribution.
The magnitude of the observed spatial deviations in WSG and the resampled null distribution for each spatial unit depends strongly on the overall distribution of WSG. For example, increasing the number of samples from the significantly deviating north Andes region would strengthen their dominance over the overall sample distribution, altering the regional deviations and eventually pulling all null distributions outside their regional averages. The specific distribution of bias in our results are thus of lesser interest, as our overall sample distribution does not have any concrete analogue in any biomass map. How such bias distributes in individual products depends on the distribution of the extrapolated ground data. However, the prevalence of spatial variation in WSG and resulting observed deviations demonstrate that the efficiency gains for AGB estimation presented by developing technologies such as LiDAR are constrained by our ability to capture and retain such variation through the extrapolation process, and suggests that WSG should be a target for further improvements to our mapping capabilities.

Predicting WSG from environmental covariates
Detaching the extrapolation of WSG from the LiDAR framework may enable more accurate spatial estimates if appropriate correlates of WSG are identified. Stand-average WSG have previously been linked to latitude (Wiemann and Williamson 2002 The presence of universal large-scale predictors of WSG is thus unclear. In this study, none of the climatic predictors we assessed showed appreciable ability to predictive ability, with any modest explanatory power achieved via spatial autocorrelation (Dormann 2007) rather than any relationship between WSG and the predictors (see also Ploton et al 2020). Substantial effort is required to attain an applicable approach to discerning spatial patterns, by disentangling the seemingly complex and context dependent relationships between WSG and potential environmental predictors, to gain a comprehensive overview of representative forest type values, or establishing direct relationships with remotely sensed reflectance imagery (Jucker et al 2018b).

Conclusion
We suggest that WSG requires considerable research effort in the pursuit of accurate estimates of global distributions of forest biomass. The spatial patterns of AGB within the bounds of our study were considerably regulated by WSG across all scales, demonstrating that the predictive performance and accuracy of biomass maps can be enhanced by explicitly accounting for the spatial distribution of WSG. Largescale systematic ground sampling of WSG is needed to reveal spatial patterns on scales relevant to forest conservation and support further investigations on potential spatial predictors.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: http://dx.doi.org/10.17632/zzzzcnt2bd.1. errez, L Rodriguez, A Andoke, S Terakami and local guides for invaluable collaboration in the field and lab. We also thank numerous national and private reserves, landowners and the Indigenous Andoke community of Caño Aduche for access permissions and generously sharing local knowledge. Funding was provided to D P E from the Natural Environment Research Council (Grant No. NE/R017441/1) and to T H and D P E by the Research Council of Norway (Grant No. 262378). This is article no. 24 of the Biodiversity, Agriculture and Conservation in Colombia/Biodiversidad, Agricultura, y Conservacion en Colombia (BACC) project.

Conflict of interest
The authors have no conflict of interest to declare.

Author contribution
D P E, TH, J B S and J S S conceived the study and developed the methodology. J S S, E P S, P W and C G B collected the data with logistical support from C A M U and J B S. J S S analysed the data with input from J B S. J S S wrote the manuscript with input from all authors.