Tropical forest carbon balance: effects of field- and satellite-based mortality regimes on the dynamics and the spatial structure of Central Amazon forest biomass

Debate continues over the adequacy of existing field plots to sufficiently capture Amazon forest dynamics to estimate regional forest carbon balance. Tree mortality dynamics are particularly uncertain due to the difficulty of observing large, infrequent disturbances. A recent paper (Chambers et al 2013 Proc. Natl Acad. Sci. 110 3949–54) reported that Central Amazon plots missed 9–17% of tree mortality, and here we address ‘why’ by elucidating two distinct mortality components: (1) variation in annual landscape-scale average mortality and (2) the frequency distribution of the size of clustered mortality events. Using a stochastic-empirical tree growth model we show that a power law distribution of event size (based on merged plot and satellite data) is required to generate spatial clustering of mortality that is consistent with forest gap observations. We conclude that existing plots do not sufficiently capture losses because their placement, size, and longevity assume spatially random mortality, while mortality is actually distributed among differently sized events (clusters of dead trees) that determine the spatial structure of forest canopies.


Introduction
The world's forests have been identified as the primary terrestrial carbon sink, with tropical forests assimilating about half of the total forest carbon uptake (Pan et al 2011). However, these and similar estimates  of forest carbon balance have high levels of uncertainty, partially Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. due to undercharacterized mortality regimes (Körner 2003, Fisher et al 2008. Tropical forest mortality regimes are commonly described by an average annual mortality rate, and less frequently include the distribution of disturbance size (Chambers et al 2004, Lloyd et al 2009, Chambers et al 2013. The mortality rate represents a bulk loss of trees or biomass while the disturbance size distribution represents a spatial clustering of this loss. The annual mortality rate has received more attention (Chambers et al 2004 than its spatial pattern due to infrequency of large disturbances (Chambers et al 2013), relatively short duration of measurement , limited spatial extent of measurement (Nelson et al 1994, Espírito-Santo et al 2010, and relatively small plot size (Fisher et al 2008). While advances in remote sensing have increased scientific understanding of mortality regimes (Kellner et al 2011, Chambers et al 2013, questions remain regarding statistical distributions of disturbance size and their effects on spatial patterns of forest dynamics. Answering such questions is critical for understanding how the relationship between annual mortality and disturbance size distribution influences estimates of forest carbon balance.
Until recently, understanding of tropical forest mortality regimes has been based primarily on plot data, with a few exceptions that utilize large-area satellite and aircraft data, leaving a large knowledge gap for disturbances between ∼0.07 and 35 ha in size (Nelson et al 1994, Espírito-Santo et al 2010, Kellner et al 2011, Morton et al 2011, Negrón-Juárez et al 2011, Chambers et al 2013. Fortunately, new remote sensing approaches fill this gap and allow more comprehensive analyses of spatial and temporal effects of disturbance on forest dynamics. Two key features have emerged from combining plot and remotely sensed data in the Central Amazon: (1) plotbased average annual mortality rates underestimate landscapescale rates, and (2) plot-based frequency distributions of disturbance size generally miss succession-inducing gaps due to small size and limited census interval (Chambers et al 2013). Furthermore, several studies indicate that this frequency distribution follows a power law (Fisher et al 2008, Chambers et al 2009b, Lloyd et al 2009, Negrón-Juárez et al 2010, Kellner et al 2011, Chambers et al 2013, but the shape of this distribution has not been rigorously tested. While advances have been made in mortality regime research, debate continues regarding the adequacy of existing field plot networks in the Amazon basin for estimating carbon balance (Fisher et al 2008, Chambers et al 2009b, Lloyd et al 2009, Chambers et al 2013. This debate bears directly on recent estimates of a tropical forest carbon sink based on field plot data , Pan et al 2011 and is highly contingent upon estimates of both annual mortality rate and disturbance size distribution because biomass accumulation is estimated as the difference between gains (from tree growth and ingrowth) and mortality losses. Furthermore, due to the relatively short period of time since plot establishment (e.g. ∼11 years on average in the Amazon basin) and the low density of current field plots across vast domains (e.g. the Amazon basin) , forest growth simulations are critical for determining whether field measurements have adequate spatial and temporal coverage to capture long-term, regional trends that are characterized by slow gains and fast losses of biomass (Körner 2003).
Recent studies using individual-based forest growth models incorporating both annual mortality and disturbance size distributions have shown some limitations of field plots. Fisher et al (2008) simulated equilibrium landscape biomass to demonstrate that increasing the proportion of large-area disturbances can bias plot-based estimates toward biomass gains. A remotely sensed estimate of Central Amazon average annual mortality rate increased plot-only estimates by 9-17% because the largest recorded plot disturbances affected only 8 trees, while mortality events detected by satellite ranged from 8 to more than 7000 trees (Chambers et al 2013). While model estimates of forest biomass dynamics are highly sensitive to annual mortality rate, the increased mortality rate estimated by Chambers et al (2013) is well within the standard deviation of the plot-based annual mortality rate distribution (Chambers et al 2004). However, the impacts of different disturbance size distributions on spatial patterns of simulated biomass and stand age are relatively unknown. Nonetheless, such spatial patterns determine what information can be obtained from a sample of field plots.
In contrast to detailed research on mortality regimes, other studies have argued that plot-based estimates of annual mortality rate are sufficient for estimating Amazon forest carbon balance , Lloyd et al 2009. One such study, however, (Lloyd et al 2009) simply challenged a technical matter in Fisher et al (2008) that had no affect on carbon balance results because their methods were internally consistent (see Chambers et al 2013). Another study  conflated the annual mortality rate distribution with the frequency distribution of disturbance size. More specifically, Gloor et al (2009) accounted for only annual mortality rate, which requires the assumption that mortality is randomly distributed in space, and not the spatial clustering of mortality into individual, contiguous events. Furthermore, they did not effectively incorporate extra-plot mortality, and thus they found that their plot observations accurately detect biomass gains at the plot level. The dispute, however, is not whether the estimates show gains, but whether the measurements sufficiently capture landscape-level losses.
Clearly, the relationship between annual mortality rate and disturbance size distribution needs to be examined to develop appropriate landscape sampling designs for robust estimation of forest dynamics and carbon balance. Here we rigorously determine that the frequency distribution of disturbance size in the Central Amazon follows a power law, and use it to evaluate impacts of mortality regimes on spatial patterns of simulated biomass and stand age. We evaluate mortality regimes estimated from plot-only and merged plot and Landsat data. The higher, merged estimate of mortality significantly decreases the simulated live biomass, and including Landsat-detected disturbance events larger than eight trees shifts the landscape structure from a random pattern to a significantly clustered mosaic of stand age and biomass. We argue that the spatial clustering associated with the additional mortality due to large events further explains why existing plots do not adequately capture disturbance losses, leading to potential overestimates of Central Amazon forest carbon uptake.

Study location and data
This study covers the Central Amazon region surrounding Manaus, Amazonas State, Brazil (3.11 • S, 60.03 • W). Plot data  have been collected in old-growth forest at permanent sites in a 50 km × 40 km area centered (2.5 • S, 60 • W) about 60 km north of Manaus (Chambers et al 2004). Standing dead trees accounted for 13.2% of total average annual plot-level tree mortality (1.02% of stems) while single and clustered wind-thrown trees comprised 50.6% and 36.2%, respectively (table 1 and figure 1). All estimates are based on trees having a diameter at breast height ≥ 10 cm. The largest mortality cluster in the observed plots contained eight trees. Nine Landsat 5 Thematic Mapper scenes (Path 231, Row 062; ∼3.4 × 10 4 km 2 each; 30 m × 30 m resolution) were processed as five paired repeat observations of forest disturbance to obtain counts of wind-thrown tree clusters by size (1985-1986, 1987-1988, 1996-1997, 1997-1998, 2004-2005). These scenes were converted to reflectance values and masked to isolate 21 800 km 2 of old-growth forest area for analysis (aggregated across the five image pairs). Spectral mixture analysis was employed to calculate the fraction of shade-normalized non-photosynthetic vegetation (NPV) per pixel. Annual NPV change ( NPV) images were calculated and disturbed pixels were identified by NPV ≥ 0.16 and then clustered into individual wind-throw events (e.g. Negrón-Juárez et al 2011). Aggregating the counts of wind-throw events across the five NPV images generated a frequency distribution for mortality event sizes ranging from 8 to 7355 trees (Chambers et al 2013).
The plot and Landsat frequency distributions of wind mortality event size (as number of trees felled) were merged by extrapolating the plot data to the aggregate Landsat old-growth forest area and averaging the counts for the 8-tree event size ( figure 1(a)). An evaluation of different merging methods on power law fits found that this method approximated the average of all methods and that the spread in power law exponents had little effect on landscape dynamics and biomass. The merged frequency distribution spanned integer mortality event sizes, or classes, ranging from 1 to 7355 trees, and increased the estimate of total average annual tree mortality to 1.20% (Chambers et al 2013).

Mortality event distribution modeling
The three data sets (plot, Landsat, merged) were pre-processed to generate three data permutations for facilitating three different fit methods. (1) The original (unprocessed) data permutation was used by all methods. (2) The normalizedlog-binned permutation summed counts in equal-sized log 10 (class) bins and then divided these sums by the number of event class integers within each bin. (3) The normalized-binned permutation summed counts in equal-sized class bins and then divided these sums by the number of class integers within each bin. The binned data permutations reduce noise and generally eliminate zero-count classes (White et al 2008, Milojević 2010. We fit power law (y = c · x −α ) and exponential (y = c · exp(−λ · x)) functions to the plot, Landsat, and merged data sets, compared the fits using a likelihood ratio test, and tested the significance of a subset of fits using a bootstrap method. We used (1) a linear ordinary least squares (OLS) method on log 10 -log 10 transformed values (White et al 2008, Milojević 2010 and (2) a discrete maximum likelihood estimator (MLE) method (Clauset et al 2009) to fit power law functions to the original and normalized-log-binned data permutations. We applied the MLE method using the observed minimum size class to ensure that all data were included in the fits. The exponential function was fit to the original and normalized-binned data permutations using (3) a linear OLS method on linear-log e transformed values. We present analyses of the six most relevant fits for each method (tables 2 and 3).
For each data set (plot, Landsat, merged) we determined the best fit by comparing all possible pairs of fits using a likelihood ratio test (Clauset et al 2009). Based on the results of these comparisons we selected eight fits for which to calculate p-values signifying the probability that a random sample from the fitted distribution would have a worse fit to the original data than the fitted distribution. Each data fit was compared with 100 sample fits ( p-value precision = 0.05) using the Kolmogorov-Smirnov D-statistic (KSD), which is the maximum distance between the cumulative distribution functions (CDF) of the data and the fitted function. If p ≤ 0.1 we rejected the null hypothesis that the data follow the respectively fitted distribution function (Clauset et al 2009).
The random samples from the eight fits required different levels of processing to prepare the samples for fitting, depending on the data set, permutation, and fit method. The plot samples were used directly, the Landsat samples were binned to classes representing integer pixel estimates, and the merged samples were binned across the Landsat range with the 8-tree size class assigned the average of the binned and unbinned values. The eight selected fits generated the following sample fits: the MLE power law method was applied to original plot, Landsat, and merged samples, the OLS power law method was applied to normalized-log-binned plot, Landsat, and merged samples and the OLS exponential method was applied to original plot samples and normalized-binned plot samples.

Forest simulations
We used the tropical ecosystem and community simulator (TRECOS) (appendix; Chambers et al 2004 andChambers et al 2013) to perform four 2000-year simulations of a Central Amazon forest landscape to evaluate the effects of two different mortality event distributions (table 1) and two different average annual mortality rates on landscape structure and biomass. The two average annual mortality rates were estimated from plot (1.02% of stems) and merged (1.20% of stems) data, respectively, and were modeled as normal distributions in log 10 space using the merged estimate of inter-annual variability as the standard deviation (mean = 0.0086 (plot) and 0.0792 (merged) log 10 (% stems y −1 ), SD = 0.073). The two event distributions were (1) the best fitted power law to the merged data and (2) the best fitted power law to the plot data. For input to TRECOS, we aggregated the merged event distribution to 10 bins and assigned each size class (1-8 trees) of the plot event distribution to its own bin.
We compared time series of mean landscape biomass and final spatial patterns among the simulations. For the time series we computed the average equilibrium biomass across simulation years 500-2000. We calculated a global clustering statistic (Moran's I) and empirical and fitted model semivariograms for each final-year map of biomass and the time since last succession-inducing disturbance (>8 trees, t d ) (R-project packages gstats and spdep; www.r-project. org/). Moran's I calculation covered the entire landscape and used a spatial weighting function (1/(2 · distance)) to emphasize clusters on the order of one hectare. Semivariogram calculations extended to half the distance across the landscape (500 m) to minimize bias due to decreasing sample sizes with increasing distance. Final-year biomass exhibited a log-normal distribution in space and t d exhibited a fat-tail distribution, so we performed Monte Carlo significance tests for Moran's I (100 simulations each). We also compared histograms and mean/median statistics of these maps.

Results
The OLS power law fits to normalized-log-binned data are similar to the MLE fits, except for the larger magnitude MLE exponents for Landsat data sets (tables 2 and 3). The OLS power law fit to the original plot data is also similar to the observed value MLE fit. The OLS exponential fits to the plot data permutations are somewhat plausible, while the exponential fits to other data sets are not (table 3). All regressions are significant to the 95% confidence level. It is apparent that the OLS fits are valid only on appropriately binned data or 'full' plot data sets with values for all integer classes. Henceforth, references to OLS fits refer to those performed on binned data unless otherwise noted. a The ratio tests rank the MLE fits on original data above all others. b Null hypothesis of power law distribution rejected ( p < 0.1). P-value tests were not performed for any of the NLB fits. c NLB = Normalized-log-binned. The plot and merged results are for a start bin center of 3 and are nearly identical to those for a start bin center of 4. The Landsat data have a start bin center of 8. Table 3. Ordinary least squares (OLS) power law (y = c · x −α ) and exponential (y = c · exp(−λ · x)) probability distribution function fits a . a OLS was performed on log 10 -log 10 transformed data for power law fits, and on linear-log e transformed data for exponential fits. b Null hypothesis of power law or exponential distribution rejected ( p < 0.1). For the exponential distribution, P-value tests were performed only for plot data fits (binned and non-binned). For the power law distribution, P-value tests were performed only for the binned data. c The data are normalized-log-binned for the power law fits and normalized-binned for the exponential fits. The plot and merged results are for a start bin (s.b.) center of 3 unless otherwise noted, and in general are nearly identical to those for a start bin center of 4. The Landsat data have a start bin center of 8. The binned data are transformed as noted.
The likelihood ratio tests rank the MLE power law fits to original data above all other fits, with plot and merged data having the best fits (table 2, figure 1). The highest ranked OLS power law fits to the plot and merged datasets are effectively identical to the MLE fits. All OLS exponential fits are ranked below all power law fits. The p-values indicate that only plot and merged data are likely to follow their respectively fitted power law distributions (tables 2 and 3). Thus, we reject the hypotheses that plot data follow an exponential distribution and that the given Landsat data follow a power law or an exponential distribution.
The mortality event distribution (table 1) and the average annual mortality rate each affect the simulated landscape differently. Increasing average annual mortality decreases landscape-level biomass (figure 2(a)) but has little effect on t d and spatial pattern. Using the merged mortality event distribution, which includes large-area events, causes local spatial autocorrelation of biomass and t d (figures 2(b)-(c) and 3) but has little effect on mean (figure 2(a)) and median values of these variables. The semivariogram results (figure 3) are supported by significantly positive, although small, Moran's I values (figures 2(b) and (c)). In contrast, using the plotbased mortality event distribution generates relatively flat semivariograms and insignificant Moran's I values, indicating that spatially random outputs are generated when only plot data are used as input.

Discussion
The results overwhelmingly show that Central Amazon mortality events can be robustly modeled by a power law distribution function, and that an exponential function is not adequate even for plot data. The merged data set is the most comprehensive mortality data available for this region and the power law exponent of its best fit (−2.73) is consistent with previous estimates of gap area frequency in tropical forests (Fisher et al 2008, Kellner and Asner 2009, Lloyd et al 2009. While research consistently shows that mortality events and gap areas follow a power law distribution, the mechanisms behind this distribution are still unclear. However, insightful comparisons of gap frequency distributions are difficult to make because of different methods and gap definitions. Most tropical forest mortality distributions are based on gap area frequency (Fisher et al 2008, Kellner and Asner 2009, Lloyd et al 2009, but gap area is an indirect measure of mortality due to spatial and temporal variation in tree density (Chapman et al 1997), and pixel footprints pose challenges for remotely measuring gap area because entire pixels are rarely devoid of trees (Nelson et al 1994). We use instead the dead tree count to directly characterize mortality within one year of gap formation, and our Landsat method generates a good relationship between NPV and number of dead trees within each pixel, enabling repeat measurements of sub-pixel mortality. It is also very well suited for merging with plot measurements to create a contiguous mortality event size distribution ranging from 1 to over 7000 trees, with the potential to capture events larger than 2000 ha (on the order of 400 000 trees) with additional data (Espírito-Santo et al 2010).
High-resolution LiDAR (Light Detection And Ranging) captures forest structure well (Kellner and Asner 2009), but its ability to directly measure large disturbance-induced gaps is limited by high data volume to small areas and low temporal frequency (Kellner et al 2011). Nonetheless, one-time measurements of canopy gaps having vegetation height <1 m are linked to gap formation unless edaphic or other non-disturbance factors limit these areas to short vegetation. LiDAR-based analysis of gaps with vegetation height <1 m in four 1 km 2 tropical landscapes estimated power law gap frequency distributions with exponents ranging from −2.00 to −2.33 (Kellner and Asner 2009). The maximum gap size was on the order of 10 000 m 2 (1 ha) with most gaps ≤100 m 2 (0.01 ha), which corresponds well with the maximum mortality event size of our plot data (8 trees where average tree density is 605 trees ha −1 ). The best fitted power law exponent for our plot data (−2.86) is more negative than the (Kellner and Asner 2009) exponents, but these two results are remarkably similar considering that ours is based on recently fallen trees and the other is based on contiguous vegetation height.
Other estimates of power law exponents for gap frequency distributions are more or less comparable to ours even though methods differ. Some exponents for gaps smaller than 1000 m 2 (0.1 ha) have been estimated by OLS for logarithmically binned data rather than by MLE on the original data to provide a binned input distribution for an empirical gap dynamics model (Fisher et al 2008, Negrón-Juárez et al 2010. These OLS estimates ranged from −1.1 to −1.6 and are mathematically equivalent to the power law exponent plus one (White et al 2008, Chambers et al 2013. When transformed to the power law exponent (−2.1 to −2.6) these estimates are comparable to MLE fits of the same source data (−1.9 to −2.7; Lloyd et al 2009), and also to our best fits (table 2) for plot and merged data. However, Lloyd et al (2009) estimated an exponent of −3.1 for remotely sensed gaps ranging from 32 to over 1 700 ha (320 000 to 17 000 000 m 2 ; Nelson et al 1994), which generally are much larger than our largest mortality event of about 35 ha (350 000 m 2 ). Even with discrepancies among methods and gap definitions, the consistency in exponents among these studies strongly supports our result that tropical forest mortality event size distributions follow a power law, and that these distributions extend to large mortality events that have not been adequately captured by existing field plots. Thus, the spatial and temporal domains sampled by existing field plots are generally too small to accurately estimate the mortality biomass flux.
This study demonstrates that both MLE and OLS methods for fitting power law functions can be reliable if the data are appropriate to the method used. The MLE algorithm finds the best power law fit to a complete data set i.e. there are observations of each discrete event size, as shown by the ratio and p-values tests (tables 2 and 3). The Landsat data, however, are inherently binned because each observed event size is based on an average mortality rate associated with an integral number of pixels. For example, an isolated mortality pixel might contain 4-19 dead trees (Negrón-Juárez et al 2011), but it is counted as an 8-tree cluster. This binning renders the Landsat data inappropriate for the MLE method, and as a result the MLE fit to the original Landsat data is not statistically robust based the p-value metrics, although this lack of significance could be an artifact of the noisy tail.
The OLS normalized-log-binned method, on the other hand, can estimate good power law fits for the plot and merged data sets (table 3 and figure 1) and at first glance appears to be more reliable than the MLE method for the Landsat data. The binning process helps account for non-sampled event sizes, smoothens out the noisy tail (figure 1), and redistributes pre-binned counts such as occur in the Landsat data. The OLS power law fits to the plot data are good, but MLE is superior for such a complete sample. The close match of the MLE and OLS power law estimates for the merged data (1% difference) indicates that these are both valid, and likely the best, estimates of mortality event distribution for this region. The OLS Landsat power law exponents are more similar to the valid merged data exponents than the MLE Landsat exponents, suggesting that in this case OLS outperforms MLE. But a binned sample from the MLE Landsat fitted distribution, which has a much steeper slope than the OLS fit (tables 2 and 3), more closely matches the observed data, albeit without the noisy tail (figure 1). And even though all p-value tests reject the power law hypotheses for Landsat data, the ratio test ranks the MLE Landsat fit higher than the OLS fit (tables 2 and 3). Discarding the noisy counts for event sizes greater than 458 trees (∼2.88 ha or 28 800 m 2 ) might generate a more reliable fit to the Landsat data, but it would severely restrict the range of event sizes represented by the fitted function. Without a larger data set it is difficult to determine how well the normalized-log-binned data represent the noisy tail and thus whether the Landsat data is adequately represented by the merged data fit or should be represented separately with a steeper slope. However, both MLE and OLS fits to the merged data do incorporate all observations consistently (figure 1).
These robust mortality event distributions enable us to simulate forest landscapes and to show that a more complete event distribution generates spatial patterns in biomass and stand age, in contrast to simulations using a plot-estimated distribution (figures 2 and 3). This spatial difference is augmented by the expected result that a higher average annual mortality rate, determined from the merged data set, reduces above ground biomass in relation to that obtained from the plot-based rate ( figure 2(a)). Using the higher mortality rate in simulations slightly underestimates landscape biomass averages in relation to wider area plot observations (∼319 Mg ha −1 ; Chambers et al 2004), in contrast to overestimation when using the lower average mortality rate (figure 2). The higher average mortality rate also reduces the range of spatial autocorrelation due to the higher frequency of small events over large events in the power law distribution (figure 3). It is therefore apparent that annual rates and event distributions are both key characteristics for understanding tropical forest dynamics.
When using the merged data estimates for average morality rate and event distribution, the simulated spatial patterns are consistent with observations of large gaps, but the forest stand cell size used in our analyses is too large to adequately capture all spatial structure of the forest. The estimated range of spatial autocorrelation for both biomass and t d is ∼28 m (figures 3(a)-(b)), which corresponds to a circular area of 616 m 2 . As mentioned above, the largest gap with vegetation height less than 1 m detected by LiDAR in four tropical landscapes was 10 000 m 2 (1 ha), with most gaps ≤ 100 m 2 (0.01 ha) (Kellner and Asner 2009). Thus, the TRECOS simulations using merged data adequately capture the observed, largearea spatial coherence in tropical forest. Additionally, Kellner et al (2011) estimated ranges of spatial autocorrelation for canopy height (8.3-21.1 m) and changes in canopy height (2.5-3.7 m) for forests on five different substrate ages in Hawaii. Unfortunately these canopy range estimates, and the majority of gaps, are smaller than what TRECOS can resolve because the stand cells are 20 m × 20 m. Nonetheless, the merged disturbance size distribution is required to simulate observed spatial patterns, demonstrating that mortality is not randomly distributed across the landscape

Conclusion
Our results show that existing plot data alone are not sufficient for characterizing tropical forest mortality processes and their impacts on landscape-level biomass density and also on spatial patterns of succession-inducing disturbance interval and biomass density. A power law distribution of mortality event size based on merged plot and satellite data is required in addition to the corresponding annual mortality rate to shift simulated forest structure from a random pattern to a significantly clustered mosaic of stand age and biomass. Thus, we conclude that the spatial clustering associated with mortality due to events larger than eight trees further explains why existing plots do not adequately capture large disturbance losses that influence succession, canopy structure, growth rates, and species composition (Chambers et al 2009a) Due to this underestimation of mortality, current plot-based analyses have a tendency to overestimate forest carbon uptake. It is apparent that a more comprehensive sampling scheme that includes large-area data (e.g., large plots and remote sensing) and robustly characterizes disturbance size distribution is required to understand tropical forest dynamics and its impact on carbon balance.