Simulated impact of sample plot size and co-registration error on the accuracy and uncertainty of LiDAR-derived estimates of forest stand biomass

doi:10.1016/j.rse.2010.10.008

Remote Sensing of Environment

Volume 115, Issue 2, 15 February 2011, Pages 636-649

https://doi.org/10.1016/j.rse.2010.10.008 Get rights and content

Abstract

Regression has been widely applied in Light Detection And Ranging (LiDAR) remote sensing to spatially extend predictions of total aboveground biomass (TAGB) and other biophysical properties over large forested areas. Sample (field) plot size has long been considered a key sampling design parameter and focal point for optimization in forest surveys, because of its impact on sampling effort and the estimation accuracy of forest inventory attributes. In this study, we demonstrate how plot size and co-registration error interact to influence the estimation of LiDAR canopy height and density metrics, regression model coefficients, and the prediction accuracy of least-squares estimators of TAGB. We made use of simulated forest canopies and synthetic LiDAR point clouds, so that we could maintain strict control over the spatial scale and complexity of forest scenes, as well as the magnitude and type of planimetric error inherent in ground-reference and LiDAR datasets. Our results showed that predictions of TAGB improved markedly as plot size increased from 314 (10 m radius) to 1964 m² (25 m radius). The co-registration error (spatial overlap) between ground-reference and LiDAR samples negatively impacted the estimation of LiDAR metrics, regression model fit, and the prediction accuracy of TAGB. We found that larger plots maintained a higher degree of spatial overlap between ground-reference and LiDAR datasets for any given GPS error, and were therefore more resilient to the ill effects of co-registration error compared to small plots. The impact of co-registration error was more pronounced in tall, spatially heterogeneous stands than short, homogeneous stands. We identify and briefly discuss three possible ways that LiDAR data could be used to optimize plot size, sample selection, and the deployment of GPS resources in forest biomass surveys.

Research highlights

► Field plot size is a key design parameter in lidar forest surveys. ► Large plots produce more reliable ground and lidar measurements. ► Large plots are more resilient to the ill effects of co-registration error. ► Optimal plot size depends on tree size and stand heterogeneity.

Introduction

Light Detection and Ranging (LiDAR) systems are able to capture the complex three-dimensional (3-D) structure of forest canopies and underlying ground surface topography at very high spatial resolutions, and have proven to be highly effective for the estimation of forest biomass and other biophysical properties across a broad range of forest ecosystems (Lefsky et al., 2002b, Reutebuch et al., 2003). When using LiDAR to undertake forest surveys, data are typically collected either as a complete (‘wall-to-wall’) spatial coverage or as series of discrete, non-overlapping flight lines (also known as swaths or strips). Field estimates of the target variable (e.g., total aboveground biomass—TAGB) are acquired before or after LiDAR data collection using a network of ground-reference (field) plots established in areas with planned or known LiDAR coverage. Finally, a regression model (or suite of models) is developed (1) to describe the statistical relationship between the field-measured variable of interest (response) and one or more LiDAR metrics (predictors), and (2) to spatially extend model predictions of the target variable across all areas of the land base where LiDAR data were captured (García et al., 2010, Hudak et al., 2006, Li et al., 2008, Næsset and Gobakken, 2008).

Successful application of this model-based estimation procedure is dependent on several underlying assumptions. First, selected LiDAR metrics must be uniquely and well correlated with the biophysical property of interest (Lefsky et al., 2005, Li et al., 2008). Second, the spatial scale (plot size and shape) and location of the LiDAR samples must exactly match the scale and location of ground-reference samples (Patterson & Williams, 2003). Third, plot size must be sufficiently large to minimize potential edge effects (i.e., unplanned exclusion or inclusion of ground-based or remotely sensed forest measurements along the plot boundary), as well as to capture an adequate amount of on-ground (in situ) structural variability (Curtis and Marshall, 2005, Zenner, 2005). Last, the network of ground-reference plots must cover the full range of variability in both response and predictor variables to minimize the occurrence of extrapolation errors when prediction models are applied across the entire LiDAR dataset (Montgomery et al., 2006).

Field plots are typically located using global positioning systems (GPS), with positional accuracies determined by instrument, user, satellite, atmospheric, and environmental related factors (Johnson & Barton, 2004). Forest canopies in particular block or scatter the GPS signal (known as multipathing), which make it difficult to locate field plots with planimetric (horizontal) accuracies better than 1–5 m (Bolstad et al., 2005, Næsset and Jonmeister, 2002, Wing and Karsky, 2006). LiDAR (x, y, z)-return coordinates (point clouds) are also subject to positional errors, but the magnitude of these errors is generally considered to be minor (< 0.5 m in the horizontal domain) relative to field plots and other types of remotely sensed data (Gatziolis & Andersen, 2008). Planimetric errors associated with field plots and/or LiDAR data result in an imperfect spatial overlap (i.e., co-registration error) between these two datasets, and this can potentially weaken the prediction accuracy of regression estimators (Gobakken & Næsset, 2009).

Plot size is an important design parameter in forest surveys, because it has the potential to either dampen or inflate the impact of edge effects and co-registration error. The edge-effect (noise) associated with LiDAR metrics is largely unavoidable, and related to the fact that trees located just outside the plot boundary (not included in the ground sample) may still have some portion of their crowns falling within the plot. As well, trees tallied within the plot may have part of their crowns lying outside the plot boundaries (not measured by LiDAR). Sample plots that have a large perimeter-to-area ratio will, in theory, produce LiDAR metrics that are less precise and less accurate, due to the inclusion of substantially more edge-induced measurement error. Because the perimeter-to-area ratio of a square or circular plot declines nonlinearly with increasing plot size, we expect that larger plots may substantially reduce the negative impact of the edge-effect on the magnitude and stability of LiDAR metrics. Furthermore, larger plots maintain a higher degree of spatial overlap in the presence of GPS positional errors (Flewelling, 2009), exhibit less between-plot variance (Zeide, 1980), and are therefore less affected by the co-registration error that inevitably occurs between ground and LiDAR samples.

Despite the emerging popularity of airborne LiDAR in forest research and operational forest management, few published studies have examined the impact of sample plot size and co-registration error on the prediction accuracy of least-squares regression estimators. Gobakken and Næsset (2009), for example, found that larger field plots (plot size ranged from 200 to 400 m²) established with minimal GPS positional error (< 2–3 m) generally produced more accurate least-squares predictions of stand height, basal area, and volume. They also observed that LiDAR metrics were less sensitive to variations in plot size and co-registration error in short, dense, spatially homogeneous stands than tall, sparse, heterogeneous stands. Mauro et al. (2009) similarly showed that GPS positional errors had a greater impact on LiDAR metrics when extracted from small plots (6 m radius) compared to larger sized plots (10 m radius). Both studies caution that specific findings could be strongly scale-dependent and therefore relevant only for the forests studied.

In this paper, we investigate the impacts of plot size and co-registration error on the precision and accuracy of LiDAR height and density metrics, regression model coefficients, and the prediction accuracy of least-squares estimators. We focus on TAGB as the target variable of interest, because of its importance as a key indicator of ecosystem productivity, carbon storage, and biodiversity (Houghton et al., 2009). Also, TAGB is jointly determined by tree mass and size-frequency (Li et al., 2008, Næsset and Gobakken, 2008, Ni-Meister et al., 2010), and so provides for a more comprehensive analysis of the way in which plot size and co-registration error interact and propagate in regression-based prediction models. We analyzed simulated 3-D forest canopies and synthetic LiDAR point clouds rather than real data, so that we could (1) control the exact amount of spatial overlap between ground-reference and LiDAR samples, and (2) investigate the impact of plot size and co-registration error across a much broader range of TAGB estimates and forest canopy structures. We introduce several possible options to improve regression estimators for LiDAR forest biomass surveys, based on notions of model parsimony, sampling efficiency, and the existence of an ‘optimal’ plot size. Here, we strictly define ‘optimal’ as the minimum plot size required to produce a regression estimator with ‘good’ or ‘acceptable’ statistical properties, rather than in the classical sense of a cost–benefit (i.e., cost of field data collection versus prediction accuracy).

Section snippets

Biomass of simulated forest canopies

We made use of a relatively large (n = 179), well-documented dataset of ‘virtual’ (computer-generated) coniferous (Douglas-fir) forest canopies previously developed and described by Frazer et al. (2005). The maximum geometric extent of each canopy volume was 50 m (width) × 50 m (length) × 60 m (height), with a voxel (volume element) resolution of 10 cm (0.001 m³) (Fig. 1). Tree crowns and boles were modelled as parabolas and cones, respectively (Van Pelt & North, 1996). Stand tables used to construct

PCA and cluster analyses of LiDAR metrics

PCA indicated that 99.8% of the total variance in the original 45 LiDAR metrics could be explained by the first 10 PCs. However, stopping rules based on randomization (i.e., Rnd-Lambda and Avg-Rnd) and broken-stick indices suggested that only the first three PCs were significant (Peres-Neto et al., 2005). These three PCs explained 90.2% of the total variance, with PC1, PC2, and PC3 accounting for 58.1, 21.8, and 10.3% of the total variance, respectively. PC1 was positively correlated with Lh_q

Importance of plot size as a key sampling design parameter

Our findings demonstrate that plot size can be a critically important sampling design parameter in LiDAR forest surveys. We found that regression model fit and prediction accuracy improved markedly as plot size increased from 314 to 1964 m²; however, marginal improvements in model fit and prediction accuracy were diminishing and asymptotic as plot size approached 2500 m². Gobakken and Næsset (2009) similarly reported improvements in R² and RMSE for stand-level basal area and volume predictions as

Acknowledgements

We thank the Pacific Forestry Centre (PFC) and the Canadian Wood Fibre Centre (CWFC) of the Canadian Forest Service (through the Lodgepole Pine Partnership Project) for the funding support that made elements of this research possible. We gratefully acknowledge Dr. Brice Mora, CFS, PFC, for assistance in preparation of the R source code, and two anonymous reviewers and the RSE Editor-in-Chief for their thoughtful editorial comments and suggestions.

References (54)

G.W. Frazer et al.
Simulation and quantification of the fine-scale spatial pattern and heterogeneity of forest canopy structure: A lacunarity-based method designed for analysis of continuous canopy heights
Forest Ecology and Management
(2005)
M. García et al.
Estimating biomass carbon stocks for a Mediterranean forest in central Spain using LiDAR height and intensity data
Remote Sensing of Environment
(2010)
M.A. Lefsky et al.
Patterns of covariance between forest stand and canopy structure in the Pacific Northwest
Remote Sensing of Environment
(2005)
E. Næsset
Predicting forest stand characteristics with airborne laser using a practical two-stage procedure and field data
Remote Sensing of Environment
(2002)
E. Næsset et al.
Comparing regression methods in estimation of biophysical properties of forest stands from two different inventories using laser scanner data
Remote Sensing of Environment
(2005)
E. Næsset et al.
Estimation of above- and below-ground biomass across regions of the boreal forest zone using airborne laser
Remote Sensing of Environment
(2008)
M. Nilsson
Estimation of tree heights and stand volume using an airborne lidar system
Remote Sensing of Environment
(1996)
P.R. Peres-Neto et al.
How many principal components? Stopping rules for determining the number of non-trivial axes revisited
Computational Statistics & Data Analysis
(2005)
E.K. Zenner
Investigating scale-dependent stand heterogeneity with structure-area-curves
Forest Ecology and Management
(2005)
P. Bolstad et al.
A comparison of autonomous, WAAS, real-time, and post-processed global positioning system (GPS) accuracies in northern forests
Northern Journal of Applied Forestry
(2005)

R.B. Caliński et al.

A dendrite method for cluster analysis

Communications in Statistics

(1974)

J.R. Carroll et al.

Measurement error in nonlinear models

(1995)

CurtisR.O. et al.

Permanent-plot procedures for silvicultural and yield research

FlewellingJ.W.

Plot size, shape, and co-registration error determine expected overlap

Frazer, G. W. (2007). Fine-scale, multidimensional spatial patterns of forest canopy structure derived from remotely...

F. Freese

Relation of plot size to variability: An approximation

Journal of Forestry

(1961)

GatziolisD. et al.

A guide to LiDAR data acquistition and processing for the forests of the Pacific Northwest

T. Gobakken et al.

Assessing effects of positioning errors and sample plot size on biophysical stand properties derived from airborne laser scanner data

Canadian Journal of Forest Research

(2009)

T.G. Gregoire et al.

Sampling strategies for natural resources and the environment

(2008)

N.B. Guttman

On the sensitivity of sample L moments to sample size

Journal of Climate

(1994)

J.A. Hartigan

Statistical theory of clustering

Journal of Classification

(1985)

HawbakerT.J. et al.

Improved estimates of forest vegetation structure and biomass with a LiDAR-optimized sampling design

Journal of Geophysical Research

(2009)

C. Hopkinson et al.

Testing models of fractional cover across multiple forest ecozones

Remote Sensing of Environment

(2008)

HoskingJ.R.M.

L-moments: Analysis and estimation of distributions using linear combinations of order statistics

Journal of the Royal Statistical Society, Series B (Methodological)

(1990)

J.R.M. Hosking

Moments or L-moments? An example comparing two measures of distributional shape

The American Statistician

(1992)

HoughtonR.A. et al.

Importance of biomass in the global carbon cycle

Journal of Geophysical Research

(2009)

A.T. Hudak et al.

Regression modeling and mapping of coniferous forest basal area and tree density from discrete-return lidar and multispectral satellite data

Canadian Journal of Remote Sensing

(2006)

Cited by (0)

¹: Tel.: + 1 250 363 0712.

²: Tel.: + 1 250 363 6090.

³: Tel.: + 1 250 721 7329.

View full text

Simulated impact of sample plot size and co-registration error on the accuracy and uncertainty of LiDAR-derived estimates of forest stand biomass

Abstract

Research highlights

Introduction

Section snippets

Biomass of simulated forest canopies

PCA and cluster analyses of LiDAR metrics

Importance of plot size as a key sampling design parameter

Acknowledgements

Forest Ecology and Management

Remote Sensing of Environment

Remote Sensing of Environment

Remote Sensing of Environment

Remote Sensing of Environment

Remote Sensing of Environment

Remote Sensing of Environment

Computational Statistics & Data Analysis

Forest Ecology and Management

A comparison of autonomous, WAAS, real-time, and post-processed global positioning system (GPS) accuracies in northern forests

Northern Journal of Applied Forestry

A dendrite method for cluster analysis

Communications in Statistics

Measurement error in nonlinear models

Permanent-plot procedures for silvicultural and yield research

Plot size, shape, and co-registration error determine expected overlap

Relation of plot size to variability: An approximation

Journal of Forestry

A guide to LiDAR data acquistition and processing for the forests of the Pacific Northwest

Assessing effects of positioning errors and sample plot size on biophysical stand properties derived from airborne laser scanner data

Canadian Journal of Forest Research

Sampling strategies for natural resources and the environment

On the sensitivity of sample L moments to sample size

Journal of Climate

Statistical theory of clustering

Journal of Classification

Improved estimates of forest vegetation structure and biomass with a LiDAR-optimized sampling design

Journal of Geophysical Research

Testing models of fractional cover across multiple forest ecozones

Remote Sensing of Environment

L-moments: Analysis and estimation of distributions using linear combinations of order statistics

Journal of the Royal Statistical Society, Series B (Methodological)

Moments or L-moments? An example comparing two measures of distributional shape

The American Statistician

Importance of biomass in the global carbon cycle

Journal of Geophysical Research

Regression modeling and mapping of coniferous forest basal area and tree density from discrete-return lidar and multispectral satellite data

Canadian Journal of Remote Sensing