Evaluating the capacity of single photon lidar for terrain characterization under a range of forest conditions

Accurate digital elevation models are key data products used to inform forest management. Light detection and ranging (lidar) technologies have emerged as a useful tool for acquiring detailed terrain information, although the accuracy of this data is known to vary with topographic complexity and the density and characteristics of overlying vegetation. Single Photon Lidar (SPL) provides a high-density point cloud that can be acquired from a much higher altitude than discrete return, small-footprint lidar (hereafter, linear-mode lidar or LML), providing efficiencies and potential cost savings for operational mapping programs. Herein, we assess the absolute and relative accuracies of leaf-on and leaf-off SPL data acquired at different altitudes for characterizing terrain under varying vegetation types and densities and compare to results for LML data. Our assessment was forest-focused and primarily point based, using 299 Real-Time Kinematic survey checkpoints to quantify elevation errors (Δh); however, we also investigated and reported accuracy for linear transects, and conducted a wall-to-wall comparison of the SPL-derived 1-m digital elevation models (DEMs) against an LML-derived DEM. Point cloud characteristics for the leaf-on 2018 SPL data were markedly different, with 88% of returns as first returns, compared to 17% for the LML, and 59% and 46% for the leaf-off SPL data acquired at 3800 m and 2000 m, respectively. Of the datasets considered herein, the SPL data acquired under leaf-on conditions in 2018 had the lowest accuracy and precision for characterizing terrain underneath vegetation cover, with an RMSE of 10.97 cm and a 95th quantile of 24.03 cm; however these values are within commonly accepted error limits for elevation products. The leaf-off SPL data were most accurate overall; however, the differences between the leaf-off SPL data acquired at 3800 m versus 2000 m were often minor (< 1 cm on average), with similar patterns in Δh between the two datasets (r = 0.8). In terms of the relative performance of the lidar datasets examined, results from the analyses of linear transects were similar to those of the checkpoints, but highlighted the variability in elevation accuracy within similar cover types. Wall-to-wall comparisons of the SPL-derived DEMs to the 2012 LML DEM also corroborated the results of the checkpoint assessment, with the 2018 SPL leaf-on DEM having the largest differences (mean difference = 7.44 cm; RMSD = 18.07 cm). Differences between DEMs did not trend consistently with increasing canopy cover or with the percentage of returns that were within ±15 cm of the ground surface. We found that it was not only the density of the vegetation, but also the composition and configuration of both the overstory and understory vegetation that influenced the accuracy with which the lidar characterized the terrain surface. Overall, our results indicated that leaf-on SPL is capable of capturing terrain information under a wide variety of forest and vegetation conditions, albeit at a lower accuracy than what is possible with leaf-on LML or leaf-off SPL, but at a level of accuracy that is within acceptable limits for most forest applications. * Corresponding author. E-mail address: joanne.white@canada.ca (J.C. White).


Introduction
Topography influences a range of biological, geomorphological, and hydrological process on the landscape, and the three-dimensional characterization of the Earth's surface is a fundamental information need for the science and management of many natural resources, including forests (Moore et al. 1991). Accurate, high spatial resolution digital elevation data are critical for forest management and operational planning (White et al. 2012), particularly for road layout (Niemi et al. 2017) and harvest planning (Holopainen et al. 2014), as well as for improving soil and ecosite mapping (Furze et al., 2017, van Rensen et al. 2015. The importance of digital elevation information is evidenced in national mapping programs (Bolstad and Stowe, 1994, Gesch et al. 2002, Natural Resources Canada, 2019 and global data products (Farr et al. 2007). A digital elevation model (DEM) is defined as an "ordered array of numbers that represent the spatial distribution of elevations above some arbitrary datum in a landscape" (Moore et al. 1991). Traditionally, elevation data were derived from photogrammetric data sources; however, light detection and ranging (lidar) technology emerged as an accurate alternative data source (Liu 2008) and for more than two decades, airborne laser scanning (ALS) data have provided an unprecedented solution to generating a digital elevation model (DEM) under forest canopy (Kraus and Pfeifer, 1998). ALS data is now used operationally in forest inventory and management and the benefits of lidar for these applications are well documented (Naesset, 2015, White et al. 2016. As a result, there is interest in increasing the efficiency and spatial extent of ALS acquisitions, while also reducing acquisition costs. Single photon lidar (SPL) represents a potential significant technological advance towards the rapid and cost-effective characterization of forest structure and terrain underneath canopy and over large areas (Swatantran et al. 2016). Quantifying the accuracy with which SPL technology can characterize terrain elevations is critical information for end users of the derived DEM products (Wechsler 2007), who will use these data to sustainably manage the forest resource and plan their forest operations (White et al. 2016). Stoker et al. (2016) evaluated both single photon (HRQLS) and Geiger-mode lidar (IntelliEarth™) for use in the United States' 3D Elevation Program (3DEP). Sub-areas of forest, urban, and mixed land use within a 1300 km 2 area in northern Connecticut were identified for evaluation. Although no details were provided on the nature of the forest vegetation present, based on the location, it is assumed these were primarily hardwood forests (Wharton et al. 2004). The authors note that the forested sub-area was not evaluated for the HRQLS due to the presence of fog at the time of data acquisition, and the nature of the vegetation cover in the other areas with checkpoints was not reported. Non-vegetated vertical accuracy (NVA) and vegetated vertical accuracy (VVA) were calculated according to ASPRS positional accuracy standards using both the point cloud directly and the derived DEM. For quality levels 1 and 2 (QL1/QL2), NVA must be less than 19.6 cm and VVA less than 29.4 cm (ASPRS (American Society for Photogrammetry and Remote Sensing), 2014). Both the HRQLS and Geiger-mode lidar failed to meet the VVA requirements and the authors attributed this result to poor foliage penetration in dense vegetation conditions. Li et al. (2016) also examined the performance of HRQLS SPL data for a 13 ha forest area in Maryland, and compared the derived DEM to an LML-derived DEM. The SPL data were acquired in leaf-off conditions (November 2014) from an altitude of 2286 m, whereas the LML data were acquired in leaf-on conditions (June 2003, ALS instrument not specified). The forest area was comprised of pines, oaks and heath shrubs, with undulating terrain. DEMs were generated from both datasets with a 1 m spatial resolution. Elevation differences between DEMs ranged from − 50 cm to 50 cm, with the SPL data overestimating elevation relative to the LML. The authors attributed this difference to challenges in correctly filtering ground returns in the dense SPL data, with returns from small shrubs and trees erroneously identified as ground returns. Swatantran et al. (2016) evaluated the efficacy of leaf-on SPL data in mapping forest structure and terrain for a 1700 km 2 area in Garrett County, Maryland. SPL data were acquired using the HRQLS instrument at an altitude of 2286 m. The study area was 67% forested and was dominated by second growth forests consisting of oaks, northern hardwoods, and hemlock-pine stands. Leaf-off LML data acquired in 2005 with an average density of 1 point/m 2 (and consisting only of first returns) were used as reference, along with 176 National Geodetic Survey (NGS) benchmarks and 71 variable radius field plots measured in 2014. Compared to the NGS benchmarks, the SPL had an RMSE of 3.78 m and a bias of − 2.12 m, whereas the LML had an RMSE of 3.17 m and a bias of − 0.5 m. Relative to the LML data, the bias of the SPL data was − 1.61 m. Based on analyses of subsequent data from a different area, the authors concluded that much of the error associated with the SPL instrument was the result of instrument calibration bias. Swatantran et al. (2016) concluded that the higher solar noise observed in the daytime SPL acquisition would not be a limiting factor to the use of the data given further refinements to noise filtering, and that the observed bias of the SPL data could be reduced with better bias control and range calibration of the SPL data. Mandlburger et al. (2019) compared data from the Leica SPL100 and Riegl VQ-1560i (full waveform). Both datasets were acquired in leaf-on conditions to a specification of target last return pulse density of 20 pulses/m 2 , for a 160 km 2 area within the municipality of the City of Vienna. SPL100 data were acquired from an altitude of 4000 m with a pulse repetition rate of 50 kHz and a scan rate of 5 MHz with 60% sidelap. Of note, by specifying a target pulse density for the surveys, Mandlburger et al. (2019) were uniquely able to quantify some of the economies afforded by the SPL100 instrument. Whereas 5 flight lines with 20% overlap for the SPL100 could have achieved the target pulse density, 10 flight lines with 60% overlap were used to minimize scan shadows in the urban area. By comparison, the LML data were acquired using 18 flight lines with 50% overlap flown at an altitude of 750 m AGL to meet the target pulse density. Thus, the SPL swath width (2400 m) was nearly three times larger than that of the LML swath width (840 m), and the SPL data were acquired at an altitude that was five times greater than the LML. While a portion of the area analyzed was forested, Mandlburger et al. (2019) provided no details on the nature or density of the forest types present. The authors visually compared the penetration capabilities of the two instruments and found that the LML outperformed the SPL in terms of overall pulse density and ground coverage, with the SPL mainly providing single returns concentrated in the upper vegetation canopy. The average number of returns per laser pulse was 1.84 for the LML compared to 1.06 for the SPL. Mandlburger et al. (2019) speculated as to whether aggressive post-processing of the SPL data to remove noise may have also removed useable returns from within the vegetation canopy. Brown et al. (2020) compared the performance of the Leica SPL100 and the Optech Titan in a primarily urban environment of the University of Houston campus. The authors characterized the horizontal and vertical accuracies of the ALS data using 33 survey-grade GNSS checkpoints, and assessed ranging precision for flat surfaces and under tree canopies, the latter of which consisted of small groups of trees that were underlain by manicured lawns. The Leica SPL100 was acquired at an altitude of 3700 m AGL with a density of 25 pulses/m 2 , whereas the Optech Titan was acquired at 500 m AGL with a density of 12 pulses/m 2 . Both datasets were acquired within weeks of each other in leaf-on conditions and only the 532 nm wavelength from the Titan data were analyzed for comparability to the SPL100. In terms of horizontal positioning, the authors found no statistically significant differences between either of the ALS datasets and the reference checkpoints, but did find significant differences in vertical elevations indicative of systematic bias in the point clouds of both instruments. Relative to the checkpoint data, the SPL100 had an average vertical bias of 11.67 cm (standard deviation = 7.63 cm), whereas the Titan had an average vertical bias of − 13.27 cm (standard deviation = 7.88 cm). However, the authors concluded that in terms of the magnitude of the error, there was no statistically significant difference in the vertical accuracies for the LML and SPL data. In their analysis of planar surfaces, the authors found that the SPL100 had lower ranging precision (standard deviation = 3.2 cm versus 1.2 cm for the Titan), and was more negatively impacted by surface properties, specifically increases in radiometric brightness and incidence angle. These findings are not unexpected given that the SPL100 data were acquired from an altitude that was more than seven times greater than the Titan data. DEMs generated from both data sets had a mean difference of 1.4 cm, noting that the study area was relatively flat and primarily urban. In terms of the impact of canopy cover on range precision, the SPL100's range precision for flat terrain under canopy was 3.6 cm, compared to 3.0 cm in open terrain. For the Titan, range precision was 2.7 cm under canopy and 1.5 cm in open terrain. For both sensors, the difference between range precision values under canopy and in the open was found to be significantly different. Similar to Li et al. (2016) and Mandlburger et al. (2019), the authors highlight noise filtering as the likely cause of the limited range resolution in the SPL100 data.
All of the aforementioned studies provide useful insights into the potential capacity of SPL data to accurately characterize the terrain surface underneath forest canopies. What is currently lacking however is a systematic investigation from a forestry perspective that examines the nature of SPL performance for characterizing the terrain surface underneath a range of forest types and vegetation densities, as well as acquisition conditions. The objectives of this study were therefore threefold: (i) quantify the vertical accuracy and precision of terrain elevations for SPL data under a range of forest conditions (leaf-on and leaf-off) and acquisition altitudes (3800 m versus 2000 m; (ii) characterize differences in penetration capabilities for SPL data under a range of different forest densities; and (iii) evaluate how differences in point clouds and vertical accuracy translate into the derived DEMs. Our study provides the first systematic investigation of SPL data for terrain mapping under forest canopies in leaf-on and leaf-off conditions.

Study area
The Petawawa Research Forest (PRF) was established in 1918 and is the oldest, continuously operated research forest in Canada (Place 2002). Located approximately 200 km northwest of Ottawa, the PRF extends over 10,000 ha of forested land (Fig. 1). The forests of both the PRF are dominated by mature Great Lakes-St. Lawrence mixedwood stands (70% by area) with remaining stands being primarily hardwood (22%) and conifer (8%) dominated. Approximately 85% of the forest land within the PRF is considered productive and includes several longterm conifer plantation studies that are unique to Canadian forestry. With its combination of experimental plots, plantations, and nonresearch areas, the PRF represents a heterogeneous mixture of forest conditions (White et al. 2019). The topography of the PRF has been impacted by glaciation and post-glacial outwashing. The area contains extensive sand plains, imposing hills with shallow sandy soils, bedrock outcrops, and areas of gently rolling hills with moderately deep loamy sand containing numerous boulders. Elevation across the study area ranges from approximately 100 m to 310 m above sea level.

Reference ground data
Real-time kinematic (RTK) survey data were acquired as control data using the NAD83 CSRS horizontal datum Version 6 (2010.0) UTM Zone 18 projection and the CGVD2013 vertical datum (Hubert 2019). The control network consisted of three published control locations maintained by the Ontario Ministry of Natural Resources and Forestry Control Survey Information Exchange (COSINE) database, as well as eight additional control points, established as inter-visible pairs distributed within the project study area (Fig. 2). Due to the long baselines between COSINE monuments, the project control points were occupied for three hours with the roving unit. The derived coordinate files of all static positions were submitted for post-processing using Natural Resources Canada's Precise Point Positioning (PPP) system. 1 The accuracy of the RTK survey was within 2 cm for hard surfaces, with respect to the nearest control point, and 5-10 cm for natural ground surfaces.
Coordinates and ground elevation values were acquired for a total of 300 checkpoints within the PRF using a Sokkia FX 105 total station. These checkpoints represented a range of vegetated and non-vegetated conditions (Table 1). A total of 79 checkpoints were acquired in nonvegetated conditions (i.e. asphalt, gravel) and 221 checkpoints in vegetated conditions, that were characterized by the dominant tree species and or vegetation present (i.e. black spruce, coniferous plantation, intolerant hardwood, jack pine, low vegetation, mixedwood, red and white pine, and tolerant hardwood). Checkpoints were acquired along 50 or 100 m transects within each cover type group, with checkpoints located at 10 m intervals along the transects (Fig. 2).

Lidar data
Linear-mode lidar data were collected over the Petawawa Research Forest on August 17-20, 2012 using a Riegl 680i sensor on a Cessna 172 aircraft flown at an average altitude of 750 m ( Table 2). These data were acquired as full waveform, but delivered by the data provider as discrete return data in 1-km tiled classified LAS files (v1.1) in NAD83 CSRS horizontal datum, UTM Zone 18 projection, and CGVD28 vertical datum. Horizontal and vertical RMSE were reported as 14.1 cm and 5.1 cm, respectively. PDAL (version 1.9.1) was used to transform the 2012 LML LAS files to the same vertical datum (CGVD2013, Version 62,010.0) and LAS file format (v1.4) as the SPL data. The LAS tiles were manually reviewed to verify the quality of the transformation and the ground point classification.
Leaf-on SPL data were collected over the Petawawa Research Forest on July 2, 2018. The Leica SPL100 sensor was flown aboard a Piper-PA-31-350 at an average altitude of 3760 m (Table 2). SPL data were also acquired for the Petawawa Research Forest on May 31, 2019 with a Leica SPL100 instrument flown aboard a Cessna F406 Caravan II aircraft at two different altitudes: approximately 3800 m (similar to that of the 2018 leaf-on acquisition) and approximately 2000 m. As a result of a late spring, conditions for acquisition in 2019 were primarily leaf-off. All three SPL datasets were acquired according to the Ontario Specification for Lidar Acquisition (Ontario Ministry of Natural Resources and Forestry, 2016), with a 10-cm vertical accuracy class. Vertical RMSE for the 2018, 2019H, and 2019L were reported as 2.7 cm, 5.5 cm, and 5.4 cm respectively, and horizontal RMSE was reported as <15 cm. Data processing and noise filtering for the three SPL datasets were completed by the data provider, using a consistent processing workflow and noise filtering algorithms, following the approach detailed by Gluckman (2016). SPL data were then classified using proprietary software tools and delivered as 1 km tiled LAS files (v1.4) in NAD83 CSRS horizontal datum Version 6 (2010.0), UTM Zone 18 projection, and CGVD2013 vertical datum. LAS tiles were manually reviewed to verify the quality of the ground point classification and to ensure spatial alignment of the acquisitions.
LML systems are able to record multiple returns (commonly 5 returns per pulse) of energy for each laser pulse; however, the amount of energy (i.e. the number of photons) required to trigger a return at the sensor is proprietary to each instrument. In LML systems, hundreds or thousands of photons may be required to trigger a return in order to reduce the impact of noise (i.e. photons from sources other than the laser; Brown et al. 2020). In contrast, a single photon can trigger a return at the sensor for an SPL system. The SPL100 system splits the emitted laser pulse into a 10 × 10 array of 100 beamlets and the returned energy from each beamlet is recorded onto a 10 × 10 array of highly sensitive detectors. These detectors have a very short recovery time (1.6 ns), which allows them to record multiple returns from each emitted laser pulse. These sensitive detectors, combined with the high repetition rate of the transmitter, represent the operational advantage of SPL technology for lidar acquisition. As a function of their design, SPL generate higher pulse densities relative to LML. However, while conventional lidars commonly operate at near-infrared wavelengths (e.g. 1064 nm), currently commercially available SPL instruments operate at green wavelengths (532 nm) and are thus sensitive to solar noise, highlighting a need for efficient noise filtering algorithms for SPL data (Stoker et al. 2016;Li et al. 2016;Brown et al. 2020). LML data must be acquired at lower altitudes (e.g. 750 m agl versus 3800 m for the SPL) and slower flying speeds (<100 knots compared to 180 knots for the SPL) relative to SPL (Wästlund et al. 2018, Mandlburger et al., 2019, both of which can increase flying time and data acquisition costs. It should be noted however that innovation in LML systems continues to increase the upper limit of acquisition altitude for these instruments as well. The SPL instrument used in the assessments of Li et al. (2016), Swatantran et al. (2016), and Stoker et al. (2016) was the first generation HRQLS instrument. The design of the second generation HRQLS sensor (marketed as the SPL100) was modified to enable the acquisition of high point densities at altitudes greater than 3100 m (Degnan, 2016).

Digital elevation models
Digital elevation models (DEM) with a spatial resolution of 1 m were generated for each of the four lidar acquisitions using returns classified as "ground" (class = 2) in the LAS files and following the methods of Axelsson (1999Axelsson ( , 2000. DEM hillshades were visually inspected to verify quality and spatial alignment, and to ensure that there were no artifacts, anomalies, or gaps in the derived DEMs. A mask of the area common to all four lidar acquisitions was generated to restrict further analyses (representing approximately 4000 ha; Fig. 1).

Forest inventory data
Forest information was derived from a forest inventory representing 2018 forest conditions. This inventory partitioned the PRF into distinct stands (polygons) representing relatively homogenous forest conditions in terms of species composition, forest structure, and management histories. The inventory was generated using a combination of expert knowledge, air photo interpretation, ground plot data, and management history. Several attributes from the forest inventory were used to interrogate differences between LML and SPL derived DEMs including land cover type, vertical complexity class, and forest planning unit (Ontario Ministry of Natural Resources and Forestry 2017). In the forest inventory data, broad land cover types are assigned to the stands to distinguish productive forest from other vegetation types. Land cover classes considered herein were brush and alder, forest, grass, open wetland, and treed wetland. The vertical complexity class characterizes the number of distinct tree layers within the stand, ranging from single layer stands to complex stands, wherein ages and heights are not considered to originate from a single disturbance event. Integral to the definition of vertical complexity is an indication of which layer is relevant for management (in the case of more than one layer): overstory or understory. Forest planning units are defined by a combination of forest type and silvicultural management system and provide a useful indicator of vegetation arrangement within the stand. In addition to these categories, the forest inventory contained stand-level estimates of vertical canopy cover derived from the 2018 SPL data. These stand-level estimates were made at a 25 m gridcell resolution using a 50 cm spikefree canopy height model (CHM) that was generated from the 2018 SPL data. Canopy cover was determined as the percentage of CHM cells within each 25 m grid cell that had a height > 2 m, and values for all 25 m grid cells within a forest stand were subsequently averaged to generate a stand-level estimate of cover. These estimates of cover were used to inform the wall-to-wall DEM analysis.

Analysis approach
To fully explore the nature of the differences between the datasets considered, we undertook a point-based assessment with the RTK   checkpoints, as well as an assessment based on linear RTK transects, and a wall-to-wall assessment for the derived DEMs.

Assessment based on checkpoints
The survey checkpoint data and corresponding lidar point clouds were evaluated using the GeoCue LP360 software package (version 2018.2.59.0). The LML and SPL LAS files were imported into an LP360 project and using the Control Points Report Dialog, a local TIN surface was generated for each checkpoint and each lidar acquisition and the Z values corresponding to the checkpoint location were extracted from the triangulated surface. Differences between each of the RTK-measured checkpoint elevations and the corresponding lidar elevations (Δh) were then calculated and summarized using both parametric (Fisher and Tate 2006) and non-parametric metrics (Höhle and Höhle 2009). Parametric measures included the root mean square error (RMSE), mean error (ME), and the error standard deviation (S). The following equations were used to calculate RMSE, ME, and S (Fisher and Tate 2006): where Z lidar is the elevation measure from the lidar, and Z ref is the elevation measure from the RTK survey for a sample of n points. RMSE provides an indication of the dispersion of errors, whereas ME provides an indication of bias. When ME is small, RMSE and S may be similar. The Friedman ANOVA by ranks test (Hollander and Wolfe 1973) was used to test for significant differences in Δh between the four lidar acquisitions. If a significant difference was found, a Wilcoxon matched pairs test (Siegel and Castellan 1988) was subsequently used to determine which lidar datasets had significantly different RMSE values. Following the approach of Höhle and Höhle (2009), differences between the reference data and the four ALS datasets were examined to check for outliers. RMSE values calculated using Eq. 1 above were used to determine a threshold for screening of outliers. Any checkpoints with Δh values that exceeded three times the calculated RMSE value for that dataset were removed from further analysis. Each of the four ALS datasets were examined separately, using the RMSE value for that dataset, resulting in the identification of three outliers: one for the 2012 LML data (a low vegetation checkpoint with Δh = 46.4 cm), and two for the 2018 SPL data (both black spruce checkpoints with Δh = 39.4 cm and 33.0 cm). Site photos and DEM hillshades of all three outliers were examined; however, only the low vegetation checkpoint was removed as it was clearly a temporary feature resulting from forest operations in 2012 that was not present at the time of either the 2018 or 2019 SPL acquisitions. A total of 299 checkpoints were therefore retained for further analysis.
In addition to ME, RMSE, and S, non-parametric measures of elevation accuracy were also calculated, including the median (50% quantile), the Normalized Median Absolute Deviation (NMAD), the 68.3% quantile, and the 95% quantile. The median and NMAD were calculated using Δh, whereas the 68.3% (Q68) and 95% (Q95) quantile were calculated using the absolute value of Δh, as per the approach of Höhle and Höhle (2009). The median is less sensitive to outliers and is a robust indicator of any systematic shift in height values relative to the reference data. Q68 and Q95 provide an indication of the magnitude of the differences between the reference data and the ALS data, regardless of the sign of the difference (i.e. an overestimate or underestimate) and Q95 is often used to assess data according to accuracy benchmarks or standards. The NMAD is proportional to the median of the absolute differences between errors and the median error and is considered an estimate of the standard deviation of the Δh values that is less sensitive to outliers (Höhle and Höhle 2009): where Δh j denotes the individual errors j = 1, …, n and m Δh is the median of the errors. Parametric and non-parametric results were summarized overall, and by the broad cover groupings of vegetated and non-vegetated cover types. Results were also summarized by detailed cover types (Table 1).
To account for different densities of vegetation associated with each checkpoint, we calculated a Lidar Penetration Index (LPI) and assigned an LPI class to each checkpoint. The LPI is a ratio of the total number of returns found within ±15 cm of the ground surface, relative to the total number of returns, expressed as a percentage. The LPI was calculated independently for each lidar acquisition, and to enable processing, the lidar point clouds were clipped to a circular area (analogous to the area of a forest inventory field plot with a radius of 14.1 m and an area of 625 m 2 ) surrounding each checkpoint centroid. Non-vegetated checkpoints would be expected to have larger values for the LPI compared to vegetated checkpoints, whereas different lidar acquisitions would have different penetration indices for the same cover types if the lidar data were acquired in leaf-off versus leaf-on conditions. Based on the distributions of LPI values for the checkpoints, four classes were defined to enable summary of the RTK checkpoint accuracy: LPI <10%, 10-20%, 20-30%, and ≥ 30%.

Assessment based on linear transects
As the RTK checkpoints were acquired as linear transects, RMSE (Eq. 1) was also calculated and reported by transect. In total, 36 transects were acquired in different dominant cover types, with an average transect length of 50 m.

Assessment based on derived digital elevation models
To assess the impact of leaf-on versus leaf-off conditions, as well as acquisition altitude on the SPL data, the wall-to-wall DEM generated from each of the SPL acquisitions was compared to the DEM derived from the 2012 LML data. Metrics used for comparison included the mean difference (MD; Eq. 5), Root Mean Squared Difference (RMSD; Eq. 6) and the 95th percentile of the absolute difference between the SPL and 2012 DEMs.
We used the LPI in our assessment of DEMs, classifying LPI values into 10% classes. We likewise classified a slope surface generated from the 2012 LML DEM into 5 • classes for our assessment. As noted earlier, topography in the study area is undulating and the elevation range is approximately 200 m. Slopes in the study area ranged from 0 • to 85 • , with a mean of 6.6 • and a standard deviation of 5.3 • . Approximately 80% of the analysis area ( Fig. 1) had slopes <10 • , and < 1% of the analysis area had slopes that were > 25 • .

Assessment based on checkpoints
Elevation differences between the surveyed RTK checkpoints and the lidar datasets were not normally distributed (Shapiro-Wilk W p < 0.05), moreover, the magnitude of Δh were larger for the 2018 SPL data (Fig. 3). Furthermore and as indicated in Fig. 4, patterns in the elevation residuals (∆h) were not consistent among the four datasets, with differences evident in the magnitude and direction (positive or negative Δh). The most similar patterns in Δh were found between the 2019 leafoff acquisitions (R 2 = 0.633), whereas the least similarity was found between the 2018 leaf-on and 2019 leaf-off SPL acquired at 3800 m (2019H, R 2 = 0.195). Elevation differences for the 2012 LML were most similar to those of the 2019 leaf-off SPL acquired at 2000 m (2019L, R 2 = 0.303).
The overall RMSE was largest for 2012 LML data (10.47 cm) and lowest for 2019L SPL data (8.56 cm; Table 3). The mean error indicated that all datasets had a negative bias and underestimated elevations relative to the reference data. Of note, the 2018 SPL data had the lowest mean error (− 0.88 cm; Table 3), and the largest error standard deviation (S = 10.12 cm); however the 2018 SPL also had the most symmetrical Δh distribution (Fig. 3). For non-vegetated cover, RMSE was largest for 2012 LML (11.82 cm), which also had the largest mean error (− 10.91 cm) and error standard deviation (4.58 cm). It should be noted that given the amount of time elapsed between the 2012 data acquisition and the RTK survey, some results presented for the 2012 LML should be interpreted with caution, in particular those for non-vegetated surfaces. Both asphalt and gravel road surfaces within the study area are maintained for seasonal conditions, and for gravel roads in particular, this can result in minor changes in surface elevation over time. Although the 2018 SPL had the lowest non-vegetated RMSE (7.36 cm), it also had the largest vegetated RMSE (10.97 cm) and was the only lidar dataset with a positive bias (0.96 cm) for vegetation ( Table 3). Results of the Friedman test indicated a significant difference in Δh overall, and for nonvegetated and vegetated checkpoints (p < 0.05). Average elevation errors for 2018 SPL were significantly different from the 2012 LML and 2019H and 2019L SPL Δh, whereas the 2012 LML Δh were only significantly different from the 2018 SPL.
Overall, the non-parametric results (Table 4) indicate similar results to the parametric measures: the 2012 LML had the largest median ∆h (− 7.8 cm), while the 2018 SPL had the largest NMAD (11.42 cm) and largest Q95 (21.39 cm). Also similar to the parametric results reported in Table 3, the 2012 LML had the largest NMAD (4.3 cm) and Q95 (17.36 cm) for the non-vegetated cover types (Table 4). Of note, the 2019L SPL had the largest median difference for the vegetated classes (− 6.5 cm), although the 2018 SPL had the largest NMAD (11.93 cm) and Q95 (24.03 cm).
By vegetated cover type, the 2012 LML had the largest RMSEs for tolerant and intolerant hardwoods, whereas the 2018 SPL had the largest RMSE for black spruce and low vegetation, and the 2019H and 2019L SPL had the largest RMSE for mixedwood (Table 5). The Friedman test indicated significant differences in Δh for all cover types except red and white pine. Pairwise significant differences in Δh were common between the 2018 SPL and other lidar datasets, particularly the 2019 leaf-off SPL. In contrast, differences between 2019H and 2019L were rarely significant for different cover types (Table 5). Results for nonparametric measures were similar (Supplement Table S1): black spruce had the largest Q95 (29.40 cm; 2018 SPL) and mixedwood had the largest NMAD of all the vegetated classes (12.97 cm; 2018 SPL).
Overall, the median LPI for the non-vegetated checkpoints was 24.9%, 34.2%, 37.1% and 39.7% for 2012 LML, 2018 SPL, 2019H SPL, and 2019L SPL, respectively. This compares to the median value for vegetated checkpoints, which were 7.9%, 13.7%, 18.2%, and 17.9% for LML, 2018 SPL, 2019H SPL, and 2019L SPL, respectively. Representation of the vegetated RTK checkpoints relative to the LPI classes varied by lidar acquisition (Fig. 5). The 2018 SPL had the largest number of checkpoints with <10% of lidar returns coming from within ±15 cm of the ground, followed by the 2012 LML, which also had the fewest number of checkpoints that had ≥30% of returns coming from within ±15 cm of the ground surface. As expected, the leaf-off SPL acquisitions resulted in more near-ground returns and larger LPI values; however, there were minimal differences in the distribution of checkpoints among the LPI classes for both of the leaf-off 2019 SPL acquisitions, despite differences in acquisition altitude (Fig. 3). There were no consistent trends in Q95 or RMSE with increasing LPI (Table 6). For the 2012 LML and 2018 SPL data, RMSE and Q95 were larger when LPI was <10% and smaller when LPI was ≥30%. For the checkpoints, an increasing LPI did not translate into greater accuracy in estimating terrain elevation.

Assessment based on linear transects
Assessments of the 36 linear transects indicated variability in terrain elevation accuracy within cover types (Supplement Table S2). We found no clear trend in error associated with the range of elevation covered by the transect or the transect length. Some of the transects have relatively consistent level of errors across all four lidar datasets considered herein (e.g. C5, D2, D5), whereas other transects revealed marked differences in performance between lidar datasets (e.g. E1, F2, G2). Overall for the vegetated transects, the 2018 SPL had the lowest accuracy and the greatest range in RMSE (mean = 10.06 cm, standard deviation = 5.10 cm) and Q95 (mean = 16.31 cm, standard deviation = 7.28 cm). Fig. 6 shows the point cloud profile for transect E1, which was located in a black spruce stand (also shown in Fig. 2, Inset A). Transect E1 had the largest RMSE for the 2018 SPL data (Supplement Table S2). The lack of ground returns underneath the dense understory in the SPL 2018 compared to the other lidar datasets is clearly evident in Fig 6.

Assessment based on derived digital elevation models
Overall, the DEM derived from the 2018 SPL data had the largest differences relative to the 2012 LML DEM, overestimating elevation by an average of 7.44 cm, an RMSD of 18.07 cm, and a Q95 of 36.03 cm (Table 7). Of note, overall results for the 2019H and 2019L acquisitions were very similar, despite differences of ~1800 m in acquisition altitude. RMSD also varied by broad cover type from the 2018 forest inventory data. Again, the 2018 DEM had the largest RMSD and Q95 for the majority of classes, with the exception of the open and treed wetland cover types, for which the 2019L DEM had the largest RMSD and Q95 values (Table 8). Whereas the 2018 SPL data overestimated elevation for all cover types, both of the 2019 SPL underestimated elevation relative to the 2012 DEM. Of note, the RMSD and Q95 were lowest for the forest cover type. The 2018 SPL in particular had large differences for grass (RMSD = 29.11 cm), whereas the all three SPL datasets had large Table 3 Parametric measures of checkpoint elevation residuals (Δh) between RTK survey data and LML and SPL data for overall, non-vegetated, and vegetated cover types as per Fisher and Tate (2006). Letters represent the results of the Wilcoxon matched pairs test and indicate which lidar datasets had significant differences in ME Δh (p < 0.05).   differences with the 2012 LML DEM in the open and treed wetland, likely as a result of seasonal differences in vegetation and soil moisture for these cover types. RMSD and Q95 increased with increasing slope, with the 2018 SPL data having larger differences in elevation relative to the 2012 LML data by slope class, while the results for the 2019H and 2019L SPL data were similar (Table 8). In contrast, differences between the 2012 LML and SPL DEMs did not increase with increasing canopy cover (Fig. 7). For the 2018 SPL DEM, when canopy cover increased from 0 to 10% to 11-20%, RMSD and Q95 increased 3 cm and 8 cm, respectively. Beyond 20% canopy cover, MD, RMSD, and Q95 were relatively stable. The 2018 SPL data had larger RMSD and Q95 values than either of the 2019 leaf-off SPL acquisitions (Fig. 7).
We also examined differences in DEMs by LPI. As indicated in Fig. 8, at <10% canopy cover, average LPI were similar for leaf-on and leaf-off SPL data. However, as canopy cover increases, LPI decreases, and at >60% canopy cover, LPI decreases more for the 2018 leaf-on SPL data, while the LPI for the leaf-off acquisitions converge. At 91-100% canopy cover, the mean LPI for the 2018 SPL data is 6%, compared to ~20% for the leaf-off SPL acquisitions (Fig. 8). In terms of differences between DEMs, results for LPI were similar to results for canopy cover with no consistent trend in RMSD and Q95 with increasing LPI. Whereas the 2018 SPL DEM consistently overestimates elevation relative to the 2012 LML DEM, the leaf-off SPL data tend to underestimate elevation relative to 2012 until LPI > 70% (Supplement Table S3).
Vertical complexity classes from the forest inventory data were used to gain some insight into the impact of vegetation configuration on DEM differences. We found that the largest differences relative to the 2012 LML DEM for all data sets were not associated with the most complex vertical structure class, but rather with single layer stands that had residual veterans in the overstory: all SPL DEMs overestimated elevation relative to the 2012 LML DEM for this stand type and had the largest RMSD and Q95 values (Table 9). For the 2018 SPL, the complex stand type was the next most challenging vertical structure, overestimating elevations by 8.92 cm and having an RMSD of 18.15 cm. Generally, the leaf-on SPL overestimated terrain elevations relative to the 2012 LML DEM, whereas the leaf-off SPL data underestimated elevations and the difference between leaf-off SPL data acquired at different altitudes was minimal.
Forest planning units, defined by a combination of forest type and silvicultural management system, are another means by which to categorize structural difference between forest stands and assess the resulting impact of vegetation configuration on the penetration of the laser pulses through the canopy and associated impacts on terrain  accuracy. By this stratification, cedar stands managed via clearcutting had the largest RMSD and Q95 values, followed by tolerant hardwood uniform shelterwood. In contrast, RMSD for jack pine clearcuts was <9 cm for all DEMs, with relatively consistent differences across all three SPL acquisitions (Table 10). The configuration and composition of the overstory and understory vegetation influences the degree to which lidar pulses can penetrate the    canopy and reach the ground surface (e.g. as per transect E1 in Fig. 6). Fewer ground returns can increase the error in DEM interpolation methods and result in artifacts in the derived DEM (Liu 2008); however, as demonstrated by Hodgson and Bresnahan (2004), increased interpolation may actually result in improved elevation accuracy under some cover types as a function of small neighbourhood smoothing under vegetation types where the number of ground returns may be more variable (e.g. deciduous). As indicated in the results presented herein, Fig. 8. Average LPI by canopy cover class.  the 2018 SPL leaf-on data had fewer ground returns, lower values for the LPI, and larger RMSE for the checkpoint analyses. Fig. 9 illustrates how these factors translate into the derived DEMs. The inset maps in Fig. 9 show an area in a single layer stand with a residual veteran overstory ("SV"; Table 9) in a white pine seed tree stand (Table 10). Larger interpolated facets are clearly evident in the 2018 SPL DEM (Fig. 9). Fig. 10 shows the variation in the number of pulses within ±15 cm of the ground surface (i.e. used to calculate LPI) for the inset area in Fig. 9, with the notable sparseness of the 2018 SPL leaf-on data.

Assessment based on checkpoints and transects
Point-based surveys are often undertaken to quantify the absolute accuracy of lidar acquisitions relative to some pre-determined standard (ASPRS (American Society for Photogrammetry and Remote Sensing), 2014). Such point-based assessments are however rarely undertaken in forest environments across a range of different forest conditions and as a result, these assessments fail to provide insight as to how different densities and configurations of vegetation may be impacting the degree to which lidar data can accurately characterize the terrain surface. Herein we provided a comprehensive assessment of the impact of vegetation on the accuracy of four different lidar acquisitions, in order to determine how the characteristics of these data influence the accuracy of the derived terrain information, with particular emphasis on the influence of leaf-on and leaf-off conditions on the SPL data as well as the impact of acquisition altitude.
Acquisition parameters influenced the characteristics of the lidar data examined herein ( Table 2). The 2018 SPL data were acquired in leaf-on conditions from an altitude of 3800 m and had an aggregate nominal pulse density (ANPD) of 32.4 pulses/m 2 . This ANPD is slightly greater than the ANPD of the 2019H SPL data (28.6 pulses/m 2 ), which were acquired at the same altitude as the 2018 SPL, but in predominantly leaf-off conditions. The 2019L SPL was also acquired in predominantly leaf-off conditions, but from an altitude that was almost half that of 2019H (2000 m). The 2019L ANPD was 51.4 pulses/m 2 . The 2012 LML data had a markedly lower ANPD than the SPL at 5.8 pulses/ m 2 . Another key difference in the datasets was that 88.3% of returns for the SPL 2018 data were first returns, compared to 58.4% and 46.4% for the SPL 2019H and 2019L respectively, and only 17.1% for the 2012 LML data.
For the vegetated classes, which are the primary target of interest for this study, the results of the checkpoint assessment indicated that the leaf-on 2018 SPL was the least accurate and least precise data source for characterizing surface elevations under vegetation among the lidar datasets examined in this study. For vegetated classes, the 2018 SPL had the largest RMSE (10.97 cm) and the largest error standard deviation (11.31 cm; Table 3), as well as the largest NMAD (11.93 cm) and Q95 (24.03 cm; Table 4). However, the accuracy with which the 2018 leaf-on SPL data captures the terrain surface is within acceptable limits of published standards for a 10-cm vertical accuracy (i.e. Canada's CQL1, USGS' QL2). Leaf-off SPL acquisitions resulted in improvements in elevation accuracy, with a 17% reduction in RMSE for the 2019 leaf-off SPL data acquired at the same altitude as the 2018 leaf-on SPL (3800 m). However, a reduction in flying altitude had less impact on elevation accuracy, resulting in a reduction in RMSE of only 8% for the 2019 leafoff SPL acquired at 2000 m agl, relative to the higher altitude SPL leafoff acquisition (Table 3). Stoker et al. (2016) assessed the Q95 for non-vegetated and vegetated checkpoints for HRQLS data acquired from an altitude of 2293 m agl in leaf-on conditions (ANPD = 23 pulses/m 2 ) and compared results to those of leaf-off LML data. Using 31 checkpoints, the authors reported a non-vegetated Q95 for the HRQLS data of 17.2 cm (LML = 12.3 cm), and a vegetated Q95 of 17.4 cm (using 17 checkpoints; LML VVA = 19.8 cm). In contrast to the results of Stoker et al. (2016), we found that the 2012 LML leaf-on data assessed herein performed more poorly than the 2018 leaf-on SPL data in non-vegetated conditions (79 checkpoints, Q95 = 17.36 cm (LML) versus 14.18 cm (2018 SPL)). However, as noted earlier, road surface maintenance can alter terrain surface elevations, particularly for gravel roads. The LML was more accurate than the 2018 SPL in vegetated conditions (220 checkpoints, Q95 = 18.11 cm for LML versus 24.03 cm for 2018 SPL; Table 4).
Overall, the RMSE we report for the RTK checkpoint assessment of the three SPL datasets considered herein ranged from 8.56 cm to 10.14 cm (Table 3), which is markedly lower than the 3.78 m RMSE reported by Swatantran et al. (2016) for the HRQLS SPL data (the predecessor of the SPL100 instrument used to acquire the data examined herein). Our assessment approach is most similar to that of Brown et al. (2020), who also used checkpoint data (N = 33) to assess SPL100 data, and reported that the SPL data overestimated elevation by an average of 11.67 cm (standard deviation = 7.63 cm). Average errors or RMSE alone however can be misleading, underscoring the importance of considering a number of different parametric and non-parametric measures to assess elevation accuracy (Höhle and Höhle 2009). For example, similar to the results of Brown et al. (2020), our checkpoint assessment also indicated that the 2018 SPL100 data acquired in leaf-on conditions overestimated elevation in vegetated areas, but only by an average of 0.96 cm, whereas the 2019H and 2019L leaf-off SPL100 underestimated elevation by 5.17 cm and 5.66 cm respectively, and the 2012 LML underestimated elevation by an average of 4.92 cm (Table 3). Metrics such as the RMSE, elevation standard deviation, Q95, and NMAD enable a more comprehensive assessment of data performance and by all these measures, the 2018 leaf-on SPL data were the least accurate and least precise data source for characterizing terrain in vegetated areas (Tables 3 and 4).
However, it must also be noted that the 2018 SPL Q95 value for the vegetated classes are within the accepted level of error for vegetated classes (ASPRS (American Society for Photogrammetry and Remote Sensing), 2014). Indeed all the SPL datasets assessed herein were within the 10-cm vertical accuracy class, with Q95 < 30 cm for vegetated classes (Table 5). Brown et al. (2020) also assessed the capacity of SPL100 and Optech Titan to penetrate the tree canopy and record multiple returns from within the canopy. The authors sampled 30 tree canopies (the size and composition of which were not reported). Only 22% of emitted SPL100 pulses had multiple returns compared to 87% of the Titan's emitted pulses. Overwhelmingly, the SPL100 pulses had 1 or 2 returns only, with less than 1% of pulses having 3 or more returns, whereas 50% of the Titan's emitted pulses had 3 or 4 returns. For the SPL100, the majority of second returns were from the ground. We found similar characteristics for the leaf-on SPL 2018 data assessed herein. For example, the ratio of first to second returns was 17.8 for the SPL 2018, compared to 4.1 and 2.92 for the SPL 2019H and 2019L, and 1.6 for the 2012 LML (Table 2). This concentration of returns from the upper canopy in the 2018 SPL leaf-on data were likewise indicated by the LPI for the 2018 SPL data: 73% of the vegetated checkpoints used in this analysis had <10% of their returns from within ±15 cm of the ground surface, whereas the 2012 LML data had 58% of checkpoints in this LPI class, and the leaf-off SPL data had less than 7% of checkpoints in this class. Note that the overall ground pulse density for the 2012 LML data (which was also leafon) is less than half that of the 2018 SPL data (Table 2). Brown et al. (2020) posited that the higher ground pulse densities of the SPL100 under vegetation were a function of more pulses being emitted initially, and not the result of superior penetration by the SPL pulses vertically through the vegetation canopy. Yu et al. (2020) similarly found that SPL100 data had a higher relative number of ground returns compared to Optech Titan, but that Titan data had a higher proportion of ground returns in multi-layer stands. Mandlburger et al. (2019) found that the SPL100 had greater dispersion than the LML used for comparison, with SPL terrain elevations having a standard deviation that was three times that of the LML. While we did not observe the same magnitude of difference in the standard deviations of the LML (S = 8.01 cm) and the 2018 leaf-on SPL (S = 11.31 cm), we did find that the 2018 SPL had the largest error standard deviation of all the lidar datasets (Table 3), as well as the largest NMAD (Table 4). Note that for the vegetated classes, the NMAD of the leaf-on 2018 SPL data (11.93 cm) was 2.5 times larger than the NMAD of the 2019 leaf-off SPL data acquired from 2000 m (4.67 cm). Several authors highlight potential issues with noise filtering algorithms (Li et al. 2016, Mandlburger et al., 2019, and Brown et al. 2020). All of the SPL data used in our study was filtered by the data provider using the same processing workflow and algorithms, and we found no evidence of systematic issues in misclassification of ground returns or anomalous canopy heights. The characteristics of the SPL data summarized in Table 2, particularly the concentration of first returns in the 2018 leafon data may in part be attributable to aggressive noise filtering by the data provider; however differences in the LPI between the 2018 leaf-on and 2019 leaf-off data acquired from the same altitude (3800 m agl) suggest that there are penetration challenges in leaf-on SPL data ( Fig. 8 and Supplement Table S3).
Analyses by vegetation type provided useful insights on the variability in elevation accuracy. Unique to the 2018 SPL data were challenges with black spruce (RMSE = 14.69 cm; Q95 = 29.40 cm) and low vegetation (RMSE = 13.55 cm; Q95 = 24.74 cm), which had the largest RMSE and Q95 values. Mixedwood was a common challenging class for all the SPL datasets, having among the largest RMSE (Table 5) and Q95 (Supplement Table S1) values. Mixedwood stands are complex assemblages of vegetation and crown forms, often with dense understory. Simpson et al. (2017) likewise found that low stature undergrowth (e.g. ferns) caused the greatest errors in elevation (RMSE >1 m) in leaf-on LML-derived DEMs. However in contrast to the results of Simpson et al. (2017), we did not find a clear relationship between understory density and elevation residuals, as indicated by the results we report for the LPI analyses for the both the checkpoints (Table 6) and the wall-towall DEM comparisons (Supplement Table S3; Fig. 8). This result suggests that in some forest types, it is not solely the density of the understory vegetation, but also the complexity and configuration of the vegetation throughout the vertical profile of the canopy that influences elevation accuracy.
Analysis of the survey transects revealed variability in accuracy within cover types, as well as challenges for the 2018 leaf-on SPL data for obtaining ground returns (Fig. 7). Despite differences in acquisition altitude, results for 2019H and 2019L were very similar. Although the results for vegetated classes overall are significantly different for 2019H and 2019L (Table 3), the only vegetated classes for which these two datasets had significantly different Δh values were for black spruce and jack pine. Thus, despite the 2019L SPL data having a ground pulse density that is ~30% greater than the 2019H data, the gain in accuracy for elevation capture is much smaller (decrease in RMSE of 0.74 cm) than the gain achieved for the leaf-off 2019H compared to the 2018 leafon SPL data (decrease in RMSE of 1.89 cm). Moreover, the swath width of 2019L SPL data was half that of the 2019H SPL (Table 2), which would likely reduce any operational gains in acquisition efficiency for large areas that would be afforded by the SPL data, with comparatively small gains in elevation accuracy.

Assessment based on derived digital elevation models
To better understand how differences in the lidar data examined herein translated into the DEM, we compared the 1 m DEMs derived from the leaf-on and leaf-off SPL data to the DEM generated using the 2012 LML data. It should be noted that the method used to generate the DEM can influence outcomes (Bater and Coops 2009); however, benchmarking has revealed that all methods of interpolation can achieve similar levels of accuracy with proper calibration (Stereńczak et al. 2017). Herein, DEMs were generated for each lidar dataset using the same approach and the same parameter settings. Li et al. (2016) compared 1 m DEMs derived from LML and SPL data, with their SPL data acquired using the HRQLS in leaf-off conditions from an altitude of 2286 m agl. The HRQLS data had a pulse density of 10 pulses/m 2 , with 35.23% of returns classified as ground returns, a ground pulse density of 3.6 pulses/m 2 , and a ground pulse spacing of 0.52 m. The authors noted that these statistics were biased by a small non-forest portion of their study area, and for the area that was actually forested, the ground pulse density was more in the order of 1-2 pulses/m 2 with "the vast majority" of the returns coming from the forest canopy. For the LML data (sensor not specified) used by Li et al. (2016), approximately 31.51% of the LML returns were from the ground surface, resulting in a ground pulse density of 0.26 pulses/m 2 with a pulse spacing of 1.96 m. The authors reported that the leaf-off SPL data overestimated elevation by an average of 21.4 cm (standard deviation = 35.5 cm) relative to the leaf-on LML data. Li et al. (2016) concluded that although the SPL data had a pulse density that was 10 times that of the LML, this increased density was concentrated in the canopy and not on the ground surface, indicating a need to revisit filtering algorithms and to further explore and quantify the penetration capability of SPL in leaf-on conditions.
In our study and similar to Li et al. (2016), we used the DEM generated from the 2012 LML as a reference and assessed the relative differences between the DEMs derived from the various SPL acquisitions, allowing us to determine the impact of leaf-on versus leaf-off conditions and acquisition altitude on the characterization of the terrain surface under varying vegetation conditions. However, our analysis of the DEMs does not allow us to say which of the SPL acquisition is more accurate, as the 2012 LML data itself cannot be assumed as truth (as we learned from the checkpoint analysis reported in Tables 3  and 4). In our relative comparison of the SPL DEMs, we found similar results from the two DEMs generated from the leaf-off SPL data acquired at different altitudes. In both cases and in contrast to Li et al. (2016), the MD in elevation relative to the leaf-on 2012 LML DEM, was less than 1 cm, the RMSD was ~13.3 cm and Q95 was ~25.5 cm (Table 7). We found that the leaf-on SPL data acquired in 2018 had the largest MD, overestimating elevation relative to the 2012 LML by 7.44 cm, with an RMSD of 18.07 cm and Q95 of 36.03 cm (Table 7). Similar to the results of the checkpoint and transect analyses, differences between the leaf-on and leaf-off SPL acquisitions were larger than the differences between the two leaf-off SPL datasets acquired at different altitudes.
We found that differences between the LML and SPL DEMs were generally lowest for forest vegetation types and largest for open wetland. Whereas the 2018 SPL DEM consistently overestimated terrain elevation relative to the LML DEM, the 2019H and 2019L SPL consistently underestimated elevation in vegetated cover types (Table 8). The 2018 SPL DEM had the greatest differences with the LML DEM for grass (RMSD = 29.11 cm, Q95 = 66.30 cm), overestimating terrain elevations by an average of ~20 cm. As a seasonal cover type, grasses were likely markedly different at the time of the leaf-off SPL data acquisition. The leaf-off SPL DEMs had the largest difference for the open wetland, underestimating elevation relative to the LML DEM by 5.73 cm (2019H) and 7.15 cm (2019L). Again, there are important seasonal differences in these cover types: both open and treed wetland can have saturated soils and/or a thick layer of sphagnum moss that can fluctuate in elevation as water levels change. Freeze and thaw cycles can likewise change surface elevations over time and through the seasons.
From a forest management perspective, several stand conditions were particularly challenging and these included stand types associated with standing surface water and saturated soils or stands that have dense understory vegetation above the ground surface. Examples include black spruce (Table 5, Supplement Table S1,and Fig. 6), open and treed wetland (Table 8), and cedar clearcuts (Table 10). The RTK survey used herein was acquired in 2019, a year in which the growing season started very late relative to seasonal norms and conditions were very wet, which made it challenging to determine true ground level in black spruce stands in particular. Increasing canopy cover did not result in a concomitant increase in RMSD with the LML DEM (Fig. 7). Wästlund et al. (2018) postulated that SPL100 data did not saturate as quickly with increasing canopy cover compared to Optech Titan data and was better able to characterize denser vegetation types. Our results would indicate that it is not just the density, but rather the composition, configuration, and density of both the overstory and understory vegetation that influences the performance of the lidar for characterizing the terrain surface, as exemplified by the RMSD values for the different vertical complexity classes (Table 9) and different forest planning units (Table 10). More complex management scenarios (e.g. tolerant hardwood uniform shelterwood) present very different configurations and densities of vegetation compared to more simplistic scenarios (e.g. jack pine clearcut). Moreover, the degree to which vegetation configuration and density impacts the accuracy of the terrain characterization also depends on the nature of the terrain itself, as we also found increasing differences between LML and SPL DEMs with increasing slope (Table 8): RMSD and Q95 doubled for all SPL datasets when considering slopes <5 • compared to slopes >25 • . The terrain in our analysis area is undulating with a 200 m range in elevation and 80% of the analysis area having slopes that are <10 • . The performance of SPL data in more complicated topographic contexts and on steeper slopes merits further investigation, as the impacts of topographic conditions on LML data have been well documented in the literature (Hodgson et al. 2005;Su and Bork 2006;Bater and Coops 2009;Gatziolis et al. 2009;Stereńczak et al. 2017).

Conclusion
Accurate capture and characterization of the terrain surface is an important information need for natural resource management, particularly for forests. Lidar has become the benchmark standard for generating accurate DEMs in forested areas, and the emergence of new lidar technologies, such as SPL, which enable rapid and efficient large area data collections, may lead to lower acquisition costs. However, these new technologies operate and capture information differently than conventional (LML) sensors. Determining the degree to which these differences impact the accuracy of terrain capture was the primary objective of this study. Our results demonstrated that leaf-on SPL data were less accurate than leaf-on LML data in capturing elevation data under a range of vegetation conditions. Our results also demonstrated that leaf-off SPL data were more accurate than the leaf-on LML data we used herein, and also more accurate than leaf-on SPL data acquired at the same altitude. However, we found that the gains in accuracy afforded by acquiring leaf-off SPL data at a lower altitude were minor by comparison. Our results demonstrate that the accuracy with which terrain elevations can be captured in vegetated conditions is not influenced solely by the density of the overlying vegetation but by the configuration and composition of the vegetation throughout the vertical profile of the canopy. For the terrain conditions of our study area, SPL technologies, whether they are acquired in leaf-on or leaf-off conditions, were capable of characterizing the terrain surface with the required level of accuracy to support forest management applications.