Retrieval of Forest Structural Parameters from Terrestrial Laser Scanning: A Romanian Case Study

Research Highlights: The present study case investigates the differences occurring when tree’s biophysical parameters are extracted through single and multiple scans. Scan sessions covered mountainous and hill regions of the Carpathian forests. Background and Objectives: We focused on analyzing stems, as a function of diameter at breast height (DBH) and the total height (H), at sample plot level for natural forests, with the purpose of assessing the potential for transitioning available methodology to field work in Romania. Materials and Methods: We performed single and multiple scans using a FARO Focus 3D X130 phase shift terrestrial laser scanner at 122 kpts and 0.3:0.15 mm noise compression ratio, resulting in an average point density of 6pts at 10m. The point cloud we obtained underpinned the DBH and heights analysis. In order to reach values similar to those measured in the field, we used both the original and the segmented point clouds, postprocessed in subsamples of different radii. Results: Pearson’s correlation coefficient above 0.8 for diameters showed high correlation with the field measurements. Diameter averages displayed differences within tolerances (0.02 m) for 10 out of 12 plots. Height analysis led to poorer results. For both acquisition methods, the values of the correlation coefficient peaked at 0.6. The initial hypothesis that trees positioned at a distance equivalent to their height can be measured more precise, was not valid; no increase in correlation strength was visible for either heights or diameters as the distance from scanner varied (r = 0.52). Conclusions: With regard to tree biophysical parameters extraction, the acquisition method has no major influence upon visible trees. We emphasize the term “visible”, as an increase in the number of acquisitions led to an increased number of detected trees (16% in old stands and 29% in young stands).


Introduction
Forest inventories are the main way of deriving information on forest stand characteristics for a range of fields including biometrics, forest monitoring, physiology and for appraising forest biodiversity or the success of forest management practices. Over the last decade, technological advances in terrestrial laser scanning (TLS) have allowed the development of highly portable, accurate and increasingly cost-effective sensors. Similarly, we can observe an increase in data processing solutions capable of segmenting and reconstructing accurate individual tree models [1][2][3]. This facilitated the use of such close range remote sensing technologies for describing forest stands structural parameters [4][5][6] and biophysical parameters [7][8][9].
TLS techniques facilitate virtual revisiting of the same forest stand once it has been scanned. This allows analyzing and testing hypotheses beyond those which initially led to the acquisition of the data. In addition, research and knowledge on TLS processing in forest ecosystems can be applied to predating sets of data [10]. Multiple scans processing [11,12], and several TLS scanning distances [7], have previously been used to assess the quality of tree parameter extraction, sometimes even using a species-specific approach [13]. Most of the studies on the extraction of forest biophysical parameters from TLS data focused either on plantations or on small areas of natural growth forest, and analyzed various size plots (from sample plot to forest stand level) [14,15].
The aim of our study is to assess the potential of transitioning the theoretical methodologies (e.g., point cloud classification, segmentation, voxelization, 3D reconstruction) towards a forest inventory oriented tool, specific for the Romanian forest monitoring activities. What was previously used for single tree scanning (as a means of analyzing the forest ecosystem in a controlled environment) needs to be transferred to full-scale fieldwork by overcoming the construction limitation of the TLS (narrow field of view) in the most productive manner. Besides the limitations of the technology itself, there is also a reluctance towards assimilating such close-range ground-based remote sensing approaches in Romanian forestry, that should be taken into account. In order to reach our aim, and asses the level to which the expert knowledge in terrestrial laser scanning can be passed to personnel from other areas, we had to restrict the acquisition, processing and interpretation of the data to preexistent algorithms and software. This is a major factor in the acceptance of the research as a tool in areas that might benefit from it.
Multiple studies have focused on the detection of biophysical parameters at both stand and individual scale, with major findings. Heinzel and Huber [16] reached close to 100% identification of stems position by focusing on close range segmentation; similar results are reiterated in several other papers [17] [14]. With similarly good results, studies on biophysical parameters estimation covered both diameter [18][19] [20] and heights evaluation [21] [22]. There are still some factors that constrain the applicability of the technology, leading to the impossibility of gaining such results for all interest aspects (e.g., diameter, height, volume, crown area). But even so, research covering the aspect of combining point clouds from multiple positions [23] or that of pre-designing the acquisition network [24] as compensating measures. As such, for most of the intended goals, we followed and extended the previous work addressing both tree and stand level [25,26], forest type influence [27] and even terrestrial laser best practice guides [28,29].
We investigated and quantified the differences that occur when reconstructing trees at sample plot level from single scan versus multiple scans in deciduous and coniferous dominated forests. We also analyzed the influence of the distance variance between the scanner and the furthest targeted stem in both single and multiple scans. As a result, our case study serves as a good practice guide with focus on Romanian forest ecosystems, by addressing the subject at country specific reach level. By evaluating the degree to which the above-mentioned results and approaches can be reached through pre-existing software[30-37] and mid-range field equipment [35], we can facilitate the acceptance of terrestrial laser scanning as a tool in Romanian forestry. The current study case needs to answer the following questions: can available tree reconstruction methods be applied at a larger scale in natural forest stands, by non TLS-centric users? What are the limitations of single and multiple scanning approaches when increasing the study area-can this limit in any way the extent to which the technology can be used in the field? Is the extra effort implied by multiple scans reconstruction translated in an increase in quality or quantity of the results?

Study Area
We have conducted field inventory activities in twelve circular plots with a radius of 15 m each, grouped in two sets. The plots were set in the mountainous and Subcarpathian hills' forests in central and southern Romania ( Table 1). The plots were selected by taking into consideration the species and age class (O-old, Y-young), namely sessile oak (So), Quercus petraea (SoO 1, 2, 3; SoY 1, 2, 3) in the hill region and norway spruce (S), Picea abies (SO 1, 2, 3; SY 1, 2, 3) in the highlands (Figure 1). We selected the species for their representativeness within the Romanian forest ecosystem, allowing for the assessment of the studied methods in both deciduous and coniferous plots of various structures.   5 Measurements along the isoline from distances equal to the height of the tree; 6 Ratio between canopy area without crowns overlap and plot area; 7 Sessile oak young (Quercus petraea); 8 Sessile oak old (Quercus petraea); 9 Spruce young (Picea abies); 10 Spruce old (Picea abies).

Reference Measurements
We focused reference field measurements on recording diameter at breast height (DBH), height (H) and position of each tree (XYZ). The DBH values were computed as an average between two caliper measurements taken at right-angles and for the heights a digital hypsometer was used. In order to determine the position of each tree, FieldMap [38] (Integrated GIS field software coupled with electronic mapping and dendrometrics sensors) measurements were carried out. These measurements will underpin the validation of TLS processing results.

Single terrestrial laser scan
We performed a single terrestrial laser scan from the center of each plot with a FARO Focus 3D X130 phase shift terrestrial laser scanner at the same time with the above-mentioned field measurements. Allowing 8 µs per scan point for each of the 44.4 million samples of the scan, increased the signal strength in recording reflectance values. We considered the scan origin as the central position, as this was also used as a reference in the field measurements.

Multiple Terrestrial Laser Scans
The multiple TLS approach aimed at increasing the number of points whilst simultaneously standardizing their density through a five-scan composition. This was done to estimate whether the difference in acquisition method translates to an increase in the number of identified trees and furthermore in the sample plot reconstruction accuracy. Also, by scanning the same plot from several directions, shadow cones were mostly eliminated.
Forest inventory procedures regarding height measurement require a viewing distance equivalent to the height of each tree projected along the local terrain contour. Based on this and the general aim of running the analysis at the plot level, the position of each scan was set at the 15 meters mark. This reduced the risk of missing any of the peripheric stems. We ran four scans from radial positions, alongside a central scan. To compare the results from different plots, we established a repeatable methodology -to position the scans on the directions of the compass points ( Figure 2). Seven spheres (0.07 m radius) were used as coregistration markers, as their shape suffers no warp with the change of the scan perspective [39]. The positioning at variable heights raised them above the understory vegetation. Their increased visibility translated to increased coregistration confidence. Each sphere was placed so it could be recorded from the central scan and from at least two more stations.
Previous to starting any of the scans, we set all stations to ensure maximum visibility. This required superficial pruning, especially in young spruce sample plots, increasing the time allocated to one scan from ~8 min to almost 30 min.

Terrestrial Laser Scans Pre-Processing
Prior to the reconstruction stage, the raw scans were coregistered. We connected individual scans using a semi-supervised target-based approach, using the spheres as references, followed by a cloud to cloud control stage [35]. This extra level of control implied refining of the rough positioning by semi-supervised sphere identification, based on onboard inclinometer and compass data. The cloud to cloud optimization was done on a cropped, evenly sub-sampled point cloud (0.08 m density threshold). Coregistration precision values were within tolerances (0.02 m/pair of scans-strong coregistration; 0.02-0.04 meters-medium-strength coregistration; >0.04 meters-weak coregistration) [35], averaging between 0.004 and 0.018 m. Any error exceeding the accepted tolerance for biophysical parameters (0.02 m) required supervised coregistration ( Table 2). After exporting the coregistered data as a coherent point cloud, the ground defining points were separated. This simplified subsequent processes by concentrating only on the vegetation. The process was an iterative one, with a three-dimensional (3D) window of 0.2 m height and a footprint of 0.8 square meters. A threshold of 128 points per grid cell (for the selected grid size) was chosen [40]. Noise was removed from the ground cloud by analyzing the normal orientation of each point, based on six nearest neighbors. The filtered cloud was then interpolated to generate a digital terrain model (DTM) of 0.1 m resolution.
The last step before reconstruction was cloud segmentation or classification of the point cloud into clusters that define individual trees. To facilitate the segmentation stage, several cleanup procedures have been undertaken. A simple way to bypass the understory vegetation, and the lowlying branches was to create a section of defined height (0.3 m) at a constant elevation above the DTM (2 m). This spatial sub-sample aimed to simplify the stem identification stage. Noise was removed from the resulting slice using the before mentioned method by changing the reference surface from a theoretically horizontal ground to a surface closer to vertical for the trunks. A fast spatial clustering based on the work of Hackenberg et al. [7] was applied to segment trunk seeds. We defined a seed as any cluster of points containing at least 0.1% of the total point cloud, but not less than 400 points. We separated the output into two classes through Euclidean clustering, based on the distance between points (0.10 m threshold). Out of the two resulting classes, one was characterized by a lower number of clusters with a higher percentage of representation in the original point cloud, whilst the other had a significantly higher number of clusters, with a reduced percentage of the total. Starting with these seeds, the clusters were expanded in the original vegetation point cloud in a vertical direction. Slices were generated at 0.10 m intervals and the same Euclidean clustering was applied. Instead of expanding the clusters based solely on the distance between points, the distance to the previous seed was also taken into consideration. A cylinder was fitted to each newly created cluster. If the projected center of the fitted cylinder fell inside the original seed footprint and its specific radius was a solution of equation 1 (n = 1.2-as an average of the minimum/maximum values provided by Bitterlich, 1978), then the cylinder was considered as being part of the same tree. Clusters with less than 10 points or with a radius exceeding the tolerances were dropped and merged in a residual point cloud to be reprocessed in the following repetitions. The iteration continued until it reached the end, by either hitting the lowest Z value or by not generating any clusters that met the conditions above.
h-height rh-radius at height h n-geometry type coefficient x-surface of revolution generator Equation 1 resulted from the initial assumption that a tree trunk can be reduced to a 3D primitive (the simplest geometric object that can represent the stems). But the structure of a stem, with a concave base and a slight convex middle accentuated towards the top, better resembles a neiloid. By not considering this taper, cylindrical representations would generate unrealistically large volumes and tolerances; a paraboloid would be the next best geometry to represent the volume of a stem, with cones midway between the tapered shapes. Neiloids would yield the least excess-volume of all the considered shapes ( Figure 3). A theoretical n value would vary between 0 for a cylinder and 3 for a neiloid. However, these values do not lead to realistic results. The literature recommends an n value between 0.9 and 1.5 to best define the real shape of a stem [41]. Two filters have been applied, addressing spatial extent and density. In the first case, clusters with a height of below 3 m or above 60 m have been removed. For point density, clusters with less than 15,000 points, were removed. Both processes were performed using CloudCompare (scalar fields -compute stat. param)[30] and Quantum GIS tools (vector geometry -bounding boxes) [37].
Down-sampling of the voxel grid reduced the size of the resulting clusters to facilitate the extraction of biophysical parameters. To limit the information loss, for each 3D cell or voxel (0.01m) the centroid of the contained points was extracted and used, rather than the centroid of the voxel itself. The mean distance to the nearest points was computed and based on the condition in equation 2, some points were excluded. The voxel grid was implemented to reduce the processing power needed for the subsequent stages by using lidR and lasvoxelize packages for R [36,42]. This step implied reducing the number of points through the intersection of the point cloud with a 3D grid of the same extent.
d � -mean distance σ-standard deviation x-threshold for angle between neighboring normals Previous research showed that d � ± σ (x = 1) is enough to identify outliers [43].

Terrestrial Laser Scans Processing
With the original point cloud cleaned and segmented, diameters and heights could be extracted. Besides these two parameters, the position of each stem had to be recorded in the same reference system as the field data to identify the correspondences between them.
Each stem position was computed as the average point location (X, Y) within a 0.1 m tall cylinder. The altimetric position was computed as a mean valueof the points lying inside a best-fitting cylinder above the ground, with a height of 0.1 m.
The DBH was determined by extracting a section parallel to the DTM, between 1.25 and 1.35 m and identifying the true stem (no branches) through a randomized Hough transform (weight based analysis for detecting shapes in 3D point clouds) [44]. Afterward, a circle was fitted using the least square regression. The height, or vertical projection of the stem, was computed in two ways, as TLS based height estimation usually leads to erroneous results. This way, two sets of height values were obtained, allowing for further evaluation of their accuracy against field measurement. The first approach was based on computing the maximum difference on the Z-axis between its DTM elevation and the top of the vegetation cluster. In the second approach, the height was computed as the highest Z value enclosed inside a cylinder of diameter equal to the stem's DBH, from the unfiltered cloud. In the case of trees with an obvious lean (either recorded during field observations or identified during any of the supervised processing stages), the length was used instead of the height. The length was computed as the maximum distance between any two points in the tree cluster.

Individual Tree Segmentation
The precision of TLS reconstruction was evaluated based on the accuracy of the diameters, heights, and number of extracted individuals for the two scanning approaches. To assess whether or not multiple scans led to more accurate results, individual trees have been identified and mapped in both the TLS point clouds (single and coregistered multiple) and in-field measurements. One to one join was used to allow comparing them as identical data sets and assess only the quality of the extracted biophysical parameters (DBH and height). Consequently, both the field and the multiple scan data sets were subsampled to only display those individuals identified through single scanning since it provided the lowest number of stems (Table 3). The increase in the number of identified individuals is substantial when switching from a single central scan to a coregistered point cloud. Also, a relationship between the sample plot's structure (age and species) and segmentation quality was observed, as the number of detected trees increased by 16% for multiple scans in old sample plots and by 29% in young sample plots. By species, multiple scans detected on average 25% more trees when compared to single scans for sessile oak and only 18% more for norway spruce sample plots.

DBH Estimation
When analyzing the quality of extraction (diameter and height), the positive influence that the increased point cloud density had on stem extraction was not replicated in the biophysical parameters' evaluation.
A comparison between the trends of the diameter values in relation to a theoretical abline (constrained by a constant relation between the two data sets) was carried out to better differentiate the two scanning approaches. Single scans displayed an overall underestimation of large diameters, regardless of the species or age of the plots (deviation within 0.3-3.6 cm). Moreover, in old sample plots, the underestimation of large DBH values was coupled with an overestimation of smaller DBH values. This effect is reduced when processing multiple scans. Regarding the overestimation of young trees' DBH and underestimation of old trees' DBH, the phenomenon can be explained by the natural evolution of trees. Young trees tend to maintain a more cylindrical stem shape, whereas old trees, because of different stress and environmental factors, develop in a more irregulate manner. This process is called ovalization (Figure 4). In the case of multiple scans, inconsistencies occurred only in spruce sample plots where old plots did not just maintain an overestimation for small DBH, but also extended the effect slightly towards the larger values.
In contrast with the visible DBH underestimation in the single scan results, multiple scans overestimated DBH values in spruce sample plots. It is worth mentioning here, that spruce plots are characterized by an uneven DBH distribution. Only when overlooking the spruce sample plots, can multiple scans approach be considered to have a regulating effect on diameter values, eliminating over-or under-estimation tendencies (Table 4). Despite the slight under and over-estimation, the diameter values strongly correlated to the field measurements ( Figure 5), with r values reaching above 0.9. This relation was reflected in the strength of the determination coefficient, r 2 which averaged a value above 0.9. Despite the largely similar results, important differences were observed at some sites between single and multiple scans. For example, for SO 1 the r 2 coefficient (with respect to the field measurements) changed from 0.33 to 0.98 from single to multiple scans.

Height Estimation
Significantly weaker correlations ( Figure 6) were reached in the case of heights estimation (r < 0.76) with determination coefficients of up to 0.59. Switching from single to multiple scans slightly improved the results from r 2 of 0.31 for single scans to r 2 of 0.41 for multiple scans. Another analysis was conducted to examine whether the distance to the scanner impacts the results. The hypothesis was that trees at a distance equivalent to their height can be measured more precisely [45]. In the same acceptance, the closer one gets to the scanner, the better the diameter can be represented and measured. To evaluate this supposition, the plot radius of 15 m was divided into two segments upon which spatial filters have been applied. The filtering consisted in the creation of a buffer around the centre of the plot with a radius of 7 m and its intersection with the position of the trees. The results showed no increase in correlation strength for either heights or diameters, as the distance from scanner increased (r = 0.52 for the entire plot vs. r = 0.62 closer than 7 m and r = 0.59 further than 7 m). Regarding diameters, there is no difference reflected in the correlation coefficients (r for the entire plot equals that of values subsampled closer than 7 m and that of values sampled further than 7 m, 0.94).
By comparing the results of the two methods used for the estimation of height, the quality of the segmentation and tree reconstruction could be assessed. The old sample plots were characterized by a bigger residual cloud, whilst reconstruction based on young sample plots point clouds resulted in sparse residual clouds (represented in Figure 7 as the area between the Digital Surface Models (DSM) based on the original and the segmented point clouds). As seen from Table 5, no matter the age, the heights are underestimated (mean underestimation for old sample plots reaches 37% (± 25%) and for young sample plots 14% (± 19%) of the mean real height of the plot) in the sessile oak study area. For spruce areas the mean underestimation is even greater, reaching values of 35% (± 31%) for young trees and 44% (± 27%) for old trees. Both height estimation techniques (segmenting and maximum height per crown projection) lead to unsatisfactory results, suggesting that single wavelength TLS in leaf-on conditions is not suitable for extracting height values at the plot level, because of difficulties in penetrating dense canopies (Figure 8).

Discussion
The current study case may not have provided significant improvements in the extraction accuracy for the biophysical parameter, but can be used as a guideline. Comparison and contrastbased optimization using a mid-class terrestrial laser scanner in natural growth forests, covered the gap between controlled research and field implementation. Results are therefore similar to some degree to that of previous research, although our approach focused on sample plot level in natural forests.
When analyzing diameter extraction, we found multiple scans approach to lead to an increase in precision. This does not invalidate single scan results, as the differences are within measuring tolerances. Despite this, a significant improvement arising from multiple scans approach is the number of identified trees, which in terms translates into a more accurate stand level estimation of all the biophysical parameters. The use of multiple scans in DBH evaluation would be beneficial for transitioning from circular fitting to an ellipse. Having a more accurate representation of the stem (as the one resulting from the multiple scans) could yield better values for the semi-major and semiminor axis of the ellipse. Since none of the height extraction methods could provide realistic estimations, investing time in multiple scans approach is not recommended. The shadowing and occlusion effects could not be fully compensated by multiple scanning. A potential solution could be the use of a dual-wavelength full-waveform terrestrial laser scanner [46]. The stand structure (density, species, age) and local topography had an influence upon the height's estimation, but different conditions would have not differentiated between the used methods.
Since the method for estimating DBH is based on the least square error when best fitting a circle, straying away from the circular shape increases the error of radius estimation. A solution might be found in increasing the number of iterations involved in extracting DBH values, but better still would be moving to a two dimensional (2D) ellipse fitting [47]. Even so, the DBH values that we obtain were similar to those highlighted in recent papers (average RMSE of 3 cm) [25,48,49] The low values observed for the correlation coefficients between heights extracted from FieldMap [38] and TLS did not improve when the number of scans increased. An increase in the number of scans proves to not increase the strength of penetration of the canopy. As a result, tree tops are still not reachable.
In an intra-species comparison, when analyzing the influence of the sample plots' age, the residual point cloud displays a rise in both density and coverage with the increase in age. This translates into a decrease of the vertical reach as age increases. Age variation generates a more complex structure, and with higher diameters and densities (above a threshold of 0.5 m/0.7 consistency-see Table 1), we expect lower relative height values.
The increase in the proportion of identified stems between single and multiple scans (10%) observed by Xi et al.- [15] was confirmed but with higher values (16 to 29%). Literature provide overall detection percentages to exceed 97% [16,17], with a slight decrease for smaller diameters. Although depended on the structure and number of scans [24], the~10% decrease is similar to our results. Height estimation remained an issue no matter the species or sample plot structure, similar to Saarinen et al. [11]. The slight precision increase in close range scans, also mentioned in previously published papers [45], was considered of no significant value. This conclusion is supported by the Wilcoxon signed rank test which results in p-values smaller than of 1×e −5 for both close (7m radius) and far (15m radius) range. Computing the cosine similarity of both close and far range height trends to the normal, once again confirmed their similitude (average value of 1.5×e −3 ± 2×e −3 ). Different conditions led to improved values in Height evaluation in similar research (average RMSE of 7 cm [50,51]).
If the aim is to identify the number of individuals, multiple scans technique is the proper way to acquire data. As very few investigations require solely the number of individuals and their positions, this might not represent an advantage of multiple scans over single scans.

Conclusions
TLS proved to be a suitable solution in some aspects of forest monitoring and inventory activities. The wide range of applications confirms TLS's utility in forestry. No matter the density nor the species variability, forest sample plots can be inventoried using a TLS based method. Point density, increased through multiple scans approach, does not equivalate to an increase in accuracy since single scans (43.7 million points) are already over-sampling the plot. However, because of the change in the distribution of data, the shadowing effect is reduced, leading to an increase in the number of detected individuals.
Multiple scans approach better defines the shape of the stems, with visible effects in the quality of DBH values. Single scan leads to an underestimation of diameters in all the plots, especially for large values.
An improvement in the extraction of height values could be the application of the multiple scan approach in a leaf-off state. As an alternative, if available, the extension of the raw point cloud with airborne scan data, through TLS-ALS (Aerial Laser Scanning) fusion would be beneficial.
The results show the necessity of repeating all the measurements in leaf-off condition and if possible, using a dual-wavelength terrestrial scanner. These two improvements upon our tested methodology are aspects of further research; either one might prove to solve the highlighted shortcomings.