Estimating Crown Biomass in a Multilayered Fir Forest Using Airborne LiDAR Data

: The estimation of individual biomass components within tree crowns, such as dead branches (DB), needles (NB), and branch biomass (BB), has received limited attention in the scientiﬁc literature despite their signiﬁcant contribution to forest biomass. This study aimed to assess the potential of multispectral LiDAR data for estimating these biomass components in a multi-layered Abies borissi-regis forest. Destructive (i.e., 13) and non-destructive (i.e., 156) ﬁeld measurements were collected from Abies borisii-regis trees to develop allometric equations for each crown biomass component and enrich the reference data with the non-destructively sampled trees. A set of machine learning regression algorithms, including random forest (RF), support vector regression (SVR) and Gaussian process (GP), were tested for individual-tree-level DB, NB and BB estimation using LiDAR-derived height and intensity metrics for different spectral channels (i.e., green, NIR and merged) as predictors. The results demonstrated that the RF algorithm achieved the best overall predictive performance for DB (RMSE% = 17.45% and R 2 = 0.89), NB (RMSE% = 17.31% and R 2 = 0.93) and BB (RMSE% = 24.09% and R 2 = 0.85) using the green LiDAR channel. This study showed that the tested algorithms, particularly when utilizing the green channel, accurately estimated the crown biomass components of conifer trees, speciﬁcally ﬁr. Overall, LiDAR data can provide accurate estimates of crown biomass in coniferous forests, and further exploration of this method’s applicability in diverse forest structures and biomes is warranted.


Introduction
Forests are considered among the most important carbon sinks since they are responsible for storing approximately 80% of all terrestrial carbon. Precise measurement and quantification of forest biomass and carbon are essential to mitigate climate change and optimize forest management strategies at global and regional scales [1]. Forest biomass includes both the above-(AGB) and below-ground biomass. The AGB can be defined as the sum of all living biomass components standing above the soil level (e.g., stem, leaves, needles, branches, bark), expressed as the mass of an individual tree or unit area [2][3][4]. The crown and more specifically, foliage is directly related to the primary production, photosynthesis, nutrient cycling, light transmission and understory vegetation, affecting tree growth and natural regeneration [5]. In addition, living and dead branches play significant roles in forest biomass dynamics as they constitute the majority of the logging residues [6] and have the highest carbon content among the tree components [7].
Conventional methods for AGB estimation require destructive sampling, which provides accurate and precise biomass estimation. However, it is time-consuming, laborious and costly [8]. Therefore, allometric equations are most commonly used for AGB estimation. In fact, these equations constitute regression models that can predict either the total biomass data were used as training data, providing accurate crown biomass estimates, while the difference in estimation accuracy of TLS and the destructively obtained reference data was not statistically significant. Cao et al [62] examined the capability of LiDAR to provide total and component-specific biomass estimates, in a mixed subtropical forest, obtaining a satisfactory estimation accuracy for the branch and foliage biomasses of coniferous trees, with R 2 values of 0.81 and 0.80, respectively. Crown biomass estimation was also examined using full-waveform LiDAR data, introducing a new LiDAR-derived metric that describes the horizontal distribution of the point cloud [63].
It is worth noting that all the aforementioned studies used monospectral LiDAR and no studies to our knowledge have explored the potential of multispectral LiDAR data in crown biomass estimation. In fact, multispectral LiDAR systems have the ability to provide information on the vertical distribution of physiological processes in different spectral channels [64]. In recent years, the multispectral LiDAR point clouds have been widely used for land cover classification [65], tree species composition [66], surface [37] and forest canopy fuel load estimation [67]. Additionally, the multispectral LiDAR data provided more accurate AGB estimations [68] and individual tree detection compared with the monospectral [69].
Although previous studies on crown biomass estimation based on LiDAR data focused on plantations and semi-natural regenerated forests, only one of them was conducted in a multilayered forest with a complex structure [62]. Additionally, there have been no studies conducted on estimating the crown biomass of Abies species, especially the Abies borissiregis. To address the aforementioned literature gaps, this study aimed to estimate the biomass of crown components (i.e., dead branches, needles and branches) in a multilayered Abies borissi regis forest using multispectral LiDAR-derived predictive models. To estimate the reference dead branch (DB), needle (NB) and branch (BB) biomass estimates, a set of new allometric equations were built based on destructive sampling. Subsequently, three regression algorithms were examined using LiDAR height and intensity-derived metrics in order to investigate the potential for reliable tree-level DB, NB and BB estimation.

Study Area
Located on the southeast side of Mount Pindos in Thessaly, Greece, the Pertouli University Forest has been managed by the Aristotle University of Thessaloniki for research and educational purposes since 1934 ( Figure 1). Its elevation varies from 1100 to 2050 m above sea level, with steep terrain and an average slope of 28.1 degrees.The climate is characterized as a transitional Mediterranean mid-European climate, featuring warm, dry summers and cold, rainy winters. An uneven distribution of the annual precipitation is observed, with high variability among the seasons. As for the geological substratum, flysch and limestone are the main layers, which along with the mountainous terrain, are vulnerable to weathering and erosion.
Abies borisii-regis, which is a natural hybrid between Abies alba and Abies Cephalonica, is the dominant species in the forested area [70]. The natural stands of this specific species are composed of all-aged stands, including natural regeneration and young trees in the understory, and mature trees in the overstory. Furthermore, it was observed that trees of the same age can exhibit different heights and DBH, which is attributed to the species' ability to survive under low-light conditions in forest understories for an extended period before being released from overstory light suppression [71]. The remaining forest area is covered by Scots and Black Pines plantations (Pinus silvestris and Pinus nigra) and individual trees of beech, oaks, willows, maple, laburnum and linden.

27
The introduction should briefly place the study in a broad context and highlight why 28 it is important. It should define the purpose of the work and its significance. The current 29 state of the research field should be carefully reviewed and key publications cited. Please 30 highlight controversial and diverging hypotheses when necessary. Finally, briefly men-31 tion the main aim of the work and highlight the principal conclusions. As far as possible, 32 please keep the introduction comprehensible to scientists outside your particular field of 33 research. References should be numbered in order of appearance and indicated by a nu-34 meral or numerals in square brackets-e.g., [1] or [2,3], or [4][5][6]. See the end of the docu-35 ment for further details on references. 36

37
The Materials and Methods should be described with sufficient details to allow oth-38 ers to replicate and build on the published results. Please note that the publication of your 39 manuscript implicates that you must make all materials, data, computer code, and proto-40 cols associated with the publication available to readers. Please disclose at the submission 41 stage any restrictions on the availability of materials or information. New methods and 42 protocols should be described in detail while well-established methods can be briefly de- 43 scribed and appropriately cited. 44 Research manuscripts reporting large datasets that are deposited in a publicly avail- 45 able database should specify where the data have been deposited and provide the relevant 46

ALS Data
In October 2018, airborne multispectral LiDAR data were acquired over the Pertouli University Forest using a RIEGL VQ-1560i-DW laser scanner sensor mounted on a singleengine airplane. This LiDAR sensor provides a high laser pulse repetition rate that can reach up to 1 MHz per spectral channel and on-the-fly multiple-time-around processing. The data were obtained at an average flight altitude of 2243 m above ground level, with a maximum scan angle of 32 • . During the flight, two spectral channels were simultaneously recorded, namely, the green and near-infrared (NIR) (532 nm and 1064 nm, respectively). Before the data delivery, the provider preprocessed the LiDAR data, transforming the full-waveform dataset into discrete returns. The delivered data consisted of a georeferenced point cloud with almost identical point cloud densities for both channels (i.e., 44.65 points/m 2 for the near-infrared and 44.85 points/m 2 for the green). Additionally, the simultaneous acquisition of aerial photographs with the LiDAR data provided high-resolution images of the entire forest.

Field Survey
Field measurements were performed in July 2022, including destructive and nondestructive sampling ( Figure 2). Destructive sampling was carried out in a specific compartment of a forest, where 13 trees of different characteristics were sampled. Although the field measurements took place 4 years after the LiDAR flight, the differences in tree height, DBH and crown structure were considered to be rather small [72]. Moreover, considering that the Forest Management Plan indicates an annual volume increase of 5.82 m 3 ha −1 in three separate growing seasons, the difference in biomass was insignificant. The selection of the specific sample trees was based on the Forest Management Plan and involved identifying trees with different crown structures, heights and social status, where we aimed to collect representative samples from all forest conditions. A set of tree variables, namely, DBH, tree position, tree height (H), canopy base height and crown radii, were recorded after the felling of each tree. DBH was recorded on the standing trees using a Haglof Mantax Blue caliper (Haglof, Torsång, Sweden), and their locations were determined using a Garmin eTrex 30x touch (Garmin, Lenexa, KS, USA). H, canopy base height and crown radii were measured using a laser distance meter (Leica Disto D2, Leica Camera AG, Wetzlar, Germany). In particular, the length of the above-ground portion of the tree, including the distance between the tree top, the first living branch and the bottom of the trunk, was measured using a laser distance meter. Crown radii were recorded by measuring the distance between the tree stem and the crown edge. The branches (including the needles) were then separated from the tree trunk and divided into 3 sections (dead branches, living branches and tree top). The total fresh weight of each section was weighed on the field using a handheld scale. A sample from each section was collected and transferred to the laboratory for further analysis. For smaller trees with a DBH smaller than 15 cm, the whole crown was taken to the laboratory for needles-branch separation and dry weighting [73]. Needles and branches of the samples were separated to determine the needles/branch ratio and placed in separate bags for drying. All the samples were dried at 85 °C for 72 hours and weighed. The ratio of the measured dry weight to fresh weight of the crown components was applied to estimate the total crown biomass ( Table  1). The selection of the specific sample trees was based on the Forest Management Plan and involved identifying trees with different crown structures, heights and social status, where we aimed to collect representative samples from all forest conditions. A set of tree variables, namely, DBH, tree position, tree height (H), canopy base height and crown radii, were recorded after the felling of each tree. DBH was recorded on the standing trees using a Haglof Mantax Blue caliper (Haglof, Torsång, Sweden), and their locations were determined using a Garmin eTrex 30x touch (Garmin, Lenexa, KS, USA). H, canopy base height and crown radii were measured using a laser distance meter (Leica Disto D2, Leica Camera AG, Wetzlar, Germany). In particular, the length of the above-ground portion of the tree, including the distance between the tree top, the first living branch and the bottom of the trunk, was measured using a laser distance meter. Crown radii were recorded by measuring the distance between the tree stem and the crown edge. The branches (including the needles) were then separated from the tree trunk and divided into 3 sections (dead branches, living branches and tree top). The total fresh weight of each section was weighed on the field using a handheld scale. A sample from each section was collected and transferred to the laboratory for further analysis. For smaller trees with a DBH smaller than 15 cm, the whole crown was taken to the laboratory for needles-branch separation and dry weighting [73]. Needles and branches of the samples were separated to determine the needles/branch ratio and placed in separate bags for drying. All the samples were dried at 85 • C for 72 h and weighed. The ratio of the measured dry weight to fresh weight of the crown components was applied to estimate the total crown biomass ( Table 1). Additionally, non-destructive samples were collected from a set of 200 trees located in representative areas of the forest. This approach was preferred to minimize the disturbance to the forest, due to the invasive nature of destructive sampling. In each sampling region, only trees that were detectable using the LiDAR sensor were measured. The DBH, H, canopy base height, crown radii and position of each tree were recorded. More specifically, the DBH was measured using a Haglof Mantax Blue caliper, H and canopy base height were measured using a Haglof EC II-D electronic clinometer, and the crown radii were measured using a laser distance meter (Leica Disto D2). Tree locations were determined using a Garmin eTrex 30x touch with an average horizontal position accuracy of 6 m. However, only 156 of these trees were correctly identified in the LiDAR point cloud.

Methods
The methodology for the LiDAR-based DB, NB and BB estimation, which is described in detail in the following sections (Sections 3.1-3.3), is illustrated in Figure 3.  Additionally, non-destructive samples were collected from a set of 200 trees located in representative areas of the forest. This approach was preferred to minimize the disturbance to the forest, due to the invasive nature of destructive sampling. In each sampling region, only trees that were detectable using the LiDAR sensor were measured. The DBH, H, canopy base height, crown radii and position of each tree were recorded. More specifically, the DBH was measured using a Haglof Mantax Blue caliper, H and canopy base height were measured using a Haglof EC II-D electronic clinometer, and the crown radii were measured using a laser distance meter (Leica Disto D2). Tree locations were determined using a Garmin eTrex 30x touch with an average horizontal position accuracy of 6 m. However, only 156 of these trees were correctly identified in the LiDAR point cloud.

Methods
The methodology for the LiDAR-based DB, NB and BB estimation, which is described in detail in the following sections (Sections 3.1-3.3), is illustrated in Figure 3.

LiDAR processing
In order to identify each tree in the LiDAR point cloud and calculate its associate metrics, different processing steps took place using primarily R software, more specifically, the lidR package [74,75].
First, the recorded GPS points from the tree positions were buffered to a 6 m radius, representing the GPS error, and then used to identify the exact field-measured trees in the LiDAR point cloud. The point cloud, divided into green and NIR channels, was clipped

LiDAR Processing
In order to identify each tree in the LiDAR point cloud and calculate its associate metrics, different processing steps took place using primarily R software, more specifically, the lidR package [74,75].
First, the recorded GPS points from the tree positions were buffered to a 6 m radius, representing the GPS error, and then used to identify the exact field-measured trees in the LiDAR point cloud. The point cloud, divided into green and NIR channels, was clipped to the buffered areas and employed for further processing. Since the study area was characterized by mountainous terrain and irregular forest canopy, a large intensity variability was expected. The single and first-of-many returns were significantly stronger than the intermediate returns and, in most cases, located higher in the canopy. Consequently, the disparity in their intensity values was influenced by the different vegetation materials (i.e., foliage and wood) that obstructed the pulse [76]. Hence, the single and first-ofmany returns of both green and NIR channels were solely used for the range intensity normalization procedure.
Regarding the calibration parameter f in the range normalization procedure, values between 2 and 2.5 are considered more suitable for forested areas [77]. In this study, the reference range parameter was set to the average flight altitude and f as 2.0, according to the radar theory [78]. After the intensity normalization, the normalized point clouds were joined with the intermediate and last returns of each channel. The combination of both the intensity-normalized and not normalized echoes (i.e., the intermediate and last returns) was used to reconstruct the initial point cloud and extract height-and intensity-related metrics for the crown components estimation.
The normalized point cloud was filtered for noise removal, using the Statistical Outliers removal algorithm [79], and classified into ground and non-ground points using the Cloth Simulation Algorithm [80]. The ground-classified points were interpolated to construct the digital terrain model (DTM), using the triangulated irregular network (TIN) algorithm, which was employed for the point cloud normalization. The canopy height model (CHM) was subsequently produced, employing the Pit-free algorithm [81]. The Pit-free algorithm was selected due to its ability to provide accurate tree delineations, especially in highdensity LiDAR data. Both rasters (i.e., DTM and CHM) had the same resolution of 0.42 m, which was defined based on the nominal pulse spacing [82]. Subsequently, to reduce the over-segmentation phenomenon due to the complex forest canopy, a morphological erosion with a cross-shaped structuring element was applied to the CHM.
Several methods have been proposed for automatic crown delineation [47,69,83], however, large omission and commission errors are often observed [61]. In this study, watershed segmentation was employed for the crown delineation procedure, without providing treetops or any kind of local maxima feature. Although the watershed algorithm outperformed the other tested algorithms [47,83], providing accurate segmentation results, manual corrections were applied where necessary.
According to [84], slopes can significantly impact the normalized canopy height of individual trees, affecting the crown of each tree. Therefore, considering the mountainous terrain of the study area, the elimination of the terrain effects for each tree was essential for accurate crown biomass estimation. Consequently, a crown normalization process inspired by the P-trees algorithm [45] was employed.
Taking into account the positional error of the GPS position for determining each tree's position and the horizontal accuracy of the point cloud, the errors were expected to be 6 m and 0.30 m respectively [51]. To match the field-measured trees with the segmented trees in the point cloud, a candidate-searching approach was employed, similar to the one of [85]. Within each buffer zone, the highest point in each segmented tree was assigned as the tree top. Segmented trees with a height difference greater than 1 meter from the field-measured height were directly eliminated and the remaining trees were used as candidate trees. Additionally, field observations and very-high-resolution aerial photographs were used to identify the actual tree measured on the field among the candidate trees according to the method described in [86]. As a result, 156 out of 200 non-destructively sampled trees were successfully identified in the LiDAR point cloud. Finally, the point cloud for each channel (i.e., green, NIR and merged) was extracted in order to calculate the LiDAR-derived height and intensity metrics separately (Table 2). Table 2. Summary of the LiDAR-derived metrics calculated for the crown biomass estimation. All described metrics (height-and intensity-related) were extracted for each point cloud separately (merged, NIR and green). Height bincentiles refer to the cumulative percentage of points in a percentage of the total height, while percentiles represent the distribution of the vertical structure between the highest and lowest height. The intensity-related metrics were obtained using only the first-of-many and single returns.

Allometric Equations
Several allometric models have been developed over recent decades for the estimation of the biomass of different tree components [73]. Crown biomass is usually calculated using the DBH and H as inputs in empirical allometric equations, providing predictions for validating biomass estimates from remote sensing models [87]. However, no allometric equations are available for crown or crown components biomass in Abies borissi-regis. In this study, new "intrinsically linear" allometric equations were created, using 13 destructively sampled trees, for dead branches, needles and branches biomass (Equation (1)): where B is the biomass of each component; a and b are the intercept and scaling coefficient, respectively; and ε is the residual error. The constructed equations were based on DBH and H, as the other field measurements were found to be insignificant (i.e., crown base height, and crown radii). The originally selected sample size was decreased from 25 to 13 according to the guidelines of the local forest service, due to the potential damage from the fallen trees and the sampling procedures to the understory. In fact, previous studies that focussed on allometric equations for biomass components estimation reported similar sample sizes to ours [10,73,[88][89][90]. Subsequently, a sample size of 13 was considered suitable for our study as well. As a result, the measured DBH and H of the non-destructively sampled trees were employed in the allometric equations to calculate the DB, NB and BB for each tree. The results were used as reference biomasses for each crown component. For these equations, no conversion factor was required, as they were used to directly estimate the biomass in kg/tree. The performance of the allometric equations was evaluated based on the relative square error (RSE), R 2 and adjusted R 2 (adjR 2 ).

Crown Biomass Estimation Using Regression Algorithms
To assess the predictive potential of the LiDAR-derived metrics for accurate DB, NB and BB estimation, three different regression algorithms were tested, namely, random forest (RF), support vector regression (SVR) and Gaussian process (GP). Prior to the analysis, the predictors were "center" scaled and divided into training (75%) and testing (25%) sets. In addition, a nested ten-fold cross-validation with grid search was applied in each training set to obtain the training accuracy and the optimal hyperparameter values for each model. This cross-validation method eliminates the bias that may be introduced by the simple cross-validation approach, providing a more reliable criterion for choosing the best model [91]. The regression models were validated to an independent testing set (i.e., 25% of the total number of samples), which was randomly split from the training data to obtain the predictive performance of each algorithm. Assessment of all models' performance for each crown biomass component estimation was carried out based on the mean absolute error (MAE), mean square error (MSE), bias, relative mean square error (RMSE) and R 2 . In addition, the relative importance of individual predictors in each model was calculated using a model-agnostic procedure, providing a list of the ten most important variables for each model [92].

Random Forest
The random forest algorithm is a flexible machine learning technique that utilizes multiple randomized decision trees to carry out tasks such as classification and regression [93]. According to this approach, each decision tree is trained on a different bootstrap sample of the data, and the predictor variables at each node are randomly selected [94]. Concerning the model's tuning, two hyperparameters were tuned, namely, the number of trees to grow and the number of predictors for each tree. A variable number of trees to be grown was tested, ranging from 50 to 500 with a step size of 50, and variable numbers of predictors for each tree were defined as the number of predictors divided by three [95].

Support Vector Regression
This method, traditionally known as support vector regression (SVR), was initially developed by Vapnik [96] as an extension to the traditional SVM algorithm [97]. The goal of this method is to find a function f (x) among the pairs of the training data without considering the pairs that have a larger deviation from the support vector than an ε deviation [98]. The ε-SVR algorithm was applied in our study, where the input data were transformed into a high-dimensional feature space using a nonlinear function, solve the final model by minimizing the training error and the complexity of the model [99]. In our work, the rbf kernel was employed and two hyperparameters were optimized, namely, the cost of violation of restrictions and sigma parameter.

Gaussian Process
The Gaussian process is a probabilistic machine learning framework that is widely used for regression and classification tasks [100]. In regression problems, a Gaussian process regression model can make predictions by incorporating prior knowledge (kernels) [101] find the probability distribution that best describes the data [102]. According to this method, it is assumed that the input data follow a multivariate Gaussian distribution, while the noise is considered independent of the data [103]. In this study, the radial basis function (rbf) kernel was employed, and the sigma inverse width parameter was optimized using a grid search.

Results
In this study, DB, NB and BB were estimated using three different algorithms. In particular, new allometric equations were developed and evaluated for their accuracy to enrich the reference dataset with non-destructive samples (Section 4.1). Subsequently, three regression algorithms (i.e., RF, SVR and GP) were evaluated for their potential to reliably estimate DB, NB and BB. The results are presented in detail in the following sections (Sections 4.2-4.4).

Allometric Equations
Regarding the allometric equations, 13 trees were destructively sampled in the field to develop a different equation for each DB, NB and BB estimation. All the developed models were based on the use of DBH and H measurements. Table 3 presents the results of the allometric DB, NB and BB estimation models. Among the three models, the NB model showed the greatest accuracy, with R 2 and adjR 2 being 0.93 and 0.91, respectively. The DB model resulted in lower R 2 and adjR 2 values and the largest RSE. The lower accuracy of the DB allometric equation compared with the others can be attributed to the natural pruning of the fir, which is inconsistent and significantly varies between trees of different ages, social statuses, competitions and lighting conditions.

Dead Braches Biomass
The performance of the models for DB estimation using the multispectral LiDAR data (i.e., green, NIR, and merged) is presented in Table 4. Figure 4 depicts the model's fit, showing the difference between the regression and the hypothetical 1:1 lines. Overall, the models demonstrated satisfactory predictive performance, with R 2 values greater than 0.7 and RMSE% values lower than 26%. More specifically, the RF algorithm provided the best results compared with all of the other algorithms in all three different point clouds (i.e., green, NIR and merged), with the green achieving an R 2 of 0.89, MAE of 1.92 kg, rbias of −0.04 and RMSE% of 17.45. The RF algorithm produced a similar estimation performance in the other two point clouds (i.e., NIR and merged) in terms of R 2 , MAE, MSE and RMSE; however, the NIR produced the lowest rbias. The application of the SVR algorithm achieved the second-best performance among all of the point clouds, with the green providing the best predictions in terms of R 2 (0.82), RMSE% (20.98), rbias (−0.19) and MSE (7.53 kg). The SVR in the NIR and merged point clouds resulted in similar predictive performance, as evidenced by the R 2 (0.77 and 0.75, respectively) and RMSE% (24.18 and 24.51, respectively); however, the merged achieved a lower rbias compared with the NIR (−0.25% and 0.38%, respectively). The GP model provided the lowest performance, with an R 2 of 0.71 and RMSE% of 26.62 in the merged point cloud. A slight improvement in terms of R 2 , RMSE and MSE is observed in the NIR and green, although the MAE and rbias were almost identical in all cases. More specifically, the RF algorithm provided the best results compared with all of the other algorithms in all three different point clouds (i.e., green, NIR and merged), with the green achieving an R 2 of 0.89, MAE of 1.92 kg, rbias of −0.04 and RMSE% of 17.45. The RF algorithm produced a similar estimation performance in the other two point clouds (i.e., NIR and merged) in terms of R 2 , MAE, MSE and RMSE; however, the NIR produced the lowest rbias. The application of the SVR algorithm achieved the second-best performance among all of the point clouds, with the green providing the best predictions in terms of R 2 (0.82), RMSE% (20.98), rbias (−0.19) and MSE (7.53 kg). The SVR in the NIR and merged point clouds resulted in similar predictive performance, as evidenced by the R 2 (0.77 and 0.75, respectively) and RMSE% (24.18 and 24.51, respectively); however, the merged achieved a lower rbias compared with the NIR (−0.25% and 0.38%, respectively). The GP model provided the lowest performance, with an R 2 of 0.71 and RMSE% of 26.62 in the merged point cloud. A slight improvement in terms of R 2 , RMSE and MSE is observed in the NIR and green, although the MAE and rbias were almost identical in all cases.
Regarding the relative predictors' importance for the DB estimation ( Figure 5), the best-performing algorithm indicated the max as the most important variable, followed by the p99, p95, p90 and p80. The same five height percentiles were selected as the most Regarding the relative predictors' importance for the DB estimation ( Figure 5), the best-performing algorithm indicated the max as the most important variable, followed by the p99, p95, p90 and p80. The same five height percentiles were selected as the most important variables by the RF algorithm in all of the point clouds. The b80, b70 and avg were also identified among the ten most important variables. It can be observed that the height metrics related to the upper sections of the crown (e.g., max, p99 and p95) were considered more important compared with those from the lower canopy. However, two metrics that corresponded to the lower parts of the crown, namely, p10 and p20, were also selected among the most important variables in all of the point clouds.
Overall, it is evident that all three point clouds resulted in similar DB estimation accuracies using the RF algorithm, which was identified as the most reliable in terms of R 2 and RMSE. In addition, the significance of the height-derived LiDAR metrics was considered higher compared with those of the intensity metrics, as no intensity metrics were identified among the most valuable for the DB estimation. important variables by the RF algorithm in all of the point clouds. The b80, b70 and avg were also identified among the ten most important variables. It can be observed that the height metrics related to the upper sections of the crown (e.g., max, p99 and p95) were considered more important compared with those from the lower canopy. However, two metrics that corresponded to the lower parts of the crown, namely, p10 and p20, were also selected among the most important variables in all of the point clouds. Overall, it is evident that all three point clouds resulted in similar DB estimation accuracies using the RF algorithm, which was identified as the most reliable in terms of R 2 and RMSE. In addition, the significance of the height-derived LiDAR metrics was considered higher compared with those of the intensity metrics, as no intensity metrics were identified among the most valuable for the DB estimation.

Needles Biomass
The performance of the three different regression algorithms for NB estimation using the multispectral LiDAR data (i.e., green, NIR, and merged) is presented in this section (Table 5). Additionally, the models' fit for each channel is illustrated in Figure 6, showing a comparison between the regression and the hypothetical 1:1 line.

Needles Biomass
The performance of the three different regression algorithms for NB estimation using the multispectral LiDAR data (i.e., green, NIR, and merged) is presented in this section (Table 5). Additionally, the models' fit for each channel is illustrated in Figure 6, showing a comparison between the regression and the hypothetical 1:1 line. The NB models displayed an increase in R 2 and significantly higher RMSE% and MSE metrics in comparison with the DB models. More specifically, using RF on the merged point cloud resulted in the highest R 2 (0.93) among all the algorithms, with a low MSE (131.69 kg), rbias (−0.05%) and RMSE% (16.80%). Although the R 2 values of the RFs using the three point clouds were identical, the lowest MAE (9.24 kg) was observed in the green. Contrary to the RF algorithm, the SVR provided better NB predictions in the green, with an R 2 of 0. 84 metrics in comparison with the DB models. More specifically, using RF on the merged point cloud resulted in the highest R 2 (0.93) among all the algorithms, with a low MSE (131.69 kg), rbias (−0.05%) and RMSE% (16.80%). Although the R 2 values of the RFs using the three point clouds were identical, the lowest MAE (9.24 kg) was observed in the green. Contrary to the RF algorithm, the SVR provided better NB predictions in the green, with an R 2 of 0. 84  The relative predictor importance of the best-performing model for NB is presented in Figure 7. Similar to the RF model for DB estimation, the max was indicated as the most important variable, followed by the p99, p95, p90 and p80. More specifically, in the green, The relative predictor importance of the best-performing model for NB is presented in Figure 7. Similar to the RF model for DB estimation, the max was indicated as the most important variable, followed by the p99, p95, p90 and p80. More specifically, in the green, the RF model identified the aforementioned variables, along with p70, b70, avg and b90. These variables were considered significant in all of the models; however, their importance values were significantly lower than the first five (i.e., max, p99, p95, p90 and p80). Overall, it can be observed that the merged point cloud could provide reliable NB estimates using primarily height-derived LiDAR metrics, resulting in significantly lower MSE and RMSE values.
the RF model identified the aforementioned variables, along with p70, b70, avg and b90. These variables were considered significant in all of the models; however, their importance values were significantly lower than the first five (i.e., max, p99, p95, p90 and p80). Overall, it can be observed that the merged point cloud could provide reliable NB estimates using primarily height-derived LiDAR metrics, resulting in significantly lower MSE and RMSE values.

Branch Biomass
The performance of the developed models for BB estimation using the multispectral LiDAR data (i.e., green, NIR and merged) is presented in Table 6. Additionally, the models' fits are illustrated in Figure 8.

Branch Biomass
The performance of the developed models for BB estimation using the multispectral LiDAR data (i.e., green, NIR and merged) is presented in Table 6. Additionally, the models' fits are illustrated in Figure 8.  In particular, among the different algorithms and channels, the RF model using the green provided better performance for BB estimation (i.e., R 2 = 0.85, RMSE% = 24.09% and MAE = 22.57 kg). The RF models using the other two point clouds (i.e., NIR and merged) marginally outperformed the green one, with an almost identical R 2 of 0.84, MSE of 831.86 kg and 812.13 kg, and RMSE% values of 25.30% and 25.00% for the NIR and merged, respectively. Excluding the RF models, the SVR using the green was the second-bestperforming model, with an R 2 of 0.79 and RMSE% of 28.21% compared with the other channels, with R 2 values of 0.75 and 0.71 and RMSE% values of 31.46% and 32.61% for the NIR and merged, respectively. Contrary to the aforementioned models, the GP model provided better estimations using the NIR channel in terms of R 2 (0.75), RMSE% (31.35%) and MAE (28.95 kg), resulting in marginally lower results compared with the SVR model using NIR. Finally, the GP model using the merged achieved the lowest R 2 (0.67) and the highest RMSE% (34.42%) compared with the rest of the employed algorithms.
The relative predictor importance of the best-performing algorithms for BB estimation in each channel is illustrated in Figure 9. Similar to the DB and NB, the RF model indicated primarily height-derived LiDAR metrics as the most important variables for reliable biomass estimation. More specifically, the max, followed by the p99, p95, p90 and p80, were identified among the most important variables. More specifically, the RF model using the green identified the aforementioned variables, along with p70, avg, b80, In particular, among the different algorithms and channels, the RF model using the green provided better performance for BB estimation (i.e., R 2 = 0.85, RMSE% = 24.09% and MAE = 22.57 kg). The RF models using the other two point clouds (i.e., NIR and merged) marginally outperformed the green one, with an almost identical R 2 of 0.84, MSE of 831.86 kg and 812.13 kg, and RMSE% values of 25.30% and 25.00% for the NIR and merged, respectively. Excluding the RF models, the SVR using the green was the secondbest-performing model, with an R 2 of 0.79 and RMSE% of 28.21% compared with the other channels, with R 2 values of 0.75 and 0.71 and RMSE% values of 31.46% and 32.61% for the NIR and merged, respectively. Contrary to the aforementioned models, the GP model provided better estimations using the NIR channel in terms of R 2 (0.75), RMSE% (31.35%) and MAE (28.95 kg), resulting in marginally lower results compared with the SVR model using NIR. Finally, the GP model using the merged achieved the lowest R 2 (0.67) and the highest RMSE% (34.42%) compared with the rest of the employed algorithms.
The relative predictor importance of the best-performing algorithms for BB estimation in each channel is illustrated in Figure 9. Similar to the DB and NB, the RF model indicated primarily height-derived LiDAR metrics as the most important variables for reliable biomass estimation. More specifically, the max, followed by the p99, p95, p90 and p80, were identified among the most important variables. More specifically, the RF model using the green identified the aforementioned variables, along with p70, avg, b80, p60, p10 and p30. Using the NIR, the RF identified p10 as the fifth most significant variable, which corresponded to the lower parts of the crown. Regarding the merged, the selected variables were almost identical to the green; however, the max presented a significantly higher importance value compared with the other variables. p60, p10 and p30. Using the NIR, the RF identified p10 as the fifth most significant variable, which corresponded to the lower parts of the crown. Regarding the merged, the selected variables were almost identical to the green; however, the max presented a significantly higher importance value compared with the other variables.
(a) (b) (c) Figure 9. Demonstration of the relative importance (a-c) of the predictors for the best-performing algorithm for BB estimation in each point cloud based on the random forest algorithm.
Overall, it is evident that the RF could provide accurate BB estimates using any of the three channels, resulting in remarkably higher R 2 and lower RMSE values compared with the SVR and GP models.

Discussion
In the present study, we examined the performance of three machine learning approaches using multispectral LiDAR data for single-tree crown components biomass estimation in a multilayered fir forest. More specifically, new allometric equations were constructed based on destructive sampling for DB, NB and BB estimation, as no equations were available for the specific species. Thus, the DBH and H were measured in the field and used as input data in the developed allometric equations to generate reference DB, NB and BB estimates. These reference data were subsequently employed for the tree level DB, NB and BB estimation using three different regression algorithms (i.e., RF, SVR and GP). LiDAR-derived height and intensity distribution metrics at the tree-level from both spectral channels (i.e., green and NIR) were employed as potential independent variables in the predictive models. Overall, the results indicate that the biomass of all crown components could be reliably estimated using either of the three regression algorithms, however, the RF models provided the best predictions in all cases.
In the case of the DB estimation, the RF model using the green point cloud resulted in the best estimation performance in terms of R 2 , MAE, MSE and RMSE (i.e., R 2 = 0.89, MAE = 1.92 kg, MSE = 5.30 kg and RMSE = 2.30 kg) ( Table 4). It is noteworthy that only height-derived metrics were selected as the most important variables for DB estimation, which indicates the significance of the height-related metrics for crown biomass estimation. Hence, intensity information, which represents the canopy's surface Overall, it is evident that the RF could provide accurate BB estimates using any of the three channels, resulting in remarkably higher R 2 and lower RMSE values compared with the SVR and GP models.

Discussion
In the present study, we examined the performance of three machine learning approaches using multispectral LiDAR data for single-tree crown components biomass estimation in a multilayered fir forest. More specifically, new allometric equations were constructed based on destructive sampling for DB, NB and BB estimation, as no equations were available for the specific species. Thus, the DBH and H were measured in the field and used as input data in the developed allometric equations to generate reference DB, NB and BB estimates. These reference data were subsequently employed for the tree level DB, NB and BB estimation using three different regression algorithms (i.e., RF, SVR and GP). LiDAR-derived height and intensity distribution metrics at the tree-level from both spectral channels (i.e., green and NIR) were employed as potential independent variables in the predictive models. Overall, the results indicate that the biomass of all crown components could be reliably estimated using either of the three regression algorithms, however, the RF models provided the best predictions in all cases.
In the case of the DB estimation, the RF model using the green point cloud resulted in the best estimation performance in terms of R 2 , MAE, MSE and RMSE (i.e., R 2 = 0.89, MAE = 1.92 kg, MSE = 5.30 kg and RMSE = 2.30 kg) ( Table 4). It is noteworthy that only height-derived metrics were selected as the most important variables for DB estimation, which indicates the significance of the height-related metrics for crown biomass estimation. Hence, intensity information, which represents the canopy's surface reflectivity, is consid-ered uncorrelated with the DB. Taking into consideration the fact that the dead branches in fir species are concentrated in the lower parts of the canopy, height metrics representing the lower crown should be selected among the most variables for DB estimation. However, height metrics related to the upper-crown parts (i.e., max, p99, p95, p80 and p70) were the most important variables in the best-performing models, while the p20 and p10 were considered less important. This can be attributed to the fact that the tree height, which is closely correlated with the maturity of the tree, has a significant impact on DB [104]. A comparison between the different algorithms revealed that the monospectral data derived from the green channel could provide an accurate dead branches biomass estimation, which was credited to the higher sensitivity of this channel in the woody parts of the tree [105]. Although the multispectral LiDAR data are considered valuable for accurate species classification [66], forest biomass [106] and fuels estimation [37], it is evident that the exclusive use of monospectral data can provide reliable DB estimations.
With respect to NB estimation, the RF presented the highest values of the statistical evaluation measures (i.e., MAE, RMSE and R 2 ). As opposed to the DB, the RF produced almost identical estimation performances using the green, NIR and merged point clouds ( Table 5). As for the variable importance, the selected variables consisted primarily of height-derived metrics, similar to the DB models ( Figure 7). More specifically, max, p99, p95, p80, p70 and avg were selected as the most important variables, showcasing the importance of height-related metrics in reliable NB estimation. This can be considered a rather logical association since the vast majority of living branches are found in the upper canopy.
Regarding the BB estimation, the RF models resulted in similar R 2 , MAE, RMSE and rbias in all of the channels. In particular, the RF in the green channel provided the highest R 2 , with low MAE, MSE and RMSE% values (R 2 = 0.85, MAE = 22.57 kg, MSE = 754.17 kg and RMSE% = 24.09%) ( Table 6). The GP provided the worst R 2 among all crown components (0.67), with high MAE and RMSE values (MSE = 1491.21 kg, RMSE% = 34.42%). Similarly to the previous crown biomass components, height-derived metrics (i.e., max, p99, p95, p90, p80 and p10) were primarily identified as the most important variables for BB estimation.
All models used in this study demonstrated satisfactory prediction accuracy for DB, NB and BB, with the RF presenting the highest values in terms of R 2 , RMSE and MAE. The RF is a non-parametric regression method that provides robustness and flexibility in modeling individual tree attributes with high accuracy [35]. Moreover, the RF algorithm has a significant advantage over the other machine learning algorithms, as it can effectively handle collinearity and nonlinear regression problems [52]. Overall, the results demonstrate that DB, NB and BB could be reliably estimated in a multilayered forest using the RF algorithm.
Although AGB estimation using LiDAR data is becoming increasingly popular, only a few studies have focused on crown biomass estimation. Furthermore, the majority of the studies focused on plantations and semi-natural forests, which are significantly different from natural forest biomes. However, no studies have yet explored the potential of LiDAR in DB estimation. Consequently, we compared our NB and BB results to similar reports that exploit both single-tree and plot-based approaches, in either forest type. In 2013, [107] employed LiDAR data for biomass components estimation in an evergreen-dominated forest with mountainous terrain in western China. The results showed a lower R 2 (0.757) in terms of BB compared with ours; however, the RMSE% was significantly lower (12.423%). Regarding the NB, the reported R 2 (0.36) was significantly low compared with the one reported in this study. Hauglin et al. [6] estimated the single-tree branch biomass of Norway spruce using LiDAR data, incorporating height-and intensity-derived LiDAR metrics. Although the results are marginally inferior compared to ours in terms of R 2 (0.83), we achieved a significantly lower RMSE% (45%). In addition, [62] used linear models with logarithmic transformations to estimate NB and BB in a coniferous forest, reporting a lower R 2 and higher RMSE% compared with ours in both cases. The variation in the results can be attributed to the different algorithms and LiDAR sensors used, as well as the significant difference between the species of each study area.
Even though the aforementioned studies employed both intensity-and height-derived LiDAR metrics, none of the above utilized multispectral LiDAR for the crown components biomass predictions. In this study, we evaluated the potential of different LiDAR channels (i.e., green, NIR and merged) to estimate DB, NB and BB. According to our findings, the monospectral data can provide accurate crown components estimations, and in most cases, the predictions were superior compared with the multispectral data. More specifically, the green channel provided the best BB, NB and DB predictions using either of the three regression algorithms. The merged point cloud produced the lowest evaluation measures of all crown components, which can be attributed to the large complexity of the point cloud.
Considering that the monospectral point cloud density was already high (~44.65 points/m 2 ) for AGB estimation at the individual tree level, the additional channel could not provide any supplementary information to improve the predictions of crown biomass. Furthermore, the intensity information of both channels was not found to be significant for the estimation of the crown biomass components. However, multispectral LiDAR and intensities can contribute to the classification of different tree species, which is essential for biomass estimation, especially in mixed forests [108].
Although this research reached its objectives, there were some inherent constraints. First of all, the destructive sample used in the study may be deemed insufficient for a forest with such a complex structure, leading to potential uncertainty in the predictions [37]. Nevertheless, the majority of studies employ similar sample sizes for destructive sampling [73], enriching the reference data via allometric equations. Regarding the dead branches, our approach did not take into consideration the lighting conditions or suppression caused by neighbor trees, which constitute major factors for the natural debranching. In addition, the study focused on dominant and co-dominant trees, excluding all the understory trees, which have a significant role in carbon allocation. As a result, the methodology can be transferred in similar biomes for DB, NB and BB, providing estimations only for the dominant and co-dominant trees.
In summary, our research shows that LiDAR data are capable of providing accurate crown components biomass predictions in a multilayered fir forest. This study emphasized the importance of using monospectral LiDAR-derived height metrics for crown biomass estimation, which is crucial for forest inventory management and forest biomass dynamics, as they comprise the majority of the logging residues. It should be noted that this research was the first contribution for dead branches biomass estimation using primarily airborne LiDAR data at the single-tree level, and one of the few studies employing LiDAR data and destructive sampling for crown components biomass estimation.

Conclusions
This study aimed to estimate the crown components biomass using multispectral LiDAR data in a multilayered coniferous forest. Specifically, single-tree DB, NB and BB were predicted using machine learning regression algorithms. The results indicated that LiDAR data can provide reliable crown components biomass estimates in a coniferous forest that was characterized by a complex structure and rough topography. More specifically: • DB and NB could be adequately estimated using all the examined algorithms (i.e., RF, SVR and GP), contrary to the case of BB, where only the RF and SVR provided accurate predictions. • NB was predicted with the highest accuracy in all tested algorithms, with the RF resulting in the most precise estimations.

•
The RF algorithm provided the most accurate DB, NB and BB predictions in terms of R 2 , MAE and RMSE% compared with the rest of the examined algorithms.

•
The height-derived LiDAR metrics contributed to the accurate crown components biomass estimation, where they were the most important variables in the best-performing models, compared with the intensity-derived, none of which were selected.

•
The metrics derived from the green point cloud provided the most accurate biomass predictions in all regression models and crown components.

•
The use of monospectral point clouds (i.e., green and NIR) resulted in improved predictive accuracy compared with the multispectral, which can be attributed to the increased complexity of the merged point cloud.
Overall, it is evident that airborne LiDAR data could successfully contribute to quantifying carbon sequestration and biomass in complex structured forests, maintaining the ecosystem's carbon balance. This study aids in improving crown biomass estimation in naturally regenerated ecosystems with a complex structure, where branches are among the most important surface fuel components. Given the limitations of the study described in Section 5, it would be interesting to examine the potential of LiDAR in crown components biomass estimation when employing an increased amount of destructive samples in order to validate the results. Further research could also focus on the use of spaceborne LiDAR systems to provide a cost-efficient approach for crown biomass estimation at large scales.