The Local Median Filtering Method for Correcting the Laser Return Intensity Information from Discrete Airborne Laser Scanning Data

Laser return intensity (LRI) information obtained from airborne laser scanning (ALS) data has been used to classify land cover types and to reveal canopy physiological features. However, the sensor-related and environmental parameters may introduce noise. In this study, we developed a local median filtering (LMF) method to point-by-point correct the LRI information. For each point, we deduced the reference variation range for its LRI. Then, we replaced the outliers of LRI with their local median values. To evaluate the LMF method, we assessed the discrepancy of LRI information from the same and diverse land cover types. Moreover, we used the corrected LRI to distinguish points from grass, road, and bare land, which were classified as ground type in ALS data. The results show that using the LMF method could increase the similarity of pointwise LRI from the same land cover type and the discrepancy of those from different kinds of targets. Using the LMF-corrected LRI could improve the overall classification accuracy of three land cover types by about 3% (all over 81%, κ ≥ 0.73, p < 0.05), compared to those using the original and range-normalized LRI. The sensor-related metrics brought more noise to the original LRI information than the environmental factors. Using the LMF method could effectively correct LRI information from historical ALS datasets.


Introduction
The discrete point cloud data collected by airborne laser scanning (ALS) systems can depict the 3-D morphological features of the targets. The return signals of laser pulses can be pointwise interpreted as the laser return intensity (LRI) [1], the value of which is in theory related to the backscatter reflectance of scan targets to the laser pulse [2,3]. The LRI information from ALS data has been used to classify land cover types [4], identify tree species in large-scale forests [5,6], estimate the leaf area index [7,8], and monitor tree growth [9]. Some studies have used the LRI information of points from forest canopies to reveal the 3-D distribution features of some canopy physiological indices, such as the leaf nitrogen content [1], leaf water potential [10], canopy photosynthetic rate [11], and canopy functional traits [12].
However, some sensor-related and environmental factors, such as the surface slope of targets, scan angles, and atmospheric attenuation, may introduce noise to the LRI information [13][14][15]. For example, the automatic gain control (AGC) tool embedded in some ALS systems can adjust the emitting signal power of the laser pulse [13,16]. Ideally, it is preferable to synchronously calibrate LRI information during the data collection [17], which some sensor manufacturers have done [15]. However, the major obstacle to LRI calibration for historical ALS datasets is that some real-time sensor settings are unrecorded and non-backtracking. Furthermore, different sensor manufacturers use diverse and

ALS Data Collection
A compact laser-based system (ALS70-HA and ALS50-HA, Leica Geosystems AG, Heerbrugg, Switzerland) with the laser pulse wavelength of 1064 nm was used to collect discrete point cloud data on sunny days from 28 September to 28 October 2013. The scan angle was set lower than 30° during the data acquisition. All ALS data were collected at the flying altitude of 900 m. The overlapping rate between neighbor flight lines was set higher than 60%. Up to fourth-return signals were interpreted and recorded as the discrete ALS data in the LAS V1.2 format. By using 873 checkpoints to evaluate the vertical accuracy across the entire study area, the RMSE of vertical accuracy was 0.03 m (0.05 m at the 95% confidence level). Each point was automatically distinguished as a point from building, tree crown, or ground and denoted by different codes in the ALS datasets. However, the points from grass, bare land, and road were denoted as the ground type. The original pointwise LRI values were normalized into the range of [0, 255]. The pointwise 3-band true color information (in RGB format) and timestamp (the GPS time tag at which the data receiver acquired the laser return signals) were recorded as well. More detailed information about the data collection process and original ALS datasets used in this research can be found and open-source downloaded at http://sonomavegmap.org/.

ALS Data Collection
A compact laser-based system (ALS70-HA and ALS50-HA, Leica Geosystems AG, Heerbrugg, Switzerland) with the laser pulse wavelength of 1064 nm was used to collect discrete point cloud data on sunny days from 28 September to 28 October 2013. The scan angle was set lower than 30 • during the data acquisition. All ALS data were collected at the flying altitude of 900 m. The overlapping rate between neighbor flight lines was set higher than 60%. Up to fourth-return signals were interpreted and recorded as the discrete ALS data in the LAS V1.2 format. By using 873 checkpoints to evaluate the vertical accuracy across the entire study area, the RMSE of vertical accuracy was 0.03 m (0.05 m at the 95% confidence level). Each point was automatically distinguished as a point from building, tree crown, or ground and denoted by different codes in the ALS datasets. However, the points from grass, bare land, and road were denoted as the ground type. The original pointwise LRI values were normalized into the range of [0, 255]. The pointwise 3-band true color information (in RGB format) and timestamp (the GPS time tag at which the data receiver acquired the laser return signals) were recorded as well. More detailed information about the data collection process and original ALS datasets used in this research can be found and open-source downloaded at http://sonomavegmap.org/. The Radar Equation is the physical basis for interpreting the laser return signals into pointwise LRI information from a discrete ALS dataset [30]. The parameters of the simplified Radar Equation (Equation (1) include the real-time sensor-related metrics, environmental parameters, and backscatter reflectance of targets to the laser pulse [19,31]: where P t is the real-time transmitted power of laser pulse (in W), D r is the aperture diameter of laser return signal receiver (in m), η sys is the sensor-specific transmission factor (dimensionless), ρ is the backscatter reflectance of target to the laser pulse (dimensionless), R is the transmission range of laser pulse from targets to return signal receiver (in m), β is the scan angle (the angle at which the laser pulse emitting from the laser scanner, degrees), and η atm is the atmospheric transmission factor (dimensionless). Theoretically, the pointwise LRI value from one ground object with a near-Lambertian surface is related to its R and ρ if the sensor-related metrics were invariable during the ALS data collection. However, η atm and P t are changeable when using a laser scanner having an embedded adjustment tool for sensor-related parameters, such as the AGC tool, while these two parameters are not pointwise recorded in a discrete ALS data. The altitude of each point (H a ) and its β value are recorded in the LAS file. We calculated η atm by Equation (2): where b = the atmospheric attenuation coefficient (in dB/km). Here, the b was set as 0.22 [14].

LRI Correction Methods
We designed a local median filter (LMF) method to point-by-point correct the LRI information from discrete ALS data. Presuming that the first-return points of trees are all from the canopy surface [23,31], we extracted those data relying on their classification and return-type codes stored in the LAS file (V1.2 version). As the return signal reflector within a forest canopy is always a mixture of woody and foliage components, we maintained the pointwise original LRI information of other returns from the inner canopy in the correction results. For ground objects except for tree crowns, we corrected their all-return LRI information by the following methods: A. Calculating the Local Slope of Targets We used the sphere searching method with a searching radius of 1.5 m to extract the local point set K = {p i (x i , y i , z i ), i = 1, 2, . . . , n} for each point. Then, we pointwise calculated the local slope for each point, except for those points from tree crowns. In a local point set, the mean coordinatefor C for the n points in K was calculated as: and the local slope of target covered by n points in K could be calculated as: Here, (x c , y c , z c ) is the Cartesian coordinate of the center point within a searching unit. When calculating the local slope of rooftops, we based on the pointwise classification codes stored in the LAS V1.2 file to remove the nearby points of ground in each searching unit. The height difference between rooftops and the ground surface might cause an overlarge slope near the boundaries of buildings. Similarly, we removed the points of buildings when calculating the local slope of road, bare land, and grass.

B. Calculating the Laser Incidence Angle
The surface slope of targets and scan angle (β) affect the incidence angle of laser return signal. As shown in Figure 2, the flight altitude of the laser scanner is H f m (H f = 900 for tested ALS dataset), NP is the normal vector of point P on the surface with the altitude of H a m. The local slope (∠NPL) is equal to α • . For points with β equal to 0 • , their locations are on the projection of flight lines of the laser scanner during the data collection (shown as the red dashed line in Figure 2b). However, for some clipped datasets that do not contain the projection of flight lines, we could not calculate the range between the emitting point of the laser pulse (O) and its sub-stellar point D and the angle ∠OLN. Owing to the fact that the distance OP can be calculated by β and LP, we deduced the variation range of ∠OPN, that is the incidence angle of laser return signal (β r ) [31]: Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 20 For non-forest points, we used β r to replace β [21] in Equation (1). The product of unrecorded sensor-related measurements in ALS datasets (UM) is calculated as: where Pr means the pointwise original LRI. Assuming that ρ of local points in one searching unit were similar, we built the KD-tree index for the original ALS data to search the nearest four points for each point. Then, we calculated the reference range of UM for each point as [UMmin, UMmax]: where UMq1 and UMq3 are the lower (i.e., 25%) quartile and upper (i.e., 75%) quartile of the nearest four UM values, respectively. The UMQR means the interquartile range of four UM values [32]. We kept the original UM for the kernel point if it was within the reference range to avoid excessive correction. Otherwise, we used the local median UM value to replace it. The pointwise scan angle was extracted by the Laspy 1.5.1 software package (https://github.com/grantbrown/laspy) in Python 3.5 program environment. Here, we redefined R as the vertical range between Hf and Ha to avoid setting a subjective reference plane for LRI correction. Finally, we recalculated LRI (LRIC) point-by-point as: where UMC means the post-corrected UM. Thus, the correction results simulated the pointwise LRI information interpreted from the laser return signal with a 0° incidence angle and post-corrected UM. In ∆NPL and ∆OPL: and OL = H f − H a tan β.
Remote Sens. 2020, 12, 1681 6 of 20 Based on the law of cosines, Substituting (5) to (8) into (9): the incidence angle could be calculated as: When using historical ALS data, users may only clip a part of the point cloud from the whole database, such as the point cloud data we used in this study. For each point, the clipped point cloud may not contain the projection of receiving the point of laser return signal (collected at the same timestamp and with 0 • incidence angle, such as point D in Figure 2), especially for points at the edge of the clipped range. Another case is that multiple points were chosen as the projection of the receiving point as they collected at the same timestamp and with 0 • incidence angle. Once the projection point cannot be extracted, or more than one point is selected, the azimuth of the laser return signal cannot be calculated. As shown in Figure 2, if ∠OLN > 90 • , the sensor cannot receive return signals from targets. Thus, and we calculated cos β r as:

C. The LRI Correction for Non-Forest Points
For non-forest points, we used β r to replace β [21] in Equation (1). The product of unrecorded sensor-related measurements in ALS datasets (UM) is calculated as: where P r means the pointwise original LRI. Assuming that ρ of local points in one searching unit were similar, we built the KD-tree index for the original ALS data to search the nearest four points for each point. Then, we calculated the reference range of UM for each point as [UM min , UM max ]: where UM q1 and UM q3 are the lower (i.e., 25%) quartile and upper (i.e., 75%) quartile of the nearest four UM values, respectively. The UM QR means the interquartile range of four UM values [32]. We kept the original UM for the kernel point if it was within the reference range to avoid excessive correction. Otherwise, we used the local median UM value to replace it. The pointwise scan angle was extracted by the Laspy 1.5.1 software package (https://github.com/grantbrown/laspy) in Python 3.5 program environment.
Here, we redefined R as the vertical range between H f and H a to avoid setting a subjective reference plane for LRI correction. Finally, we recalculated LRI (LRI C ) point-by-point as: Remote Sens. 2020, 12, 1681 7 of 20 where UM C means the post-corrected UM. Thus, the correction results simulated the pointwise LRI information interpreted from the laser return signal with a 0 • incidence angle and post-corrected UM. However, the original LRI would be kept for points with LRI C over the predefined range of LRI values (from 0 to 255 for the tested ALS data). D. The LRI Correction for Points from Tree Crowns The incidence angle of laser return signals from tree crowns could not be deduced. For each first-return point from tree crowns, we used the method described in the previous section to search its nearest four first-return points. Then, we judged whether the LRI value of the center point in one searching unit was in its reference range. If its LRI value was within the range of LRI q1 − 1.5 · I QR , LRI q3 + 1.5 · I QR (LRI q1 and LRI q3 mean the lower and upper quartile of local four LRI values, and I QR means the interquartile range of four LRI values), we maintained its original LRI. Otherwise, we used the local median LRI to replace its original LRI value. We programmed the LMF and RN correction methods in Python 3.5 using the Scikit-Learn 0.21.3 package [33].

Random Sampling and Evaluation
To evaluate the LRI correction methods, we selected point sets from five land cover types. The points in each sample point set (with an area of 3 m × 3 m) were from the same land cover type. For points from one point set (from road, bare land, grass, tree crown, or building), the similarity among their post-corrected LRI values should be higher. Then, we calculated the mean (µ), standard deviation (σ), and coefficient of variation (CV) of pointwise LRI for each point set: Furthermore, we used the coefficient of joint variation (CJV) [21] to evaluate the discrepancy of LRI information between two types of point sets (W and V). As some CJV values might overlarge if the average LRI values of pairs of point sets are close to each other, we optimized its calculating method as: Here, a higher CJV shows less between-class confusion of pointwise LRI information from W and V point sets.
The emission time tag of each laser pulse was not stored during the ALS data collection. Considering the propagation speed of the laser pulse (about 3 × 10 8 m/s), we ignored the time difference between the emitting and receiving time of a laser pulse. Thus, we assumed that the points collected at the same timestamp were with the same UM. For each point set, we calculated the mean CV (T CV ) and mean CJV (T CJV ) of LRI values collected at different timestamps, except for those collected at only one timestamp.
where N means the number of timestamps when collecting each point set, and B means the pointwise LRI values collected at the same timestamp. With the decreasing T CV and T CJV , the similarity among LRI values obtained at N timestamps would increase.

Land Cover Classification
To evaluate the contribution of LRI correction methods, we manually chose points from road, bare land, and grass as the training and test samples, which were all distinguished as ground type in the original datasets. There was no intersection between training and test points. We used the Laspy 1.5.1 software package to extract the points of ground type by the pointwise classification code stored in the LAS file. For training points, we combined their original, LMF-corrected, or RN-corrected LRI with the pointwise 3-D true color information to form the 4-D feature matrices as the input to the random forest (RF) classification model [34]. The RF classifier was programmed using the Scikit-Learn 0.21.3 [33]. We removed points from tree crowns and buildings, as they had been distinguished in the original datasets. Finally, we calculated the overall classification accuracy (OA), producer's accuracy (PA), and user's accuracy (UA) of points from each land cover type and the Kappa coefficient (κ) for the classification results. McNemar's test [35] was used to evaluate the statistical significance between the classification results obtained by using different types of LRI information (original, LMF-corrected, or RN-corrected).
where V type means the number of test points that were correctly classified (RO: road, L: bare land, G: grass) in ALS data of one plot. T type is the number of tested points from type V in the same dataset. F type is the number of test points misclassified as type V in the same dataset. The κ statistic is computed as: where P 0 means the relative observed agreement among N samples in error matrices with r rows. P c is the hypothetical probability of chance agreement. X i+ and X +i are the row probabilities and the column probabilities, respectively.

The LRI Correction Results of Sample Point Sets
The location of each point set is shown in Figure 3 as well as the number of timestamps and point density (the average number of points per square meter) information. The standard deviation (LRI-STD) and CV (LRI-CV) of original, LMF-corrected, and RN-corrected LRI information from each point set (total 250 point sets) selected in Plots A, B and C are shown in Figure 4 and Table 2. Each land cover type contains 50 sample point sets.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 20 As shown in Figure 4, over 94% of point sets have lower LRI-CV and LRI-STD after the LMF correction than those of the original LRI information. Although over 72% of the RN-corrected point sets from five land cover types had lower LRI-CV, the LRI-STD of more than 70% of point sets witnessed increasing, which means the similarity of pointwise LRI values became lower after the RN correction. For a point set, using the LMF correction method could more robustly increase the similarity of LRI values (with decreasing LRI-CV and LRI-STD) compared to using the RN method.
The average LRI-STD dropped relatively severely for point sets from tree crowns but slightly for those from buildings and bare land in the LMF-corrected results. In Table 2, the mean LRI-STD decrease by 3.01 for point sets from grass, by 2.24 for point sets from bare land, by 3.22 for point sets from road, by 2.14 for point sets from buildings, and by 5.3 for point sets from tree crowns after being corrected by the LMF method. The LRI-CV of each point set also decrease after using the LMF correction method. The mean LRI-CV values decrease by 0.01 for point sets from grass, by 0.03 for point sets from bare land, by 0.04 for point sets from buildings, by 0.05 for point sets from road, and by 0.07 for point sets from tree crowns. However, the LRI-CV of each point set was nearly invariable after the RN correction compared with those of the original LRI information. As shown in Figure 4, over 94% of point sets have lower LRI-CV and LRI-STD after the LMF correction than those of the original LRI information. Although over 72% of the RN-corrected point sets from five land cover types had lower LRI-CV, the LRI-STD of more than 70% of point sets witnessed increasing, which means the similarity of pointwise LRI values became lower after the RN correction. For a point set, using the LMF correction method could more robustly increase the similarity of LRI values (with decreasing LRI-CV and LRI-STD) compared to using the RN method.
The average LRI-STD dropped relatively severely for point sets from tree crowns but slightly for those from buildings and bare land in the LMF-corrected results. In Table 2, the mean LRI-STD decrease by 3.01 for point sets from grass, by 2.24 for point sets from bare land, by 3.22 for point sets from road, by 2.14 for point sets from buildings, and by 5.3 for point sets from tree crowns after being corrected by the LMF method. The LRI-CV of each point set also decrease after using the LMF correction method. The mean LRI-CV values decrease by 0.01 for point sets from grass, by 0.03 for point sets from bare land, by 0.04 for point sets from buildings, by 0.05 for point sets from road, and by 0.07 for point sets from tree crowns. However, the LRI-CV of each point set was nearly invariable after the RN correction compared with those of the original LRI information.    The confusion of LRI information between each pair of point sets from diverse land types was decreased after the LMF correction. As shown in Table 3, for each pair of point sets selected from two land cover types except tree crowns, the mean LRI-CJV increase over 0.05 after the LMF correction compared to those from the original LRI information. However, the average LRI-CJV rising slightly (< 0.03) for pairs of point sets, including one from the tree crown. In the RN-corrected results, the mean LRI-CJV values increase slightly or keep invariable except for the pairs of point sets from grass and bare land. In Figure 5, we illustrated the result that the LRI-CJV values were always higher in the LMF-corrected results than those of the RN-corrected LRI information. Thus, the discrepancy in LRI information between each pair of point sets from diverse land cover types was relatively more significant in the LMF correction results, compared with those in the original and the RN-corrected LRI information.
Remote Sens. 2020, 12, x FOR PEER REVIEW 11 of 20 the LMF-corrected results than those of the RN-corrected LRI information. Thus, the discrepancy in LRI information between each pair of point sets from diverse land cover types was relatively more significant in the LMF correction results, compared with those in the original and the RN-corrected LRI information.  As shown in Figure 6 and Table 4, over 82% of the point sets have lower T CV values after the LMF correction compared to those of the original and RN-corrected LRI. In the LMF-corrected results, the mean T CV decrease by nearly 0.03 for point sets from road and building, by 0.07 for point sets from tree crowns, and by 0.01 for point sets from bare land. However, the mean T CV of each point set from grass is nearly invariable after the LMF correction. In addition, the most significant declines of T CV are 0.16 for point sets from road, 0.12 for point sets from tree crown, 0.08 for point sets from buildings, 0.04 for point sets from bare land, and 0.025 for point sets from grass after the LMF correction than using the original LRI information (shown in Figure 6a-e). It should be noted that the T CV of each point set was almost unchanged after the RN correction. We found that the average T CJV only decreased for point sets from bare land in the RN-correction results, but all types of point sets in the LMF-corrected results. As shown in Figure 6, the T CJV s become lower for over 78% of the tested point sets when using the LMF method compared to those of the original and RN-corrected LRI. The average T CJV s of LMF-corrected results drop dramatically for point sets from bare land (by from 1.19 to 0.62). Still, they vary slightly for those from tree crowns due to the high similarity (the average T CJV = 0.07) of the original LRI information collected at different timestamps. For over 78% of the point sets collected within multiple timestamps, the similarity of pointwise LRI values was higher after the LMF correction than those corrected by the RN method.

The Classification Accuracy of Ground Points
In Plots A, B, and C, we manually selected 30,000 test points for each of the three land cover types, including bare land, road, and grass, that were all denoted as ground type in the original LAS file. Then, the 90,000 test points were used to assess the classification accuracy of ground points by using the original and post-corrected (LMF-and RN-corrected) LRI information and the pointwise true color information. All results are shown in Figure 7 and Table 5. Using the corrected LRI could effectively increase the OA by about 3% (κ ≥ 0.73, p < 0.05) compared with the OA of using the original LRI information. Using the LMF-corrected LRI was better than the RN-corrected LRI for classifying ground points. The PA of road points (RO-PA) were higher in the classification results of Plots A and B by using the LMF-corrected LRI, while this index was lower than 84% when using different types of LRI information to distinguish the data of Plot C. The PA values of grass points (G-PA) were nearly invariant for the classification results of Plot C when using three types of LRI information. Still, this index increased by over 3% for the classification results of Plots A and B after using the LMF-corrected LRI information. The G-PA was slightly higher when using the RN-corrected LRI information to classify the ALS datasets of Plots C than using the LMF-corrected LRI information. The PA of points from bare land (L-PA) increased by less than 2% when using the post-corrected LRI. For the classification results of Plot C, the L-PA of LMF-corrected data increased by nearly 2% compared to those using the original LRI and by over 5% compared to those using the RN-corrected LRI. Thus, using the LMF-corrected LRI achieved higher accuracy when classifying the above three land cover types.

The Classification Accuracy of Ground Points
In Plots A, B, and C, we manually selected 30, 000 test points for each of the three land cover types, including bare land, road, and grass, that were all denoted as ground type in the original LAS file. Then, the 90,000 test points were used to assess the classification accuracy of ground points by using the original and post-corrected (LMF-and RN-corrected) LRI information and the pointwise true color information. All results are shown in Figure 7 and Table 5. Using the corrected LRI could effectively increase the OA by about 3% (κ ≥ 0.73, p < 0.05) compared with the OA of using the original Figure 6. T CV and T CJV of each point set collected within various timestamps. Here, the point sets collected at only one timestamp during the analysis are shown with T CV and T CJV equal to zero. Table 5. Classification accuracy of grass, bare land, and road points. The OA means the overall classification accuracy. The G-PA, L-PA, and RO-PA mean the producer's accuracy of tested points from grass, bare land, and road, respectively. The G-UA, L-UA, and RO-UA mean the user's accuracy of tested points from grass, bare land, and road, respectively. By the McNemar's test, the difference between each pair of classification results was significant with p < 0.05.  LRI information. The G-PA was slightly higher when using the RN-corrected LRI information to classify the ALS datasets of Plots C than using the LMF-corrected LRI information. The PA of points from bare land (L-PA) increased by less than 2% when using the post-corrected LRI. For the classification results of Plot C, the L-PA of LMF-corrected data increased by nearly 2% compared to those using the original LRI and by over 5% compared to those using the RN-corrected LRI. Thus, using the LMF-corrected LRI achieved higher accuracy when classifying the above three land cover types.

Sensitivity Analysis: R vs. UM
According to Equation (1), increasing the transmission range (R) of the laser return signal would reduce the LRI value. The scanning angle, surface slope, and altitude of ground objects affect the variables θ r and R of the laser return signal [20,36,37]. For point cloud data from a near-Lambertian surface, the variable scanning angle might be the main reason for a reduction in the similarity of pointwise LRI information from the same land cover type. However, the effects of variable UM to the pointwise LRI information are rarely explored in most range-based correction methods.
We calculated the mean STD and CV of the original pointwise LRI information collected at the same timestamp but with different R values (R STD and R CV ) or with similar R values but changeable UM (UM STD and UM CV ) in each point set from four land cover types. The points with a difference in R-values less than 0.1 m were assumed as points collected with similar R. We used the average values of these four metrics to do the sensitivity analysis, as each point set might contain multiple UM STD , UM CV , R STD , and R CV values. The results are shown in Figure 8 and Table 6, in which the R STD s and R CV s of point sets collected at only one timestamp were calculated. In addition, we removed the point sets from tree crowns where the local points rarely had similar R during the sensitivity analysis.
same timestamp but different R in each point set.

Types
Max  We found that the UM introduced more noise than R to the original LRI information from the ALS datasets of Plots A to E. The UMSTD was larger than RSTD for each point set from four land cover types except for about 30% of the point sets from buildings. In addition, the UMCV was slightly larger  Table 6. Largest UM STD , UM CV , R STD , and R CV of LRI information from point sets except for those from tree crowns. The UM STD and UM CV mean the average LRI-STD and LRI-CV for points collected with the similar transmission range of laser return signal (R) at different timestamps in each point set. The R STD and R CV mean the average LRI-STD and LRI-CV for pointwise LRI information collected at the same timestamp but different R in each point set.

Types
Max We found that the UM introduced more noise than R to the original LRI information from the ALS datasets of Plots A to E. The UM STD was larger than R STD for each point set from four land cover types except for about 30% of the point sets from buildings. In addition, the UM CV was slightly larger than R CV for each point set except for 26% of the point sets from buildings. The UM CV of LRI information from point sets of grass were lower than from the other three land cover types (i.e., bare land, building, and road). When using the laser pulse in the 1064-nm wavelength, the backscatter reflectance (ρ) of grass is theoretically higher than those of targets without vegetation coverage [38,39]. The AGC tool would adjust the sensor-related metrics when the data receiver could not capture laser return signals from targets with low ρ. It is reasonable that the average variable rate of pointwise LRI values with similar R collected at multiple timestamps from grass was lower than those from the other three land cover types (with lower UM CV ). We inferred that the variable UM might introduce more noise to the pointwise LRI information from targets with low ρ.

Comparison with the RN Method
The RN method is commonly used to correct the LRI information of discrete ALS data [3,40]. However, we found that the RN method contributed less than the LMF method in correcting the LRI information of tested ALS data. The RN and LMF methods both calculate R to correct the energy attenuation of laser return signals. When using the RN method, we set R as the difference in height between H f and H a . Thus, the index R used for the RN and LMF correction is equal. With the assumption that the variable R causes the outliers of LRI, the RN-corrected LRI values collected with similar R from the same targets should be the same. However, this was not the case for selected point sets.
Whether to assume that the pointwise UM values are invariable during the LRI correction is the main difference between the RN and LMF methods. Using the LMF correction method can detect the outliers of pointwise variable UM in original ALS data and replace them to suppress the noise caused by the variable sensor-related metrics. The original pointwise UM would be maintained if it was within its local reference range to avoid an excessive LRI correction. As shown in Figure 4, the LRI-STD and LRI-CV of RN-corrected results were always higher than those corrected by the LMF correction method. For ALS data collected with the fixed UM, using the RN and LMF methods can achieve the same results.
We also calculated the T CV and T CJV for point sets collected at multiple timestamps. Since the points in the same point set are of the same type and are distributed in the range of 3 m×3 m, their pointwise LRI values should be similar. For the point sets that were collected at more than one timestamp, the variable UM reduced the similarity of pointwise LRI values. As shown in Table 4 and Figure 6, the T CV and T CJV values were always higher for the original LRI information. It is sensible that the similarity of LMF-corrected LRI collected at different timestamps was increased (with lower T CV and T CJV ) for over 78% of the tested point sets, compared with those of the original and RN-corrected LRI information. When using a laser scanner that can automatically adjust the emitting power of laser pulse (e.g., the ALS50-HA), it is necessary to suppress the noise introduced by the variable transmission range of return signals and sensor-related parameters. The results of this study also verify that using the LMF method can robustly correct the LRI information interpreted by different laser scanners.

The Sources of Outliers in the LMF-Corrected LRI Information
After the LMF correction, the LRI values of some points might surpass their predefined range from 0 to 255. Through visual detection, these points were always at the edges of rooftops, or grass area, or on the surfaces of vehicles parked on the road, but they were seldom from road and bare land. Yan and Shaker [21] also found some overlarge LRI values from grass and rooftops after the LRI correction. Furthermore, the LRI values of some points from tree crowns, which were misclassified as ground type in original ALS datasets, might be overlarge after the LMF correction.
We inferred that the sources of outliers were as follows: (1) we hypothesized that the flying altitude of the laser scanner was invariant during the data collection. However, the real-time flying altitude was not recorded in discrete ALS datasets. For return signals collected with lower R, the corresponding interpreted LRI values would be higher according to Equation (1) [41]. (2) The pointwise LRI values from targets with high ρ are larger than those from targets with lower ρ in theory. For example, the LRI values of points from tree crowns are higher than those of points from asphalt road when using the near-infrared laser pulse. Thus, we might not precisely detect the reference range of UM for some points on the boundaries of targets. (3) Setting the exponent of the denominator in Equation (1) as two (squared) is only suitable for the reflectors with near-Lambertian surfaces [3]. However, existing research has verified that this setting is not ideal for correcting LRI values from targets with a linear or point distribution, such as suspended electrical wires or upright stumps [19].
(4) The LRI information of scan targets located in the overlapping region of LiDAR data strips may be affected by line-stripping noise [21]. (5) The hotspot effect of received laser return signals. The LRI values dramatically increase naturally toward the zero-phase angle between the laser scanner and viewing directions. Without the synergy field survey for the ρ of ground objects to the laser pulse, we cannot quantitatively assess the hotspot effects of LRI information for historical ALS data. Moreover, the hotspot effect on LRI information seems more severe for near-infrared laser pulses according to the laboratory tests [17]. Thus, for points with overlarge post-corrected LRI values (>255), we maintained their original LRI in the results.

Limitations and Potential Applications
The LMF method is not suitable for correcting the LRI information interpreted from more than first-return signal from tree crowns, in which the return signals might be reflected from understory shrubs and soil background with diverse ρ. Thus, we did not calculate the UM for points from tree crowns, but replaced the outliers of related LRI information by their local median value. However, it should be noted that the relationship between the pointwise post-corrected LRI value and corresponding ρ was still unknown for historical ALS data [42].
When calculating the reference range of UM for non-forest points or LRI value for first-return points from tree crowns, we assumed that points in one searching unit have similar ρ. However, the reflectance among adjacent points may be different if the local point density was low, as they may from multi-class objects. We inferred that the variable local point density had an impact on the LRI correction results. To evaluate the effect of point density on the LRI correction results, we must consider the geometrical features of objects and the complexity of land cover types. It is challenging to evaluate the lowest available point density for ALS data collected by different laser scanners and from diverse regions. Assessing the effects of point density on the LMF correction results should be carried out by synchronously surveying the backscatter reflectance of ground objects to the emitting laser pulse during the data collection. At present, we cannot carry out this study by using a historical ALS dataset without a field survey of the backscatter reflectance of ground objects. We will attempt to do this in our future research.
The LRI-STD and LRI-CV of point sets from the same land cover type experienced a reduction after using the LMF correction method, especially for the pointwise LRI information from tree crowns, grass, and road. Using the LMF correction method could achieve higher accuracy when classifying land cover types from the ALS data. Existing research studies have attempted to use the multi-temporal and multispectral LRI information from ALS data collected in forests to reveal the spatiotemporal variation of canopy physiological and functional traits at the single crown level [43,44], for example, the chlorophyll content, the photosynthetic rate, and the equivalent water thickness of leaves [39,45]. Some studies have verified that using the corrected multispectral LRI information is superior to the multispectral images in depicting the 3-D distribution of physiological features at the singletree level in large-scale forests [12,46]. Theoretically, the LMF method also has the potential to correct the LRI information from multispectral ALS datasets, which needs to be verified in future ecological research.

Conclusions
To correct the pointwise LRI information from the discrete ALS dataset, we designed the LMF method, a semi-theoretical correction method without the requirement of synergy field surveys for the backscatter reflectance of targets to emitting laser pulses. Based on the simplified Radar Equation, we calculated, point-by-point, the local reference range of UM for each non-forest point, and then replaced the outliers with the local median UM to recalculate their LRI values. For the pointwise LRI from tree crowns that were beyond their reference range, we replaced these with the local median LRI values. The average LRI-CV and LRI-STD decreased by more than 0.01 and 2.14, respectively, for five types of point sets after using the LMF method. The discrepancy among the pointwise LMF-corrected LRI values from diverse land cover types became more significant (with higher LRI-CJV) compared to those corrected by the RN method and in original LRI information. Besides, for over 78% of the sample point sets, the similarity of pointwise LMF-corrected LRI values collected at different timestamps was increased (with lower T CV and T CJV ). To classify the ground points into three land cover types (grass, road, and bare land) from ALS data, combining the pointwise LMF-corrected LRI with the 3-D true color information could increase the OA about 3% (κ ≥ 0.73, p < 0.05), compared with the results of using the original and RN-corrected LRI information. The variable pointwise UM introduced more noise than R to the original LRI information from the tested ALS data. The LMF method shows its ability to correct the outliers of LRI for the tested historical discrete ALS data collected by different laser scanners.