Estimating FPAR of maize canopy using airborne discrete-return LiDAR data

: The fraction of absorbed photosynthetically active radiation (FPAR) is a key parameter for ecosystem modeling, crop growth monitoring and yield prediction. Ground-based FPAR measurements are time consuming and labor intensive. Remote sensing provides an alternative method to obtain repeated, rapid and inexpensive estimates of FPAR over large areas. LiDAR is an active remote sensing technology and can be used to extract accurate canopy structure parameters. A method to estimating FPAR of maize from airborne discrete-return LiDAR data was developed and tested in this study. The raw LiDAR point clouds were processed to separate ground returns from vegetation returns using a filter method over a maize field in the Heihe River Basin, northwest China. The fractional cover (fCover) of maize canopy was computed using the ratio of canopy return counts or intensity sums to the total of returns or intensities. FPAR estimation models were established based on linear regression analysis between the LiDAR-derived fCover and the field-measured FPAR ( R 2 = 0.90, RMSE = 0.032, p < 0.001). The reliability of the constructed regression model was assessed using the leave-one-out cross-validation procedure and results show that the regression model is not overfitting the data and has a good generalization capability. Finally, 15 independent field-measured FPARs were used to evaluate accuracy of the LiDAR-predicted FPARs and results show that the LiDAR-predicted FPAR has a high accuracy ( R 2 = 0.89, RMSE = 0.034). In summary, this study suggests that the airborne discrete-return LiDAR data could be adopted to accurately estimate FPAR of maize.


Introduction
The fraction of photosynthetically active radiation (PAR) absorbed by vegetation in the 0.4-0.7μm spectrum, also known as FPAR, is one of important terrestrial variables controlling the mass and energy exchanges between vegetation and the atmosphere [1][2][3], and therefore is one of key parameters required in crop production models and Earth system models for simulating land vegetation-atmosphere interactions [4][5][6][7]. FPAR can be directly measured using the traditional and accurate ground-based methods, but direct measurements of FPAR are difficult and time consuming [1,4,8]. In addition, it is not feasible to apply the groundbased methods to measure FPAR across large landscapes [9]. Even at some regional scales, these methods are also difficult to use to measure FPAR for studying spatial patterns of FPAR [10]. Remote sensing provides an alternative and unique method to obtain repeated, rapid and inexpensive estimates of FPAR over large areas [6,11]. Passive remote sensing data have been widely used to derive FPAR using radiative transfer models or empirical relationships between FPAR and vegetation indices [5,12]. Previous studies showed a linear or close linear relationship between vegetation index (VI) and FPAR [5,13], where commonly used vegetation indices include the normalized difference vegetation index (NDVI), the simple ratio (SR), and the enhanced vegetation index (EVI where SR = (1 + NDVI)/(1-NDVI), FPAR max = 0.950, FPAR min = 0.001; FPAR max and FPAR min are independent of vegetation type; SR i,max is the SR value corresponding to 98% of NDVI population for type i vegetation; SR i,min is the SR value corresponding to 5% of NDVI population for type i vegetation. FPAR can also be estimated from empirical relationships between field-measured FPAR and leaf area index (LAI) [15]. Beer-Lambert law has been widely used to describe the exponential attenuation of monochromatic radiation in a canopy and approximates vegetation as a turbid medium [16,17]. Casanova et al. [18] found the ratio between transmitted and incoming photosynthetically active radiation (PAR; 0.4-0.7μm) under a canopy decreases approximately exponentially with LAI, and thus the FPAR can be expressed as a function of LAI as follows: 1 .
where FPAR is fraction of PAR intercepted by the canopy, c is extinction coefficient and LAI is leaf area index. Although vegetation indexes can be used to estimate FPAR, accuracy in the estimated FPAR is limited by saturation of vegetation indexes, because canopy spectral reflectance is less sensitive to LAI variation as LAI is greater than 3.0 [19, 20]. The FPAR estimation based on Eq. (2) is also affected by the background spectral information if LAI is low [21,22]. Furthermore, since optical remotely sensed data do not take into account the threedimensional structural characteristics of the canopy cover, the information reflected in the two-dimensional data such as multi-spectral imagery is not sufficient for estimating FPAR [8]. Although interferometric synthetic aperture radar (InSAR) can provide some threedimensional vegetation structural information, the coarse spatial resolution associated with InSAR suggests that the current InSAR technique is incapable of measuring stand-scale canopy vertical structures [8, 23, 24].
LiDAR (Light Detection and Ranging) is an active remote sensing technology and has been widely used in extracting three-dimensional information of land surfaces [25,26], such as fine spatial resolution and high vertical accuracy 3D structural characteristics of plant cover [27,28], which are very useful for studying forest canopy structure and leaf physiology, and estimating biophysical parameters [29, 30], e.g., tree height, crown width, fractional cover, LAI, total aboveground biomass, and timber volume [e.g., 31-37]. LiDAR systems have the ability to penetrate into vegetation canopy and thus can be used to characterize light transmittance through the canopy [38,39]. This, in turn, provides information about the available light for photosynthesis and its variation over the entire canopy depth [38].
Several studies have shown that airborne LiDAR data can be used to accurately estimate FPAR based on empirical relationships between FPAR and LiDAR derived metrics [e.g., 4,8,38,39]. Recent studies that used LiDAR data to estimate FPAR tended to concentrate on forest vegetation [e.g., 8, 39], and application of LiDAR data to estimate FPAR of crops such as maize and sorghum has barely been investigated. However, FPAR is an important parameter in many crop growth and yield models [e.g., 1,40]. Therefore, this study aimed to estimate FPAR of maize using airborne discrete-return LiDAR data. Three main objectives were pursued to achieve the goal of this study: 1) computing vegetation cover (fCover) using the ratio of canopy return counts or intensity sums to the total of returns or intensities; 2) establishing a series of regression relationships between LiDAR-derived fCover and fieldmeasured FPAR; and 3) evaluating the reliability of the constructed regression model by performing the leave-one-out cross-validation (LOOCV) analysis and validating the accuracy of the LiDAR-predicted FPARs.

Study area
The study area is in the Heihe River Basin, northwest China (38°56′N-38°59′N, 100°25′E-100°28′E) (Fig. 1). The mean annual air temperature, precipitation and potential evapotranspiration are about 7.3 °C, 129 mm, and 2047 mm in this arid region, respectively. The terrain in the study area is relatively flat with a mean elevation of 1403 m above sea level. A total of 40 FPAR values were measured in six sampling areas (see Fig. 1).

Field measurements
The field experiment was conducted in July 2012. The FPAR was measured as photosynthetic photon flux density (PPFD, µmol m −2 s −1 ) using LI-191SA Linear Quantum PAR Sensor (LI-COR, Inc.) [41]. This sensor spatially averages radiation over its 1 meter length, minimizes the error and allows one person to easily make many measurements in a short period of time (LI-COR, Inc.) [41]. To obtain FPAR, above-canopy downwelling (PAR ad ) and upwelling (PAR au ) PAR, below-canopy downwelling (PAR bd ) and upwelling (PAR bu ) PAR were measured for each sample plot. Linear quantum sensor was at an approximate height of 0.5 m above the canopy surface and 0.15 m above the ground for all measurements, respectively. This placement ensured that the sensor was above the short grass vegetation in each sampling plot. To reduce the error, the sensor was kept level when all the measurements were taken.
where PAR ad and PAR au are incident and reflected PAR above the canopy, respectively; PAR bd and PAR bu are incident and reflected PAR below the canopy, respectively [42]. The geographic coordinates of each plot center point were measured using a centimeterlevel GPS with a horizontal accuracy of 0.012 m and a vertical accuracy of 0.01 m. A total of 40 plots with an area of 5.0 m x 5.0 m each were measured at representative sites across the study area. The maize crop during the field survey was between pre-flowering and flowering stages. The average height of maize was 1.85 m, and the average row spacing and intra-row spacing were about 0.6 m and 0.25 m, respectively.

Collection of airborne LiDAR data
Small footprint airborne LiDAR data were acquired in July 2012 using a Leica ALS70 system. For the study area, the scan angle was ± 18° with a 60% ight line side lap. The average flying height was 1300 m above the ground with a velocity of 60 m/s and an average point density of 7.4 points/m 2 . Raw LiDAR data was converted to a LAS binary format file including x, y, z coordinates and intensity return values.

LiDAR data processing
The flowchart of LiDAR data processing and PFAR estimation is shown in Fig. 2. Raw airborne LiDAR data may contain some outliers, which are extremely higher or lower than other points in the vicinity and isolated points up in the air or below the ground. Outliers were removed from raw LiDAR data before they were processed by the TerraScan software (TerraSolid, Ltd., Finland). Then LiDAR point clouds were classified as canopy and ground returns using the progressive triangulated irregular network (TIN) densification method proposed by Axelsson [43] in the TerraScan software. It is very important to carry out some manual check and reclassification since the classification may misclassify some points and any misclassification of point clouds could result in significant errors in the extracted DEM and vegetation structural parameters.
The fractional cover of canopy (fCover) is defined as the fraction of ground covered by vegetation [32]. The LiDAR derived canopy fractional cover can be calculated as the ratio of the number of canopy laser returns or intensity sums to total returns or intensities based on Eq. (4) where fCover is the fractional cover of canopy, N canopy is the number of canopy laser returns or the sum of intensity and N total is the total returns or intensities. LiDAR intensity is the amount of energy reflected back to the LiDAR sensor, which is a function of several variables such as laser power, angle of incidence, target reflectivity and LiDAR sensor-to-target range [47,48]. It is therefore necessary to calibrate intensity values to achieve a better comparison among different strips, flights or regions [49]. To improve accuracy of the LiDAR derived canopy fractional cover, LiDAR intensity data were normalized using Eq. (5) [50].
where I normalized is the normalized intensity, I is the raw intensity, R is the sensor to target range, R s is the reference range or average flying height (in this study R s = 1000 m) and a is the angle of incidence. Since this study area has a relatively flat terrain, the scanning angle and the angle of incidence were approximately equal. Therefore, we used the scan angle as the incidence angle a in Eq. (5), and the effect of terrain on the LiDAR intensity was neglected in this study [49,51]. The reflectances are different between the canopy and ground at the infrared wavelength (1064 nm). To reduce the effect of this reflectance difference on accuracy in the estimated canopy fractional cover, the LiDAR intensity values of canopy and ground need be adjusted according to Eq. (6) [52][53][54].
where k is the adjusting factor of reflectance, I canopy and I ground are the intensity of canopy and ground, respectively. In this regard, previous studies generally used 2.0 as the k value in calculating fCover [e.g., [52][53][54]. To determine the optimal k value and improve the accuracy of FPAR estimation, we examined a sequence of k values from 1.0 m to 10.0 m with an increment of 0.5 m and selected the value that yielded the highest R 2 value in a linear regression against the field-measured FPAR [46].

FPAR estimation from LiDAR data
The fCover is dependent on canopy structure up to a certain radius size and the trial-and-error method could be used to determine an optimal radius size [55]. Previous studies provided experiential evidence on how to identify a reasonable radius size [32,37]. To obtain the optimal plot size for estimating FPAR from the LiDAR data, the center coordinates of each plot were applied to extract return counts and intensity for each plot with different radii from classified LiDAR point clouds. We examined a range of plot radii from 1.0 m to 7.0 m with an increment of 0.5 m. For each plot, the number of canopy returns and the sum of intensity (N canopy ) and the total returns and intensities (N total ) were calculated for each specific radius. LiDAR return-and intensity-based fCover of each plot were computed according to Eq. (4) using N canopy and N total , respectively. The linear regression analysis of the LiDAR-derived fCover against the field-measured FPAR was carried out to build an empirical FPAR estimation model as given in Eq. (7): where y is FPAR, x is fCover based on returns or intensities, a and b are the slope and the y intercept of the regression line, respectively. The coefficient of determination (R 2 ), the adjusted R 2 (adj. R 2 ) and the root mean square error (RMSE) across a range of plot sizes (i.e., 1.0 m-7.0 m in radius) were calculated. The statistical results of each radius were compared for determining the optimal FPAR estimation model and the optimal plot size.

Results
Since fCover varies with the plot size [37,55], to investigate the effect of different plot sizes on model performance, fCover was estimated on the 40 field plots using a range of plot radii from 1.0 m to 7.0 m with an increment of 0.5 m. A linear regression analysis (i.e., Eq. (7)) between 25 field-measured FPARs and LiDAR-derived fCover values was carried out. The statistical characteristics of the estimated FPARs with different plot radius are shown in Table  1. According to Table 1, as the plot radius is 4.0 m, both LiDAR return-and intensity-based fCover estimations have the smallest RMSE and the highest correlation with the fieldmeasured FPAR. Accounting for the reflectance difference between the canopy and ground, the fCover was calculated using Eq. (8) to improve the accuracy of FPAR estimation. Results showed the accuracy of intensity-based FPAR estimation was the highest (R 2 = 0.90) as the k value was 3.0 in Eq. (8), while R 2 was 0.85 for the LiDAR return-based fCover estimation. Scatterplots of the LiDAR-derived fCover versus the field-measured FPAR and the corresponding regression models are presented in Fig. 3. To determine the optimal model of FPAR estimation using the airborne LiDAR data, we compared the LiDAR intensity-based and return-based models. Results indicate that the intensity-based prediction model had the highest accuracy (R 2 = 0.90, adj. R 2 = 0.89, RMSE = 0.032) and yielded slightly better estimates compared to the LiDAR return-based model (R 2 = 0.85, adj. R 2 = 0.84, RMSE = 0.038). This result is in a good agreement with Hopkinson and Chasmer's results which showed that laser intensity can potentially provide an accurate estimate of FPAR [44], because the intensity methods implicitly provide some quantification of the surface area that the pulse is interacting with in the form of the reflectance amplitude (e.g., intensity) [44].

Accuracy assessment
To assess the model's reliability, the predicted residual sums of squares (PRESS statistic) was calculated using the leave-one-out cross-validation (LOOCV) analysis [37,56], which is an effective method to evaluate the generalization capability of regression models, being particularly useful when a low number of field-measured data are available [56]. The root mean square error from the cross validation analysis (RMSE cv ) was computed as the square root of the ratio between the PRESS statistic and the number of observations [34]. The RMSE cv was 0.033 for the intensity-based regression model and the RMSE was 0.032. The close consistency between the RMSE cv and the RMSE suggests that the regression model is not overfitting the data and has a good generalization capability. In addition, we obtained the accuracy of LiDAR-predicted FPARs using other 15 field-measured FPARs which are not used for the regression analysis (Twenty-five out of the total 40 field-measured FPARs were used to construct the FPAR estimation model). Results show a good relationship (R 2 = 0.89, RMSE = 0.034, RMSE cv = 0.035) between the field-measured and the LiDAR-predicted FPARs (Fig. 4).

Discussion
In this study, we explored the potential of airborne discrete-return LiDAR data in estimating FPAR of maize. Results show that airborne LiDAR data can be used to estimate FPAR accurately. However, LiDAR data acquisitions over large areas are extremely expensive and they become more expensive at higher sampling densities [57]. LiDAR and optical remotely sensed data can be considered complementary, as LiDAR can provide vertical structural information of canopy and optical data can reflect the spectral information of the ground cover [57]. Fusion of airborne LiDAR and multispectral or hyperspectral imagery has promising prospective for estimating FPAR at regional scales [58][59][60]. More effort is needed to study how to combine airborne LiDAR data and optical remotely sensed imagery to retrieve vegetation biophysical parameters (e.g., biomass and structure) [61,62]. The FPAR estimation approach developed in this study can be applied to other sites. However, since the estimation model is empirical, it could not be directly used for different study areas and vegetation types, and LiDAR data must be analyzed and modeled according to actual situations to obtain the optimal FPAR estimation model suitable for the study area.
Thomas et al. [38] conducted spatial simulations of FPAR using a LiDAR-hyperspectral approach and found the strongest correlation between FPAR and mean LiDAR height. Here we also performed a linear regression analysis between mean LiDAR height and the fieldmeasured FPAR and found that the maximum R 2 between FPAR and mean LiDAR height was 0.61 (RMSE = 0.062), which is lower than the LiDAR intensity-based estimation model developed in this study. Moreover, Thomas et al. [38] showed that the optimal cell size for estimating FPAR was 20 m in their study. However, our results indicated that 4.0 m radius plot size was the appropriate scale for estimating FPAR of maize. The difference between our results and Thomas et al's could be due to different vegetation types, study areas, and average LiDAR point density.
The FPAR retrieved from airborne LiDAR data in this study is only the instantaneous value. Daily PAR can be calculated through integrating instantaneous PAR from sunrise to sunset [63]. To date, vegetation LiDAR research has been largely empirical, focusing on strong relationships between LiDAR-derived metrics and vegetation height, LAI, FPAR, biomass, and volume [38]. Further research on developing some innovative physically based methods, e.g., radiative transfer models [59,64], to estimate biophysical and biochemical parameter from airborne LiDAR is needed.

Conclusions
FPAR is an important input parameter for terrestrial ecosystem models. Although many studies have shown that LiDAR is a viable technique for estimating FPAR, the majority of these studies are forestry-oriented. The aim of this study was to investigate the feasibility of applying discrete-return small-footprint LiDAR data to estimate FPAR of maize. A method to retrieve FPAR from LiDAR data was proposed and applied to a maize field in northwest China. The canopy and ground returns were first separated from raw LiDAR point clouds. The canopy fractional cover (fCover) was estimated based on the ratio of the number of canopy laser returns (or intensity sums) to total returns (or intensities). FPAR estimation models of maize were established based on linear regression analysis between the fieldmeasured FPAR and the LiDAR-derived fCover. Results show strong correlations between the LiDAR-derived fCover and the field-measured FPAR (R 2 = 0.90, RMSE = 0.032, p < 0.001). It was found that 4.0 m plot radius was the optimal scale for estimating FPAR. Accuracy in the estimated FPAR can be improved through normalizing the LiDAR intensity value based on a standard range and the incidence angle. It was also found that the LiDAR intensity-based FPAR estimation model had a higher accuracy compared to the LiDAR return-based FPAR model. The reliability of the LiDAR intensity-based FPAR model was assessed using the LOOCV procedure. Results show that the regression model is not overfitting. Finally, 15 independent field-measured FPARs (that were not used in the regression analysis) were used to evaluate accuracy of the LiDAR-predicted FPARs, and the coefficient of determination (R 2 ) is 0.89 and RMSE is 0.034.