Synergistic use of multi-and hyperspectral remote sensing data and airborne LiDAR to retrieve forest floor reflectance

Forest floor vegetation can account for a notable fraction of forest productivity and species diversity, and the composition of forest floor vegetation is an important indicator of site type. The signal from the forest floor influences the interpretation of optical remote sensing (RS) data. Retrieval of forest floor reflectance properties has commonly been investigated with multiangular RS data, which often have a coarse spatial resolution. We developed a method that utilizes a forest reflectance model based on photon recollision probability to retrieve forest floor reflectance from near-nadir data. The method was tested in boreal, hemiboreal, and temperate forests in Europe, with hemispherical photos and airborne LiDAR as alternative data sources to provide forest canopy structural information. These two data sources showed comparable performance, thus demonstrating the value of using airborne LiDAR as the structural reflectance model input to derive wall-to-wall maps of forest floor reflectance. We derived such maps from multispectral Sentinel-2 MSI and hyperspectral PRISMA satellite images for a boreal forest site. The validation against in situ measurements showed fairly good performance of the re-trievals in sparse forests (that had effective plant area index less than 2). In dense forests, the retrievals were less accurate, due to the small contribution of forest floor to the RS signal. We also demonstrated the use of the method in monitoring the recovery of forest floor vegetation after a thinning disturbance. The reflectance model that we used is computationally efficient, making it well applicable also to data from new and forthcoming hyperspectral satellite missions.


Introduction
Forest floor, defined here as all material below a tree canopy, is composed of varying mixtures of living plants, litter, lichen, rock, soil, and snow (Pisek et al., 2010).Forest floor vegetation can account for a notable fraction of forest productivity (Ikawa et al., 2015) and species diversity (Hart and Chen, 2006).Composition of forest floor vegetation is also an important indicator of site type (Cajander, 1926).Monitoring of forests through remote sensing has traditionally focused on the tree canopy, which has the largest economical value and often dominates the remote sensing signal.Depending on the density of the tree canopy, also the forest floor can have a large contribution to the forest reflectance observed through passive optical remote sensing (Eriksson et al., 2006;Rautiainen and Heiskanen, 2013;Suzuki et al., 2011).Variation of the forest floor reflectance can have a notable influence on the interpretation of the remote sensing signals from forests (Eriksson et al., 2006;Pisek and Chen, 2009;Spanner et al., 1990), and information on the forest floor reflectance characteristics can help to interpret these signals.
Forest floor reflectance can potentially also be utilized in assessing the vegetation composition of the forest floor or site fertility, as these variables are related to the forest floor reflectance characteristics (Forsström et al., 2021;Rautiainen et al., 2011).Furthermore, forest floor reflectance and its spatial variability can be used as input for forest reflectance and land surface models.
Previous studies have evaluated the potential of spectral remote sensing data, acquired from either airborne (Markiet and Mõttus, 2020;Pisek et al., 2010Pisek et al., , 2015a) ) or satellite (Canisius and Chen, 2007;Jiao et al., 2014;Kuusinen et al., 2015;Pisek and Chen, 2009;Pisek et al., 2015bPisek et al., , 2016Pisek et al., , 2021;;Yang et al., 2014) platforms, in estimating forest floor reflectance.An approach that utilizes multiangular remote sensing data together with a geometric-optical model has been evaluated in several papers (Canisius and Chen, 2007;Jiao et al., 2014;Pisek et al., 2010Pisek et al., , 2015aPisek et al., , 2015bPisek et al., , 2016Pisek et al., , 2021)).With the help of a geometric-optical forest reflectance model, the surface area fractions of different components of the forest (shaded and sunlit forest floor and canopy) can be estimated.Consequently, the reflectance of the sunlit forest floor and canopy components can be retrieved if remotely sensed forest reflectance data at two different view angles is available.The requirement of multiangular observations has restricted the majority of studies to low spatial resolution (MODIS or MISR) satellite data (Canisius and Chen, 2007;Jiao et al., 2014;Pisek and Chen, 2009;Pisek et al., 2015bPisek et al., , 2016Pisek et al., , 2021)).Use of multiangular airborne (Pisek et al., 2010(Pisek et al., , 2015a) ) or unoccupied aerial vehicle data is also possible, but such data are not commonly available.As discussed by Gemmell (2000), a reason for using multiangular data is that there are several variables contributing to forest reflectance, and thus a single nadir observation is not enough for their retrieval.Previous approaches to retrieve forest floor reflectance from nadir data have relied on the assumption that the forest floor reflectance is invariant within some predefined area (Yang et al., 2014) or within a certain site type (Kuusinen et al., 2015).
The optical remote sensing signal of a forest is composed of signals from the tree canopy and the forest floor.If the former is known (e.g., from a physical reflectance model), the forest floor reflectance can be estimated from a single remote sensing observation (pixel).This would make it possible to utilize data with nadir-view geometry that are more readily available and have finer spatial resolution compared to multiangular satellite datasets.To our knowledge, the only such study was presented by Markiet and Mõttus (2020) who estimated forest floor reflectance from airborne hyperspectral data.A challenge with the physically-based approaches is that they require some additional data on canopy structure to be able to parameterize the model.Airborne Light Detection And Ranging (LiDAR) data allows the estimation of canopy structural parameters such as canopy cover, plant area index, and canopy gap fractions (Kamoske et al., 2019;Korhonen et al., 2011;Webster et al., 2020).Airborne LiDAR data can estimate canopy structure parameters in a wall-to-wall map, which can be used to parameterize forest reflectance models so that the only remaining unknown variable is the forest floor reflectance.
The purpose of our study was to develop and test a new approach for estimating forest floor reflectance spectra from remote sensing data at stand level.We used a physically-based forest reflectance model to interpret near-nadir multi-and hyperspectral airborne and satellite data.More specifically, we 1) evaluated the accuracy of the forest floor reflectance retrieval when using hemispherical photographs or airborne LiDAR data as input in the forest reflectance model, and 2) produced and validated a local map of forest floor reflectance for a boreal study area and demonstrated its use in monitoring the recovery of forest floor vegetation after a thinning disturbance.This work is an advance towards operational mapping of forest floor reflectance at stand level from openly available medium-resolution satellite data as well as from future hyperspectral satellite data sets.
We retrieved forest floor reflectance spectra from multi-and hyperspectral remote sensing data using the PARAS model.PARAS is a forest reflectance model that applies photon recollision probability (p) for modeling forest canopy scattering (Rautiainen and Stenberg, 2005).As input, the model takes canopy element (i.e., foliage and woody element) spectra and canopy structural parameters (canopy interception and p).We first tested our method using airborne hyperspectral data and a set of test plots (40 × 40 m, number of plots (n) = 61), distributed across all four study sites in order to maximize variation in forest canopy structure.We compared two alternative approaches to obtain the canopy structural parameters required for the PARAS model: 1) hemispherical photographs and 2) airborne LiDAR.The purpose of this comparison was to evaluate the performance of airborne LiDAR in estimating the canopy structural parameters for the PARAS model.Use of LiDAR data would allow mapping forest floor reflectance for large areas.Finally, we created a wall-to-wall map of forest floor reflectance for the Hyytiälä site (area of ca. 4 × 4 km), using the combination of airborne LiDAR with either Sentinel-2 MSI multispectral or PRISMA hyperspectral data.The map was validated using a set of independent validation plots (60 × 60 m, n = 5), and its information content was demonstrated with practical examples.Our workflow is presented in Fig. 1.The following Sections 2.2-2.4 present our data, and the retrieval method for forest floor reflectance is explained in detail in Section 2.5.

Airborne hyperspectral and LiDAR data
Airborne data were collected in all study sites close to peak-growing season in 2019 (Table 1), using the Flying Laboratory of Imaging Systems FLIS (CzechGlobe -Global Change Research Institute CAS, 2022; Hanuš et al., 2016) that allowed synchronous acquisition of hyperspectral and LiDAR data.The data were acquired at approximately 1 km altitude above ground level.The hyperspectral data were acquired with CASI-1500 and SASI-600 pushbroom sensors (Itres Ltd., Canada), which measure at wavelength ranges of 382-1052 nm and 958-2443 nm, respectively, and have sampling interval and spectral resolution (fullwidth-at-half-maximum, FWHM) of 15 nm.The pixel size on the ground was 0.5 m (CASI) or 1.25 m (SASI).The data were geo-orthorectified to a canopy surface model computed from the airborne LiDAR data and were atmospherically corrected with ATCOR-4 software v7.2.0 (Richter and Schläpfer, 2018), to yield hemispherical-directional reflectance factors (HDRF).Further, the data were manually checked to exclude visible clouds or cloud shadows from the analysis.The nearest-to-nadir cloudfree flight line was selected for each of our test plots, and the mean HDRF spectrum of pixels within the plot (an area of 40 × 40 m) was used in further analyses.See Hovi et al. (2022) for details on the airborne hyperspectral data acquisition and processing.
The sensor operates at a wavelength of 1064 nm and the emitted pulse has beam divergence of 0.25 mrad (measured at 1/e 2 ).Echoes were extracted from the waveform data and georeferenced into ETRS89 coordinate system with UTM projection and ellipsoid heights.Riegl's software suite (RiProcess v1.8.4,RiAnalyze v6.2.2,RiWorld v5.1.3,and GeoSysManager v2.0.8) was used in the processing.The processing included a strip adjustment to ensure a good geometric match between the flight lines.We interpolated a raster digital elevation model (DEM) for each study site at 1 m resolution by Delaunay triangulation of ground echoes that had been detected with LASTools (algorithm "lasground -nature -fine").

Sentinel-2 MSI multispectral satellite data
We downloaded a total of 13 multispectral Sentinel-2 MSI images for the Hyytiälä site for years 2019-2021 (Table 2) from the Copernicus Open Access Hub (Copernicus, 2022).The selection criteria were that the image should be cloud-free and acquired during summer months (from mid-June to the end of August).We used atmospherically corrected surface reflectance (HDRF) data, which we generated from Level 1C images with the Sen2Cor processor (version 02.09.00), using the A. Hovi et al. default parameters and the altitude set to 170 m above sea level.The images were acquired at 12:50-13:01 local time (09:50-10:01 UTC) and, depending on the acquisition date, the sun zenith angle varied between 39 and 49 • (Table 2).The data had 20 m pixel size and were in the UTM zone 35 projection, which was the same as used for the airborne data in Hyytiälä.

PRISMA satellite data
To demonstrate forest floor reflectance retrieval from hyperspectral satellite data, we used a PRISMA (PRecursore IperSpettrale della Missione Applicativa) image produced by the Italian Space Agency.The image was acquired over the Hyytiälä site on June 26th 2020, at 13:02 local time (10:02 UTC), with sun zenith angle of 39 • and view zenith angle up to 10 • .The visible-near-infrared (VNIR) and shortwaveinfrared (SWIR) sensors of PRISMA measure at wavelength ranges of 402-973 nm and 943-2497 nm, respectively.The sampling interval is 6-12 nm and spectral resolution (FWHM) 9-15 nm, depending on wavelength.We used a Level 2D image (product name PRS_L2D_STD_20200626100214_20200626100218_0001, processing time 1st February 2022, processor version "02.05",L1 processor version "L2D_3.9-2").Level 2D means an atmospherically corrected and orthorectified image.The projection was UTM zone 35 and the pixel size on the ground was 30 m.The image had georeferencing errors up to 200 m, which represent a typical geolocation accuracy of the PRISMA products currently.We used aerial orthoimages of the National Land Survey of Finland, downloaded from their file service (NLS -National Land Survey of Finland, 2022) on 2nd February 2022, to identify natural ground control points (GCP) such as small lakes and road crossings within a subset of the image that covered our study area.We then determined shifts in X and Y directions for the PRISMA image.The rootmean-square-errors (RMSE) of the GCPs after applying the shifts were 10 m and 9 m in X and Y directions, respectively.

In situ and laboratory measurements 2.3.1. Test plots
Rectangular (40 × 40 m) field plots were established and measured in Hyytiälä (2019, n = 20, and2021, n = 11), Järvselja (2020, n = 13), Bílý Kříž (2019, n = 7), and Lanžhot (2019, n = 10) (Table 3).All measurements were conducted in June to September when the trees had full leaves.The measurements included forest inventory for determining the tree species proportions, hemispherical photography for determining canopy interception and plant area index, and measurements of forest floor reflectance spectra.The plot centers were geolocated to submeter accuracy, based on terrestrial LiDAR data that were geometrically matched with the airborne LiDAR point cloud, using treetops as tie points.Tree species proportions, expressed as fractions of total basal area in the plot, were calculated from the in situ forest inventory Fig. 1.Workflow of the study.The work comprised of a test phase in which the retrieval of forest floor reflectance was tested in a set of test plots (n = 61), and a mapping/validation phase in which a map of forest floor reflectance was created and validated against in situ measurements in a set of validation plots (n = 5).The white boxes with red dashed outlines depict the data used or created in the test phase, and those with blue solid outlines represent the data used or created in the mapping/validation phase.The white boxes with thin black outlines represent the data common to all parts of the study.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 1
Parameters of the airborne hyperspectral and LiDAR data acquisitions in the study sites.Hyperspectral and LiDAR data were acquired during the same flight, and thus the general parameters (time, sun zenith angle) apply to both datasets.measurements.
We report here the main steps of hemispherical photography and forest floor spectral measurements.For details, see Hovi et al. (2022), who reported the measurements performed before 2021 (the measurements for year 2021 followed a similar protocol).Hemispherical photographs were acquired in a 4 × 4 grid with 10 m spacing (16 measurement spots).We used a Nikon D5000 camera with a Sigma 4.5 mm f/2.8 DC HSM Circular Fisheye lens.The photographs were taken at height of 1 m in plots with mean tree diameter smaller than 10 cm, and at 1.5 m elsewhere.The photography was conducted in diffuse illumination conditions, either daytime with overcast sky, or early in the morning or late in the evening.We used the blue band for separating canopy and sky pixels with the thresholding algorithm by Nobis and Hunziker (2005).The photograph was then divided into five concentric zenith rings that had median (and range) of zenith angles as 11 and 67 • (60-73 • ).The zenith rings correspond closely to those in the commonly used LAI-2000/2200 instruments.We calculated effective plant area index (L eff ) and diffuse interception (i D ) for each study plot according to the LAI-2200 manual (LI-COR, 2012) as. and (1) where i(θ i ) is the mean canopy interception in the plot at zenith angle θ i , and Forest floor hemispherical-conical reflectance factor (HCRF) spectra were measured on a 11-m-long east-west oriented transect in the center of each plot, using an ASD FieldSpec4 spectrometer (ser.nr.18,456, field-of-view 25 • ).Total of 15 measurements were acquired on the transect at 0.8-m intervals.A white reference measurement (a calibrated Spectralon panel with 99% nominal reflectance) was acquired in the beginning and end of the transect, and at every 3rd measurement spot.To avoid shadows cast by the trees, the measurements were conducted in diffuse illumination conditions.Raw digital number data recorded by the spectrometer were converted to HCRF by normalizing the target spectra to the white reference measurements.The white reference measurements were linearly interpolated in time for accurate results.The transect measurements were averaged, and the plot-wise average and standard deviation of the spectra were used in further analyses.
Three plots in Hyytiälä did not include forest floor spectral measurements and were thus used only for modeling canopy structure from airborne LiDAR.

Validation plots
Five large rectangular plots (60 × 60 m) were measured in the Hyytiälä site between June 30th and August 5th in summer 2021.These plots were used for validating the forest floor reflectance maps produced from Sentinel-2 MSI and PRISMA data.The plots had different tree densities as well as different tree species and forest floor composition.Forest floor spectra were measured using the same instrumentation as for the test plots.Measurements were conducted in a 6 × 6 grid with m spacing (36 measurement spots) in diffuse illumination conditions.For each measurement spot, one forest floor reflectance spectrum was measured.The white reference panel was measured before and after the forest floor measurement, and the target spectrum was normalized to the average of the two white reference measurements.The center of each plot was geolocated by matching terrestrial and airborne LiDAR, similarly as for the test plots.Forest inventory and hemispherical photography were also conducted to document the tree canopy structure in the plots.No disturbances in the tree canopy occurred in the validation plots between 2019 and 2021, i.e., in the time between the airborne LiDAR acquisition and the in situ measurements.The plot-wise average and standard deviation of the spectra were used in further analyses.

Foliage and woody element spectra
Tree foliage and woody element spectra were used as input in the PARAS model.Foliage was sampled in all study sites concurrently with the field campaigns of 2019 and 2020 (Section 2.3.1).We sampled topof-canopy and bottom-of-canopy foliage, and for conifers also current and previous year needles.We measured directional-hemispherical reflectance and transmittance spectra of the foliage, using ASD Field-Spec3 or FieldSpec4 spectrometers coupled with ASD RTS-3ZC integrating spheres (measurements described in detail in Hovi et al., 2022).Reflectance and transmittance were summed to obtain albedo spectra that were used in our analyses.
For woody elements (stems, branches, twigs) we used stem bark hemispherical-directional reflectance factor spectra measured at 1.3 m height with a Specim IQ hyperspectral camera in Finland (Helsinki area) and Estonia (the Järvselja site) (measurements described in detail in Juola et al., 2022).Since the data covered only wavelengths up to nm, we used data from other published sources for wavelengths above 1000 nm.Pine, spruce, and birch stem bark spectra were obtained from Lang et al. (2002).Aspen stem bark spectrum from Spencer and Rock  16 (0-97)  30 (0-91)  1 (0-3)  -Alder  0 (0-2)  17 (0-82)  -1 (0-5)  Aspen/poplar  0 (0-1)  8 (0-50)  0 (0-3)  5 (0-47 (1999) was used for all other broadleaved species.Continuity of the spectra was ensured by multiplying the spectra from these additional sources with a scaling factor so that the reflectance at 1000 nm matched with the measurements of Juola et al. (2022) at the same wavelength.The same scaling procedure was used by Hovi et al. (2022).Due to lack of available branch and twig spectra, stem bark reflectance spectra were used as proxies of the albedo spectra of all woody elements in the PARAS model.Finally, representative foliage and woody element spectra for each tree species and for each study site were obtained as weighted averages of the measured samples.Top-of-canopy and bottom-of-canopy foliage were given equal weights.For conifers, current-year and older needle age classes were weighted according to their estimated shares of total foliage based on needle annual turnover rates.One-year old needles were assumed to represent all older needle age classes.Woody element spectra were calculated simply as species-specific averages.Section 2.4 in the Supplementary material of Hovi et al. (2022) describes in detail the calculation of representative foliage and woody element spectra.

Tree species maps (the MS-NFI product)
To map forest floor reflectance for the entire Hyytiälä site (4 × 4 km), information on tree species composition was required.As maps of tree species were not readily available from our own data sources, we used the Multi-Source National Forest Inventory (MS-NFI) raster maps for the year 2019 (©Natural Resources Institute Finland (NRIF) 2021).These open datasets were downloaded from the file service of the NRIF (Luke, 2022).The raster maps were used to calculate tree species fractions (as fractions from total stem volume) of the entire 4 × 4 km area.The original pixel size of the data was 16 m and we resampled them into the same 20 m and 30 m pixel sizes as the Sentinel-2 MSI and PRISMA data.

Retrieval of forest floor reflectance
Our goal was to retrieve forest floor reflectance spectra from airborne or satellite data.To do this, we predicted HDRF of the forest canopy with the PARAS model, and then removed its contribution from the signal measured by the remote sensor.The PARAS model assumes that the forest floor is Lambertian, and therefore its reflectance factor (measured in any view direction) is equal to reflectance.Thus, we refer to the retrieved quantity as 'forest floor reflectance' throughout the text.We used the version of PARAS model published in Hovi et al. (2022) that includes multiple scattering between forest floor and the canopy.The model simulates the forest HDRF in view direction Ω, R(↓ sky ,Ω), as the sum of contributions from the canopy and the forest floor as where R BS (↓ sky ,Ω) is the HDRF of the forest canopy in a black-soil case (i.e., when forest floor is optically black), T BS (↓ sky ,↓) is the bihemispherical transmittance of the forest canopy for radiation that enters the canopy from the sky, R G is the bi-hemispherical reflectance of the forest floor (i.e., the 'forest floor reflectance'), R S (↑,↓) is the downward bi-hemispherical reflectance of the canopy for radiation that enters the canopy from below, and T S (↑,Ω) is the hemispherical-directional transmittance factor of the canopy in direction Ω for radiation that enters the canopy from below.The second term on the right-hand side of Eq. 3 represents the forest floor contribution to the forest HDRF and comprises two components: T BS (↓ sky ,↓)/(1-R G R S (↑,↓)), which quantifies the downward radiation flux on the forest floor (averaged over shaded and sunlit areas), and R G T S (↑,Ω), which quantifies the fraction of the downward radiation flux that is scattered by the forest floor and then transmitted by the canopy towards the remote sensor.
If the HDRF of the forest is known, forest floor reflectance (R G ) can be solved from Eq. 3 as Eq. 4 shows that R G is obtained by subtracting the HDRF of the canopy, R BS (↓ sky ,Ω), from the HDRF of the forest, R(↓ sky ,Ω), and dividing the result with a term that describes the visibility of the forest floor towards the airborne sensor.We obtained R(↓ sky ,Ω) from the remote sensing measurements (airborne, Sentinel-2, or PRISMA).The other terms in Eq. 4 were modeled as where i 0 is the canopy interception of incoming radiation, i D is the canopy interception of diffuse radiation, i Ω is the canopy interception of radiation in the direction of view, and ω C (⋅, ⋅) are illumination and viewing geometry dependent canopy scattering coefficients.If the canopy is optically black (a situation that is fairly close to reality in red and blue wavelength regions), only the direct transmission through canopy gaps contributes to the visibility of the forest floor, and the term in the denominator of Eq. 4 becomes (1-i 0 )(1-i Ω ).
The interception terms were obtained from hemispherical photographs or estimated from airborne LiDAR data.The interception of incoming radiation was estimated as i 0 = D × i D +(1-D) × i S , where i S is the canopy interception in the direction of the Sun, and D is the spectrally dependent diffuse fraction from total incoming radiation.We calculated D for airborne hyperspectral, Sentinel-2, and PRISMA data, using the default discrete-ordinates method in the libradtran radiative transfer library version 2.0.4 (Mayer and Kylling, 2005;Emde et al., 2016).The canopy scattering coefficients were modeled based on photon recollision probability (p) as where ω E is canopy element albedo, Q is the wavelength-dependent canopy reflectance to total scattering ratio, and Q Ω is the ratio of canopy HDRF to canopy bi-hemispherical reflectance.In other words, the fraction (1-p)ω E /(1-pω E ) is the canopy albedo (quantifying the scattering to all directions), multiplying it by Q gives the canopy reflectance, and multiplying the result by Q Ω yields the canopy HDRF.Following Stenberg (2007), p was estimated as where L is plant area index, L eff is the effective plant area index, and β C is canopy clumping coefficient.We assumed β C to be 1 here, meaning that the canopy elements are randomly distributed.However, we accounted for shoot clumping by modifying the canopy element albedo using the shoot level p as described below.
Canopy element albedo was estimated as A. Hovi et al.where f sp are tree species fractions, f W,sp are species-specific woody element fractions from the effective (light-intercepting) plant area, and ω W,sp and ω S,sp are the woody element and shoot albedo, respectively, for each tree species.The values of f W,sp were taken from Hovi et al. (2022) (0.32 for Scots pine, 0.30 for Norway spruce and silver fir, and 0.12 for all broadleaved species).The shoot albedo was obtained from leaf (or needle) albedo (ω L ) and photon recollision probability of a shoot (p S,sp ) as.
ω S,sp = , where (14) STAR S,sp is the spherically averaged silhouette to total area ratio of a shoot.We set the shoot clumping index 4 × STAR S,sp to 0.6 for conifers and 1 for broadleaved, indicating no shoot-level clumping for broadleaved species.
Finally, we estimated Q and Q Ω as.
Eq. 16 is from Mõttus and Stenberg (2008), and the asymmetry parameter q was estimated from plant area index as q = 1 -exp(−0.1684× L) (Stenberg et al., 2013).Eq. 17 was empirically derived from the results of Hovi et al. (2022).The constant 0.71 is a correction factor based on the mean ratio of 'true' Q Ω to its theoretical estimate i Ω /i D .Hovi et al. (2022) obtained the 'true' Q Ω for each of their study plots separately, by fitting the PARAS model to the remotely sensed spectral HDRFs of the plot.

Estimation of canopy angular interception and photon recollision probability with airborne LiDAR
We tested two different methods for estimating the structural input parameters of the PARAS model (angular canopy interception, p) from the airborne LiDAR data.First, we tested the commonly used area-based approach (ABA), in which forest variables are predicted with statistical models from metrics describing the height distribution of LiDAR echoes in the given estimation area (e.g., a forest plot or a raster grid cell) (Naesset, 2002; for use of ABA in prediction of angular canopy closure, see Korhonen et al., 2011).Second, we tested synthetic hemispherical photographs (SHPs) (Varhola et al., 2012;Hancock et al., 2014;Webster et al., 2020) calculated from the airborne LiDAR data.In the SHP method, each LiDAR echo is assumed to be an opaque sphere, and the spheres are projected onto a synthetic binary hemispherical image, where a black pixel (i.e., the sphere) represents canopy and a white pixel represents the sky.Our response variables were the plot-wise average canopy interception values at each of the five zenith rings of the hemispherical photographs.From the interception values, we then calculated L eff and i D by integrating over the zenith rings, similarly as done for the real hemispherical photographs (Eqs.1-2).Finally, we calculated p from effective plant area index (L eff ) and canopy diffuse interception (i D ) (Eq. 12).Importantly, in preliminary tests we found that estimating L eff and i D directly from the LiDAR metrics using the ABA method resulted in physically non-meaningful combinations of L eff and i D , and thus erroneous p values in some of our test plots in sparse forests.
We used a grid cell size of 40 × 40 m in the ABA to match with the size of our test plots.As the predictor variable, we used the all-echo cover index, i.e., the ratio of LiDAR echoes obtained from the canopy to all LiDAR echoes in the grid cell.In the SHP method, we simulated 16 SHPs per each test plot, at the same locations where the real hemispherical photographs had been taken.To account for the variation of LiDAR pulse density (Table 1), we used a variable sphere radius that depended on local pulse density in the vicinity of the plot.Supplementary material explains the technical details of the implementation and parameters of the ABA and SHP as well as comparison of the estimated angular canopy interception and p against real hemispherical photographs.Results regarding the use of the estimated canopy interception and p in the retrieval of forest floor reflectance are reported in Sections 3.2 and 3.3.

Wall-to-wall maps of forest floor reflectance
We generated wall-to-wall maps of forest floor reflectance for the Hyytiälä study site, using the Sentinel-2 MSI and PRISMA images, airborne LiDAR data, and the tree species fractions from the Finnish MS-NFI product.First, canopy interception values at the five zenith rings were predicted for the Hyytiälä site (4 × 4 km area) in regular grids with 20 m and 30 m spacing, which corresponded to the pixel sizes of the Sentinel-2 MSI and PRISMA satellite images.We used the ABA method outlined in Section 2.5.3.The area from which the airborne LiDAR data was extracted for each grid cell was 40 × 40 m so that it matched exactly with the size of our test plots.The grid cells were thus partly overlapping.L eff , i D , and p were calculated from the canopy angular interception values using Eqs. 1, 2, and 12, respectively.The foliage and woody element spectra for each grid cell were obtained from the speciesspecific spectra by weighting with species fractions obtained from the Finnish MS-NFI product.Finally, forest floor reflectance values at the Sentinel-2 and PRISMA bands were calculated for each grid cell using the PARAS model (Eq.4).For comparison with the in situ measurements in the validation plots, we extracted for each plot the average spectrum of nine 20 × 20 m pixels (Sentinel-2 MSI), or four 30 × 30 m pixels (PRISMA).

Evaluation of the retrieval algorithm for forest floor reflectance
First, we present the results on retrieval of forest floor reflectance in the test plots, using the most detailed input, i.e., airborne hyperspectral data and hemispherical photographs.The agreement of retrieved and in situ forest floor reflectance was reasonably good in sparse forests (small L eff ) but the retrieval became less accurate when the density of the tree canopy increased (Fig. 2).Simulations with the PARAS model showed that, in dense canopies, the signal from the forest floor is very low compared to the total reflectance signal from the forest (Fig. 3).Plots with L eff less than 2 had wavelength-dependent RMSE of 0.020-0.117,which is an order of magnitude smaller than the RMSE of all plots (0.320-1.148) (Fig. 4).It was concluded that the retrieval is not reliable in dense forests, and plots with L eff larger than 2 (38 plots out of 58) were excluded from the further analyses.
A. Hovi et al.

Airborne LiDAR vs. hemispherical photos in the retrieval of forest floor reflectance
Next, we compare the accuracy of the retrievals when using either hemispherical photos or airborne LiDAR as the structural input.These results were obtained from the airborne hyperspectral data and were evaluated in our test plots.The results were fairly similar with both hemispherical photographs and airborne LiDAR (Fig. 5).When using the hemispherical photographs, the RMSE values in plots with L eff less than 2 were 0.037, 0.021, 0.109, 0.080 for the green, red, NIR, and SWIR spectral bands, respectively, and 0.099 for normalized difference vegetation index (NDVI).When using airborne LiDAR data (the ABA method) instead, the RMSE values went up to 0.043, 0.028, 0118, and 0.086 for the spectral bands, and 0.150 for NDVI.Thus, a slight increase of RMSE was seen in all bands, but there were no large differences in the information content of the retrieved forest floor spectra.This could be expected, because LiDAR estimated canopy angular interception and consequently also photon recollision probability relatively well (Supplementary material Fig. S1).We also note that the results with the SHP method were similar to those with the ABA method.When using the SHP method, the RMSE of the retrieved forest floor reflectance in plots with L eff less than 2 were 0.049, 0.033, 0.106, and 0.089 for green, red, NIR, and SWIR spectral bands, respectively, and 0.129 for NDVI.
We discuss in more detail the few cases where the accuracy of the retrievals decreased when using airborne LiDAR data.These cases help to understand how errors in modeling the interception or reflectance properties of the tree canopy influence the retrieval of forest floor reflectance.For two pine plots (marked as #1 and #2 in Fig. 5), airborne LiDAR overestimated the canopy interception (i.e., the transmission term T BS (↓ sky ,↓)T S (↑,Ω) in Eq. 4 became too small), and thus the floor reflectance became overestimated, especially in the green and red spectral bands.Similar overestimation of forest floor reflectance was observed in another pine plot (#3 in Fig. 5) when using either hemispherical photographs or airborne LiDAR.This highlights the overall sensitivity of the retrievals to the errors in estimating canopy interception.Two other outliers in Fig. 5 also deserve attention.In a spruce plot (#4) the PARAS model overestimated the reflectance factor of the tree canopy in the NIR band considerably (the term R BS (↓ sky ,Ω) in Eq. became too large), and consequently, NIR reflectance of the forest floor became underestimated.Similar behavior was observed for a birch plot (#5), but only in the SWIR band.Despite the few outliers, the overall performance of the retrieval algorithm in sparse forests was good with both hemispherical photographs and airborne LiDAR data.

Wall-to-wall maps of forest floor reflectance
We mapped forest floor reflectance spectra for the Hyytiälä site, using satellite images (Sentinel-2 MSI, PRISMA) with airborne LiDAR as the structural input.The ABA method was used for processing the LiDAR data here, because it was computationally much more efficient than the SHP method, and because the analysis in Section 3.2 indicated that the two methods produced similar results.
As an example, we show maps of forest floor reflectance characteristics obtained from a Sentinel-2 MSI image on June 30th in (Fig. 6).Pixels with L eff larger than 2 were masked out in the map (comprising 49% of the forest area).Forest stands that were thinned in 2018 or 2019, i.e., just before the acquisition of the remote sensing data, are highlighted in Fig. 6.In Finland, thinning is typically performed 1-2 times in a rotation period of a forest stand, and approximately 30% of canopy cover is removed at a time.Thinned stands are clearly distinguished from the other vegetation based on their low forest floor NDVI.Thinning residues (branches, foliage litter, etc.) have been left on the ground and the harvest machinery have disturbed the understory and changed its spectral properties.Another clearly visible feature in the map is young forests (especially abundant in the southwestern part of the area), which typically have grass and other green vegetation cover, and thus high NDVI.A pine bog in the middle of the area, and pine forests on medium-fertile soil with no recent thinning, show moderate NDVI values.
As a quantitative validation, we compared the mapped forest floor reflectance obtained from the Sentinel-2 MSI and PRISMA images against in situ measurements in the five validation plots.Wavelengthdependent RMSE varied between 0.008 and 0.101 for the Sentinel-2 MSI data (Table 4).For the PRISMA data, RMSE values were 0.017-0.053(Table 4).Visualization of the retrieved forest floor spectra are shown for the PRISMA data and for the Sentinel-2 data of year 2021, i.e., the same year as the in situ measurements of forest floor reflectance were conducted (Fig. 7).Based on mean RMSE values, year 2021 showed the best match between Sentinel-2 and in situ measurements (Table 4).
Finally, we demonstrate how the forest floor reflectance can be used for monitoring the recovery of forest floor vegetation from a thinning disturbance.We show spectra of two thinned stands for years 2019-2021: a pine stand (Fig. 8a-b) and a birch stand (Fig. 8c-d) on medium fertile soils.The stands had been thinned between summers 2018 and 2019.The tree canopy was intact between 2019 and 2021, and the growth of the trees is relatively slow (height growth typically 30 cm per year).Thus, using airborne LiDAR data from 2019 for the years 2020 and 2021 would result in small errors that are not critical for our   analysis.The presence of logging residues and the disturbed understory vegetation are seen as relatively low NIR, and high visible and SWIR reflectance in the 2019 data (Fig. 8a,c).Year 2020 shows intermediate values, and in summer 2021 the forest floor spectra resemble a green vegetation the most, i.e., year 2021 has the strongest visible and SWIR absorption among the studied years.The HDRFs of the forest, including contributions from both forest floor and the canopy, showed a similar pattern, but their overall level was lower (Fig. 8b,d).

Overview of the method and its applications
Comprehensive monitoring of forests requires information not only on the tree canopy but also on the forest floor.The spectral properties of ground vegetation can be used, for example, in the prediction of fractional cover of different vegetation types (Schaepman-Strub et al., 2009;Forsström et al., 2021), which are connected to site type, productivity, and species diversity.Furthermore, disentangling the remote sensing signal of forest floor from that of the canopy can help to understand ecological succession, be it after a disturbance, forest management operation (as demonstrated in our study), or other change in the state of forest cover.We are currently seeing a rapid increase in the availability of hyperspectral data sets provided by satellite missions such as PRISMA and EnMAP (Rast and Painter, 2019).By the end of 2020's, with the advent of forthcoming ESA's CHIME (Rast et al., 2021) and NASA's SBG missions (Cawse-Nicholson et al., 2021), medium spatial-resolution (~30 m) hyperspectral satellites are expected to provide frequent (sub-monthly) global coverage.These new data sets can enable standlevel monitoring of forest floor reflectance properties at high spectral resolution.
Most medium-to-high spatial resolution satellite data, including the new hyperspectral missions, observe the Earth close to nadir, and thus it is difficult to retrieve forest canopy structural information from the data directly.We examined the use of a physically-based forest reflectance model with airborne LiDAR as an additional data source in the retrieval of forest floor reflectance spectra.We showed that, in terms of RMSE, LiDAR-based retrievals were almost as accurate as those based on hemispherical photos.From a practical perspective (i.e., interpretation of the retrieved forest floor reflectance spectra) the two data sources provided equal information content.Further, the synthetic hemispherical photographs, despite their suspected superiority and much higher computational demands, did not outperform the simple area-based approach in our plot (or satellite pixel) level analyses.Our results can serve as a basis for developing operational algorithms for retrieval of forest floor reflectance at medium spatial resolution, i.e., at the level of individual forest stands.In addition to airborne LiDAR, the method could potentially also be used together with spaceborne LiDAR data from GEDI and ICESat-2 missions, as they deliver similar information on canopy height distribution as the area-based approach used here.

Evaluation of the results
The accuracy of the retrieval method was the highest for stands with L eff less than 1.5-2.With increasing L eff the signal from the forest floor decreases (Fig. 3).Consequently, the retrieval of forest floor reflectance becomes more sensitive to any errors (random or systematic) in the modeled canopy reflectance properties or interception.For example, Varvia et al. (2018) showed that, in a Bayesian inversion of a forest reflectance model, the uncertainty intervals of forest floor reflectance were higher in dense than in sparse canopies.The problem of tree canopy occluding the forest floor has been also commonly reported in other studies dealing with retrieval of forest floor reflectance properties from remote sensing data (Pisek and Chen, 2009;Pisek et al., 2016Pisek et al., , 2021)).Thus, this issue is not unique to our study or modeling approach.We see the greatest potential for mapping forest floor reflectance, where tree canopies are typically sparse, e.g., in the boreal region.It can be also argued that in dense forests the mapping of forest floor properties is less important because the contribution of forest floor vegetation to the overall forest productivity is expected to diminish with increasing tree canopy plant area index or canopy closure.
Sentinel-2 and PRISMA data had comparable retrieval accuracies in our validation plots (Table 4).However, PRISMA underestimated forest floor reflectance for one spruce plot in the visible wavelength region (Fig. 7e).This is likely due to low HDRF values in the PRISMA data, and the overall sensitivity of the retrieval algorithm to inaccuracies of input data in dense forests.On the other hand, the Sentinel-2 image acquired on the same day with the PRISMA data had large HDRFs, which resulted in overestimation of forest floor reflectance and thus large RMSE values (Table 4).The differences could be due to atmospheric correction: the mean aerosol optical thickness value in the PRISMA data (0.157) was notably higher than that derived by the Sen2Cor software for the Sentinel-2 image acquired on the same day (0.074).Overestimation of forest HDRF and thus forest floor reflectance in the visible wavelength region was observed also for Sentinel-2 images acquired on July 25th, 2019, and July 14th, 2021 (Table 4).Temporal changes of forest floor

Table 4
Root-mean-square-error (RMSE) of the retrieved forest floor reflectance against in situ measurements in the five validation plots.The wavelengths correspond to the central wavelengths of the spectral bands of the Sentinel-2 MSI (and PRISMA) instruments.For PRISMA, we present results for the bands that were closest to the Sentinel-2 MSI bands.could have also affected the evaluation of PRISMA data as well as the Sentinel-2 data acquired before 2021, because the in situ measurements of forest floor reflectance were made in 2021.Indeed, the RMSEs of the retrievals were the smallest for year 2021 (Table 4).Potential effects of the senescence of forest floor vegetation were seen in the Sentinel-2 images acquired in August in all studied years, with low HDRFs and large RMSEs in the NIR region (Table 4).
Quantitative comparison of our results to other studies is difficult, because of different data sources and retrieval algorithms, and the lack of accuracy measures (e.g., RMSE) reported in previous studies.However, we attempt a quantitative comparison here, to help evaluate the performance of our method, and also to highlight differences and synergies between methods.A method that utilizes multiangular remote sensing data (Canisius and Chen, 2007) has been evaluated against field measurements in several publications (Pisek et al., 2010(Pisek et al., , 2015a(Pisek et al., , 2015b(Pisek et al., , 2016(Pisek et al., , 2021)).Pisek et al. (2010) used airborne hyperspectral data over boreal forests in Canada and had 30 × 30 m plots for validation, i.e., the spatial resolution was well comparable to ours.From the reported in situ measurements and airborne retrievals of forest floor reflectance (Table II in Pisek et al., 2010) we calculated RMSE of the retrievals.The RMSE was 0.010-0.014for the red band, and 0.042-0.061for the NIR band, depending on the angular measurement configuration of the airborne data.These values are close to our validation results for year 2021, i.e., the year that was coincident with the in situ measurements (Table 4).Overall, synthesis of the above referred validation studies indicates fairly good performance of the method based on multiangular satellite or airborne data.Further, a recent Europe-wide validation effort concluded that the retrieval of forest floor signal from multiangular data was not reliable if the foliage cover ("percentage of ground covered by the vertical projection of foliage and branches") was larger than 85% (Pisek et al., 2021).Based on our hemispherical photos, foliage cover of 85% corresponds to L eff of approximately 4. With our algorithm, the retrieval of forest floor reflectance deteriorated already at L eff of 1.5-2.This may indicate that multiangular data are capable of predicting forest floor reflectance in denser stands than nadir measurements alone.However, the availability of multiangular data at medium to high spatial resolution (i.e., at the scale of typical forest stands) and thus its value in practical forestry is limited, which highlights the complementary nature of our approach.
Markiet and Mõttus (2020) used near-nadir airborne hyperspectral data, and their retrieval method resembled that of ours.They modeled the forest reflectance factor as a linear combination of the reflectance factors of sunlit canopy, shaded canopy, sunlit forest floor, and shaded forest floor.The unknown parameters in the estimation were obtained with a physically-based modeling approach based on p-theory.A tendency of the algorithm to overestimate forest floor reflectance was reported, but general reflectance differences between site fertility types (e. g., increasing NIR reflectance along with increasing site fertility) could be correctly predicted.No quantitative accuracy measures were reported.The retrievals of canopy gap fractions and p in Markiet and Mõttus (2020) were based on airborne hyperspectral data.We demonstrated the use of airborne LiDAR data in this context and used a new forest reflectance model that was recently validated empirically (Hovi et al., 2022).The above comparisons with earlier studies highlight the importance of performing quantitative comparisons of different retrieval algorithms within the same remote sensing and in situ data sets in the future, to better understand the relative performance of different methods and data sources.

Limitations and practical considerations
The forest reflectance model that we used allows for a simple analytic calculation of the forest floor reflectance if the tree canopy structural parameters and tree species are available from airborne LiDAR or other sources.However, the model relies on the assumption of a Lambertian forest floor.Based on field measurements of bi-directional reflectance distribution functions (BRDF) of forest floor vegetation, modeling the forest floor as non-Lambertian would be more realistic (Forsström et al., 2019;Peltoniemi et al., 2005), but it would result in additional parameters and increased computational complexity, which would prohibit the analytic inversion performed here and increase the computation time, or prevent parameter identifiability in numerical inversion altogether.As an example of the computational complexity, the model of Manninen and Stenberg (2021), which was based on ptheory, required computationally heavy numerical integrals for modeling the multiple scattering between the canopy and the forest floor.
Directional scattering properties of the tree canopy are considered in the model that we used, through the parameter Q Ω .but the calculation of Q Ω is based on the assumption that photons are evenly distributed on all elements of the canopy (Hovi et al., 2022).Thus, another potential way of improving the accuracy of the retrievals would be to consider a more detailed model for the reflectance factor of the tree canopy (R BS (↓ sky ,Ω) in Eq. 3), e.g., a geometric-optical or a ray tracing model, or a model based on photon recollision probability that models the first and higher orders of scattering separately (see Manninen and Stenberg, 2021 as an example of the latter).Again, the number of parameters would increase, and not all parameters may be obtainable from the LiDAR data.Further research on the potential of the above-mentioned improvements would help in finding an optimal solution in terms of accuracy, model invertibility, and computational time.
Finally, we discuss practical aspects to consider if applying the method at large scale.Airborne LiDAR data are currently openly available in many countries (Kakoulaki et al., 2021).The openly available data sets often have a lower pulse density (e.g., fewer than five pulses per m 2 ) than what we used (7-20 pulses per m 2 ).The area-based approach that we employed has been successfully applied in the estimation of canopy structural variables from low pulse density data (under one pulse per m 2 ) (Korhonen et al., 2011), which indicates that the open airborne LiDAR data sets could be used also in our algorithm.However, the interpretation of LiDAR requires in situ forest plots as empirical training data.While the empirical models for interpreting LiDAR data may be to some extent transferable across different sensors and study areas, it is likely that optimal results are obtained by using at least some local in situ calibration data (Kotivuori et al., 2018).In addition, the PARAS model requires tree species fractions for parameterizing the foliage and woody element optical properties and shoot clumping coefficients.For this purpose, we obtained tree species fractions from existing maps.In the absence of forest inventory databases, estimation of tree species from the remote sensing data sets is an option but might require additional in situ training data.

Conclusions
We developed and tested a practical method for mapping forest floor reflectance from near-nadir medium spatial resolution satellite data (e.g., Sentinel-2, Landsat, and new hyperspectral missions such as EnMAP and PRISMA) with airborne LiDAR as an additional data source.This paves the way for operational wall-to-wall mapping of forest floor reflectance properties at stand level, i.e., at the spatial scale that is relevant to practical forestry.We recommend further research in quantitative evaluation of the method against other approaches, comparison of different forest reflectance models, as well as more comprehensive testing in different study areas and forests.Nevertheless, the method presented in this study is promising as it can take full benefit of not only openly available multispectral or future hyperspectral satellite data sets, but also structural information from LiDAR.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 2 .
Fig. 2. Examples of forest floor reflectance spectra retrieved from airborne hyperspectral data, using hemispherical photographs as an additional data source.The results are compared to in situ measurements in four test plots: a) a pine plot in Hyytiälä, b) a pine-birch plot in Järvselja, c) a birch plot in Hyytiälä, and d) a spruce plot in Bílý Kříž.Effective plant area index (L eff ) and tree height (H) are given in each sub-figure.Airborne hemispherical-directional reflectance factors (HDRFs) of the entire forest (including contributions from both tree canopy and the forest floor) are also shown for comparison.The gray area around the black line represents standard deviation of the in situ measured forest floor spectra.

Fig. 3 .Fig. 4 .
Fig. 3. Forest hemispherical-directional reflectance factors (HDRFs) in green, red, near-infrared (NIR), and shortwave-infrared (SWIR) bands simulated with the PARAS model, either using the in situ measured forest floor reflectance spectra as input, or assuming a black soil.The black vertical lines represent the contribution of the forest floor to forest HDRF.The forest HDRFs from airborne hyperspectral measurements are shown for comparison.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig.5.The relationship between forest floor reflectance retrieved from airborne hyperspectral data and forest floor reflectance measured in situ in the test plots in four spectral bands: green (553 nm, a-b), red (667 nm, c-d), near-infrared (867 nm, e-f), and shortwave-infrared (1618 nm, g-h).The bottom row (i-j) shows retrieved and measured NDVI.The left column shows results obtained using hemispherical photos as input data, and the right column shows the results with airborne LiDAR (the ABA method).Data are shown separately for plots with effective plant area index (L eff ) less than 2, and for those with L eff less than 1.5.The numbered circles represent outliers that are discussed in Section 3.2.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.Top row: Map of forest floor NDVI for the Hyytiälä site calculated from a Sentinel-2 MSI image acquired on June 30th 2019.Forests with effective plant area index greater than 2 were masked out.Middle row: Effective plant area index (L eff ) of forests mapped from airborne LiDAR.Bottom row: Red-green-blue visualization of the original Sentinel-2 MSI image.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 7 .
Fig. 7. Comparison of in situ measured forest floor reflectance spectra against retrievals from Sentinel-2 MSI and PRISMA images in the validation plots.Hemispherical-directional reflectance factor (HDRF) spectra of the entire forest are shown for comparison.The Sentinel-2 MSI spectra represent the range of observations from June 26th to August 13th in 2021.The PRISMA spectra are from the image acquired on June 26th, 2020.The gray area around the black line represents standard deviation of the in situ measurements.Photographs of the plots are shown on the right.Dominant tree species, effective plant area index (L eff ), and tree height (H) are listed in each sub-figure.

Fig. 8 .
Fig. 8. Time series of retrieved forest floor reflectance spectra and the corresponding forest hemispherical-directional reflectance factors (HDRF) from Sentinel-2 MSI for a coniferous (a-b) and a broadleaved forest (c-d).The coniferous forest is a pine-dominated stand of approximately 20-m-tall trees, and the broadleaved forest is a birch stand with tree height of 18 m (the same stand in which the large forest plot #3 (Fig. 7c) was located in).The spectra shown in the figure represent mean spectra for each year (2019-2021), based on Sentinel-2 MSI images acquired in summertime (mid-June to late August).

Table 3
Mean (and range)of forest characteristics of the test plots in the study sites.