Physically based illumination correction for sub-centimeter spatial resolution hyperspectral data

.


Introduction
The improved affordability and miniaturization of imaging spectrometers has made vegetation monitoring with Unoccupied Aerial Vehicles (UAV) widely accessible over the last decade, offering new prospects for both environmental and economical purposes.The high spatial resolution of this data has enabled, for instance, the detection of pests (Honkavaara et al., 2020;Hellwig et al., 2021;Huo et al., 2023), invasive species (Papp et al., 2021;Gholizadeh et al., 2022), and diseases (Morel et al., 2018;Hornero et al., 2021) in vegetation on an unprecedented level of detail.Most vegetation monitoring methods rely on measuring the spectral hemispherical-directional reflectance factor (reflectance from hereafter) of vegetation elements, such as tree crowns or leaves.The reflectance is defined as the radiance scattered by the observed elements to the radiance scattered by a Lambertian reference surface under the same illumination and observation geometry (Schaepman Strub et al., 2006).However, obtaining the same conditions for the vegetation elements and the reference surface is practically impossible given the complexity of the vegetation structure.Hence, instead of the actual leaf reflectance, an apparent reflectance called the top-of-canopy (TOC) reflectance is commonly used, wherein the radiance scattered by the observed elements is divided by the radiance scattered by an Lambertian reference surface at the TOC.While this approach is appropriate to data from spaceborne instruments, where the spatial resolution is coarse enough that each pixel can be treated as a volume of infinitesimal scattering elements, it can cause severe problems when applied to UAV-based images where the spatial resolution is very high.For instance, when the spatial resolution allows observing an individual leaf within a canopy, the radiance scattered from the neighboring vegetation elements toward the leaf is not accounted for with the TOC reflectance, nor is the orientation of the leaf.This can be a source of great uncertainty for UAV-based methods relying accurate information of the leaf reflectances for monitoring leaf biophysical and biochemical properties (Takala and Mõttus, 2016).
For such high spatial resolution data, the differences in the amount of irradiance arriving on the TOC and the leaf surfaces, i.e., the differences between the illumination conditions, are often corrected via spectral normalization methods (Mishra et al., 2020).These illumination correction methods are based on transforming the data by, e.g., centering the spectrum of each pixel around its mean and normalizing with respect to its standard deviation (Asaari et al., 2018).While these methods reduce the level of variation between the pixels, they assume the illumination conditions are not dependent on the wavelength (Roger et al., 2020) which is not the case in a vegetation canopy.Moreover, the transformed spectra cannot be analyzed using physically based methods, requiring statistical analysis instead.
Efforts for illumination correction using radiative transfer theory have been made with the COSINE model (Jay et al., 2016) which, combined with the PROSPECT leaf radiative transfer model (Jacquemoud and Baret, 1990), can retrieve leaf pigments with great accuracy in a laboratory setup.However, COSINE assumes the leaf is illuminated only by direct solar irradiance and hence does not account for the complex illumination conditions within a vegetation canopy.To this end, the theory of spectral invariants, also called the p-theory, has been posed as a possible solution to account for these illumination conditions (Hernandez-Clemente et al., 2016).The theory links the TOC reflectance to the leaf optical properties via a set of spectrally invariant parameters (Huang et al., 2007;Knyazikhin et al., 2011;Knyazikhin et al., 0027) and is scalable between different hierarchical levels of canopy structures, which enables connecting, e.g., the scattering by a needle to the scattering by a shoot (Smolander and Stenberg, 2003;Lewis and Disney, 2007;Huang et al., 2007).This has enabled a wide range of computationally efficient applications, including estimating canopy structure (Lin et al., 2023) and forest floor spectra (Markiet and Mõttus, 2020), modeling canopy optical properties (Hovi et al., 2022) and leaf optical properties (Lewis and Disney, 2007;Wu et al., 2021), as well as describing solar induced chlorophyll fluorescence (Zeng et al., 2020).
Recent progress with the p-theory has corroborated a direct link between the spectral invariants and the direct solar irradiance and the ambient irradiances incident on the surfaces of sunlit leaves within a canopy (Ihalainen and Mõttus, 2022).Thus, it is theoretically possible to correct for these illumination conditions by estimating the spectral invariant parameters and hence retrieve the actual leaf reflectance.However, this is complicated by the fact that the retrieval of these parameters is an ill-posed inverse problem, as demonstrated by Lewis and Disney, 2007.In practice, this means that the TOC reflectance spectra can be modeled using several different leaf reflectance spectra and spectral invariant parameters with equal accuracy.Obtaining a stable solution to an ill-posed inverse problem generally requires a regularization strategy, whereby unrealistic solutions to the problem are penalized using prior information of the problem (Combal et al., 2003;Mueller and Siltanen, 2012).
The objective of this work was to develop and assess a method based on the p-theory for correcting the illumination conditions of sunlit leaves and retrieving their actual reflectance spectra from subcentimeter spatial resolution hyperspectral images in the visible and near-infrared (VNIR) wavelengths.To achieve this, we evaluated the performance of the p-theory as a forward model for describing the TOC reflectance and the illumination conditions.Specifically, we investigated the spectral variability of p and ρ, and studied the accuracy the forward model when these parameters were set as constants.Based on the evaluation, we developed a regularization strategy for solving the inverse problem of illumination correction and reflectance retrieval.The study is based on simulated and measured data.

Theory
Let us consider a remote spectral measurement of a sunlit vegetation canopy when a pixel is considerably smaller than a leaf.The signal contributed by a single green leaf to the TOC reflectance measured by a pixel of an imaging spectrometer above the canopy is given by where ϕ ⊙ is the incident direct solar irradiance on the leaf, ϕ A is the ambient diffuse irradiance from every direction around the leaf (4π), Φ TOC is the irradiance on the TOC, S leaf is the leaf scattering factor, Ω i is the direction of the Sun in the TOC coordinate system, and Ω ′ i and Ω ′ s are the directions of solar irradiance and the radiance scattered by the leaf in the coordinate system of the leaf.Eq. (1) only includes the contribution of by vegetation elements and ignores the direct contribution of the ground area visible to the sensor.The leaf scattering factor is a weighted sum of leaf reflectance and transmittance factors where the weights w R and w T are the ratios of incident irradiances on the upper and lower hemispheres of the leaf to the total incident irradiance, respectively, such that w R +w T = 1 (Ihalainen and Mõttus, 2022).
Next, let us define two unitless variables: where I s is the measured radiance that is scattered by the leaf.To simplify the notation, we will stop denoting the directional terms Ω i ,Ω ′ i , and Ω ′ s in further analysis.In theory, both ρ and p can have values anywhere between 0 and ∞.The value of ρ depends on the leaf orientation, the solar zenith angle with respect to the TOC, and the ratio of direct solar irradiance to the total irradiance arriving on the TOC from the above (Ihalainen and Mõttus, 2022).Given the relatively small variation of the direct-to-total irradiance ratio in the VNIR range (Mõttus et al., 2015), we approximate ρ as a wavelength-independent parameter (i.e., spectrally invariant), which depends on the leaf orientation and illumination geometry.On the other hand, p depends on the leaf optical properties and the amount of multiple scattering within the canopy.Thus, it can have large spectral variation across the VNIR range.Nevertheless, we assume it also to be spectrally invariant as a rough approximation, such that Eq. (1) becomes (5) However, approximating p as a constant is not valid for a shaded leaf receiving no direct solar irradiance.For such a leaf, ρ = 0, meaning that Eq. ( 5) reduces to p = 1/S leaf which is constant only if S leaf is a constant.Let us consider now an additional non-green component, C, in the measured TOC reflectance.In general, it can be due to non-green material inside, below or next to the canopy.For the special case of a canopy consisting predominantly of green material and a small pixel completely filled by a leaf surface, the non-green component can be contributed by specular reflectance from the leaf wax surface, assumed to be independent of wavelength in the VNIR region (Bousquet et al., 2005;Knyazikhin et al., 0027).For the most general case we can write Given the approximate spectral invariance of the specular component in the visible and near-infrared regions (Bousquet et al., 2005;Knyazikhin et al., 0027), we assume C to be constant in a canopy consisting of green leaves.Substituting R TOC,G from Eq. ( 6) into (1) yields where ρ ′ = ρ − pC.Eq. ( 7) can be viewed as a linear model with independent variables R TOC and S leaf , and with parameters p, ρ ′ , and C O. Ihalainen et al. meaning that these parameters can be computed using linear least squares, provided that S leaf is known.
Next, let us use a model for the leaf scattering factor with the input parameter vector β → (including, e.g., leaf pigments, water content, and dry matter content), so that we can write S leaf (λ) = S leaf ( β → ; λ).We obtain an equation for computing the TOC reflectance as where T is a vector of the model parameters.Finding the full set of parameters ϑ → from the measured R TOC using Eq. ( 8) is an inverse problem, achievable by minimizing an objective function between measured and modeled TOC reflectance, Given the spectrally invariant behavior of p, ρ ′ and C, the inverse problem can be solved in a small subset of the total measured wavelengths, λ ∈ λ.However, since the inverse problem is ill-posed (Combal et al., 2003;Lewis and Disney, 2007) a regularization strategy is needed to ensure a unique solution to the problem.Hence, we regularize the inverse problem as where argmin stands for the argument of the minimum, r is a regularization term penalizing V for unrealistic values and α ∈ R + is a regularization parameter determining the magnitude of the penalty.The choice of r and α are described in Section 3.5.After inverting p,ρ ′ , and C in the region λ, the leaf scattering factor is obtained for all observed wavelengths from Eq. ( 8) as

Hyperspectral top-of-canopy measurements
We measured the TOC reflectance of a Diervilla lonicera Mill.shrubbery located in the Otaniemi campus, Finland (60 • 11′ N, 24 • 49′ E).The target was chosen based on a visual estimation of satisfying model requirements: we required it to be dense, homogeneous, completely closed, consisting predominantly of leaves (i.e., no flowers or fruits) of sufficient size, being in full sunlight and easily accessible.
The hyperspectral images were acquired with a portable Specim IQ imaging spectrometer (serial number 190-1100152, Specim, Oulu, Finland).The imaging spectrometer, weighing 1.3 kg, was mounted at the end of a stable vertical steel extension attached to a heavy-duty tripod.The position of the sensor was roughly 60 cm away from the top of the tripod and at a height of 195 cm from the ground surface.The Specim IQ is equipped with a motorized pushbroom line scanner and a chargeable battery for power supply.The sturdy setup over the canopy ensured that distortions during image acquisition were minimized.The imaging spectrometer was operated with a notebook computer.The spectral range of the Specim IQ is from 400 to 1000 nm, and the spectral resolution (full-width-at-half-maximum) is 7 nm.The total number of bands captured across the wavelength range is 204.The field of view of the imaging spectrometer is 31 • × 31 • and the image size is 512 × 512 pixels.The measurement height from the TOC was roughly 1 m, providing a spatial resolution of approx.1.1 mm.A calibrated Spectralon white reference panel (serial number 9AA10-1120-6585) with a nominal reflectance of 99% was placed within the field of view of the image.TOC reflectances were calculated for the raw image pixels by removing the dark current signal and normalizing to the average of the white reference pixels.Integration times were selected manually for the measurements to maximize the dynamic range and to avoid saturating any pixels.The view zenith angle was 0 • (i.e., from nadir), while the solar zenith angle was 47 • .Multiple consecutive images were acquired during nearly cloudless sky conditions, from which one hyperspectral image was selected for analysis.Behmann et al., 2018 reported that instability in reflectance can be observed for the Specim IQ from 400 to 415 nm for some materials, and from 925 to 1000 nm when measuring outdoors in direct sunlight.Hence, the final hyperspectral image used for analysis was clipped to contain only wavelengths from 450 to 900 nm (see Fig. 1).

Spectral reference measurements of leaves
The spectral reflectance of two sunlit leaves of the Diervilla lonicera Mill.shrubbery visible in the hyperspectral images were measured outdoors directly after the hyperspectral data were collected.We refer to these leaves as leaf A and leaf B hereafter.AVASPEC-ULS 2048x64TEC-EVO spectrometer (Avantes B.V., the Netherlands) with a spectral range from 390 to 935 nm and a spectral resolution of 0.8 nm was used for the measurements.A bare fiber optic cable with 30 • field of view used for probing the samples at 5 cm distance and at 45 • view zenith angle.To reduce background irradiances a black sheet of cardboard with 2.5% reflectance was placed underneath the samples.An uncalibrated Spectralon plaque with 99% reflectance (serial number 99AA03-0416-5279) was used as a reference.The probed physical area of the leaves corresponded approximately to a 10 × 10 pixel area in the hyperspectral image.

Simulated data
We used a Monte Carlo ray tracing (MCRT) software, written by the authors (see Ihalainen and Mõttus, 2022 for details), to simulate hyperspectral images of vegetation with a spatial resolution of 0.01 m between the wavelengths from 450 to 900 nm at 5 nm intervals.The MCRT software produces a TOC reflectance image, R TOC with the corresponding single and multiple scattering reflectance images, R single and R multi , as well a mask for sunlit leaves.The number of photons used in the simulations was 10 8 .
We simulated a 1 m ×1 m area with 138 randomly positioned and oriented flat, disc-shaped leaves with identical optical properties where O. Ihalainen et al. the radius of each leaf was 0.18 m, and some of the leaves were overlapping.We used a bi-Lambertian scattering model for the leaves and the leaf reflectance and transmittance spectra were generated using the ANGERS leaf optical properties database (Feret et al., 2008).The database contains the directional-hemispherical reflectance and transmittance (DHR, DHT) factors of 276 leaves along with their measured biochemical properties, i.e., the concentrations of carotenoids (C ar ) and chlorophyll a and b (C ab ), leaf mass per area (LMA), and equivalent water thickness (EWT), as well as the inverted value of the leaf structure parameter (N) used in the PROSPECT models.
We created eight sets of simulated data, where the leaf optical properties varied from between the scenes while the scene geometry remained the same.We chose data from four leaves in the ANGERS database based on how well the PROSPECT-D model (Féret et al., 2017) could fit the measured ANGERS spectra when using the corresponding biochemical properties in the PROSPECT input.We then selected the leaves corresponding to the 10th, 30th, 60th, and 90th percentile residual sum of squares (RSS) values.The ANGERS database indices of these leaves were 202 (Acer pseudoplatanus L.), 74 (Acer pseudoplatanus L.), 245 (Viburnum rhytidophyllum Hemsl.), and 126 (Acer pseudoplatanus L.), respectively.The RSS values were calculated between 670 and 790 nm and are given in Table 1, along with the leaf trait values.Both the measured and modeled spectra of these samples were used in the simulations.In the following text, we denote the scenes created using the measured and modeled spectra respectively by with A (for ANGERS) and P (for PROSPECT), respectively, followed by the ANGERS database index of the leaf used in the simulation input, e.g., A202 and P202.
For each scene, we used the following observation geometry: the solar zenith angle was 30 • and the view zenith angle was 0 • .We used the same direct-to-global irradiance ratio, d for all of the scenes, generated using the 6S atmospheric radiative transfer code (Vermote et al., 1997) with input parameters specified in Table 1 by Mõttus et al., 2015.The background of each scene was black, i.e., all photons interacting with it were absorbed.Overall, 56% of the pixels in the scenes were sunlit.To test whether the inversion algorithm can account for a constant component in the measured signal, we added a 0.1 term to the TOC reflectance images when inverting the scattering factor, simulating slightly inaccurate calibration or atmospheric correction.

Forward model analysis
Using the a priori information of R leaf , T leaf , R multi , and R single given by the MCRT software, we computed of p(λ) and ρ(λ) as (Ihalainen and Mõttus, 2022) and modeled the TOC reflectance (Eq.( 8)) for the sunlit pixels of the simulated data with three formulations of the scattering factor: S leaf = R leaf , S leaf = T leaf , and 2 ω.These formulations are related to the distribution of leaf normal orientations within the canopy, i.e., the probability of viewing a specific side of the leaf surface (Myneni and Ross, 1991).The first two formulations assume that the observed TOC reflectance is due to leaf reflectance and transmittance, respectively, whereas the third formulation assumes an equal contribution from both factors.For a bi-Lambertian leaf, this third formulation corresponds to leaf albedo, denoted by ω.
The spectral variability of p(λ) and ρ(λ) was evaluated by computing their standard deviations with respect to λ and averaging over the sunlit pixels.We denote these averaged standard deviations by σ p and σ ρ .In the following text, we write p(λ) and ρ(λ) when referring to the values computed from Eqs. ( 11) and ( 12), and we use p and ρ when they are set as constants.
The forward model performance was studied by first computing the constant parameters p,ρ ′ , and C from Eq. ( 7) with ordinary least squares for the subset wavelength range, 670 nm < λ < 720 nm.This range was selected because the output of PROSPECT at these wavelengths is mostly determined by the leaf chlorophyll content, C ab (Knyazikhin et al.,0027).Then, these parameters were inserted into Eq.( 8) to compute R mod TOC in the full spectral domain of the data, i.e., 450 nm < λ < 900 nm.Finally, we calculated the residual sum of squares between R TOC and R mod TOC averaged over sunlit pixels, denoted by RSS.

Inversion algorithm
Based on the results of the forward model analysis (Section 4.1) we used S leaf = R PROSPECT leaf in the forward model (Eq.( 8)) when solving the inverse problem.PROSPECT-D (Féret et al., 2017) was used for modeling the leaf reflectance as the most suitable model in the PROS-PECT family (S.Jacquemoud, pers. comm.).The inverse problem was solved in the spectral range 670 nm⩽ λ⩽720 nm so that C ab could be used as the only unknown leaf trait parameter for S leaf in the inversion.For the others leaf traits, the averaged values of the ANGERS database were used: N = 1.52,LMA = 0.0052 g/cm 2 , EWT = 0.0116 g/cm 2 , and C ar = 8.66 μg/cm 2 .
The residual sum of squares was used as the objective function in the inversion: where n is the number of the observed wavelengths.The objective function was studied with the simulated data by optimizing Eq. ( 13) for β = C ab given a set of p and ρ with 0⩽p, ρ⩽1.5, and calculating the corresponding value of the objective function.The optimization was performed using Powell's conjugate direction method, implemented in the SciPy library of the Python programming language.Additionally, we computed the values of p and ρ from Eq. ( 7) for the actual leaf reflectance using linear least squares.
The inverse problem (Eq.( 9)) was also solved with Powell's conjugate direction method.The initial guess value for C ab was 30 μg/cm 2 .In the inversion algorithm, the spectral invariant parameters p, ρ ′ , and C were computed from Eq. ( 7) using linear least squares with S leaf (C i ab ) for each iteration step, i, of Powell's method.For the pixels where the inversion algorithm failed to converge, we set θ → = 0 → .To alleviate the ill-posedness of the inverse problem, we regularized it by setting the regularization term in Eq. ( 9) to where a > 0, thus penalizing the objective function for large values of C.

Table 1
The database index, residual sum of squares between PROSPECT-D and measurement, structure parameter, chlorophyll a + b content, equivalent water content, mass per area, and carotenoid content of the ANGERS leaves used in generating the simulated data.We set the regularization parameter to α = 1.0.The code for to solving the inverse problem is available on GitHub athttps://github.com/mottus/spectralinvariant/.

Forward model results
For each of the three formulations of the leaf scattering factor, S leaf , the values of the averaged standard deviations of p(λ) were similar (< 0.27) for nearly every scene (Table 2).The values of the averaged standard deviations of ρ(λ) were consistently the lowest across every scene when S leaf = R leaf and the highest when S leaf = T leaf .When S leaf = R leaf , the RSS was below 0.38 for each scene, whereas these values varied more widely for the other two formulations: at highest they were 112.41 for S leaf = T leaf , and 4.41 for S leaf = 1 2 ω leaf .
Inspecting the spectra of p(λ) for individual pixels showed a minimum close to 670 nm and a plateau after 750 nm for each scattering factor formulation (Fig. 2a).There was a corresponding peak at 670 nm in the spectra of ρ(λ) when S leaf = T leaf and S leaf = 1 2 ω leaf and a similar plateau after 750 nm (Fig. 2b).However, there was no peak for ρ(λ)

Inversion results
The objective function, as a function of p, ρ and the optimized C ab , had a trough-shaped form (Fig. 3).Along the bottom of the trough, visualizing an expected relationship between p and ρ, the objective function had a unique minimum.The position of the minimum corresponded to the p and ρ values obtained using the actual leaf reflectance.
The leaf scattering factor was inverted using the regularized inversion algorithm with S leaf = R PROSPECT leaf .When applied to the simulated data, the mean rRMSD of the sunlit pixels was below 6.9% for the scenes where the leaf optical properties were generated using PROSPECT (Fig. 4).For the scenes where the optical properties of real leaves from the ANGERS database were used, the mean rRMSD varied between 8.7% and 33.9%.On the average, the mean rRMSD of the scenes was 12.0%.The inversion results for the Diervilla lonicera Mill.shrubbery hyperspectral image were in good agreement with the in situ reference spectra of leaf A and leaf B, although the inverted spectra was underestimated at λ > 700nm for both leaves (Fig. 5).The coefficient of determination and the rRMSD of leaf A were 0.999 and 5.7%, and for leaf B they were 0.999 and 10.0%, respectively.

Discussion
The results from the simulated and measured data presented above demonstrate that the spectral invariant theory can accurately describe the illumination conditions and retrieve the actual leaf reflectance of sunlit leaves from sub-centimeter spatial resolution hyperspectral images in the VNIR region.Finding a robust solution to the inverse problem is made possible by extending the spectral invariant theory with the parameter C which accounts for the influence of non-green materials within the pixel and uncertainties from, e.g., reflectance calibration and atmospheric correction.

The p-theory as a forward model
Overall, formulating the scattering factor as S leaf = R leaf described the TOC reflectance more accurately than S leaf = T leaf and S leaf = 1 2 ω leaf (Table 2).This result is in line with the notion that the radiance scattered by an average sunlit leaf within a canopy toward a sensor is mostly due to reflectance (Ihalainen and Mõttus, 2022).It also shows that the ptheory can accurately approximate the TOC reflectance in the VNIR range when the spectral invariants are retrieved from a relatively narrow subset wavelength range.
Investigating the illumination conditions via p(λ) and ρ(λ) showed that the spectral variability of p(λ) was similar for all formulations of the scattering factor while ρ(λ) varies the least when S leaf = R leaf .Inspecting the spectra of p(λ) for individual pixels revealed that p(λ) is not completely constant but varies in the visible with wavelengths (Fig. 2a).This variation corresponds with the level of multiple scattering within the canopy, which is low in the visible wavelengths, where vegetation absorbs strongly, but increases sharply and plateaus in the NIR range where vegetation absorption decreases.On the other hand, the spectra of ρ(λ) was nearly constant in the entire VNIR range for S leaf = R leaf .This was to be expected, given that ρ(λ) describes the ratio of leaf-level direct solar irradiance to the TOC irradiance.Since the TOC irradiance is mostly due to solar irradiance this ratio becomes nearly constant and varies slightly due to the diffuse sky irradiance (Mõttus et al., 2015).Formulating the scattering factor as S leaf = T leaf and S leaf = 1 2 ω leaf produced a peak in the spectra of ρ(λ) at approx.670 nm due to the spectral differences between S leaf and the single-scattered TOC reflectance R single .Since R single is mostly due to reflectance, dividing it by, e.g., transmittance will result in large variations at wavelengths where reflectance and transmittance are the most different.For instance, the strong absorption by chlorophylls at 670 nm results in result in significant differences between reflectance and transmittance while at larger wavelengths they behave similarly.
More work on the theory is needed to accurately model completely shaded pixels.While these pixels are often ignored given their high amount of noise, they can provide valuable information on, e.g., vegetation light use efficiency (Takala and Mõttus, 2016).For completely shaded pixels, the TOC reflectance is given by R TOC = ϕ A ΦTOC S leaf , where ΦTOC is the ratio of diffuse irradiance from the sky and within the canopy on the leaf to the irradiance arriving at the TOC.This ratio could be estimated by comparing sunlit and shaded pixels of the same material.However, this would require identifying said pixels from the image which is cumbersome if done manually and prone to uncertainties

Table 2
The standard deviations of p(λ) and ρ(λ), and the residual sums of squares between measured and modeled TOC reflectance averaged over the sunlit pixels of the simulated data scenes for the three formulations of the leaf scattering factor: S leaf = R leaf , S leaf = T leaf , and S leaf = 1 2 ω leaf .otherwise.A simpler approach would be to find an appropriate mathematical formulation for modeling the ratio as a function of λ.

Illumination correction and leaf reflectance retrieval
A key finding for enabling the inversion is the unique minimum of the objective function (Fig. 3).Lewis and Disney, 2007 noted that nearly arbitrary values of pwithin basic physical limitscan be used for finding a value of ρ that allows an accurate model of the canopy reflectance.This is indicated by the trough-shaped form of the objective function.However, our results show that there exists a unique pair of p and ρ that produce the best fit to the data.Moreover, the position of the minimum corresponded well with the values p and ρ computed for the actual leaf reflectance, provided that PROSPECT was able to accurately model the reflectance.
In practice, the observed spectra of a vegetation pixel rarely arises from purely green material.Rather, the spectra may be influenced by, e. g., the leaf cuticle or veins.For coarser resolution data, twigs, branches, flora, and soil may also be visible in the pixel.Additionally, the noise level of the data and errors in the signal processing chain such as inaccurate atmospheric correction can cause artifacts in the observed spectra.These effects altogether affect the existence and position of the unique minimum.As a countermeasure, we introduced the parameter C in the p-theory for describing them.Since many of the effects do not    exhibit strong dependency on the wavelength, we assumed C to be spectrally invariant from 670 to 720 nm and developed a regularization strategy that adds a penalty for large values of C and for unrealistic values of p and ρ ′ .As a first approximation, we assumed C also to be constant outside the wavelength region used for model inversion.This is not mathematically necessary.Given that we have a leaf reflectance model which can be applied across the spectrum, C(λ) can retrieved from the difference of the modeled and measured TOC reflectance factors.
The regularized inversion algorithm consistently found the correct model parameters for the simulated data despite the addition of the 0.1 constant term to the data (Fig. 4).As expected, the algorithm performed the best when PROSPECT was able to accurately model the leaf reflectance.This is evident in the results where the mean rRMSD between the actual and inverted leaf reflectance was always lower for the scenes where the leaf spectra were created using PROSPECT than for the scenes where the spectra of real leaves were used.The outliers in the inversion results were caused by the pixels at the boundaries between the leaves and the black forest floor where the spectra were contaminated with high levels of noise relative to the reflectance signal.
To test the performance of our method with measured data, we applied the inversion algorithm to the hyperspectral image of the Diervilla lonicera Mill.shrubbery (Fig. 5a).Overall, the method significantly reduced the effect of specular reflection, leaf orientation, and illumination conditions for the sunlit leaves (Fig. 5b).In some cases, the inversion algorithm also produced a solution to completely shaded pixels despite the theory not applying under such conditions.These pixels can be masked in theory by, e.g., using a threshold value for ρ during the inversion, below which the pixel is ignored.Comparing the in situ measured reflectances of leaves A and B to their inverted counterparts from the hyperspectral image showed good agreement between the reference spectra and inverted spectra (Fig. 5c and d).While the TOC reflectances were similar to the reference reflectance in the visible range, the TOC reflectances had much larger values in the NIR.For both leaves, the inverted spectra were underestimated after approx.720 nm.For leaf B, the reflectance was also underestimated in the so-called green peak, at around 550 nm.Similar behavior was also found in the results of the simulated images (data not shown).This is due to the wavelength range (670 to 720 nm) used in the inversion.As noted earlier, p(λ) varies strongly in this range, becoming constant only after 720 nm.Thus, computing a constant p within this region can introduce error to the inverted spectra, especially for the wavelengths where diffuse ambient irradiance has a significant contribution to the TOC reflectance.The error could be decreased by extending the wavelengths used in the inversion from 720 nm to, e.g., 790 nm, where p(λ) becomes constant.However, this would require including the PROSPECT leaf structure parameter, N, as an additional unknown in the inversion algorithm, since it has a major contribution to the leaf reflectance in the NIR range (Xiao et al., 2013).More work is needed to investigate extending the wavelength range used in the inversion while still reliably estimating both p and N.
While the inversion results from the measured data are promising, the fact that the spectra of only two leaves of one species were used as validation needs to be addressed.In fact, the good results may have been specific to the reference leaves or their species, especially considering that PROSPECT did not always perform well for the simulated data with real leaf spectra.On the other hand, this is more of an issue with PROSPECT rather than the inversion algorithm itself.Using a small number of reference leaves was predominantly due to the fact that the target canopy was homogeneous, i.e., the reflectance should not vary strongly from leaf to leaf within the target canopy.The illumination corrected RGB image (Fig. 5b) shows this to be the case.Another point to address is the difference between the observation geometries for the reference measurements and hyperspectral images.A major concern in this regard is the specular reflection from the leaf cuticle: the cross-plane geometry used in the reference measurements ensures minimal contribution from specular reflection, whereas this is not the case in the hyperspectral image.Nevertheless, since the specular component is accounted for by the inversion algorithm, we believe the reference spectra to be sufficient for validation.A final point to address is that the pixels of the leaves in the hyperspectral image used for validation do not exactly correspond to the areas probed with the spectrometer.Given that the leaf surfaces were relatively homogeneous, we do not consider this to be a cause of excessive uncertainty in the validation.
It should be noted that using PROSPECT to model the hemisphericaldirectional reflectance is suboptimal as it actually models the DHR.However, since PROSPECT is able to produce spectra which sufficiently resemble the reflectance, albeit with possibly incorrect leaf trait parameters, its use can be justified.Moreover, the leaf traits can be estimated from the illumination corrected reflectance spectrum produced by the inversion algorithm via other methods such as spectral indices.To estimate the leaf chlorophyll content and other traits directly with the method presented in this work, another leaf radiative transfer model is needed.One solution could be sought by mathematically linking the reflectance with the DHR, similar to the link between the bidirectional reflectance factor and the DHR (Bousquet et al., 2005;Jay et al., 2016).This would enable the use of PROSPECT, although the approach might only be valid under special circumstances.An important point to consider with leaf radiative transfer models is the computational cost.Despite the relatively good efficiency of PROSPECT for spectrometer data, applying it to hyperspectral images with hundreds of thousands of pixels can be computationally expensive.In this regard, leaf radiative transfer models based on the p-theory (Lewis and Disney, 2007;Wu et al., 2021) offer a promising solution.
In addition to accounting for completely shaded pixels, retrieving leaf traits, and extending the wavelength range, future work on the inversion algorithm should also be focused on up-scaling the method from sub-centimeter spatial resolution to coarser resolutions.However, a major challenge for the up-scaling is that it would require unmixing of the various spectral signatures from within a given pixel, further complicating the inverse problem.Nevertheless, recent efforts with the p-theory already account for the spectra of forest floor (Markiet and Mõttus, 2020) and woody elements (Hovi et al., 2022) in the TOC reflectance.A promising addition from our work is the C parameter which can theoretically model any additional spectral signature in the pixel.

Conclusion
In this work, a physically based method for the illumination correction of TOC reflectance for sub-centimeter spatial resolution hyperspectral data in the VNIR wavelengths was investigated using simulated and measured data.The method, based on the p-theory, relates the observed TOC reflectance of sunlit leaves to the actual leaf reflectance and the incident ambient diffuse irradiance and direct solar irradiance via two parameters, p and ρ.A third parameter, C, was introduced into the p-theory to account for the effects of non-green materials in a given pixel.Using the simulated data, we confirmed that assuming p and ρ to be independent of wavelength in the VNIR region accurately reproduces the TOC reflectance when the actual leaf reflectance is known.
To enable the illumination correction of the TOC reflectance via the inversion of these parameters, a regularization strategy was developed.The results of the regularized inversion demonstrated that the illumination conditions can be accurately corrected for using the p-theory which allowed the retrieval of the actual leaf reflectance.This can have significant implications for vegetation monitoring applications relying on very high spatial resolution hyperspectral data as it offers the potential for an efficient physically based approach for leaf trait retrieval under complex illumination conditions.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.The SPECIM IQ hyperspectral camera mounted on a tripod above the Diervilla lonicera Mill.shrubbery and the Spectralon white reference placed at the top of the canopy.

Fig. 2 .
Fig.2.An example of p(λ) and ρ(λ) of a single pixel calculated using the three formulations for S leaf (λ) from the simulated data.

Fig. 3 .
Fig. 3.The behavior of the objective function, V = RSS, for a sunlit pixel from the scene D74 when C ab and C were estimated for fixed values of p and ρ.The white dot marks the p and ρ values for the actual leaf reflectance.

Fig. 4 .
Fig.4.The distributions of the relative RMSD values between the inverted and the actual leaf reflectances of the simulated data scenes.The spectra of the leaves in the scenes starting with the letter P were generated using PROSPECT.For the scenes starting with A, the measured spectra from the ANGERS database was used.

Fig. 5 .
Fig. 5.The RGB TOC reflectance image of the Diervilla lonicera Mill.shrubbery with the pixels corresponding to the in situ measured area of leaves A and B shown respectively in cyan and fuchsia (a), an inverted reflectance RGB image (b), the situ measured reflectances and the corresponding averaged TOC and inverted reflectances of leaf A (c) and leaf B (d).