Radiative transfer model inversion using high-resolution hyperspectral airborne imagery – Retrieving maize LAI to access biomass and grain yield

Mapping crop within-field yield variability provide an essential piece of information for precision agriculture applications. Leaf Area Index (LAI) is an important parameter that describes maize growth, vegetation structure, light absorption and subsequently maize biomass and grain yield (GY). The main goal for this study was to estimate maize biomass and GY through LAI retrieved from hyperspectral aerial images using a PROSAIL model inversion and compare its performance with biomass and GY estimations through simple vegetation index approaches. This study was conducted in two separate maize fields of 12 and 20 ha located in north-west Mexico. Both fields were cultivated with the same hybrid. One field was irrigated by a linear pivot and the other by a furrow irrigation system. Ground LAI data were collected at different crop growth stages followed by maize biomass and GY at the harvesting time. Through a weekly/biweekly airborne flight campaign, a total of 19 mosaics were acquired between both fields with a micro-hyperspectral Vis-NIR imaging sensor ranging from 400 to 850 nanometres (nm) at different crop growth stages. The PROSAIL model was calibrated and validated for retrieving maize LAI by simulating maize canopy spectral reflectance based on crop-specific parameters. The model was used to retrieve LAI from both fields and to subsequently estimate maize biomass and GY. Additionally, different vegetation indices were calculated from the aerial images to also estimate maize yield and compare the indices with PROSAIL based estimations. The PROSAIL validation to retrieve LAI from hyperspectral imagery showed a R2 value of 0.5 against ground LAI with RMSE of 0.8 m2/m2. Maize biomass and GY estimation based on NDRE showed the highest accuracies, followed by retrieved LAI, GNDVI and NDVI with R2 value of 0.81, 0.73, 0.73 and 0.65 for biomass, and 0.83, 0.69, 0.73 and 0.62 for GY estimation, respectively. Furthermore, the late vegetative growth stage at V16 was found to be the best stage for maize yield prediction for all studied indices.


Introduction
Maize (Zea mays) is one of the major food crops over the world, cultivated on more than 182 million ha and producing over 1400 million ton of grain in 2018 with an average grain yield of 7.7 ton/ha (FAO, 2020). The demand for food is increasing worldwide and climate change aggravates the volatility of yield (Cogato et al., 2019). Crop yield varies between seasons, countries, fields, varieties and even within the same field due to diverse management practices and environmental conditions. Monitoring this within-field variability in season and from previous seasons provides a piece of essential information for farmers, land rentals and insurance companies for decision making. Furthermore, within-field variability of yield from previous season is one of the fundamental inputs for site-specific crop management recommendations, such as fertilizer and seed rates (Kayad et al., 2021) through the delineation of management zones.
A common practice for site-specific crop management is to collect information with remote and proximal sensing to investigate withinfield variability of different factors that drive crop yield and quality. Acquired images through this approach require a proper analysis with specific algorithms, in order to generate the appropriate information layers for subsequent decision making and calculation of optimal input rates. Use of unmanned aerial vehicles is increasing rapidly in agriculture. They provide images with a high spatial resolution covering up to hundreds of hectares in a single flight (Caballero et al., 2020;Lan et al., 2010). These images can also be used as a tool for proving concepts at experimental plot scale for later being upscale through satellite imagery, which can cover regional areas.
Leaf Area Index (LAI) is an important biophysical variable for monitoring maize growth and estimating yield (Su et al., 2019a). LAI is a dimensionless (m 2 /m 2 ) variable that describes the canopy structure and is related to the vegetation photosynthetic activity and plant health. It can be considered as a potential proxy for crop biomass, harvest index and grain yield (Baret et al., 1989;Haboudane et al., 2004;Jonckheere et al., 2004;Weiss et al., 2004). Crop LAI measurements are based on different techniques that can be split between direct and indirect methods (Strachan et al., 2015). Direct methods require plant leaves collection to be measured by leaf scanner instrument. Indirect methods, e.g. close range detection techniques, such as ceptometers, digital hemispherical photography (fish-eye), smartphone applications and remotely sensed data (Confalonieri et al., 2013;Jonckheere et al., 2004;Weiss et al., 2004) which measure the reflectance from incident-diffuse and/or the direct illumination, are also widely recognized for such measurements (Facchi et al., 2010;Francone et al., 2014). Destructive methods are more accurate, however more labouring and time consuming as compared to indirect methods. Furthermore, direct methods and close range techniques are both limited in terms of spatial sampling, while remotely sensed techniques allow for an exhaustive characterization of the fields of large spatial extent.
Estimating LAI with remote sensing (RS) data provides a nondestructive, rapid and cost-effective method over large areas, allowing also frequent measurements. Two main approaches for LAI estimation from RS data are commonly used. First, the empirical approach is based on fitting relationships between remotely sensed canopy reflectance through spectral vegetation indices (VIs) and in situ LAI measurements (Haboudane et al., 2004;Qiao et al., 2020;Towers et al., 2019). The second approach exploits the knowledge of the physics of the interaction between electromagnetic radiation and vegetation surfaces, developed using radiative transfer theory (Campos-Taberner et al., 2016;Danner et al., 2019;Delloye et al., 2018;Duan et al., 2014;Houborg et al., 2009;Jacquemoud et al., 2000;Mananze et al., 2018;Punalekar et al., 2018). Radiative transfer models (RTM) allow for the simulation of reflectance from a set of biophysical variables. Inverting these models from reflectance measurements then allow for estimating these biophysical variables (Darvishzadeh et al., 2008;Mananze et al., 2018). The empirical approach through VIs is simple and fast, while it cannot be easily generalized as it depends on the season, location, crop density, plant species, growth stage and specific spectral sensor resolution (Rivera et al., 2014;Su et al., 2019a). Furthermore, it does not allow for exploiting all the relevant spectral information compared to RTM, as only two or three bands are mostly used for VIs calculation (Baret and Buis, 2008;Berger et al., 2018). On the other hand, the RTM inversion is more robust as the physical laws governing the radiative transfer model within the canopy are not site or date specific but depend on the observation and illumination geometry, besides the biochemical and structural properties of the vegetation elements that are described by the input biophysical variables to be estimated (Darvishzadeh et al., 2008;Mananze et al., 2018). However, RTM may require a fairly complex process to well represent the canopy structure related to a given crop while the accuracy of the model estimation depends on the predefined model parameters and natural uncertainties of the model, which is commonly referred to by the ill-posed inverse problem (Combal et al., 2003;Houborg et al., 2015).
VIs play an important role in monitoring variation in vegetation and are defined as the arithmetic combination of two or more spectral bands, which allows for enhancing vegetation while minimizing background effects (Matsushita et al., 2007). Leaf reflectance in the visible spectral range is mostly affected by the leaf chlorophyll content. Chlorophyll shows a higher reflectance in the green (G) spectrum region compared to red (R) and blue (B) regions (Barnes et al., 2015). At canopy level in the near-infrared (NIR) region, the reflectance is much greater than in the visible due to light scattering caused by interactions between leaf internal structure and the incident radiation (Knipling, 1970). The narrow portion between R and NIR regions is called Red-Edge (RE), where vegetation reflectance increases dramatically compared to a smaller increase in the case of soils and other terrestrial objects (Corti et al., 2018;Scotford and Miller, 2005). Vegetation indices exploit such differences in leaf and canopy reflectance properties in the visible and near-infrared spectrum according to the variable of interest.
The use of remotely sensed data and its derived VIs (Kayad et al., 2019;Schwalbert et al., 2018) or assimilation in crop growth models for monitoring crop growth  and yield estimation is already well established (Bala and Islam, 2009;Bouman, 1995;Gao et al., 2018;Mkhabela et al., 2011;Monteith, 1972;Rodrigues et al., 2018;Schulthess et al., 2013). Many VIs provide a good estimate of the accumulated absorbed photosynthetically active radiation (APAR) which in turn controls biomass accumulation (Akitsu et al., 2017;Monteith, 1972). For instance, the normalised difference vegetation index (NDVI) (Hasegawa, 1976) is widely used for crop growth and yield estimations, however it is limited by some saturation for medium to high LAI values (Hatfield et al., 2008;Peralta et al., 2016;Rembold et al., 2013). Other VIs show better sensitivity for high LAI values such as the green normalized difference vegetation index (GNDVI) (Gitelson et al., 1996) or the normalized difference red-edge (NDRE) (Hatfield et al., 2008). Several studies investigated the possibility of estimating maize biomass and grain yield (GY) through VIs (Kayad et al., 2019;Madugundu et al., 2017;Schulthess et al., 2013;Schwalbert et al., 2018;Venancio et al., 2020) due to their high correlation with several biophysical variables and simplicity (Ji and Peters, 2007;Lambert et al., 2018). Kayad et al. (2019) investigated the possibility to monitor maize GY using Sentinel-2 satellite images through different VIs and machine learning techniques. Their results showed that GNDVI was the best index to describe within-field GY variability at the R4-R6 growing stage with R 2 values of up to 0.48 for GNDVI and up to 0.6 from random forest correlation based model. Shanahan et al. (2001) suggested GNDVI at the mid-grain filling stage for maize GY prediction in plot experiment. Furthermore, Schwalbert et al. (2018) reported in a recent study that NDRE, GNDVI and NDVI showed high performance in maize yield prediction from Sentinel-2 images collected ± 20 days from the flowering stage. In general, monitoring crop yield variability through empirical models derived from VIs is already well investigated while it is only valid under specific conditions, requires field measurements concurrent to remote observations and tends to have spatial and temporal limitations (Berger et al., 2018;Hatfield et al., 2008).
In this study, we explored the inversion of the PROSAIL RTM (Jacquemoud et al., 2009), which is a combination between the leaf optical properties model (PROSPECT) with the scattering by arbitrary inclined leaves model (4-SAIL). PROSPECT is a key model to simulate leaf reflectance over the whole optical domain and comes with different versions, such as PROSPECT 4, 5 and D (Baret et al., 1992;Jacquemoud and Baret, 1990). The 4-SAIL model was proposed by Verhoef (Verhoef, 1985(Verhoef, , 1984 to simulate bidirectional reflectance of a canopy (Jacquemoud et al., 2009). Previous studies on PROSAIL investigated its ability to retrieve biophysical and biochemical variables such as LAI and chlorophyll from maize, wheat, rice, sugar beet, potato and grassland (Atzberger et al., 2015;Baret et al., 2007;Darvishzadeh et al., 2012Darvishzadeh et al., , 2008Duan et al., 2014;Herrmann et al., 2011;Punalekar et al., 2018;Richter et al., 2011Richter et al., , 2009Sehgal et al., 2016). In some cases, LAI and chlorophyll retrievals were followed by yield or nitrogen content estimations using empirical equations or crop-growth models Huang et al., 2019b;Jay et al., 2017;Punalekar et al., 2018;Zhang et al., 2016). Comprehensive systematic reviews were done by Jacquemoud et al. (2009) andBerger et al. (2018) reporting PROSAIL theory, applications and evaluation for future capabilities.
Although several studies focused on maize LAI, biomass and GY estimation through different remote sensing approaches, there is a need to evaluate the use of the PROSAIL approach on actual maize farmer fields dealing with natural field spatial variability for commercial applications. Furthermore, comparing PROSAIL with simple empirical VI approach is needed to understand their relative strengths and weaknesses. This study has the following specific objectives: (1) perform PROSAIL model inversion for maize LAI retrieval at different crop growth stages; (2) assess maize biomass and GY relationships with retrieved LAI from PROSAIL; (3) compare the relationships between selected VIs and retrieved LAI with maize ground biomass and GY.

Study area and ground data collection
This study was conducted in two separate fields (23 km distance) located in the Yaqui Valley near Ciudad Obregón (Sonora), in northwestern Mexico (Fig. 1). Those fields were cultivated with the same maize variety (Caribú) during the same growing season 2014-2015. The first field (27 • 17'N, 109 • 57'W) named F1 which has an area of 20 ha. It was sown on October 4th, 2014 with a seed rate of 10 seeds / m 2 . The second field (27 • 26'N, 110 • 07'W) named F2 has an area of 12 ha and was sown on September 20th, 2014 with a seed rate of 9 seeds / m 2 . Row distance was 0.8 m. F1 was irrigated by a furrow irrigation system (flood), while for F2, a linear pivot irrigation system was used. Residues from the previous crop of F1 were burned (maize), while they were incorporated into the soil using a conventional tillage for F2 (potato). Climate in the study area is semi-arid with an average rainfall of 280 mm per year and soil types are clayey and alluvial soils at 3:2 ratio (Meisner et al., 1992;Rodrigues et al., 2018).
Leaf Area Index (LAI) were measured on the ground using the LAI ACCUPAR LP-80 developed by METER Group, Inc. USA. Measurements were made at five different maize growth stages between 40 and 157 days after sowing (DAS). The LAI data were collected at 12 fixed ground points which resulted in a total of 60 ground LAI observations for F1. Each observation consisted of three readings made perpendicular to the two central rows along the sampling area, which is described in the next paragraph. For the subsequent analyses, the averages of the three readings were used. The LAI measurement considered the solar zenith angle at each field location, observation date and time calculated from the National Oceanic and Atmospheric Administration (NOAA) website (https://www.esrl.noaa.gov/gmd/grad/solcalc/azel.html).
Biomass and GY were sampled at the harvesting time at 26 and 25 sampling points from F1 and F2 fields, respectively, including the 12 LAI ground points above mentioned from F1 field (Fig. 1). Biomass and GY were sampled from the 2 rows adjacent to the centre of each sampling point. Sampling distance for each row was 5 linear metres, resulting in a sampling area of 8 m 2 . The final observations were calculated in dry tons ha -1 . Biomass refers to the whole above ground plant material harvested per unit area while grain yield refers to weighted harvested grain per unit area.

Hyperspectral airborne flight campaign and image processing
Aerial hyperspectral images were collected from the two fields at 12 different dates -varying from V7 to R6 growth stages (GS) (Hanway, 1963). A total of 9 images from field F1 and 10 images from field F2 were collected. From those, five images for F1 had simultaneous measurements of ground LAI. Table 1 show dates of observations with the corresponding DAS and GS from each field.
The aerial hyperspectral imagery flight campaign was carried out with a push-broom micro-hyperspectral imaging sensor model, Micro-Hyperspec VNIR (Headwall Photonics, Bolton, MA, USA). This sensor measures the reflectance within the spectral region of 400-850 nm, split into 250 channels. The sensor was mounted on a manned airplane flying at 350 m above ground yielding images of 0.7 m ground sampling  distance. We used crosses, made of 3 m long sheets of polyethylene coated with aluminium foil, as ground control points and placed them around the study sites and measured their coordinates with a real-time kinematic (RTK) GNSS model, Trimble R4 GNSS system (Trimble, CA, USA). They were subsequently used for geo-referencing of the hyperspectral images. The root mean square error of georeferencing was less than the pixel size (0.7 m). During the processing, the geolocation of each image was done using ENVI 5.6 version (ENVI, Research Systems Inc., Boulder, CO, USA). Based on the methodology described by Rodrigues et al. (2018), the hyperspectral sensor had been radiometrically calibrated in the laboratory using an integrating sphere, CSTM-USS-2000 C Uniform Source System (LabSphere, NH, USA) at six integration times and four levels of illumination. Hyperspectral imagery was atmospherically corrected using the total incoming irradiance at 1 nm intervals simulated with the SMARTS model hosted by the National Renewable Energy Laboratory, US Department of Energy (Gueymard, 2005(Gueymard, , 1995. The aerosol optical depth was measured at 550 nm with a Micro-Tops II sun photometer model (Solar LIGHT Co., Philadelphia, PA, USA) at the time of the flights. The SMARTS model computes clear sky spectral irradiance, including hemispherical diffuse, direct beam, circumsolar and total irradiance on a tilted or horizontal plane for specified atmospheric conditions. The algorithms were developed to match the output from the MODTRAN complex band models to within 2%, using aerosol optical depth as an input. The spectral resolution was 1 nm for the 400-1750 nm and 0.5 nm for the 280-400 nm ranges of the electromagnetic spectrum. This radiative transfer model had been previously used for the atmospheric correction of narrow-band multispectral imagery in several studies (Berni et al., 2009;Calderón et al., 2015Calderón et al., , 2013Zarco-Tejada et al., 2016, 2012. Spectral binning was performed on each mosaic into 7.5 nm FWHM (Full Width at Half Maximum) to decrease noise effects, resulting in 61 wavelengths. From those, the 751, 759, 766, 773, 810 and 818 nm wavelengths were removed due to oxygen absorption by the sensor and noise effects. Finally, 55 wavelengths were used for subsequent analyses. All mosaics were used for data extraction based on the ground measurements locations (26 points from F1 and 25 from F2) with a buffer of 3 m diameter from the centre of the geocoordinates. This buffer distance allowed for averaging of about 7 pixels from each point. Finally, all spectral data were associated with the corresponding maize biomass, GY and matched ground LAI observations for further analyses. Fig. 2 shows the spectral data acquired across five hyperspectral images, which corresponds to the 60 ground LAI measurements.

PROSAIL inversion for LAI retrieval
To build the PROSAIL model, a combination between the PROSPECT-4 and 4-SAIL models was used. In the PROSPECT-4 model, four leaf parameters are required to build the model: leaf structure (N), chlorophyll a+b content (C ab ), water thickness (C w ) and dry matter content (C m ). Whereas the 4-SAIL model requires the following parameters: leaf reflectance and transmittance (PROSPECT output), LAI, average leaf inclination angle (ALIA) of an ellipsoidal leaf angle distribution function (Campbell, 1990;Duan et al., 2014), hot spot parameter (Hot), soil brightness (αsoil), sun zenith angle (θs), observer zenith angle (θv) and relative azimuth angle (φSV) (Berger et al., 2018).
Maize biophysical parameters to feed PROSAIL model were obtained from the literature (Berger et al., 2018;España et al., 1999;Richter et al., 2009;Verrelst et al., 2016). Rather wide, but also narrow ranges were found for some of the parameters. For instance, N ranged between 1.2 and 2, C ab ranged between 20 and 80 µg/cm 2 , LAI ranged between 0.5 and 7 m 2 / m 2 and ALIA ranged between 30to 90 . Whilst, other crop related parameters such as C w and C m were in a narrower range. The remaining PROSAIL model parameters are not related to the crop biophysical variables such as: αsoil, θs, θv and φSV. Soil brightness (αsoil) values ranged between 0 and 1. This parameter depends on soil type and moisture content. Sun zenith angle (θs) depends on the location of the study area and the day of the year the study is presented. θs values were calculated from the NOAA as described in Section 2.1 for the whole growing season, with values ranging between 42to 52 . Additionally, information about viewing geometries such as θv and φSV depends on the sensor and flight information which in our case have the ranges between − 24to 24and 0to 180 , respectively. Global sensitivity analysis (GSA) and preliminary analysis were done to examine different combinations of PROSAIL model parameters and the influence of each parameter simulated in the final look up table (LUT). Based on this preliminary analysis and GSA, we decided to change only the most sensitive parameters which were the ones having the wide range of magnitude and sensitivity, while other parameters were kept constant. This decision seems to be reasonable considering the required computational power and the fact that this study focused on one crop within the same region.
The ARTMO software package (Rivera et al., 2014) was used to simulate a maize spectral reflectance training dataset and to perform the GSA at different leaf and canopy parameters through the PROSAIL model. Building the PROSAIL based model consisted of three steps: First, setting up the sensor characteristics, which include the number of wavelengths (55 bands), its spectral range (400-850 nm) and FWHM (7.5 nm). Second, adding maize biophysical parameters, field and flight conditions to the PROSPECT-4 and 4-SAIL models as shown in Table 2 and described previously. Finally, running the PROSAIL simulations for specified maize parameters to simulate its spectral reflectance. The parameter θs was fixed at 49which is the median value at the study area through the growing season. Additionally, C m and C W were fixed at 0.005 g/cm 2 and 0.02 cm, respectively, which are in line with literature range values (Berger et al., 2018). The GSA analysis results showed that no influence of these parameters in visible and NIR regions, while θv and φSV were sampled within the predefined range to cover the different Fig. 2. Spectral data corresponding to the 60 ground LAI measurements at different DAS from field F1. observation scenarios. The PROSAIL spectral dataset was generated for five free variables N, C ab , LAI, ALIA and hot spot sampled within the predefined ranges using a uniform distribution function except for LAI where a latin hypercube distribution function was used. As a result, a total of 60750 simulations were obtained for the dataset. Finally, the cost function was used to find the best match between the simulated data set and the available ground LAI data with the corresponding spectral data . The least square error (LSE) cost function was used to select the solution of the inverse problem and the best 10% of the solutions corresponding to the smallest LSE values were averaged to calculate the modelled estimates of free parameters. Such approach were reported by several studies to reduce the influence of the ill-posed problem caused by measurement errors and model inadequacies (Duan et al., 2014). The 60 ground LAI data (12 ground sampling points from F1 time five images) were used to validate the retrieved LAI from PROSAIL. Fig. 3 illustrates the maize LAI retrieval flowchart based on PROSAIL and LUT inversion approach. The validated model was used to retrieve LAI from all remaining available ground sampling points where GY and biomass were measured, across all hyperspectral time series from both field locations. LAI data retrieved from PROSAIL model inversion will be referred as 'retrieved LAI' hereafter.

Calculated VIs
In order to compare the estimation of biomass and GY by the retrieved LAI from PROSAIL inversion with a simpler approach as VIs, three VIs were calculated; NDVI, GNDVI and NDRE (Eqs. 1-3). These VIs were chosen based on their simplistic forms and previous results shown in the literature (Kayad et al., 2019;Peralta et al., 2016;Schwalbert et al., 2018). VIs were calculated from all acquired aerial hyperspectral images from both study fields and extracted using the 51 ground point locations from both study fields (Fig. 1).
Where: NDVI, is the normalized difference vegetation index, GNDVI is the green normalized difference vegetation index, NDRE is the normalized difference red-edge, NIR is the reflectance at the near infrared band (832 nm), R is the reflectance at the red band (663 nm), G is the reflectance at the green band (560 nm) and RE is the reflectance at the red-edge band (722 nm).

Data analysis
To build empirical models between the retrieved-LAI and studied VIs versus maize biomass and GY, different regression models were tested, including linear, exponential, power and logarithmic. As per this step results and the distribution of data points, regression models were fitted between maize biomass and GY versus retrieved LAI and vegetation indices. This step considered only the 12 sampling points corresponding to the ground LAI measurements from F1 (Fig. 1). A time series analysis through coefficient of determination (R 2 ) values was used to assess the best crop growth stage for maize biomass and GY estimation. The subsequent analyses were done using only the best crop growth stage selected.
The empirical equations fitted in the previous analysis were retrieved by considering only the ground 12 observation points from F1, and applied to the remaining dataset. The remaining dataset, totalling 39 observations across both fields (14 from F1 and 25 from F2), was used for the cross-validation of the empirical equations. Furthermore, comparisons of the performance achieved by the different proxiesretrieved LAI and VIswere done by assessing statistical metrics of each crossvalidation. R 2 values, root mean square error (RMSE) and mean absolute error (MAE) where calculated for this purpose.

Field ground data
Ground LAI data were collected at different crop stages starting from 40 to 123 DAS. Boxplots for ground LAI data from F1 are shown in Fig. 4.   Fig. 3. The development flowchart for Maize LAI retrieval model based PROSAIL.
Ground LAI data showed an average of 1.3 at 40 DAS and subsequently increased to above 3 at 74 DAS. The average LAI values across crop stages from F1 was 2.7 and reached a maximum of 4.9 at 123 DAS. Fig. 5 shows the boxplots from maize biomass and GY. F2 produced an average of 20.5 and 9.8 ton/ha while F1 produced 11.7 and 5.4 ton/ ha for maize biomass and GY, respectively. F2 yield was higher than F1 with a narrow range of variability to both, maize biomass (14.8-26.7 ton/ha) and GY (8-12 ton/ha). Whilst F1 showed a higher range variability on both, maize biomass (4.4-20 ton/ha) and GY (1.4-9.2 ton/ha). The narrow range of variability observed for F2 indicates that it was more homogeneous as compared to F1. Across both fields, the range of GY was between 1.4 and 12.0 ton/ha and for biomass, it was between 4.4 and 26.7 ton/ha. This range is quite wide, as compared to other studies (Kayad et al., 2021(Kayad et al., , 2019 and global reports (FAO, 2020).

Retrieved LAI from PROSAIL inversion-based model
The PROSAIL model yielded 60750 synthetic spectral signatures using the model input parameters as per Table 2. These simulations were used through the LUT inversion approach with the LSE cost function to retrieve maize LAI from both study fields. Fig. 6, shows the 60750 synthetic spectral signatures, the measured reflectance data overlapped with the range of synthetic data and the bare soil reflectance from ARTMO library. The 60 ground LAI data points across different growth stages were used to validate the retrieved LAI from PROSAIL. Validation results showed an R 2 value of 0.5 with a mean absolute error (MAE) of 0.7 and RMSE of 0.8 (Fig. 7). Specific maize growth stages R 2 values were 0.63, 0.64, 0.59, 0.4 and 0.37 for 40, 53, 74, 107 and 123 DAS, respectively. This model was used to retrieve LAI from all acquired hyperspectral images where ground sampling measurements from both study fields took place, which were considered for further analysis.    Fig. 7. Cross-validation between ground LAI vs PROSAIL retrieved LAI at different DAS.

VIs and retrieved LAI vs maize yield and biomass
Calculated vegetation indices and retrieved LAI from the 12 selected ground sampling points were correlated with maize biomass and GY to describe their relationship through maize growth stages from F1. The time-series of R 2 values between NDVI, GNDVI, NDRE and retrieved LAI versus maize biomass and GY are shown in Fig. 8a and Fig. 8b, respectively. In general, all VIs and retrieved LAI have the same trend over different growth stages. R-square started with high values (>0.65) between 40 and 98 DAS for both, biomass and GY, then declined gradually at advanced crop growth stages. At 40 DAS, retrieved LAI showed the lowest correlation values compared to the other VIs for both biomass and GY, while all indices showed very low correlations starting from 144 DAS. At 123 DAS, GNDVI and NDRE showed a higher accuracy than retrieved LAI and NDVI.
As both, VIs and LAI, showed the same temporal trend with maize biomass and GY according to the R 2 time series, it can be concluded that the same late vegetative growing stage V16 (74 DAS at F1 and 67 DAS at F2) is most suitable for maize biomass and yield assessment. Equations 4-11 describe the empirical relationship between maize biomass and GY versus retrieved LAI, NDVI, GNDVI and NDRE at the V16 growth stage, as described in Section 2.5. These equations were applied on the remaining 39 observations points as part of the cross-validation process, assessing maize biomass and GY from both fields and evaluating model accuracies. Where biomass and GY in ton/ha and all p-values were ˂0.001. Fig. 9 shows the XY graphs resulting from the cross-validation process, where maize biomass and GY were estimated from the VIs and retrieved LAI empirical models at V16 growth stage generated in the previous step. Estimated maize GY and biomass through NDRE showed the highest R 2 values (0.83 and 0.81, respectively) in comparison with all tested indices, with RMSE of 1.11 and 2.36 ton/ha for maize GY and biomass, respectively. Retrieved LAI and GNDVI showed the same biomass estimation accuracy of 0.73 R 2 value and 3.78 and 3.72 ton/ha RMSE, respectively. While GNDVI showed a slightly better performance for GY estimation. Moreover, NDVI produced the lowest estimation accuracies with R 2 values of 0.62 and 0.65 and RMSE of 2.09 and 4.99 ton/ ha for GY and biomass, respectively. Considering R 2 values from each field, showed a quite low values from F2 due to the narrow range of variability. In F1, retrieved LAI showed the highest R 2 values followed by NDRE and mostly equal values from GNDVI and NDVI.

Discussion
Ground LAI increased until V16 and then remained stable (Fig. 4). Feng et al. (2013) reported similar maize LAI behaviour across growth stages under drought experiments. Meanwhile, ground yield measurements from both fields showed a clear difference between them regarding spatial variability and average measurements. F2 produced 75% and 83% higher biomass and GY than F1, respectively. Furthermore, the range in variability for GY in F1 was between 1.4 and 9.2 ton/ha and 4.4-20 ton/ha for biomass. In contrary, F2 was more homogenous. Its production levels were also higher, as they ranged from 15 to 26 and 8-12 ton/ha for biomass and GY, respectively (Fig. 5). Kayad et al. (2019) reported an average maize GY of > 13.5 ton/ha in a highly productive field for continues three seasons while the FAO (2020) reported a global average of 7.7 ton/ha. These yield differences are most likely due to the different irrigation systems: F2 had more frequent post-planting irrigations and they were more evenly distributed thanks to the pivot irrigation system, whereas F1 was irrigated by a furrow irrigation system.
In this study, a total of 60750 LUT simulations based on PROSAIL input parameters were used to estimate maize LAI. This number was based on combination of different input values of N, Cab, LAI, ALIA, Hot, αsoil, θv, φSV and constant values for other model parameters according to previous studies, field and measurement conditions ( Table 2). The use of identical parameters for both fields is reasonable, considering that the two fields were planted with the same hybrid and experienced the same weather conditions. Image acquisition also took place under almost identical conditions. The number of LUT simulations is in agreement with many previous studies in this field of research (Darvishzadeh et al., 2012;Duan et al., 2014;Richter et al., 2009). Duan et al. (2014) investigated different LUT sizes between 50, 100 and 250 thousand simulations to estimate LAI for sunflower, potato and maize fields and reported no significant difference between the retrieved LAI accuracies using different LUT sizes. Similar results were reported by Darvishzadeh et al. (2012) for rice chlorophyll estimations and by Richter et al. (2009) for sugar beet and maize LAI estimations. Furthermore, preliminary tests were carried out to investigate different LUT sizes by decreasing the number of steps within each parameter, reaching up to 120 thousand simulations. It was found that 60750 simulations provided reasonable results.
Retrieved LAI from PROSAIL inversion showed reasonable accuracy against ground LAI measurements where R 2 value was 0.5 and RMSE was 0.8. This result is in agreement with previous studies from different experiments and platforms. Kimm et al. (2020) used different fused satellite datasets acquired by MODIS, Landsat and CubeSat to estimate maize LAI in the US corn belt through a PROSAIL RTM inversion. Their results showed R 2 values of 0.69 and 0.76 while RMSE was > 1 m 2 /m 2 between ground measured LAI and estimated LAI from STAIR fusion (MODIS-Landsat fusion) and CubeSat data, respectively. Su et al. (2019b) retrieved maize canopy LAI and chlorophyll at county scale by fusing Sentinel-2 and MODIS images through PROSAIL at four different maize growth stages. Their results showed R 2 values of about 0.6 between ground measured and estimated LAI and chlorophyll. Su et al. (2019a) retrieved maize LAI using PROSAIL with leaf angle distribution functions generated from terrestrial laser scanning points resulting in a strong and significant correlation (R 2 value = 0.82) between ground and estimated maize LAI.
Additionally, the retrieved versus ground LAI were relatively well distributed around the 1:1 line (Fig. 7). However, the retrieved LAI was slightly overestimated at early crop growth stages and slightly underestimated at late crop growth stages. Such a trend of over and underestimation of retrieved LAI through PROSAIL inversion was observed from previous studies (Adeluyi et al., 2021;Bacour et al., 2006;Si et al., 2012). The PROSAIL model does not take into account the shading effect of row crops (Berger et al., 2018;Dorigo, 2012;Richter et al., 2011). Duan et al. (2014) used hyperspectral images acquired by UAV in different flight directions to estimate LAI through a PROSAIL inversion. Their results showed that shading enhanced reflectance in the backward scattering direction and reduced in the forward scattering direction. Similar results were reported from multi-angular compact high resolution imaging spectrometer (CHRIS) data to estimate LAI (Verrelst et al., 2012). Another assumption could be the fact that maize is a typical row Fig. 9. XY cross-validation graphs of estimated vs ground maize biomass and GY based on retrieved LAI and VIs empirical models. crop and affected by leaf clumping in row direction while the PROSAIL model assumes that leaves are randomly distributed within the canopy volume (Duan et al., 2014;Jacquemoud et al., 2009). Yao et al. (2008) proposed a row structure model for early maize growth stage and homogeneous canopy model for later stages to reduce the effect of row structure on LAI estimation. Moreover, in this study the spectral reflectance was extracted from each sampling point with a buffer of 3 m diameter, which is almost 7 pixels. The sampling area covers almost 4 rows which could be reasonable for reducing the row effects, especially at early stages.
According to the time series analysis for R 2 values between Biomass and GY against retrieved LAI and VIs, the maize vegetative growth stage at V16 resulted in the highest accuracies for yield estimations. This result is in accordance with previous studies for late vegetative stages (Peralta et al., 2016;Schwalbert et al., 2018) while other studies suggested to use the grain filling stages R2-R4 for maize yield prediction (Aguate et al., 2017;Kayad et al., 2019). The difference between the studies reporting on the most suitable time for maize yield prediction could be due to the difference in experimental conditions such as maize hybrids, sensors, methodologies, scale and environmental conditions. However, most of the literature suggests the range between late vegetative to early reproductive growth stages for maize yield prediction through remotely sensed data, which is in agreement with results found in this study.
Maize biomass and GY estimation through VIs and retrieved LAI showed different accuracy levels at different growth stages. Fig. 8 shows R 2 values up to 0.88 for biomass and 0.87 for GY at early stages till 97 DAS, then they declined gradually to < 0.2 before harvest. In general, all indices followed the same trend of high correlations at early stages, decreasing at later growth stages, while NDRE and GNDVI showed better performance than other indices at 123 DAS. This result is in agreement with Kayad et al. (2019), who suggested GNDVI for maize yield estimation at R4-R6 growth stage between 105 and 135 DAS. However, their study didn't consider early growth stages. The lower correlation from NDVI at this specific stage could be explained by the fact of NDVI saturation at medium to high LAI values (Hanna et al., 1999;Kayad et al., 2016;Nguy-Robertson et al., 2012;Trotter et al., 2008). Additionally, for the retrieved LAI, the low range of variability at this stage could explain the low correlation at 123 DAS as shown in Fig. 4.
The selected growth stage of V16 (late vegetative stage) for yield estimation provides a balanced decision considering the findings of previous studies. Although, the R 2 values were almost the same for the selected growth stage at 73 DAS for all studied indices except NDVI (Fig. 8), yield estimation accuracies changed as shown in Fig. 9. It is worth mentioning that in Fig. 9, NDRE estimations trend line were mostly overlaid with the 1:1 line, compared to higher deviation in case of other studied indices, which prove the robustness of NDRE approach compared to other approaches. Kayad et al. (2019) reported same indices for within-field maize yield prediction with Sentinel-2, while their R 2 value was ˂0.5 and reached 0.6 when using machine learning techniques. Schwalbert et al. (2018) reported that NDRE retrieved from Sentinel-2 images was the most sensitive VI for maize yield prediction, followed by GNDVI and NDVI with R 2 values ranging between 0.32 and 0.68. Additionally, NDRE outperformed GNDVI and NDVI for maize yield prediction at mid-season from high resolution RapidEye satellite imagery with R 2 value ranging between 0.29 and 0.70 (Peralta et al., 2016). NDRE could avoid the saturation issues at medium to high LAI (Schwalbert et al., 2018). Furthermore, the higher spatial and spectral resolution of the aerial hyperspectral images used in this study could explain the better estimation accuracies than those achieved with Sentinel-2 and RapidEye satellite images.
In general, NDRE, retrieved LAI and GNDVI showed high accuracy for maize yield estimation, while NDRE outperformed the retrieved LAI for two reasons. Firstly, LAI is a biophysical variable that determines the photosynthetic active radiation (PAR), better describing the vegetative canopy structure and its photosynthetic activity. However, other biophysical variables such as chlorophyll content are also important for describing the whole canopy vigour and subsequently crop yield, whereas NDRE could provide a better proxy for the combination of both. Secondly, PROSAIL is a physically-based approach that has many approximations to simulate the biological canopy properties. Such approximations may face many challenges especially to assess within-field variability of the same crop/variety and considering the previously explained limitations of row and shading effects.
It is worth mentioning that the spatial and temporal constraints to reapply VIs based on empirical models were reported by many previous studies (Báez-González et al., 2002;Lobell, 2013;Schwalbert et al., 2018). Although VIs showed high correlations in this study, these correlations may decrease when such empirical equations are applied to a data set that represents a wide range of conditions (Peralta et al., 2016;Schwalbert et al., 2018). On the contrary, retrieved LAI based on RTM inversion, can be applied in different fields for LAI estimation, subsequently allowing for more robust yield estimations and potentially reducing the number of in-situ measurements to a regional level assessment. Having said so, retrieving LAI from RTM inversion applied into RS imagery show its potential for accessing good crop biomass and yield proxies. However, absolute yield values will always require in-situ measurements specially from different crop varieties. The advantage of using RTM inversion is to derive an accurate and robust estimation for biophysical variables such as LAI and once calibrated, being potentially applied to different fields of the same calibrated crop for subsequently crop yield estimation.
Finally, research applying physically based radiative transfer models and testing their transferability at different spatial scales, from plot experiments to farmers' fields, are needed to demonstrate the robustness of this approach for plant biophysical attributes retrieval. At plot experiment scales, low altitude remote sensing platforms used for high throughput phenotyping can make use of RTM models for plant traits retrieval, such as canopy chlorophyll content (Delloye et al., 2018) and carotenoids (Berger et al., 2018;Jiang et al., 2018), which can be used as proxies for resistant genotype selection under biotic and/or abiotic stresses. At farmers' field scales, the use of such biophysical plant variables can potentially open unprecedented opportunities for within-field variability assessment of yield and quality of the product, besides supporting crop management practices related to fertilizer recommendations.

Conclusions
A field study was conducted in two different maize fields located in north-western Mexico to investigate the potential of PROSAIL model inversion to estimate LAI, for subsequently biomass and GY assessment. Maize ground LAI measured at different growth stages along with hyperspectral imagery signals were used to calibrate and validate PROSAIL and through its inversion, to retrieve LAI. Finally, retrieved LAI and VIs were used for maize biomass and GY estimations. The main findings can be summarized as follows: 1. PROSAIL model inversion was capable to retrieve maize LAI, presenting reasonable R 2 value (0.5) and considerably low RMSE value (0.8); 2. maize yield estimation showed better performance at the late vegetative growth stages (V16), around 73 DAS; 3. maize biomass and grain yield estimations through NDRE showed higher performance in comparison with retrieved LAI, GNDVI and NDVI, when applied to an independent dataset.

Funding
This work was funded partially by the CGIAR Research Program on Maize (www.maize.org) and by the Spurring Transformation in Agriculture Research (STARS) project under number 1094229-2014 (www. starsproject.org). We're most grateful to Lorena Gonzalez Perez and the Crop Nutrition team for their assistance with field measurements, flight campaign and image processing; to the maize growers Rosario Caballero and Juan Escamilla, for allowing us to access to their crop fields. Much of the data analytics was conducted whilst the senior author was on a visiting scholarship with CIMMYT funded by the University of Padova, Italy.

Declaration of Competing Interest
The authors declare no conflict of interest.