Integration of synthetic aperture radar and optical satellite data for corn biomass estimation

Efforts to use satellites to monitor the condition and productivity of crops, although extensive, can be challenging to operationalize at field scales in part due to low frequency revisit of higher resolution space-based sensors, in the context of an actively growing crop canopy. The presence of clouds and cloud shadows further impedes the exploitation of high resolution optical sensors for operational monitoring of crop development. The objective of this research was to present an option to facilitate greater temporal observing opportunities to monitor the accumulation of corn biomass, by integrating biomass products from Synthetic Aperture Radar (SAR) and optical satellite sensors. To accomplish this integration, a transfer function was developed using a Neural Network algorithm to relate estimated corn biomass from SAR to that estimated from optical data. With this approach, end users can exploit biomass products to monitor corn development, regardless of the source of satellite data.• The Water Cloud Model (WCM) was calibrated or parametrized for horizontal transmit and horizontal received (HH) and horizontal transmit and vertical received (HV) C-band SAR backscatter using a least square algorithm.• Biomass and volumetric soil moisture were estimated from dual-polarized RADARSAT-2 images without any ancillary input data.• A feed forward backpropagation Neural Network algorithm was trained as a transfer function between the biomass estimates from RADARSAT-2 and the biomass estimates from RapidEye.


Subject Area Agricultural and Biological Sciences
More specific subject area: C rop biophysical parameters modeling Method name: Empirical model, semi-empirical model, machine learning model Name and reference of original method Water Cloud Model [1] . Vegetation modelled as a water cloud. Radio Science, Vol. 13, pp. 357-364. Resource availability https://smapvex12.espaceweb.usherbrooke.ca/intranet.php

Method details
The Water Cloud Model (WCM) is a semi-empirical model that has been frequently used by researchers to estimate crop biophysical parameters from SAR data [ 2 , 5 , 8 ]. The compact form of the model is introduced in Eq. (1) [4] .
where σ 0 is total backscatter in power unit, L is biomass, M v is volumetric soil moisture, θ is the incidence angle, and A, B, C, D, E 1 and E 2 are the coefficients.
The WCM model has six coefficients ( A, B, C, D, E 1 and E 2 ) and two unknown variables (i.e. biomass and volumetric soil moisture). The model calibration to parameterize the six coefficients and its inversion to estimate the biomass and soil moisture are explained in the following sections.

WCM model calibration
The WCM model has six coefficients and therefore, calibration of the model requires at least six calibration points with their ground measurements (i.e. biomass and soil moisture) and satellite observations (i.e. backscatter and incidence angle). However, to develop a robust model more data are needed over a wide range of biomass and soil moisture conditions. In this research, 23 calibration points were used with soil moisture ranging from 0.039 m 3 m −3 to 0.379 m 3 m −3 , dry biomass from 0.003 kg m −2 to 1.16 kg m −2 , wet biomass from 0.04 kg m −2 to 7.1 kg m −2 and SAR incidence angles from 21.025 °to 31.9592 °. A least square method [7] was used to calibrate the WCM model. To run the least square method, the nlinfit function in MATLAB (version R2018b) was used to estimate the six coefficients.
[ Beta , R ] = nlinfit ( X , Y , @ModelFun , Beta0 ) In the above code, nlinfit is a function that applies the least square method to a non-linear regression function and estimates its coefficients. Beta is the vector of estimated coefficients and its size is 6 × 1 in this study. R is the vector of residuals (6 × 1) for the estimated coefficients. X is the matrix of independent variables including biomass, soil moisture and incidence angle. The size of this matrix is 23 × 3. Y is a vector (23 × 1) of the dependent variable, in this study, total backscatter. ModelFun is the function for the WCM model. Beta0 is the vector (6 × 1) of initial values for the six coefficients. In this study, the initial values of the coefficients were random numbers between 0 and 1. The nlinfit function works based on an iterative approach, improving the initial coefficients (i.e. Beta0) in every iteration. The iteration terminates when the sum of squares of the residuals reaches its default tolerance value of 10 −8 , or the number of iterations reaches 100.

WCM model inversion
A goal of this research was to estimate biomass and soil moisture by inverting the WCM model without the requirement of any additional input data. Because the WCM model has two unknown variables (i.e. biomass and soil moisture), the model was calibrated or parameterized for two polarizations -HH and HV. With these two equations (i.e. one for each of the polarizations), both biomass and soil moisture can be simultaneously derived using the Levenberg-Marquardt algorithm Table 1 Biomass models based on optical vegetation indices. a 1 , a 2 , a 3 , a 4 , b 1 , b 2 , b 3 and b 4 are empirically derived coefficients. Separate sets of coefficients were estimated for wet and dry biomass.

Optical Models
Normalized Difference Vegetation Index (NDVI) [6] . Using the fsolve function in MATLAB, this algorithm was implemented for all calibration and validation points.
V is the estimated variables (i.e. biomass and soil moisture) and is a vector of 2 × 1. V0 is the initial values for the estimated variables and has the same dimensions as V. In this study, the initial values for biomass and soil moisture were 1 kg m −2 and 0.2 m 3 m −3 , respectively. Fun is a system of two WCM equations (one for each polarization). The fsolve function, like the nlinfit function, needs initial values for the variables, improving these initial values with every iteration. The iterations stopped when the difference between the derived variables of the two iterations is less than 10 −6 , or the number of iterations reached 400.

Calibration of optical models
The optical models ( Table 1 ) were based on four vegetation indices -Normalized Difference Vegetation Index (NDVI), Red-Edge Triangular Vegetation Index (RTVI), Simple Ratio (SR) and Rededge Simple Ratio (SRre). These indices were applied to reflectance data from RapidEye imagery. As with the calibration of the WCM model, the nlinfit function in MATLAB was used to calibrate the optical models. In this function, X is a 23 × 1 vector of the vegetation index and Y is a 23 × 1 vector of biomass measurements. ModelFun is the optical model ( Table 1 ). Beta0 is the vector of initial values for the two coefficients and its size is 2 × 1. The initial values of the coefficients were random numbers between 0 and 1. As before, estimation of the coefficients was done iteratively. The iteration stopped when the sum of squares of the residuals reached to the tolerance value of 10 −8 , or the number of iterations reached 100.

Transfer function
A transfer function between the biomass estimates from RADARSAT-2 and the biomass estimates from RapidEye was developed. The purpose of this function is to allow users to derive biomass from satellite data regardless of the source. The transfer function was a two-layer feed-forward backpropagation Neural Network model with 10 hidden neurons [3] . To train the model, the biomass estimates from RADARSAT-2 (from the calibration points) were used as input with the corresponding estimates from RapidEye as output. The model was trained with Levenberg-Marquardt algorithm using the MATLAB Neural Net Fitting tool. 70% of the calibration points (i.e. 17 points) was used to develop the Neural Network, with the remainder (6 points) reserved to validate the trained model. After the network was developed, it was used to adjust the SAR-based biomass estimates for the 43 validation points, using the following MATLAB code: K is the biomass estimates from the Neural Network model (a vector of 43 × 1). NNModel is the trained Neural Network function. L is the input to the Neural Network which contains the biomass estimates from the WCM model. The abs function delivers the absolute value of the estimate.

Supplementary material
The measured dry and wet biomass, measured soil moisture, satellite observations including HH and HV backscatters and incidence angles and NDVI are reported for all the 66 points (including calibration and validation points) in Table 2 . This table was sorted such that the first 23 points are the calibration points and the rest of the points were used as the validation points.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.