Assessing the benefit of satellite-based Solar-Induced Chlorophyll Fluorescence in crop yield prediction

Large-scale crop yield prediction is critical for early warning of food insecurity, agricultural supply chain management, and economic market. Satellite-based Solar-Induced Chlorophyll Fluorescence (SIF) products have revealed hot spots of photosynthesis over global croplands, such as in the U.S. Midwest. However, to what extent these satellite-based SIF products can enhance the performance of crop yield prediction when benchmarking against other existing satellite data remains unclear. Here we assessed the benefits of using three satellite-based SIF products in yield prediction for maize and soybean in the U.S. Midwest: gap-filled SIF from Orbiting Carbon Observatory 2 (OCO-2), new SIF retrievals from the TROPOspheric Monitoring Instrument (TROPOMI), and the coarse-resolution SIF retrievals from the Global Ozone Monitoring Experiment-2 (GOME-2). The yield prediction performances of using SIF data were benchmarked with those using satellite-based vegetation indices (VIs), including normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), and near-infrared reflectance of vegetation (NIRv), and land surface temperature (LST). Five machine-learning algorithms were used to build yield prediction models with both remote-sensing-only and climate-remote-sensing-combined variables. We found that high-resolution SIF products from OCO-2 and TROPOMI outperformed coarse-resolu- tion GOME-2 SIF product in crop yield prediction. Using high-resolution SIF products gave the best forward predictions for both maize and soybean yields in 2018, indicating the great potential of using satellite-based high-resolution SIF products for crop yield prediction. However, using currently available high-resolution SIF products did not guarantee consistently better yield prediction performances than using other satellite-based remote sensing variables in all the evaluated cases. The relative performances of using different remote sensing variables in yield prediction depended on crop types (maize or soybean), out-of-sample testing methods (five- fold-cross-validation or forward), and record length of training data. We also found that using NIRv could generally lead to better yield prediction performance than using NDVI, EVI, or LST, and using NIRv could achieve similar or even better yield prediction performance than using OCO-2 or TROPOMI SIF products. We concluded that satellite-based SIF products could be beneficial in crop yield prediction with more high-resolu- tion and good-quality SIF products accumulated in the future.


Introduction
Crop yield forecasting at a regional to global scale is important for early warning of food insecurity, agricultural supply chain management, and economic market prediction (Everingham et al. 2002;Hansen and Indeje 2004;Isengildina-Massa et al. 2008). Generally, a crop yield forecasting system can be based on either physical or statistical models. Physical-model-based approach usually uses a crop model to dynamically simulate crop growth and yield formation processes (Brown et al. 2018;Jones et al. 2017;Jones et al. 2003 Peng et al. 2020;Rosenzweig et al. 2013;Shelia et al. 2019). However, due to the complexity and relatively lower performance of these physical models at large scales, statistical models are widely used in operational large-scale crop yield forecasting systems (Chipanshi et al. 2015;Newlands et al. 2014;Peng et al. 2018b).
Statistical crop yield models are data-driven, and thus the type, volume, as well as quality of input data are among the key factors determining the model performance. Earlier studies developing statistical models for crop yield forecasting mainly rely on environmental factors as inputs, such as climate and soil condition (Legler et al. 1999;Phillips et al. 1999;Potgieter et al. 2002;Qian et al. 2009). Later, satellite data has been proved to be beneficial in operational crop yield forecasting systems. Using satellite data only or adding satellite data upon environmental information can generally lead to better yield estimation than traditional statistical crop yield models only using environmental factors . The application of various remote sensing products across a diverse spectral range in crop yield estimation has been extensively explored (Guan et al. 2017), including (but not limited to) surface reflectance (You et al. 2017), vegetation indices (Bolton and Friedl 2013;Cai et al. 2019;Chipanshi et al. 2015;Johnson 2014;Lobell et al. 2015;Newlands et al. 2014;Peng et al. 2018b), land surface temperature (LST) (Cai et al. 2017;Johnson 2014;You et al. 2017), fraction of photosynthetically-active radiation (fPAR) (Bastiaanssen and Ali 2003;Jiang et al. 2004), gross primary productivity (GPP) (He et al. 2018), evapotranspiration (Anderson et al. 2016;, active microwave based backscattering and passive microwave based vegetation optical depth (Chaparro et al. 2018;Guan et al. 2017).
Satellite-based Solar-induced Chlorophyll Fluorescence (SIF) has recently demonstrated to be effective in capturing the spatial and temporal variabilities of terrestrial carbon uptake (Frankenberg et al. 2011;Guanter et al. 2014;Joiner et al. 2013;Parazoo et al. 2013;Shiga et al. 2018). Although agricultural areas are always hot spots on the satellite-based SIF maps during the peak growing season, few studies have directly explored the use of satellite SIF data in crop yield estimation. Guanter et al. (2014) and Guan et al. (2016) were the first to indirectly link SIF retrieval and crop yield. They first estimated GPP through linear scaling with or without accounting for stoichiometry and photosynthetic pathways, and then benchmarked with aggregated net primary productivity (NPP) estimated from production of all crops, instead of yield for individual crops mainly due to the coarse spatial resolution (0.5 degree) of the Global Ozone Monitoring Experiment-2 (GOME-2) gridded SIF products used in their studies. A recent work using SIF products from the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) and GOME-2 for wheat yield prediction in Australia showed SIF was no better than Enhanced Vegetation Index (EVI), largely due to the low spatial and temporal resolution of SCIAMACHY and GOME-2 SIF products and also low signal-to-noise ratio in these coarse-resolution SIF products (Cai et al. 2019). More recently, the Orbiting Carbon Observatory 2 (OCO-2) and the TROPOspheric Monitoring Instrument (TROPOMI) can provide SIF retrievals at much higher spatial resolutions Köhler et al. 2018), which opens the opportunity for large-scale crop yield estimation using satellite-based SIF. Although the original OCO-2 SIF product has its limitations in sparse sampling swath and long revisit cycle , data-driven gap-filling of OCO-2 SIF can provide spatial continuous and high resolution (0.05 degree) SIF products (Li and Xiao 2019;Yu et al. 2018;Zhang et al. 2018). However, there are still no studies directly using these satellite-based high-resolution SIF products for crop yield prediction, and further comparing the performances of using satellite-based high-resolution SIF products with those of using coarse-resolution SIF products and traditional vegetation indices or LST in crop yield prediction. Therefore, the benefits of using satellite-based high-resolution SIF products in operational crop yield estimation remains unclear. Besides the above advancements in generating new high-resolution SIF products, the near-infrared reflectance of vegetation (NIRv) (Badgley et al. 2017), a new combination of red and near-infrared band reflectance from Moderate Resolution Imaging Spectroradiometer (MODIS), has been found to be well correlated with SIF and can lead to improved GPP estimates relative to the normalized difference vegetation index (NDVI) and fPAR, which indicates that NIRv can also be potentially applied for crop yield estimation. To our best knowledge, there is still no studies testing the performance of NIRv in crop yield prediction.
This study aims at assessing the potential application of satellitebased high-resolution SIF products from gap-filled OCO-2 and TROPOMI in estimating maize and soybean yield in the U.S. Midwest. The yield prediction performance using satellite-based high-resolution SIF will be benchmarked with those using coarse-resolution GOME-2 SIF, several MODIS-based vegetation indices (NDVI, EVI, and NIRv), and LST. By conducting this assessment and intercomparison study, we want to answer the following questions: (1) Does high-resolution SIF products perform better than coarse-resolution SIF products in crop yield prediction? (2) Does high-resolution SIF products perform better than vegetation indices and LST in crop yield prediction? Answer to these questions could guide the development of operational yield prediction system using multi-source remote sensing data.

Study Area
We focused on rainfed maize and soybean yield estimation over 12 states in the U.S. Midwest, including Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, and Wisconsin ( Fig. 1). In 2018, harvested area over these 12 states accounted for 87% and 83% of U.S. total harvested area for maize and soybean, respectively, which corresponded to 89% and 84% of U.S. total maize and soybean productions, respectively. The rainfed maize and soybean harvested area in these 12 states accounted for 66% and 73% of U.S. total harvested area, which corresponded to 69% and 74% of U.S. total productions in 2018 for corn and soybean, respectively (Fig. 2).

Historical crop yield and acreage data from USDA NASS
We obtained county-level harvested yield and acreage data for both rainfed maize and soybean from the U.S. Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) Quick Stats Database (quickstats.nass.usda.gov). For counties without any irrigated yield records, their yield was considered to be rainfed. For irrigated counties, we only included yield records that were explicitly reported as "nonirrigated" as rainfed yield.

Historical climate data
Historical climate data were obtained from the Parameter-elevation Relationships on Independent Slopes Model (PRISM), which has a 4-km spatial resolution (Daly et al. 2008). We used monthly mean temperature (Tair), precipitation (Prec), and vapor pressure deficit (VPD) as climate variables in yield prediction models. Tair and Prec are commonly used in building crop yield prediction models as they represent the basic meteorological condition over a region Lobell and Burke 2010;Lobell et al. 2015;Peng et al. 2018b). VPD has been found to be a dominant factor in indicating crop water stress over the U.S. Midwest (Lobell et al. 2014). Though VPD is highly correlated with Tair, adding VPD upon Tair and Prec still improved the forecasting performance of yield (Peng et al. 2018b). The original 4-km data was aggregated to the county level without differentiating crop types. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) (Yu et al. 2018), which is a machine learning prediction of OCO-2 nadir SIF using MODIS Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) products (MCD43A4 and MCD43C4). The original instantaneous OCO-2 SIF retrievals at 757 nm and 771 nm were scaled to 757 nm using (SIF 757 + 1.5×SIF 771 )/2 to improve the accuracy and finally converted to daily mean SIF before used for model training and prediction. Biome-and time-step-specific feedforward artificial neural network (ANN) models were trained and cross-validated using co-located OCO-2 footprints and MODIS NBAR data. This product shows high quality when benchmarked with independent airborne SIF measurements (Yu et al. 2018). The data is available after September of 2014, and we used all the data from 2015 to 2018.
We also used the TROPOMI ungridded daily SIF product with a footprint size of about 7 km x 3.5 km at nadir . TROPOMI is on board the Sentinel 5 Precursor (S-5 P). SIF retrieval was conducted at a spectral window ranging from 743-758 nm, which is a subset of TROPOMI's band 6 (725-775 nm). TROPOMI SIF data is available after February of 2018. We gridded the footprint-level TRO-POMI SIF data to 0.05°to match the spatial resolution of SIF _ OCO2 005 product used in this study. A SIF value contributes to a grid cell average if the footprint covers the center of this grid cell ). We converted the instantaneous TROPOMI SIF to daily average by applying a day length correction factor contained in the TROPOMI SIF product, which assumed cloud-free condition as a first-order approximation (Frankenberg et al. 2011;Köhler et al. 2018).
Besides SIF _ OCO2 005 and TROPOMI SIF products, we also used a coarse-resolution SIF products from GOME-2 (Köhler et al. 2015). GOME-2 is on board EUMETSAT's polar orbiting Meteorological Operational Satellites (MetOp-A and MetOp-B), and a nadir-scanning medium-resolution UV/VIS spectrometer with a spectral range between 240 and 790 nm. A subchannel ranging from 720-758 nm were used to retrieve SIF signal at 740 nm from GOME-2 observations (Köhler et al. 2015). Before gridded into 0.5°product, the instantaneous SIF retrievals were converted to daily mean values using the daily correction factor approach (Frankenberg et al. 2011), which is the same with TROPOMI SIF product ). This product is available since 2007 and we used all data from 2015 to 2018.
Following Köhler et al. (2018), we scaled SIF _ OCO2 005 at 757 nm to TROPOMI SIF retrieval channel (around 740 nm) by multiplying a factor of 1.56, which was determined based on a reference SIF emission shape derived from leaf-level measurements (Magney et al. 2017). The scaled SIF _ OCO2 005 , TROPOMI gridded SIF products at 740 nm, and GOME-2 SIF product were then aggregated to monthly and county level mean values for maize and soybean separately using crop fraction determined from the yearly cropland data layer (CDL, see section 2.4.3 for details).

MODIS data
We used NDVI, EVI, NIRv, and LST data from MODIS as additional remote sensing based predictors for crop yield estimation. Both NDVI and EVI were from the Terra 16-day global vegetation indices product with a spatial resolution of 250 m (MOD13Q1.006), which contains the best available vegetation index values from all the MODIS acquisitions within the 16-day period. NIRv was calculated using the daily MODIS Nadir BRDF-Adjusted Reflectance (NBAR) product with a spatial resolution of 500 m (MCD43A4.006) following the definition of NIRv= where NIR and R represent the surface reflectance at near-infrared and red bands, respectively (Badgley et al. 2017). Daily NIRv was then composited into 16-day data following a similar maximum-value approach with NDVI and EVI. Daytime LST data was from the Aqua 8-day global land surface temperature and emissivity product with a spatial resolution of 1 km (MYD11A2.006). We choose Aqua daytime LST product here as the Aqua satellite goes cross the equator at approximately local time 1:30 P.M., which is closer to the time of maximum canopy temperature, incoming solar radiation, as well as most possible stressing conditions for crops on clear days, compared with Terra satellite with a visiting time of 10:30 A.M.. We did not use the nighttime LST product as it has little correlation with crop yield (Johnson 2014). All 8-day or 16-day MODIS data were firstly aggregated to monthly scale, and finally aggregated to county-level mean values for maize and soybean separately using crop area fractions determined from the yearly 30 m USDA NASS Cropland Data Layer (CDL, see section 2.4.3 for details). All MODIS pixels with area fraction of corn or soybean larger than 50% within a specific county were averaged as the county-level mean values.

CDL from USDA NASS
The USDA NASS CDL was used to aggregate the remote sensing variables to county level for corn and soybean separately. The CDL data is a yearly multi-satellite based crop type classification product using decision tree supervised classifier and has a 30 m spatial resolution. The classification accuracy for maize and soybean is above 95% over the U.S. Midwest (Boryan et al. 2011). For MODIS VIs and LST data, we aggregated all the MODIS pixels with fractions of corn or soybean larger than 50% within a county. For SIF data, we conducted simple weighted average of all the 5 km grids within a county using corn or soybean area fraction as weights.

Crop yield model development
We used five different machine learning algorithms to develop the crop yield model, including the least absolute shrinkage and selection operator regression (LASSO) (Tibshirani 1996), ridge regression (RIDGE) (Hoerl and Kennard 1970), support vector regression (SVR) (Smola and Schölkopf 2004), random forest regression (RF) (Breiman 2001), and artificial neural network (ANN) (Gardner and Dorling 1998;Specht 1991). LASSO and RIDGE are both regularized regression methods and their difference is that LASSO uses L1 regularization while RIDGE uses L2 regularization (Fu 1998;Tibshirani 1996). Both LASSO  (GOME2) in the x-axis represent SIF _ OCO2 005 , TROPOMI and GOME-2 SIF products, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 and RIDGE have the same penalty parameter to be tuned. The SVR is a kernel-based regression method solving nonlinear regression problems by transferring the data to a higher-dimensional space through a kernel function. We used the radial basis function (RBF) kernel for SVR (Suykens and Vandewalle 1999) as it usually gives better accuracy than linear and polynomial kernels. RF is a binary-tree based machine learning algorithm, which builds an ensemble of decision trees with different subsets of variables. The ANN is based on a collection of artificial neurons, which loosely model the neurons in a biological brain and can receive inputs, change their internal states (activation) according to the inputs, and produce outputs depending on the inputs and activation. We used the multilayer perceptron (MLP) regressor (Gardner and Dorling 1998), which is feedforward ANN and trains using backpropagation with no activation function in the output layer. We choose the L-BFGS method to optimize the squared-loss as it converges faster and performs better for small datasets. All these methods have been previously explored for crop yield estimation at varied scales (Cai et al. 2019;Jeong et al. 2016;Jiang et al. 2004). The main purpose of using multiple algorithms with varied complexity here is to test whether the differences in yield predictability using different remote sensing variables are consistent when using different algorithms to build the crop yield model.

Experiment design
We conducted two groups of experiments: one group used only remote sensing variables, while another group used both climate and remote sensing variables as predictors. Remote sensing variables included monthly SIF, NDVI, EVI, NIRv, and LST during the growing season. We used monthly air temperature, precipitation, and VPD during the growing season as climate variables in the second group of experiments as these variables combined can provide reasonable prediction performance among all the climate-only models Peng et al. 2018b). The growing season in this study was defined as May to September, which aligned with the actual growing season of corn and soybean in the U.S. Midwest. All variables were standardized by removing their mean values and scaling to unit variance before used for model training and testing, which can help avoid bad performance if the individual features are not standard normally distributed data.
Two different out-of-sample validation methods were used to quantify the yield estimation performance. One was the repeated fivefold-cross-validation (FFCV) method, and the other one was the forward method. The repeated FFCV method runs the FFCV for n times, each of which randomly splits the whole dataset into 5 folds, and uses 4 folds for training and 1 fold for testing within a FFCV loop. We choose n = 100 corresponding to 500 training-testing splits in total, which balanced well between accuracy and computation burden. The forward method used all data from years before the prediction year as training dataset. For both repeated FFCV and forward methods, all the five algorithms were automatically optimized by tuning their hyperparameters using FFCV on their training dataset. The training data was shuffled in a consistent way to avoid the impact of internal structures (both spatial and temporal) in training data on FFCV. The prediction performance was then assessed using the testing dataset. We used coefficient of determination (R 2 ), root mean square error (RMSE), and mean absolute bias (MAB) as statistical metrics in performance assessment. For repeated FFCV method, we reported both mean and standard deviation of these two metrics evaluated over the 500 training-testing splits.
To better demonstrate the benefit of using SIF _ OCO2 005 and TROPOMI SIF in yield prediction, we evaluated the performance using data during 2015-2018 (4-year case hereafter) and only in 2018 (1-year case hereafter) mainly considering the data availability of SIF _

Correlation of crop yield with climate and remote sensing variables in 2018
The spatial patterns of crop yield in 2018 were better correlated with remote sensing variables (SIF, NDVI, EVI, NIRv, and LST) than climate factors (Tair, Prec, and VPD). Among the tested remote sensing variables, SIF from OCO-2 and TROPOMI, EVI, NIRv in July and SIF from OCO-2 and TROPOMI, and NIRv in August showed correlation coefficients larger than 0.8 with maize yield. Similarly, SIF from OCO-2 and TROPOMI and NIRv in August also showed correlation coefficients larger than 0.8 with soybean yield. The correlation coefficient between SIF from GOME-2 and crop yield was smaller than those between SIF from OCO-2 and TROPOMI and crop yield, sometimes even ranked the lowest among all the remote sensing variables, such as in August for both maize and soybean. LST negatively correlated with crop yield. For maize, the correlation coefficient between LST and yield in July is larger than that in August. For soybean, the correlation coefficient between LST and yield in August is larger than that in July. Among the three climate factors, VPD was negatively correlated with crop yield and precipitation was positively correlated with crop yield for both corn and soybean, while air temperature was negatively correlated with crop yield for maize and positively correlated with crop yield for soybean. VPD showed higher correlation coefficients than Tair and Prec with both maize and soybean yields. For example, the correlation coefficients between VPD and maize yield were -0.69 and -0.60 in July and August, respectively; while those between Tair and maize yield were only -0.20 and -0.03 in July and August, respectively.
There were also strong correlations among different climate and remote sensing variables themselves. Strong positive correlation coefficients were observed among the SIF and VIs for both maize and soybean, while LST negatively correlated with other remote sensing variables. VPD also negatively correlated with all the remote sensing variables, except LST with which VPD showed positive correlation coefficients indicating that LST and VPD are good crop stress indicators when crop growth condition is sub-optimal. Compared with VPD, the correlation between Tair and the remote sensing variables were relatively weak.

FFCV of yield prediction performance using only remote sensing variables
We first evaluated the tested yield prediction performance of those models only using remote sensing variables with FFCV out-of-sample validation method. The results for training and testing with data during 2015-2018 and in 2018 only are shown in Fig. 3 and 4, respectively. For maize and soybean yield prediction during 2015-2018, NIRv performed consistently better than other remote sensing variables with the highest R 2 and lowest RMSE. The performance of SIF _ OCO2 005 in maize yield prediction was slightly better than NDVI, EVI, and LST, while GOME-2 SIF has the lowest performance with lowest R 2 and largest RMSE in crop yield prediction among all the remote sensing variables (Fig. 3). When these models were trained and tested using data in 2018, the performance of using SIF from GOME-2 still showed the lowest performance, while the performances of using other remote sensing variables were quite similar, especially when using nonlinear machine learning algorithms. Overall, we still observed that NIRv, SIF _ OCO2 005 , and TROPOMI SIF performed better than other remote sensing B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 variables (GOME-2 SIF, NDVI, EVI, and LST) in maize and soybean yield prediction (Fig. 4). For maize, NIRv performed consistently better than other remote sensing variables with the highest R 2 and lowest RMSE across the five algorithms. For soybean, SIF _ OCO2 005 had a slightly better mean performance compared with NIRv and TROPOMI SIF, but we noted that the performance differences among these three variables were marginal. Results from MAB metric were consistent with the above results from R 2 and RMSE ( Fig. S1 and S2), i.e., SIF _ OCO2 005 , TROPOMI SIF, and NIRv had the lowest MAB among all the remote sensing variables for both maize and soybean yield prediction.

FFCV of yield prediction performance using combined climate and remote sensing variables
We then evaluated the tested yield prediction performance of those models using combined climate and remote sensing variables with FFCV out-of-sample validation method. The results for training and testing with data during 2015-2018 and in 2018 only are shown in Fig. 5 and 6, respectively. For both maize and soybean yield prediction during 2015-2018, NIRv performed best among all the remote sensing variables with the highest R 2 and lowest RMSE (Fig. 5). SIF _ OCO2 005 showed similar performance with EVI or NDVI in yield prediction for both maize and soybean. When the models were trained and tested using data in 2018, NIRv still performed best for maize yield prediction. For soybean yield prediction, NIRv performed best when using linear yield prediction algorithms, while both SIF _ OCO2 005 and VIs had similar performances when using nonlinear yield prediction algorithms. Similar to the results obtained when only using remote sensing variables in crop yield prediction, using SIF from GOME-2 had the lowest performance in crop yield prediction with climate and remote sensing combined models. MAB results showed that SIF _ OCO2 005 , TROPOMI SIF, NDVI, EVI, and NIRv had comparable MABs, especially in soybean yield prediction with climate and remote sensing combined models, while GOME-2 SIF and LST had much larger MABs ( Fig. S3 and S4).

Forward yield prediction in 2018
The spatial patterns of forward yield prediction in 2018 using different climate and remote sensing combined variables and random forest models are shown in Fig. 7 and 8 for maize and soybean,   Fig. 3. Testing performance of maize (top panels) and soybean (bottom panels) yield prediction using only remote sensing variables and evaluated with five-foldcross-validation method during 2015-2018. The performance metrics (left panels for R 2 and right panels for RMSE) are calculated for 500 random training-testing splits and then both means (filled bars) and standard deviations (error bars) of the metrics are derived. SIF(OCO2) and SIF(GOME2) in the legend represent SIF _ OCO2 005 and GOME-2 SIF products, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 respectively. We showed the results from the random forest models because they had the best yield prediction performance as shown in Fig. 3 to Fig. 6. The models were trained using data from 2015-2017. For SIF, we trained the model using SIF _ OCO2 005 data during 2015-2017, while validated the model using both SIF _ OCO2 005 and TROPOMI data in 2018. For both maize and soybean, using SIF _ OCO2 005 and TROPOMI SIF products gave the best yield prediction performances. For example, using SIF _ OCO2 005 in yield prediction in 2018 achieved a R 2 of 0.77 and RMSE of 18.11 bu/acre (1.14 t/ha) for maize, and R 2 of 0.78 and RMSE of 5.31 bu/acre (0.36 t/ha) for soybean, respectively. Using TROPMI SIF in yield prediction gave similar performance as that using SIF _ OCO2 005 in 2018 with some performance degradation as the models were trained using SIF _ OCO2 005 . Using GOME-2 SIF gave the lowest performance in maize yield prediction with an R 2 of 0.53 and RMSE of 25.95 bu/acre (1.63 t/ha), while its performance was slightly better than using LST for soybean yield prediction. NIRv performed best among other remote sensing variables besides SIF. Bias distribution in yield prediction showed that using SIF _ OCO2 005 , TROPMI SIF, and NIRv could lead to more centralized and narrower bias distributions around zero compared with using other remote sensing variables for both corn and soybean (Fig. 9).

Potential of using satellite-based SIF products in crop yield prediction
In this study, we demonstrated that using high-resolution SIF products from OCO-2 and TROPOMI could significantly improve the yield prediction performance compared with using coarse-resolution SIF products from GOME-2. This is mainly because higher resolution of SIF _ OCO2 005 and TROPOMI SIF enables better quantification of SIF signals from cropland, and GOME-2 has a lower signal-to-noise ratio. Our study also demonstrated that using high-resolution SIF products from OCO-2 and TROPOMI could bring benefits in crop yield prediction. For example, using SIF products from OCO-2 and TROPOMI achieved the best yield prediction performances for both maize and soybean with either five-fold-cross-validation in 2018 (Fig. 4) or forward prediction in 2018 (Figs. 7, 8, and 9). However, our results also showed that using Fig. 4. Testing performance of maize (top panels) and soybean (bottom panels) yield prediction using only remote sensing variables and evaluated with five-foldcross-validation method in 2018. The performance metrics (left panels for R 2 and right panels for RMSE) are calculated for 500 random training-testing splits and then both means (filled bars) and standard deviations (error bars) of the metrics are derived. SIF(OCO2), SIF(TROPOMI), and SIF(GOME2) in the legend represent SIF _ OCO2 005, TROPOMI and GOME-2 SIF products, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 current high-resolution SIF products did not guarantee consistently better yield prediction performances than using other remote sensing variables in all the evaluated cases. The relative performances of using different remote sensing variables in yield prediction depended on crop types (maize or soybean), out-of-sample testing methods (five-foldcross-validation or forward), and length of training data. However, considering that the high-reslution SIF products we used here have a spatial resolution of 5 km while other MODIS-based variables are at finer (≤ 1 km) resolutions, we are still optimistic in the performance of using SIF data for crop yield prediction since higher spatial resolution of SIF data would allow better separation of corn and soybean than the current 5 km data we used in this study. There are several possible ways that can lead to potential improvement for the yield prediction performance using SIF. Firstly, different ways of using SIF data in building crop yield prediction models may lead to different performances. For example, considering that the SIF signal is integrable over time, we may also use growing season accumulated SIF or maximum SIF in crop yield prediction. Converting satellite-observed SIF (angular SIF from top canopy) into whole-canopy total emitted SIF has been found to better correlated with canopy photosynthesis (Liu et al. 2019;Yang and van der Tol 2018;Zeng et al. 2019), which may improve the crop yield prediction too. The total emitted SIF from chlorophyll is attenuated by reabsorption and scattering within the leaf and canopy making the observed canopy SIF is a variable fraction of total emitted SIF. The conversion from satelliteobserved SIF to total emitted SIF is non-trivial, as we need to estimate the escape ratio, which is determined by sun-sensor geometry, canopy structure, and leaf optical properties. Recent work by Zeng et al. (2019) proposed a practical approach to approximate the escape ratio for near infrared SIF using the NIRv-to-fPAR ratio, making conversion from satellite-observed SIF to total emitted SIF feasible at large scale. To test the performances of these alternative ways of using SIF data in crop yield prediction, we compared the performances of using monthly SIF, growing season maximum and accumulated SIF, and the monthly total emitted SIF from SIF _ OCO2 005 in yield prediction during 2015-2018 (Fig. 10). We used scaling relationships between NDVI and fPAR to calculate fPAR (Gitelson et al. 2014), and subsequently estimated escape ratio and total emitted SIF (Zeng et al. 2019). We observed that using monthly mean SIF actually achieved better performances than other alternative ways of using SIF data for both maize and soybean Testing performance of maize (top panels) and soybean (bottom panels) yield prediction using combined climate and remote sensing variables and evaluated with five-fold-cross-validation method during 2015-2018. The performance metrics (left panels for R 2 and right panels for RMSE) are calculated for 500 random training-testing splits and then both means (filled bars) and standard deviations (errorbars) of the metrics are derived. SIF(OCO2) and SIF(GOME2) in the legend represents SIF _ OCO2 005 and GOME-2 SIF products, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 yield prediction. The reason may be that monthly mean SIF provides more temporal information to capture crop stress at various stages than growing season maximum and accumulated SIF, and there are uncertainties in estimations of both fPAR and escape ratio when deriving the total emitted SIF. Whether using total emitted SIF could lead to improved yield prediction performance when better fPAR estimation and new approximation of escape ratio become available deserves further investigation. Secondly, we note that the yield prediction performance reported here was only optimized with limited training data. For fair comparison between SIF and other remote sensing predictors, we only used the data during 2015 to 2018 for model training, which may be not enough for operational application aiming at higher performance in yield prediction. With the increase of training data, the yield prediction performance of using SIF products may be further improved. To test this hypothesis, we trained random forest models with different years of training data before 2018 and tested the model performances using data in 2018 (Fig. 11). The models used EVI, NDVI, NIRv and LST as remote sensing variables, all of which are available since early 2000s. These experiments were restricted to six states (Illinois, Indiana, Iowa, Nebraska, North Dakota, and Wisconsin) with CDL data available since 2003. For maize yield prediction, additional 4 to 6 years (7 to 9 years in total) of training data could further improve the yield prediction performances in 2018. After that, the yield prediction performances were relatively stable, and more training data did not necessarily mean better yield prediction performance any more. For soybean, the change of yield prediction performance with increased years of training data was relatively noisy. These results indicated that the maize yield prediction performances using SIF could be further improved with more training data accumulated. Besides new data from OCO-2/3 and TROPOMI, recent advancements in developing gap-filled OCO-2 SIF products since 2000 (Li and Xiao 2019) and attempts in reconciling inconsistencies among multi-sensor observations in last two decades (Parazoo et al. 2019;Wen et al. 2020) may also help in accumulating more SIF data for training crop yield prediction models, though the uncertainties from the reconstructed SIF data may propagate to the final yield prediction which needs further investigation in future studies.
Thirdly, better quality of future SIF products may further improve the performance in yield prediction. New satellite missions, such as FLuorescence EXplorer (FLEX) (Drusch et al. 2016), can provide SIF Fig. 6. Testing performance of maize (top panels) and soybean (bottom panels) yield prediction using combined climate and remote sensing variables and evaluated with five-fold-cross-validation method in 2018. The performance metrics (left panels for R 2 and right panels for RMSE) are calculated for 500 random training-testing splits and then both means (filled bars) and standard deviations (errorbars) of the metrics are derived. SIF(OCO2), SIF(TROPOMI), and SIF(GOME2) in the legend represent SIF _ OCO2 005, TROPOMI and GOME-2 SIF products, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 products with higher spatial resolutions than existing SIF products. Statistical downscaling also has the potential to further improve the spatial resolution of existing SIF products although previous efforts mainly focused on downscaling the coarse-resolution SIF products, such as those from GOME-2 (Duveiller and Cescatti 2016;Duveiller et al. 2019).

Performance variation among using different remote sensing variables
We also observed differences in the yield prediction performances when using different VIs and LST. Among the tested VIs, NIRv had an overall best performance in predicting maize and soybean yield indicating great potential of using NIRv in crop yield prediction. Compared with traditional remote-sensing-based VIs, NIRv has a more direct physical interpretation as it approximates the proportion of NIR light reflected by vegetation canopy (Badgley et al. 2017). NIRv also minimizes the impacts of soil background and sun-canopy-sensor geometry Badgley et al. 2017). Our study is the first one that used NIRv for crop yield prediction at large scales. NDVI seemed a good indicator for predicting soybean yield, but not for maize yield. We also found that LST shows better predictability for maize yield than soybean yield (Fig. 11), which may be partly caused by the fact that soybean yield is less sensitive to high temperature and VPD than maize mainly due to relatively higher optimal growth temperature and stable sowing density of soybean over the last two decades (Lobell et al. 2014).

Performance variation among using different machine-learning algorithms
Although we were not aiming to compare the performances of different machine learning algorithms in crop yield estimation in this study, we did see performance difference among the selected five algorithms. Generally, we found the nonlinear algorithms (RF, SVM, and ANN) perform better than the linear algorithms (LASSO and RIDGE). LASSO and RIDGE performed similarly, and the three nonlinear algorithms achieved similar performances in yield prediction for both maize and soybean, which are consistent with previous studies (Cai et al. Fig. 7. Comparison between the spatial patterns of observed and predicted maize yield in 2018 using random forest model with climate and remote sensing combined variables as inputs. The models were trained using data from 2015-2017. SIF(OCO2), SIF(TROPOMI), and SIF(GOME2) represent SIF _ OCO2 005 , TROPOMI, and GOME-2 SIF products, respectively. For SIF _ OCO2 005 and TROPOMI SIF, we trained the model using SIF _ OCO2 005 data during 2015-2017, while validated the model using both SIF _ OCO2 005 and TROPOMI data in 2018. RMSE values outside and inside the parentheses are in bu/acre and t/ha, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 2019). Other advanced machine learning algorithms, such as deep learning algorithms (Oliveira et al. 2018;You et al. 2017), may have the potential to improve the absolute performance in crop yield prediction. However, we expect that the relative performance using SIF and other remote sensing based predictors would not change even when using more advanced algorithms to build crop yield prediction models, which needs further testing.

Conclusion
With more satellite-based high-resolution SIF products becoming available, there is a need to assess the potential benefits of using these SIF products in operational crop yield prediction. In this study, we evaluated the relative performances of using high-resolution SIF products from OCO-2 and TROPOMI, coarse-resolution SIF product from GOME-2, and MODIS-based VIs (including NDVI, EVI, and NIRv) and LST in predicting maize and soybean yield of the U.S. Midwest. Both remote-sensing-only and climate-remote-sensing-combined yield prediction models were built using five machine-learning algorithms, including LASSO, RIDGE, SVM, RF and ANN. We found that using highresolution SIF products from OCO-2 and TROPOMI outperformed using coarse-resolution SIF product from GOME-2 in yield prediction. We also found that using high-resolution SIF products from OCO-2 and TROPOMI gave the best forward predictions for both maize and soybean yields in 2018, indicating great potential of using satellite-based high-resolution SIF products for crop yield prediction. However, using currently available high-resolution SIF products did not guarantee consistently better yield prediction performances than using other satellite-based remote sensing variables in all the evaluated cases, indicating there are still opportunities to improve the quality and resolution of satellite-based high-resolution SIF products. We also found that using NIRv could generally lead to better yield prediction performance than using NDVI, EVI, or LST, and using NIRv could achieve similar or even better yield prediction performance than using the two high-resolution SIF products. These findings indicate that satellitebased high-resolution SIF products could be beneficial in crop yield prediction with more high-resolution and good-quality SIF products accumulated in the future and NIRv is very promising for crop yield Fig. 8. Comparison between the spatial patterns of observed and predicted soybean yield in 2018 using random forest model with climate and remote sensing combined variables as inputs. The models were trained using data from 2015-2017. SIF(OCO2), SIF(TROPOMI), and SIF(GOME2) represent SIF _ OCO2 005 , TROPOMI and GOME-2 SIF products, respectively. For SIF _ OCO2 005 and TROPOMI SIF, we trained the model using SIF _ OCO2 005 data during 2015-2017, while validated the model using both SIF _ OCO2 005 and TROPOMI data in 2018. RMSE values outside and inside the parentheses are in bu/acre and t/ha, respectively. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 Fig. 9. Bias distribution of predicted maize (left) and soybean (right) yield in 2018 using random forest model with climate and remote sensing combined variables as inputs. The models were trained using data from 2015-2017. SIF(OCO2), SIF(TROPOMI), and SIF(GOME2) represent SIF _ OCO2 005, TROPOMI and GOME-2 SIF products, respectively. For SIF _ OCO2 005 and TROPOMI SIF, we trained the model using SIF _ OCO2 005 data during 2015-2017, while validated the model using both SIF _ OCO2 005 and TROPOMI data in 2018. Fig. 10. Testing performance of maize (top panels) and soybean (bottom panels) yield prediction using climate variables plus monthly SIF, growing season maximum and accumulated SIF during May to September, or monthly total emitted SIF from SIF _ OCO2 005 . The performances were evaluated with five-fold-cross-validation method during 2015-2018. The performance metrics (left panels for R 2 and right panels for RMSE) are calculated for 500 random training-testing splits and then both means (filled bars) and standard deviations (errorbars) of the metrics are derived. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 prediction. To our best knowledge, this study is the first one that compares yield prediction performances of using different SIF products (high-resolution versus coarse-resolution) and using optical-based VIs (including the newly developed NIRv) and thermal-based LST, which can provide insights on developing operational crop yield forecasting system using multi-source remote sensing data. Similar studies outside the U.S. Corn Belt are also needed to assess the performances of using different remote sensing data in crop yield prediction.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Fig. 11. Performance change with increasing years of training data. The random forest models with climate-remote-sensing-combined variables were trained using data before 2018 and validated using data in 2018. The numbers in x-axis represent the number of years before 2018. For example, 3 and 15 in the x-axis mean training data are from 2015 to 2017, and 2003 to 2017, respectively. It has to be noted that model training and validation were conducted over six states with CDL data since 2003, including Illinois, Indiana, Iowa, Nebraska, North Dakota, and Wisconsin. B. Peng, et al. Int J Appl Earth Obs Geoinformation 90 (2020) 102126 nassgeodata.gmu.edu/CropScape/. MODIS products are available at https://e4ftl01.cr.usgs.gov/. TROPOMI footprint SIF data is available at ftp://fluo.gps.caltech.edu/data/tropomi/ungridded/. SIF _ OCO2 005 is available at https://cornell.app.box.com/s/cavtg50y80udbdirg022gm 5whugmth02. GOME-2 SIF product is available at ftp://fluo.gps. caltech.edu/data/Philipp/GOME-2/. PRISM weather data is available at http://www.prism.oregonstate.edu/.