Establishment of Plot-Yield Prediction Models in Soybean Breeding Programs Using UAV-Based Hyperspectral Remote Sensing

: Yield evaluation of breeding lines is the key to successful release of cultivars, which is becoming a serious issue due to soil heterogeneity in enlarged ﬁeld tests. This study aimed at establishing plot-yield prediction models using unmanned aerial vehicle (UAV)-based hyperspectral remote sensing for yield-selection in large-scale soybean breeding programs. Three sets of soybean breeding lines (1103 in total) were tested in blocks-in-replication experiments for plot yield and canopy spectral reﬂectance on 454~950 nm bands at di ﬀ erent growth stages using a UAV-based hyperspectral spectrometer (Cubert UHD185 Fireﬂy). The four elements for plot-yield prediction model construction were studied respectively and concluded as: the suitable reﬂectance-sampling unit-size in a plot was its 20%–80% central part; normalized di ﬀ erence vegetation index (NDVI) and ration vegetation index (RVI) were the best combination of vegetation indices; the initial seed-ﬁlling stage (R5) was the best for a single stage prediction, while another was the best combination for a two growth-stage prediction; and multi-variate linear regression was suitable for plot-yield prediction. In model establishment for each material-set, a random half was used for modelling and another half for veriﬁcation. Twenty-one two growth-stage two vegetation-index prediction models were established and compared for their modelling coe ﬃ cient of determination ( R M 2 ) and root mean square error of the model ( RMSE M ), veriﬁcation R V 2 and RMSE V , and their sum R S 2 and RMSE S . Integrated with the coincidence rate between the model predicted and the practical yield-selection results, the models, M A1-2 , M A4-2 and M A6-2 , with coincidence rates of 56.8%, 58.5% and 52.4%, respectively, were chosen for yield-prediction in yield-test nurseries. The established model construction elements and methods can be used as local models for pre-harvest yield-selection and post-harvest integrated yield-selection in advanced breeding nurseries as well as yield potential prediction in plant-derived-line nurseries. Furthermore, multiple models can be used jointly for plot-yield prediction in soybean breeding programs.


Introduction
In plant-breeding programs, accurate yield evaluation of breeding lines is the key to release of novel cultivars since yield is always the most important trait among those targeted [1,2]. In conventional breeding processes, two major links are involved, one is derivation of breeding populations with targeted variants through hybridization and/or mutagenesis, the other is selection for the targeted variants through successive field tests integrated with some necessary lab evaluations. In the selection step, precise yield evaluation of the variants or breeding lines mainly depends on precise field experiments, which is often influenced by complicated environmental conditions (mainly soil uniformity) acting on the field plots, especially as the tested number of lines increased. In fact, yield evaluation of breeding lines is the key to successful release of cultivars, which is becoming a serious issue due to soil heterogeneity in enlarged field tests. For raising the experiment precision, a series of experiment designs and corresponding statistical methods were developed in last century, including various incomplete block designs such as blocks in replication design and lattice design which can test hundreds of breeding lines in a same experiment [3]. However, even so, the soil heterogeneity is still difficult to overcome with for a yield test comprising thousands of breeding lines. That requires new techniques to improve the yield-test precision and yield-selection efficiency and effectiveness.
Remote-sensing images have been widely used in the measurement of crop traits, such as plant height, chlorophyll content, leaf area index (LAI), disease susceptibility, drought stress sensitivity, nitrogen content, yield and etc. [4][5][6][7][8][9]. These are based on the differences in spectral reflectance of the canopy among varieties for the above traits [10]. The most of the field-based research for yield estimation models using canopy reflectance and canopy temperature measurements were focused on 2 or 3 band indices, which can be highly variable and inconsistent among breeding lines or varieties [11]. The biophysical/biochemical components of the field population, such as the canopy and leaf structures, may not allow the plant to fully reach its genetic yield potential [4]. Some researchers have suggested that yield gains observed in field crops can be attributed to more efficient photosynthetic parameters in addition to genetic reasons [12][13][14][15]. Studies that focus on utilizing full spectrum instruments in prediction models have been reported in recent years in wheat [16][17][18], corn [19], rice [20], cotton [21] and soybean [22,23] in optimal and/or drought environments.
Hyperspectral remote sensing provides a continuous spectrum with plentiful band information and high-resolution images. Hyperspectral imaging has become a common method used to predict crop traits and yields [24,25]. Hyperspectral remote-sensing data acquired from the ground [26][27][28], unmanned aerial vehicles [10,[29][30][31], airborne platforms [32] and satellite platforms [33] can capture crop canopy spectra in narrow bands and thereby provide information on the biophysical/biochemical composition of the canopy. Low-altitude and flexible unmanned aerial vehicles (UAV) provide an important, affordable and low-cost approach to quantify the components of crop phenotyping [34,35] and precision agriculture [36,37]. Therefore, the UAV are becoming critical in high-throughput crop phenotyping of large number of plots and field trials in a near real-time and dynamic way.
In using UAV equipped with an imaging spectrometer, to find band regions that most significantly contribute to yield estimation is the key to the prediction accuracy. Early researches focused on finding new wavelengths and spectral regions that correlated to plant function. Tucker [38] proposed 5 primary and 2 transition regions of the visible and near infrared spectrum to characterize plant functions. Signal-to-noise ratio in remote-sensing research is always a concern, and sensing the reflectance of plant canopies increases this ratio [19]. In addition, not all the hyperspectral reflectance data in a plot, but those from certain sizes of plot can be used to avoid the border influence between plots, thus the optimized sampling area of hyperspectral reflectance in a plot should be identified to minimize the prediction error.
In the study on hyperspectral remote-sensing technology, strategies for high-throughput field-based phenotyping were investigated with different methods, which showed an obvious difference in estimation accuracy. Vegetation indices are used usually to maximize the relationship between certain reflectance wavelengths and plant function when the effect of background noise is well-controlled [39,40].
The most of the vegetation indices correlate with plant parameters such as pigment status, grain yield. OSAVI (optimized soil-adjusted vegetation index), EVI (enhanced vegetation index), RVI (ration vegetation index), PVI (perpendicular vegetation index) and DVI (difference vegetation index) can be used to estimate leaf nitrogen content [8,41], while NDVI (normalized difference vegetation index), RVI and GNDVI (green and near infrared difference vegetation index) have been used extensively to predict yield and other plant functions in many crops using hyperspectral and satellite imagery [42][43][44][45][46][47][48][49][50][51][52][53]. The 10 vegetation indices often used for yield prediction are listed with their full names, formulae and references in Table 1. In literature, the yield prediction model that was constructed from NDVI at the flowering, podding, and seed-filling stages of all breeding lines was the best, with a coefficient of determination (R 2 ) value of 0.66 [28]. Sensitive vegetation indices in the form of NDVI and RVI based on canopy spectral reflectance were suggested to predict the grain yield of soybean by Ma et al [45] and Qi [54], where NDVI was found to have the highest correlation with soybean yield.
In our breeding programs in north China, more than 1000 breeding lines are yield-tested usually for cultivar releasing each year. To make a precise yield evaluation and selection, we took incomplete block design (Blocks in Replication design), additional check plots, precise field management and careful plot-yield harvest. But the plot yield of lines were still fluctuated obviously. We considered using UAV-based hyperspectral remote sensing to predict plot-yield of lines as an auxiliary yield selection tool in addition to the above fine experiment procedures. Thus, the present study aimed at to explore how to establish prediction models for plot-yields in breeding programs for soybeans using UAV-based hyperspectral remote sensing, to establish, validate and select optimal plot-yield prediction models, and then to demonstrate their efficiency and effectiveness in real breeding programs. To fulfill the objective, the major elements in model construction, including the optimal plot sizes for a representative reflectance data set, suitable vegetation indices with their optimal spectral bands, appropriate growth stage for hyperspectral remote sensing data collection and appropriate regression models corresponding to the vegetation indices and spectral bands were studied using plot-yield and UAV-remote-sensing data on four sets of large number of breeding lines in a real soybean breeding program. The selected prediction models were examined for utilization in a real breeding program.

Materials and Methods
The whole process of the study includes the following five linked steps: (i) four sets of breeding lines in the real breeding programs were tested and UAV remote-sensed; (ii) the optimal plot sizes for representative reflectance data set were determined; (iii) based on the three of four yield-test breeding line data sets, the optimal vegetation indices along with their optimal spectral bands were analyzed and selected (another set of plant-derived-line data was for validation); (iv) a series of regression models using different vegetation indices (and their combinations) extracted from single or multiple stage hyperspectral remote-sensing data were analyzed for their precision; (v) the selected prediction models were examined with the verification root mean square error (RMSE V ) and real breeding selection results. Finally, three best models were selected for yield prediction in 1st-and 2nd-year yield-test as well as for plant-derived-line evaluation which may be used in comprehensive yield selection integrated with the harvested yield records. Please see the flowchart for the UAV data-processing process (Supplementary material Figure S1).

Plant Materials and Field Experiments
The study was taken along with a real soybean breeding program at Shandong Shofine Seed Technology Co. Ltd. The first-year yield-test in 2015 (1stYYT 2015) with 532 breeding lines, second-year yield-test in 2015 (2ndYYT 2015) with 274 breeding lines, and the second-year yield-test in 2016 (2ndYYT 2016) with 297 breeding lines in a total of 1103 breeding lines were used for establishment and verification of yield prediction models. In addition, a recombinant inbred lines population derived from NN1138-2×KF-1 (named NJRIKY) with 441 lines were used for verification of the prediction models to imitate the selection for plant-derived-line at early breeding stage (Supplementary material Table S1). The experiments were designed and conducted at Shofine Seed Technology Co. Ltd. in Jining, Shandong, China (E 116 • 22 10~20", N 35 • 25 50"~26 10") in 2015-2016 as indicated in Table S1 and Figure 1A,B. Each set of lines were tested in a blocks in replication design experiment with three replications using randomized complete blocks design analysis as an approximation [3]. The detailed allocations are listed in Table S1. These breeding lines vary obviously in yield, plant height, growth period, and other agronomic traits. In 2ndYYT 2015, 48 breeding lines were retained in the 2ndYYT 2016, and 165 breeding lines of the 1stYYT 2015 were promoted to the 2ndYYT 2016, these two groups of lines having two years of spectral reflectance and corresponding yield data, therefore, can offer more information for establishing and validation of the prediction models. The planting density was approximately 190,000 plants ha −1 . The plot seed yield was measured by harvesting plots with the seed moisture adjusted to 13%, recorded as t ha −1 .

4
second-year yield-test in 2015 (2ndYYT 2015) with 274 breeding lines, and the second-year yield-test in 2016 (2ndYYT 2016) with 297 breeding lines in a total of 1103 breeding lines were used for establishment and verification of yield prediction models. In addition, a recombinant inbred lines population derived from NN1138-2×KF-1 (named NJRIKY) with 441 lines were used for verification of the prediction models to imitate the selection for plant-derived-line at early breeding stage (Supplementary material Table S1).
The experiments were designed and conducted at Shofine Seed Technology Co. Ltd. in Jining, Shandong, China (E 116°22′10～20″, N 35°25′50″～26′10″) in 2015-2016 as indicated in Table S1 and Figure 1A, B. Each set of lines were tested in a blocks in replication design experiment with three replications using randomized complete blocks design analysis as an approximation [3]. The detailed allocations are listed in Table S1. These breeding lines vary obviously in yield, plant height, growth period, and other agronomic traits. In 2ndYYT 2015, 48 breeding lines were retained in the 2ndYYT 2016, and 165 breeding lines of the 1stYYT 2015 were promoted to the 2ndYYT 2016, these two groups of lines having two years of spectral reflectance and corresponding yield data, therefore, can offer more information for establishing and validation of the prediction models. The planting density was approximately 190,000 plants ha −1 . The plot seed yield was measured by harvesting plots with the seed moisture adjusted to 13%, recorded as t ha −1 . The resolution of the UAV is 0.01 m while the flight altitude is 50 m. The extraction area of each plot is 2.1~8.1 m 2 in different yield-tests, the number of spectral points collected per plot was 21,000~81,000 (2.1~8.1 m 2 /(0.01 × 0.01)).

Assembly of the Unmanned Aerial Vehicle (UAV)-Based Hyperspectral Remote-Sensing System
An UAV with eight rotors, flying height around 50 m, equipped with a Cubert UHD185, a Sony digital camera and a position-orientation system, was assembled for taking the hyperspectral reflectance ( Figure 2). The total weight of the attached equipment was approximately 470 g and its housing was measured approximately 28 × 6.5 × 7cm. The instrument had a spectral range of 454 nm to 950 nm, a 4-nm spectral sampling interval, an 8-nm spectral resolution at 532 nm, and a total of 125 spectral channels. For each band, a 50 × 50 pixel image with a 12-bit dynamic range (4,096 digital numbers, DN) was created. Inside the camera, the different bands were projected to different parts of a charged coupled device (CCD). At the same time, as the hyper spectrum (HS) image was being recorded, a grayscale image with a resolution of 990 × 1000 pixels was captured.
Before the UAV flight, de-noising and lens distortion correction were completed. A black and white board was used for radiation calibration of the UHD185 (Table S2). For stitching the

Assembly of the Unmanned Aerial Vehicle (UAV)-Based Hyperspectral Remote-Sensing System
An UAV with eight rotors, flying height around 50 m, equipped with a Cubert UHD185, a Sony digital camera and a position-orientation system, was assembled for taking the hyperspectral reflectance ( Figure 2). The total weight of the attached equipment was approximately 470 g and its housing was measured approximately 28 × 6.5 × 7cm. The instrument had a spectral range of 454 nm to 950 nm, a 4-nm spectral sampling interval, an 8-nm spectral resolution at 532 nm, and a total of 125 spectral channels. For each band, a 50 × 50 pixel image with a 12-bit dynamic range (4,096 digital numbers, DN) was created. Inside the camera, the different bands were projected to different parts of a charged coupled device (CCD). At the same time, as the hyper spectrum (HS) image was being recorded, a grayscale image with a resolution of 990 × 1000 pixels was captured.
Before the UAV flight, de-noising and lens distortion correction were completed. A black and white board was used for radiation calibration of the UHD185 (Table S2). For stitching the hyperspectral image, a certain degree of image was overlapped (heading overlap >70%, side overlap >30%). To obtain stable data, the reflectance was taken on the day with calm and cloudless weather. hyperspectral image, a certain degree of image was overlapped (heading overlap >70%, side overlap >30%). To obtain stable data, the reflectance was taken on the day with calm and cloudless weather. A DJI Spreading Wings S1000+ equipped with Cubert UHD185 (for obtaining stable soybean canopy hyperspectral reflectance data) and Sony DSC-QX100 (For hyperspectral image stitching correction).

Processing of the UAV Hyperspectral Reflectance and Determination of the Reflectance-Sampling Unit-Size in Plots
The software, Cubert-Pilot (Version1.1, Cubert GmbH, Ulm, Germany) and Agisoft Photoscan Pro (Version 1.4, Agisoft LLC, Russia) were used to realize the image mosaic [55][56]. All graphs of the maximum area vector of each plot were used to fit the ArcGIS software (Version10.0, ESRI, Redlands, USA) on the spliced hyperspectral image. The POS (position and orientation system) data, digital image and hyperspectral data were aligned and fused. DOM (digital orthophoto map) data were obtained after four steps of pre-processing: including (i) aligning photos, (ii) building the dense cloud, (iii) building the mesh, (iv) building the texture. The process of obtaining the UAV hyperspectral reflectance data is shown in Figure S1.
Based on the image mosaic, reflectance-sampling unit-size (area in a testing plot) was studied to avoid the boarder influence. The sample unit was centered on the geometric center of each plot to avoid the space vector region beyond the plot boundary. Then, the bandmath module of ENVI software (Version 4.8, HARRIS geospatial, Wokingham, UK) combined with IDL (Interactive Data Language) was used to scale the length and width of the maximum area of each plot by 20 times (unit-sizes were designed using ENVI procedure, Table S3), and thus, 21 reflectance-sampling unit areas were defined, which were then used as vector images to obtain 21 reflectance-sampling unit data sets. The coefficient of variation (CV) of the top three vegetation indexes (NDVI, RVI and VOGI) were calculated for all the 21 reflectance-sampling units using the analysis of variance (ANOVA) procedure (SAS Institute Inc., NC, USA). Based on the relationship between the CV and reflectance-sampling unit-size, the best unit-size was chosen for further analysis.

Optimization of the Vegetation Indices along with Corresponding Hyperspectral Bands
All two-band combinations (R(x1) and R(x2)) for the 10 most popular vegetation indices related to yield prediction reported in previous literature, including NDVI, RVI, VOGI (Vogelmann red edge index 1) and others in Table 1, within the spectral range of 454~950 nm, were screened and constructed according to the vegetation index formulas.

Processing of the UAV Hyperspectral Reflectance and Determination of the Reflectance-Sampling Unit-Size in Plots
The software, Cubert-Pilot (Version1.1, Cubert GmbH, Ulm, Germany) and Agisoft Photoscan Pro (Version 1.4, Agisoft LLC, Russia) were used to realize the image mosaic [55,56]. All graphs of the maximum area vector of each plot were used to fit the ArcGIS software (Version10.0, ESRI, Redlands, USA) on the spliced hyperspectral image. The POS (position and orientation system) data, digital image and hyperspectral data were aligned and fused. DOM (digital orthophoto map) data were obtained after four steps of pre-processing: including (i) aligning photos, (ii) building the dense cloud, (iii) building the mesh, (iv) building the texture. The process of obtaining the UAV hyperspectral reflectance data is shown in Figure S1.
Based on the image mosaic, reflectance-sampling unit-size (area in a testing plot) was studied to avoid the boarder influence. The sample unit was centered on the geometric center of each plot to avoid the space vector region beyond the plot boundary. Then, the bandmath module of ENVI software (Version 4.8, HARRIS geospatial, Wokingham, UK) combined with IDL (Interactive Data Language) was used to scale the length and width of the maximum area of each plot by 20 times (unit-sizes were designed using ENVI procedure, Table S3), and thus, 21 reflectance-sampling unit areas were defined, which were then used as vector images to obtain 21 reflectance-sampling unit data sets. The coefficient of variation (CV) of the top three vegetation indexes (NDVI, RVI and VOGI) were calculated for all the 21 reflectance-sampling units using the analysis of variance (ANOVA) procedure (SAS Institute Inc., NC, USA). Based on the relationship between the CV and reflectance-sampling unit-size, the best unit-size was chosen for further analysis.

Optimization of the Vegetation Indices along with Corresponding Hyperspectral Bands
All two-band combinations (R(x1) and R(x2)) for the 10 most popular vegetation indices related to yield prediction reported in previous literature, including NDVI, RVI, VOGI (Vogelmann red edge index 1) and others in Table 1, within the spectral range of 454~950 nm, were screened and constructed according to the vegetation index formulas. The contour map of determination coefficients (R 2 , Equation (1)) was plotted according to the value of R 2 completed with "plsregress" function in MATLAB R2010b (MathWorks, Inc., Natick, MA, USA). From the contour map, the sensitive wavebands along with the corresponding indices were identified according to their largest R 2 values. The R 2 are calculated as follows: where x i and y i are the measured yield value and the predicted yield value, respectively, x, is the average value of x i i varies from 0 to n, where n is the number of tested breeding lines.

Establishment and Verification of the Yield Prediction Models
In this study, three sets of breeding lines in a total of 1103 (1stYYT 2015, 2ndYYT 2015 and 2ndYYT 2016) were tested for their plot yield and canopy spectral reflectance at the full flowering stage (R2), the full podding stage (R4), the initial seed filling stage (R5), and the full seed filling stage (R6) using a UAV-based hyperspectral spectrometer. The software "plsregress" function in MATLAB R2010b randomly takes half of the lines in each test for establishing the yield prediction model and takes the other half for validation of the established model. The soybean yield prediction models were established from the different material sets based on linear and non-linear (curvilinear) regressions. The formula of the root-mean square-error (RMSE) was used to evaluate the precision of the established models, which is as follows: where x i and y i are the measured yield value and the predicted yield value, respectively, i varies from 0 to n while n is the number of tested breeding lines. The coefficient of determination in model construction is designated as R M 2 and the root mean square error of the model as RMSE M ; the coefficient of determination calculated from the other half of the lines is used for validation and designated as R V 2 and the root mean square error of the model as RMSE V ; both sets of determination coefficient and root mean square error are used to assess the yield prediction model. For a comprehensive evaluation to balance the modelling and verification, these two parts were summed up as R S 2 and RMSE S . Yield predictions with higher R 2 and lower RMSE are deemed to be better ones.

Superior Plot-Yield Prediction Models Selected for Breeding Programs
Twofold methods were used to verify all of the established yield prediction models. In the first method, all the models were evaluated with the three sets of yield-tested breeding lines using RMSE V summed over the three sets of breeding lines. In the second method, in breeders' actual yield selection, the breeding line with average yield in a single year less than 3.00 t ha −1 is treated as low-yielding line to be eliminated (Eli), that with average yield in a single year more than 3.75 t ha −1 is treated as high-yielding line to be promoted (Pro) and that with average yield in a single year between 3.00 and 3.75 t ha −1 is treated as intermediate line to be reserved for further observation (Res). According to the selection classification, the prediction values of lines in each of the three sets of tests (1stYYT 2015, 2ndYYT 2015 and 2ndYYT 2016) were grouped into the respective categories and compared with the actual selection results. The coincidence rate between the predicted classification and actual breeding classification was calculated for each of the three yield-tests as well as the total value of the three tests.
Based on the results from the two methods, the superior plot-yield prediction models were determined and also were checked for their utilization in plant-derived line selection.

Field Experiment Precision and Variation among the Tested Breeding Lines
The yield distribution, variation, and coefficient of variation (CV) of the four sets of breeding lines are summarized in Table 2. The average plot seed yield of the breeding lines of the 1stYYT 2015 ranged between 1.83 and 4.99 t ha −1 , that of 2ndYYT 2015 ranged from 1.65 to 4.91 t ha −1 , that of the 2ndYYT 2016 ranged between 1.72 and 4.41 t ha −1 and the average plot yield of the plant-to-lines of NJRIKY ranged from 1.08 to 3.39 t ha −1 , with their genotypic coefficient of variation values of 34.85%, 29.35%, 26.90% and 33.15%, and their error coefficient of variation of 19.18%, 15.89%, 12.81% and 33.31%, respectively. These results indicated that large yield variation existed in the three sets of breeding lines with small experimental errors while there was larger yield variation and experimental error but less mean yield for the NJRIKY plant-derived lines population. Therefore, the data of the three sets of breeding lines were used for the establishment and validation of yield-prediction models from which the established models can fit a relatively wide situation, while those of the NJRIKY was to be used for calibration of the established prediction models to imitate the plant-derived line prediction and selection.

Analysis for Sensitive Wavebands and Optimal Vegetation Indices for Breeding Line Yield-Prediction
For identifying the hyperspectral reflectance wavebands sensitive to yield, the yield of 2ndYYT 2015 ( Figure 3A), 1stYYT 2015 ( Figure 3B), NJRIKY test 2015 ( Figure 3C) and their corresponding average hyperspectral data at R2, R4, R5 and R6 were analyzed, The wavelengths with maximum and minimum correlation coefficients between spectral reflectance and seed yield were 750~950 nm and 454~710 nm, respectively, for the tests (Figure 3).  Figure 3C) and their corresponding average hyperspectral data at R2, R4, R5 and R6 were analyzed, The wavelengths with maximum and minimum correlation coefficients between spectral reflectance and seed yield were 750~950 nm and 454~710 nm, respectively, for the tests (Figure 3). Based on the 2ndYYT 2015 ( Figure 4A) and 1stYYT 2015 ( Figure 4B) data, the contour maps of determination coefficients of linear regression between the two-band NDVI, RVI at R5 stage and yield were established using the "plsregress" function in MATLAB procedure. The dark red area presented the highest correlation zone, and the best sensitive bands for yield-prediction concentrated in the range of 550 nm to 750 nm. Based on the 2ndYYT 2015 ( Figure 4A) and 1stYYT 2015 ( Figure 4B) data, the contour maps of determination coefficients of linear regression between the two-band NDVI, RVI at R5 stage and yield were established using the "plsregress" function in MATLAB procedure. The dark red area presented the highest correlation zone, and the best sensitive bands for yield-prediction concentrated in the range of 550 nm to 750 nm. Based on the 2ndYYT 2015 ( Figure 4A) and 1stYYT 2015 ( Figure 4B) data, the contour maps of determination coefficients of linear regression between the two-band NDVI, RVI at R5 stage and yield were established using the "plsregress" function in MATLAB procedure. The dark red area presented the highest correlation zone, and the best sensitive bands for yield-prediction concentrated in the range of 550 nm to 750 nm. The results of the relationship between vegetation indices and yield at different single growth stages analyzed using MATLAB procedure are listed in Table 3  The results of the relationship between vegetation indices and yield at different single growth stages analyzed using MATLAB procedure are listed in Table 3; the sensitive bands of the 1stYYT 2015 at R2, R4, R5 and R6 growth stages were 750 nm and 770 nm, 750 nm and 770 nm, 634 nm and 674 nm and 550 nm and 710 nm, respectively, while those of the 2ndYYT 2015 were 482 nm and 590 nm, 514 nm and 606 nm, 514 nm and 606 nm and 550 nm and 710 nm, respectively. This indicated that the sensitive bands varied greatly between the two breeding line tests for the same growth stage, while the sensitive bands also varied at the different growth stages even for a same yield-test.  Table 3 also shows that the yields of the two tests were all highly correlated with canopy reflectance at R5 stage, with the maximum R 2 up to 0.68 and 0.50 respectively, and therefore, the best growth stage to collect the UAV hyperspectral reflectance data for yield-prediction using vegetation indices was at R5, while the spectral sensitive bands for soybean yield-prediction were in 454~850 nm. The other growth stages, R2, R6 and R4, were in turn not as good as R5. The 10 vegetation indices were ranked for each of the growth stages in the two yield-tests according to their determination coefficients, NDVI and RVI were all ranked the top two (Table 3). Since NDVI and RVI based on filtered optimized bands are the two most sensitive indices, they were selected for the establishment of yield-prediction models in this research.

Optimized Reflectance-Sampling Unit-Size for Organizing the UAV Hyperspectral Reflectance Data
From the UAV reflectance data set of the breeding lines, the hyperspectral data of each plot were obtained using the vector image georeferenced with the hyperspectral image. Twenty-one reflectance-sampling unit-sizes were designed using ENVI procedure combined with IDL language (Table S3), each plot image and vector map at each spatial scale were read, and the 21 datasets of the average spectral reflectance in each plot were extracted ( Figure S2). It could be seen that the spectral reflectance of the canopy corresponding to different spatial sampling unit areas was of no significant difference in 550~750 nm of the visible light bands, but the difference was significant in the 750~850 nm near-infrared region. To select the best sampling unit-size of hyperspectral reflectance and eliminate plot marginal effects, the hyperspectral reflectance plot data of 2ndYY T2015, 1stYYT 2015 and NJRIKY at R5 growth stage were used. The CVs of red and near-infrared band, NDVI, RVI and VOG1 of the three tests were also calculated from the spectral information extracted from the 21 different reflectance-sampling unit-sizes. The smaller the value of the coefficient of variation, the better the reflectance-sampling unit-size. Figure 5 showed that the CV of red-band, near-infrared, NDVI, RVI and VOG1 distributed between 0.15~0. 18 Figure 5). Thus, when the proportion of the sampling unit-size in that of the total plot was between about 20% to 80%, the canopy reflectance data obtained could be used for plot-yield prediction. In the establishment of prediction model below, the upper-side of the optimal sampling unit-area was preferred since all the hyperspectral data can be obtained from one flight and no additional expense was needed. 10 5). Thus, when the proportion of the sampling unit-size in that of the total plot was between about 20% to 80%, the canopy reflectance data obtained could be used for plot-yield prediction. In the establishment of prediction model below, the upper-side of the optimal sampling unit-area was preferred since all the hyperspectral data can be obtained from one flight and no additional expense was needed.

Identification of Major Factors for the Establishment of Plot-Yield Prediction Models
In the establishment of plot-yield prediction models, all the material sets were separated into two subsets for mutual checks which was done automatically by the MATLAB software. The materials, in a total of 1,103 lines, were organized and coded as 1stYYT 2015 (A1 + B1), 2ndYYT 2015 (A2 + B2), and 2ndYYT 2016 (A3 + B3) (Table S1) (Table S1, S4,  S5).
The 17 material sets were used to screen for major factors to be included in yield-prediction models. The exponential, linear and logarithmic regressions with one vegetation index (RVI or NDVI) at R5 were established using Excel 2007 procedure (Table S4, S5). The results showed that the difference of R 2 between the RVI and NDVI were not significant and the R 2 of linear function of all material sets were somewhat larger and more stable. Among the models in Table S4, the linear regression y = 3E-05x + 0.6526 (x = RVI (618,674)) with R 2 of 0.61 and y = -2E-05x + 0.2055 (x = NDVI
The 17 material sets were used to screen for major factors to be included in yield-prediction models. The exponential, linear and logarithmic regressions with one vegetation index (RVI or NDVI) at R5 were established using Excel 2007 procedure (Tables S4 and S5). The results showed that the difference of R 2 between the RVI and NDVI were not significant and the R 2 of linear function of all material sets were somewhat larger and more stable. Among the models in Table S4, the linear regression y = 3E-05x + 0.6526 (x = RVI (618, 674)) with R 2 of 0.61 and y = −2E-05x + 0.2055 (x = NDVI (618, 674)) with R 2 of 0.61 both for A1 + B1 (1stYYT 2015); the two linear regressions composed of NDVI or RVI both with R 2 of 0.49 for A2 + B2 (2ndYYT 2015). The similar situation was observed for other material groups, such as A1, B1, A2, B2, etc., which indicates both NDVI and RVI were relevant in the construction of plot-yield prediction models. Based on the aforementioned, a linear function with two vegetation indices (namely NDVI and RVI) at R5 stage was established for the second round of the yield-prediction models assessment (Table S6).

Establishment and Evaluation of Yield-Prediction Models Using Normalized Difference Vegetation Index (NDVI) and Ration Vegetation Index (RVI) at R5
The second round yield-prediction models were established from the 17 material groups and listed in Table 4 (the model equations listed in Table S7). As indicated above, the program took a random half of the lines for establishing yield-prediction model and the other random half for validation of the established model. Linear models composed of NDVI and RVI at R5 were established for each of the 17 material groups. In Table 4, the established models were evaluated based on their modelling precision, including the modelling determination coefficient R M 2 and the modelling root mean square error (RMSE M ) and their verification precision, including the verification determination coefficient R V 2 and the verification root mean square error (RMSE V ). For a comprehensive evaluation to balance the modelling and verification, these two parts were summed up as R S 2 and RMSE S , respectively.
In   Table S7. λ1 and λ2 are the two sensitive bands. RMSE is root mean square error. In the Model Precision column, R M 2 is the model determination coefficient, RMSE M is the model root mean square error calculated from the difference between the predicted value and the observed value in the lines set from which the model is developed. In the Verification Precision column, R V 2 is the verification determination coefficient, RMSE V is the verification root mean square error calculated from the difference between the value predicted from the established model and the observed value in the lines set used for verification. In the Sum Precision column, R S 2 and RMSE S are sums of model and verification determination coefficient (R M 2 + R V 2 ) and root mean square error (RMSE M + RMSE V ), respectively. The same is true for later tables. 1 These two material sets were tested two years, therefore, the number of observations for modelling and verification are two times of the number of lines.

Establishment and Evaluation of Yield-Prediction Models Using NDVI and RVI at Multiple Stages
The 17 material sets and yield-prediction models in Table 4 involved only two vegetation indices at a single growth stage R5, utilization of more vegetation indices at multiple growth stages might improve the model precision, which was conducted using the MATLAB procedure. From the 1stYYT 2015 and 2ndYYT 2015 data, all the 10 vegetation indices and growth stages were screened for best plot-yield prediction-models, the maximum coefficient of determination for models with the 10 vegetation indices reached 0.69 and 0.59. Since 1stYYT 2015 (A1 and B1) in Table 4 was the material set from which the best model came, its major results are introduced here. The yield-prediction models based on combinations of two growth stages and three growth stages of vegetation index when 9 VIs involved, the maximum of the model R 2 was 0.73. The best combination of the three growth stages were R2, R5, and R6; when 10 vegetation index variables were introduced, the maximum model R 2 was 0.74. As the number of growth stages and vegetation indices increased, the model R M 2 increased but not significantly, Table   S6 shows that two growth stages models are better than single-stage models, the combinations of R2 and R5, then R6 and R5, R4 and R5 are in turn better than the others among the two-stage models. However, not very large difference was among the vegetation index numbers involved, so less number (2 vegetation indices) was preferred for simplicity of the models. Based on the above results, the third round yield-prediction models for the 17 material sets with two growth stages (R5 + R4 for each material set and R5 + R2, R5 + R6 and R5 + R4 for A1 and A6 material sets) and two vegetation indices (NDVI and RVI), in a total of 21 yield-prediction models were established using the MATLAB procedure and then evaluated further. As indicated before, half a set of breeding lines was used for modelling and half set for validation. The results were summarized in Table 5 (the model equations listed in Table S8).  Table S8.  Table 4. P: P values of model significant test, expressed in exponential notation, such as, 2.68E-63, that is 2.68 multiplied by 10 −63 . 1 These two material sets were tested for two years, therefore, the number of observations for modelling and verification are two times the number of lines.
Based on the results that the precision of the yield-prediction models composed of two vegetation indices at two growth stages were better than those composed of two vegetation indices at R5 single growth stage in term of determination coefficient (R M 2 , R V 2 and R S 2 ) and root mean squares error

Further Comparison and Selection of Best-Fitted Plot-Yield Prediction Models for Yield Breeding Programs
The verification of the models in Table 5 was limited in using the other half of breeding lines in the same material set, while the recognized yield-prediction model was to be used for a broad range of breeding materials, so these models should be further validated with more breeding materials. Our method was twofold: one was to evaluate the verification root-mean-square-errors (RMSE V ) for all the breeding line sets tested (in a total of 1103 lines), the other was to evaluate the coincidence between the model-predicted and breeders' actual yield selection results. Table 6 shows the results from the evaluation of verification root-mean-square-errors (RMSE V ). All the models were evaluated with the three sets of yield-tested breeding lines 1stYYT 2015 (A1 + B1), 2ndYYT 2015 (A2 + B2), 2ndYYT 2016 (A3 + B3) and their total set (A4 + B4). The models M A1-2 , M A2+B2-2 , M A2-2 and M A6-2 are models with less RMSE V for all the four breeding line sets in addition to higher determination coefficient in Table 5, while the models M A4+B4-2 , M A4-2 and M B4-2 were of small RMSE V for all the four material sets but with medium size of determination coefficient in Table 5.
The results of evaluation of the coincidence between the model-predicted and breeders' actual yield selection are shown in Table 7. The coincidence was good in the four material sets for the above 4 models (M A1-2 , M A2+B2-2 , M A2-2 and M A6-2 ) selected from Table 6. After a further comparison comprehensively, the models of M A1-2 , M A6-2 and M A4-2 were good in coincidence rates for all the selection categories (eliminated, reserved and promoted) in all the populations and were chosen for utilization in plot-yield prediction in yield breeding programs (see Table 7 and its notes for details). Table 6. Comparisons of the verification RMSE in A1 + B1, A2 + B2, A3 + B3 and A4 + B4 among models in Table 5.

Model
Growth Period Range (d)

Yield
Range/ (t ha −1 )    Note: Comparisons of consistence between the breeders' actual yield selection results and the model-predicted yield selection results among the 21 models are listed in this table; the breeding lines were treated as to be eliminated (Eli, yields lower than 3.00 t ha −1 ) to be reserved (Res, yields between 3.00 t ha −1 and 3.75 t ha −1 ) and to be promoted (Pro, yields above 3.75 t ha −1 ) in A1 + B1, A2 + B2 and A3 + B3. The models used in model-predicted yield selection are those listed in Table 5 and Table S8. M A1-2 is a linear model derived from the material set which is a first part with 133 breeding lines of the 1stYYT 2015, with its yield ranging between 1.836 and 4.680 t ha −1 , growth period ranging between 99 d and 112 d. M A4-2 is also a linear model derived from the material set which is a first part with 275 breeding lines of the three sets of tests, with its yield ranging between 1.656 and 4.757 t ha −1 , growth period ranging between 96 d and 116 d. M A6-2 is also a linear model derived from the material set which is a group of the selected and retained breeding lines from 1stYYT 2015 and 2ndYYT 2015 with two years' data of 106 breeding lines, with its yield ranging between 2.380 and 4.925 t ha −1 , growth period ranging between 101 d and 116 d. The formulae of the three recommended and other prediction models are listed in Table S8 with their corresponding hyperspectral reflectance bands. The three plot-yield prediction models can be used for breeding lines in yield-test nurseries within the corresponding yield and growth period range, single model or all the three models can be used simultaneously in a same yield-test nursery.
In addition, the 21 models in Table 5 were also validated with the NJRIKY (A + B) population to imitate the plant-to-line selection precision. Tables S9 and S10 showed that the above models of M A1-2 and M A4-2 (but not M A6-2 ) were also suitable for yield-prediction of the plant-derived-line selection.

Discussion
From the above, in order to establish prediction models for plot-yields in soybean breeding programs using UAV-based hyperspectral remote sensing, the optimal techniques of the four major elements in model construction were explored, then the plot-yield prediction models were established after five linked steps, with the optimal models selected, such as M A1-2 , M A4-2 and M A6-2 for yield-test nurseries and the former two for plant-derived line nursery.
Comparing the present results to those in the literature, our four element results used in five linked steps to obtain the three optimal prediction models based on UAV-hyperspectral reflection are more systematic and advanced in comparison to others. Zhang et al. [28] used the active remote-sensor GreenSeeker to measure the canopy NDVI by the seedling, flowering, podding, and seed-filling stages in a total of 1,272 soybean lines, including the breeding lines and recombinant inbred lines. Among the single stage yield prediction models, the seed-filling stage was the best, having the highest coefficient of determination and lower standard errors. The yield prediction model that was constructed from NDVI at the flowering, podding, and seed-filling stages of all breeding lines was the best with an R 2 value of 0.66. Wu et al. [27] obtained the canopy spectral reflectance information of 30 soybean cultivars using the FieldSpec Pro FR2500 Analytical SpectralDevice (ASD) and constructed a large number of spectral parameters. The multiple regression values of the yield obtained with NPH1280 at flowering stage (R2), V_Area1190 at full podding stage (R4), and NPH560 at initial seed filling stage (R5) were found to provide the best yield prediction with an R 2 value of 0.68. Qi [54] systematically studied a method to monitor soybean yields based on the FieldSpec Pro FR2500 hyperspectral spectrometer, but the application of the method is limited due to its low accuracy, low efficiency, and inability to obtain the data in real time and for a large area. Sankaran et al [67] found that the vehicle-mounted platform achieved rapid and non-destructive acquisition of plant phenotype information under field conditions. However, this method has a limitation in a crop-planting scheme and low operational efficiency in large areas. Anyway, our results on vegetation index selection (NDVI and RVI), R5 growth stage for remote-sensing are basically in accordance with the previous results, but our results on regression type was consistently multiple linear regression model while curvilinear regressions involved in other reports. The especially meaningful element results in the present study is those of the sampling-unit size of hyperspectral reflectance which is a specific requirement due to the UAV-based remote sensing covering a whole plot influenced by neighboring plots.

The Major Elements and Potential Utilization of the Established Plot-Yield Prediction Models
Remote sensing (RS) measurements can provide timely information on plant growth and development, responses to dynamic weather conditions and management practices and, therefore, the final crop yield potentials [33]. Based on crop-specific spectral reflectance features, crop yields can be predicted by constructing remote-sensing models that incorporate multiple vegetation indices [68][69][70]. In the present study for establishing plot-yield prediction model based on the capture of UAV-based hyperspectral reflectance data, the major elements were considered as the reflectance-sampling unit-size in a plot, selection of vegetation indices along with their corresponding reflectance spectrum, selection of growth period for capture of hyperspectral reflectance data (these three elements involving hyperspectral reflectance data capture) and selection of regression pattern, combination of vegetation indices and combination of growth periods (these three elements involving hyperspectral reflectance data analysis).
As for the potential utilization of the established plot-yield prediction models, reliable and early assessment of the breeding lines' yield is of paramount importance. Plant breeders have to rapidly obtain plot yields of a large numbers of lines under field conditions [71][72][73]. The soybean breeders both private and public have to make a decision before starting the winter nursery on their breeding lines to be promoted into the higher rank nursery (promoted), or eliminated, or retained as a repeater, especially since a large number of breeding lines have to be treated in modern plant-breeding programs. The plot-yield prediction models can be of relevant help before harvesting and after harvesting for breeders to treat their breeding lines in a short time. For pre-harvest utilization in the Shandong Shofine Seed Technology Co. Ltd., as the plot-yield prediction models recommended from the present study is concerned, M A1-2 , M A6-2 and M A4-2 all involve R5 and R4 two growth stages for remote sensing, there is enough time (about one month from R5 to harvesting) for calculating the predicted plot-yield and field checking. Based on the prediction, the selection plan for breeding lines can be prepared, and some of the inferior lines can be eliminated in advance to save labor for harvesting.
After harvesting with the plot-yield results come out, breeders can make a direct selection of the elite breeding lines according to the harvested plot-yield with reference to the predicted plot-yields. This is especially helpful if the field experiment was damaged due to some reasons and could not provide an exact yield measurement. As indicated above, the models of M A1-2 , M A6-2 and M A4-2 can be used for model-assisted selection for yield-tests in higher ranks of nurseries. While M A1-2 and M A6-2 can be used for plant-derived line selection in early nursery. At this stage, the plant-derived-line experiment is usually without replication, therefore, the real field selection with reference to model-based selection must be more efficient and effective than the ordinary procedure.
However, since the environment and breeding lines vary from program to program, the best models may be different from each other, but the established model construction elements and methods can be used to establish local models for pre-harvest yield-selection and post-harvest integrated yield-selection in advanced breeding nurseries as well as yield potential prediction in plant-derived-line nurseries in soybean breeding programs.

Potential Improvement of Plot-Yield Prediction Models in Soybean Breeding Program
From the present results, different material tests may provide different model precision, such as M A1+B1-2 better than M A3+B3-2 , and M A1-2 and M B1-2 better than M A2-2 and M B2-2 . Different years (environment) may cause different model precision even for a same material set, such as M A6-2 better than M B6-2 . Thus it was recognized that the model precision depends on their source breeding lines, those from a same test was usually better than those from different tests even the sample size (number of total lines) increased, such as M A1+B1 and M A2+B2 (but not M A3+B3 ) were better than M A4+B4 , and different material tests may provide different model precision, such as M A1+B1 is better than M A3+B3 , and M A1 and M B1 better than M A2 and M B2 , and different year (environment) may cause different model precision even for a same set of breeding lines, such as M A6 better than M B6 . Based on the above points, the optimal models were selected as M A1-2 , M A2+B2-2 , M A2-2 and M A6-2 with less RMSE V and higher R V  (Table 5). It is obvious that the determination coefficients are not very high even the RMSEs are relatively low. Therefore, we have to consider how to improve the models for a more precise prediction. Since in the present study we have noticed with regard to the optimal combination of the model construction elements that an increase of vegetative indices in a model did not increase R M 2 very much (Table S6) and an increase of hyperspectral reflectance stages did not increase R M 2 very much but increased R V 2 obviously (Tables 4 and 5), two additional elements might be potential for the improvement of model precision. One is the precision of the experiment, the other is the representativeness of the breeding lines used for model establishment. In the present study, the error term CVs were 19.18% and 15.89% for 1stYYT 2015 and 2ndYYT 2015, respectively, this is a somewhat larger experiment error, it may have caused the not high enough determination coefficient. However, the error term CV of the 2ndYYT 2016 was 12.81% which is less than the other two yield-tests. The models established from 1stYYT 2015 (A1 + B1) are all better in R M 2 , but the models constructed from 2ndYYT 2016 (A3 + B3) are all poorer in R M 2 . While the models based on A4 + B4 which were combined from the three set of the tested breeding lines are all good in R M 2 .
Therefore, experimental precision is not the only reason, it must be related to the representativeness of the breeding lines. Thus, for the establishment of a precise prediction model, both experiment precision and the representativeness of the breeding lines should be well-controlled.

Innovation Potential of Plant Breeding Nursery System Using UAV-Based Hyperspectral Reflectance Techniques
From the above, it is commonly understood that plant breeding efficiency can be improved by using UAV-based remote-sensing platforms which exhibit a large potential to provide yield-prediction even before harvesting so that the next breeding plan can be arranged in advance [67,74,75]. In the present study, an eight-rotors UAV deployed with digital camera and hyperspectral camera was used for field-based phenotyping for an experiment with thousands of breeding plots. In the results on sampling unit-size, Figure 5 shows that the relationship between the coefficients of variation of the red-band, near-infra-red band, NDVI, RVI and VOG1 and the reflectance-sampling unit-sizes varied like a concave basin, very high CVs at the very small size and the very large size of sampling area, this means too small a sampling unit caused large fluctuation due to the heterogeneity of the canopy and too large sampling unit caused large fluctuation due to the influence of border area between neighboring plots. While the sampling unit located in the central part with size between about 20% to 80% of the plot (2.1~8.1 m 2 , 1.2~5.2 m 2 and 1.0~2.7 m 2 for three different tests), the CVs were about the similar without very large difference, indicating the homogeneity of the hyperspectral reflectance between the central 20% to 80% of a plot if the plant in a plot has a normal uniform growth. This means that even 20% of the plot size can obtain the hyperspectral reflectance data as precise as 80% of the plot size. To make sure of the data precision and full-use of the data, we used the larger sampling unit data in our model establishment.
However, the flat or near-flat CV distribution in the central 20%-80% of a plot ( Figure 5) implies that in using UAV-based hyperspectral reflectance for plot-yield prediction, the plot size can be reduced to certain size providing the border area influence excluded, and that even the replication number can be reduced if a single plot can be of representativeness. If so, there might be potential in increasing the breeding lines tested or increasing the breeding scope, especially the breeding test scope can be enlarged without worrying about the soil homogeneity challenge to breeding programs. In addition, the yield-testing ability at the plant-derived-line stage can be raised and even the prediction model for first and second-year yield-test can be used for plant-derived-line selection, like the present results that the prediction M A1-2 and M A6-2 can fit for plant-derived line yield prediction. The reason for that is the high density of the hyperspectral reflectance points and canopy homogeneity in a small area.

Conclusions
In the establishment of plot-yield prediction models for soybean breeding programs using UAV-based hyperspectral remote sensing, four model construction elements were studied individually with the results being: (i) the suitable sampling unit-size in a plot was the central part of 20%-80% plot size (the high end was used in model construction to make a full use of the information); (ii) NDVI and RVI and their combination along with their best spectra combinations of near-infrared and red spectrum were the best vegetation indices for yield-prediction; (iii) R5 was the best growth stage for a single-period model, while R5 and R4 were the best combination for a two-period prediction model; (iv) linear regression was suitable for plot-yield prediction model construction in comparison to exponential and logarithm regression. Seventeen prediction models composed of NDVI and RVI vegetation indices at R5 growth stage and then 21 prediction models composed of the two vegetation indices at two growth stages (R5 plus another one) were established. In choosing the best models, the modelling R M 2 and modelling RMSE M , verification R V 2 and verification RMSE V , and their sums R S 2 and RMSE S were evaluated and compared. Integrated with the coincidence rate between the model-predicted results and the real selection results, the models of M A1-2 , M A6-2 and M A4-2 were chosen for utilization in real breeding programs. Here M A1-2 is a linear model appropriate for local yield in 1.836~4.680 t ha −1 , a growth period in 99 d~112 d; M A4-2 is also a linear model appropriate for local yield in 1.656~4.757 t ha −1 , a growth period in 96 d~116 d; M A6-2 is also a linear model appropriate for local yield in 2.380~4.925 t ha −1 , a growth period in 101 d~116 d. The established model construction elements and methods could be used in the establishment of local models for pre-harvest yield-selection and post-harvest integrated yield-selection in advanced breeding nurseries as well as yield potential prediction in plant-derived-line nurseries, furthermore, these models can be used jointly for plot-yield prediction in soybean breeding programs.  Table S8. The established major plot-yield prediction models using NDVI and RVI constructed from two growth-period UAV hyperspectral reflectance data, Table S9. Comparisons of the verification RMSE in NJRIKY among models listed in Table 5, Table S10. Comparisons of coincidence between the breeders' actual yield selection results and the model-predicted selection results among the 21 models listed in Table 5 for the NJRIKY yield test. (Coincidence rate expressed in % while actual selection results expressed in number of lines). Figure  S1. Flowchart showing the UAV data processing, Figure S2.