Spatial Heterogeneity of Winter Wheat Yield and Its Determinants in the Yellow River Delta, China

Understanding spatial differences of crop yields and quantitatively exploring the relationship between crop yields and influencing factors are of great significance in increasing regional crop yields, promoting sustainable development of regional agriculture and ensuring regional food security. This study investigates spatial heterogeneity of winter wheat yield and its determinants in the Yellow River Delta (YRD) region. The spatial pattern of winter wheat in 2015 was mapped through time series similarity analysis. Winter wheat yield was estimated by integrating phenological information into yield model, and cross-validation was performed using actual yield data. The geographical detector method was used to analyze determinants influencing winter wheat yield. This study concluded that the overall classification accuracy for winter wheat is 88.09%. The estimated yield agreed with actual yield, with R2 value of 0.74 and root mean square error (RMSE) of 1.02 t ha−1. Cumulative temperature, soil salinity and their interactions were key determinants affecting winter wheat yield. Several measures are recommended to ensure sustainable crop production in the YRD region, including improving irrigation and drainage systems to reduce soil salinity, selecting salt-tolerant winter wheat varieties, and improving agronomy techniques to extend effective cumulative temperature.


Introduction
Agricultural production is a critical issue related to food security guarantee, regional economic development and social stability for local government [1][2][3]. Approximately 14.29% of the global land-surface area is covered by cultivated land [4]. Due to limited cultivated land resources and continued population growth, global agricultural production faces enormous challenges. Owing to continuously growing population from seven billion in 2010 to nine billion in 2050 and limited land resources, a global food gap will appear [5]. To fill the yield gap between supply and demand, it is imperative to increase crop yield. Affected by internal and external factors, the crop growth and its ultimate yield usually involve complex procedures, which have a significant spatial heterogeneity [6][7][8][9][10]. Hence, identifying spatial heterogeneity of crop yields, determining the influencing factors and quantifying theirs effects on crop yield are of great significance for improving crop yield and achieving sustainable development of agriculture.
As important research fields in precision agriculture, accurate crop identification and yield estimation are preconditions for studying spatial heterogeneity of crop yields and their determinants [11]. Commonly depending on crop acreage, yield estimation is an important basis for establishing agriculture effectively quantified through traditional method. Moreover, the observation scale of traditional method is relatively small, mainly focusing on field scale [58]. Actually, crop yield and its influencing factors have significant spatial variability at large scale, which was not considered in traditional methods. As a new statistical method, the geographical detector is developed to detect spatial heterogeneity and explore its driving forces [59]. The geographical detector method is implemented based on the assumption that if an independent variable has a significant impact on a dependent variable, the spatial pattern of independent variable and the dependent variable should be similar [60]. The method has been widely used in social sciences [61,62], human health [63,64], natural sciences [65] and environmental sciences [66,67] etc.
To better explain the spatial variance in crop yield linked to influencing factors, it is important to describe and understand the spatial pattern of crop yield. Given the spatial heterogeneity of wheat yield, the Yellow River Delta (YRD) region, the newest land in China, was selected as study area. The aim of this study was to explore spatial heterogeneity of winter wheat yield and quantitatively distinguish its determinants at regional scale. To achieve this goal, we classified winter wheat by computing Minkowski distance based on time-series similarity analysis, estimated winter wheat yield by using time-series phenological information, and quantitatively investigate determinants and interactions affecting spatial heterogeneity of winter wheat yield by using geographical detector in YRD region. This study is beneficial for agricultural policy maker in formulating appropriate measures to improve regional crop yield and ensure food security.

Study Area
The Yellow River Delta (YRD), a new alluvial land in Bohai Bay, is the transitional area formed from the land-sea interaction. The YRD region consists of Dongying city and Binzhou city, and covers an area of 15,564 km 2 , approximately extending from 36.68 • to 38.21 • N latitude and 117.25 • to 119.21 • E longitude ( Figure 1). This region has a typical warm and temperate monsoon climate, with sufficient light and heat. Clear seasonal variations in temperature, humidity and evaporation make it suitable for crop production. The YRD region has the most potential productivity in Shandong province and even China [68]. Coastal lowland with flat terrain is another characteristic of the study area. The elevation in the study region ranges from 1 to 20 m above sea level. With a short time of land formation and high ecological vulnerability, crop growth is largely restricted in many parts of YRD region, and the spatial heterogeneity of crop production is obvious [69][70][71]. Agriculture is the dominant industry in the YRD region, and the cropland area accounts for more than 65% of the region's total area [11]. The main crop types are winter wheat, summer maize and cotton. Among them, winter wheat is the major grain in YRD region. The YRD region is a blended-planting region, consisting of double cropping and single cropping. Double cropping system is alternated by winter wheat and summer maize, which is commonly practiced [72]. Single cropping system is mainly planting cotton. There also very little spring wheat is cultivated. Due to soil salinization, it is hard to grow winter wheat in northeastern part of the region, where cotton is cultivated. Winter wheat is planted in late September to early October, and becomes mature in early to middle June of next year (Figure 2). The duration time for winter wheat is approximately 230-260 days [73]. Several phenological phases forms a complete growth cycle of winter wheat in one year, they are germination, seedling, tillering, overwintering, green-up, jointing, booting, heading, flowering, milk, the dough phase and maturity [11].
Sustainability 2020, 12, x FOR PEER REVIEW  4 of 21 planting region, consisting of double cropping and single cropping. Double cropping system is alternated by winter wheat and summer maize, which is commonly practiced [72]. Single cropping system is mainly planting cotton. There also very little spring wheat is cultivated. Due to soil salinization, it is hard to grow winter wheat in northeastern part of the region, where cotton is cultivated. Winter wheat is planted in late September to early October, and becomes mature in early to middle June of next year ( Figure 2). The duration time for winter wheat is approximately 230-260 days [73]. Several phenological phases forms a complete growth cycle of winter wheat in one year, they are germination, seedling, tillering, overwintering, green-up, jointing, booting, heading, flowering, milk, the dough phase and maturity [11].

MODIS Data and Pre-Processing
The moderate resolution imaging spectroradiometer (MODIS) data was provided by the National Aeronautics and Space Administration's Earth Observation System Clearing House Warehouse Inventory Search Tool (https://ladsweb.nascom.nasa.gov/search). The MOD09Q1 surface reflectance data from 2014 to 2015 were selected to extract winter wheat. The MOD09Q1 products were a re-project to Albers equal-area conic projection and resampled by the MODIS Reprojection Tool (MRT) (NASA, Washington, USA). Red and near-infrared reflectance were used to calculate the NDVI, which were provided at 250-m resolution from MOD09Q1 product every eight days. The NDVI was described using Equation (1): where and are the visible red (620-670 nm) and near-infrared (841-876 nm) reflectance for the MODIS bands respectively. The pixel quality was described by using quality assurance (QA) data of MOD09Q1. High quality corresponds to pixel values ranging from 0 to 3. Medium quality corresponds to pixel values ranging from 4 to 5. Low quality corresponds to pixel values ranging from 6 to 16. The corresponding weights were assigned to different quality levels. Weight values of 1, 0.5 and 0 were assigned to high quality, medium quality and low quality respectively.
Before winter wheat identification, smoothing algorithms should be used to reduce noise of time-series NDVI. The maximum value compositing (MVC) process can eliminate thick clouds, however, noise still exists which often generated by sensor parameter, solar altitude and aerosols. The Savitzky-Golay (SG) filter method was applied to reduce noises because of its simplicity and reliability. The SG filter achieves better denoising results than other mainstream algorithm in effectively reserving relative maximum, minimum and width [74]. The performance of the SG filter

MODIS Data and Pre-Processing
The moderate resolution imaging spectroradiometer (MODIS) data was provided by the National Aeronautics and Space Administration's Earth Observation System Clearing House Warehouse Inventory Search Tool (https://ladsweb.nascom.nasa.gov/search). The MOD09Q1 surface reflectance data from 2014 to 2015 were selected to extract winter wheat. The MOD09Q1 products were a re-project to Albers equal-area conic projection and resampled by the MODIS Reprojection Tool (MRT) (NASA, Washington, USA). Red and near-infrared reflectance were used to calculate the NDVI, which were provided at 250-m resolution from MOD09Q1 product every eight days. The NDVI was described using Equation (1): where ρ NIR and ρ Red are the visible red (620-670 nm) and near-infrared (841-876 nm) reflectance for the MODIS bands respectively. The pixel quality was described by using quality assurance (QA) data of MOD09Q1. High quality corresponds to pixel values ranging from 0 to 3. Medium quality corresponds to pixel values ranging from 4 to 5. Low quality corresponds to pixel values ranging from 6 to 16. The corresponding weights were assigned to different quality levels. Weight values of 1, 0.5 and 0 were assigned to high quality, medium quality and low quality respectively.
Before winter wheat identification, smoothing algorithms should be used to reduce noise of time-series NDVI. The maximum value compositing (MVC) process can eliminate thick clouds, however, noise still exists which often generated by sensor parameter, solar altitude and aerosols. The Savitzky-Golay (SG) filter method was applied to reduce noises because of its simplicity and reliability. The SG filter achieves better denoising results than other mainstream algorithm in effectively reserving relative maximum, minimum and width [74]. The performance of the SG filter is exhibited in Figure 3. The SG filter smooths and computes derivatives of a set of consecutive values by executing simplified least squares-fit convolution [75]. For better performance of filtering, the data quality levels assigning with corresponding weights were used to identify spurious observations for the SG filter. Subsequently, the SG filter was applied by setting a window width of four, one envelope iterations, and an adaptation strength of two. Fitting iterations of SG filter were set to 2 instead of 1 in other similar study. However, the NDVI of double rice in rotation fallow period will be smoothed out at some double cropping field when the envelope iteration was set greater than 1. After several experiments and visual inspection, all the optimal fitting settings were set. The SG filter is described using Equation (2): where Y denotes the original NDVI value, Y * j is the filtered NDVI value, C i refers to the coefficient (smoothing window), j is the running index of the original ordinate data table, and m is the half-width of the smoothing window [76].
Sustainability 2020, 12, x FOR PEER REVIEW 5 of 21 is exhibited in Figure 3. The SG filter smooths and computes derivatives of a set of consecutive values by executing simplified least squares-fit convolution [75]. For better performance of filtering, the data quality levels assigning with corresponding weights were used to identify spurious observations for the SG filter. Subsequently, the SG filter was applied by setting a window width of four, one envelope iterations, and an adaptation strength of two. Fitting iterations of SG filter were set to 2 instead of 1 in other similar study. However, the NDVI of double rice in rotation fallow period will be smoothed out at some double cropping field when the envelope iteration was set greater than 1. After several experiments and visual inspection, all the optimal fitting settings were set. The SG filter is described using Equation (2): where Y denotes the original NDVI value, * is the filtered NDVI value, refers to the coefficient (smoothing window), is the running index of the original ordinate data table, and m is the halfwidth of the smoothing window [76].

Field Survey Data and Statistical Data
Two hundred and fifty pure pixels were manually selected from Google Earth. The pure pixels consisted of five vegetation types and each type had 50 pixels. The NDVI temporal profiles were constructed by using NDVI from September 2014 to June 2015. After averaging processing, and standard NDVI temporal profiles for each vegetation types were obtained.
One hundred and sixty-four field survey data were collected in March 2015 (Figure 1), and used to verify the accuracy of winter wheat classification. The filed site dataset had the same geographic coordinates, projected coordinates and location as MODIS data. A handheld GPS device Trimble GeoExplorer 3000 series (Trimble Navigation, Ltd, California, USA) was used to record geographical location and land use types. There were 84 samples of winter wheat, 33 samples of cotton, 18 samples of grassland, and 29 samples of woodland in the filed site dataset respectively. The results of classification and yield estimation were validated in two ways, verification with field sites and local agricultural statistics.
Ground-based LAI measurements from 2014 to 2015 aimed to collect data for calibration of LAI estimation models from the time series NDVI dataset. Thirty-six winter wheat plots were selected to measure LAI and its corresponding yield ( Figure 1). We randomly divided the plots into two parts. Two-thirds of the plots were used for training, and one-third of the plots were used for testing.

Field Survey Data and Statistical Data
Two hundred and fifty pure pixels were manually selected from Google Earth. The pure pixels consisted of five vegetation types and each type had 50 pixels. The NDVI temporal profiles were constructed by using NDVI from September 2014 to June 2015. After averaging processing, and standard NDVI temporal profiles for each vegetation types were obtained.
One hundred and sixty-four field survey data were collected in March 2015 (Figure 1), and used to verify the accuracy of winter wheat classification. The filed site dataset had the same geographic coordinates, projected coordinates and location as MODIS data. A handheld GPS device Trimble GeoExplorer 3000 series (Trimble Navigation, Ltd, California, USA) was used to record geographical location and land use types. There were 84 samples of winter wheat, 33 samples of cotton, 18 samples of grassland, and 29 samples of woodland in the filed site dataset respectively. The results of classification and yield estimation were validated in two ways, verification with field sites and local agricultural statistics.
Ground-based LAI measurements from 2014 to 2015 aimed to collect data for calibration of LAI estimation models from the time series NDVI dataset. Thirty-six winter wheat plots were selected to measure LAI and its corresponding yield ( Figure 1). We randomly divided the plots into two parts. Two-thirds of the plots were used for training, and one-third of the plots were used for testing. Regional LAI estimation model was established by using training plots, and the estimation result was validated by testing plots. To ensure the comparison of the extraction of representative mean LAI values with MODIS data at 250 m pixel resolution, the area for each plot is not less than 250 m × 250 m. Mean values of plots were used to represent the measured samples. Field location information was also recorded. The investigation times of LAI were the beginning of overwintering, green-up, jointing, heading and maturity, respectively. A LAI-2200 instrument (LI-COR Biotechnology, Ltd, Nebraska, USA) was used to collect LAI, and five measurements were done in each homogeneous area of MODIS pixel. LAI samples were randomly set and measured within each plot at a 45 • angle between rows. A mean value of the five samples represented the plot value. To establish and verify the yield estimation model of winter wheat, the same thirty-six plots were selected to obtain the yields on 8 June 2015 during maturity stages. Five representative samples were selected within the scope of 250 m × 250 m. The mean value of each five samples within the scope represented the field value. An area of 1 m 2 in each wheat field was reaped and measured. The samples were taken back to the laboratory. After procedures of drying, weighing and threshing, the unit yield of winter wheat in each field was obtained. Two-thirds of the yield plots were used for training, and one-third of the plots were used for testing.
Meteorological data of nine meteorological stations covering the study area from 2014 to 2015 were provided by China Meteorological Data Service Center (http://data.cma.cn/). A daily meteorological dataset was selected, including air temperature, precipitation, sunshine duration, and radiation. Cumulative temperature of the green-up phase and the heading phase were calculated respectively by accumulating daily air temperature during the corresponding time period. A kriging method were used to interpolate meteorological data with a spatial resolution of 30 m. Cross-validation was performed and the validation results all met accuracy requirements.
The aridity index (AI) was defined as the ratio of annual potential evapotranspiration to precipitation. AI has been widely used to describe annual evaporation ratio in many studies. The AI dataset is provided by Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn). AI data was projected to an Albers equal area conic projection and extracted according to spatial extent of study area.
Soil properties such as soil salt content and soil organic matter content were determined from field soil samples which were collected in May 2015. A fishnet of 5 km × 5 km rectangular cells was created by ArcGIS 10.4 (ESRI, California, USA). Three hundred and forty-five soil samples at 0~20 cm plough layers were obtained with a soil drill. A kriging method was used to interpolate soil salt content and soil organic matter content with a spatial resolution of 30 m. Cross-validation was performed and the validation results all met accuracy requirements.
Digital Elevation Model (DEM) data was derived from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global digital elevation model (GDEM) Version 002 data with a spatial resolution of 30 m, which was provided by the Land Processes Distributed Active Archive Center (LP DAAC) of United States Geological Survey (USGS) (https://glovis.usgs.gov/). DEM data was projected to an Albers equal area conic projection and extracted to mask according to spatial extent of study area. On this basis, slope map was generated using Slope function of 3D Analyst tools in ArcGIS 10.4.
The spatial distribution of Yellow River was derived from the land-use map of the study area. The buffer analysis was done by distance function in ArcGIS 10.4. Distance to Yellow River was drawn by setting 1.0 km buffer radius.
Population data was from the sixth national census of China in 2010, which was collected at a community level. The Voronoi diagram was build and kriging interpolation was performed. Population density was obtained with a spatial resolution of 30 m.
The socio-economic status can be assessed by nighttime light data [77]. The visible infrared imaging radiometer suite (VIIRS) data was used, which was provided by the Earth Observation Group (EOG), National Geophysical Data Center (NGDC) at the National Oceanic and Atmospheric Sustainability 2020, 12, 135 7 of 21 Administration (NOAA) (https://ngdc.noaa.gov/eog/download.html). The annual nighttime light image in 2015 was projected to an Albers equal area conic projection and extracted to study area with a spatial resolution of 500 m.

Winter Wheat Classification Based on Time-Series Similarity Measurement
In this study, there were two steps to performing the wheat discriminations. One was analysis of NDVI temporal profiles and the other was time-series similarity classification based on Minkowski distance.

NDVI Time Series Profiles Analysis
Each vegetation type had their unique temporal profiles behaviors, exhibiting their dynamic response to the environmental changes. Figure 4 exhibits the average temporal profiles for different vegetation types from September 2014 to June 2015 during the growing period of winter wheat. There were distinguishing features for woodland, grasslands, winter jujube, cotton and winter wheat in this period. As for winter wheat, the number of peaks had obvious characteristics.

Winter Wheat Classification Based on Time-Series Similarity Measurement
In this study, there were two steps to performing the wheat discriminations. One was analysis of NDVI temporal profiles and the other was time-series similarity classification based on Minkowski distance.

NDVI Time Series Profiles Analysis
Each vegetation type had their unique temporal profiles behaviors, exhibiting their dynamic response to the environmental changes. Figure 4 exhibits the average temporal profiles for different vegetation types from September 2014 to June 2015 during the growing period of winter wheat. There were distinguishing features for woodland, grasslands, winter jujube, cotton and winter wheat in this period. As for winter wheat, the number of peaks had obvious characteristics.
The extracted NDVI temporal curves of winter wheat conformed to the typical phenological characteristics, which exhibited two consecutive periods of increase and decrease, with a maximum NDVI value of 0.8. The first peak appeared at the beginning of the overwintering, and the second peak existed in early May of the following year. Winter wheat could be separated from other vegetation based on the characteristic of NDVI temporal profile. Figure 4 exhibits the differences of time series NDVI profiles between winter wheat and other vegetation types.

Time-Series Similarity Measurement
A time-series similarity measurement basing on Minkowski distance was applied to discriminate winter wheat. The standard time series curves were obtained by selecting several pure pixels from each vegetation type. Figure 5 exhibits the principle of the Minkowski distance.
Suppose there are two groups of time series Q and S, whose lengths are , namely = , , … , , = , , … , , we need to calculate the path = , , … , between two groups of time series. is used to get the distance metric ( , ) which can be described by Equation (3): The extracted NDVI temporal curves of winter wheat conformed to the typical phenological characteristics, which exhibited two consecutive periods of increase and decrease, with a maximum NDVI value of 0.8. The first peak appeared at the beginning of the overwintering, and the second peak existed in early May of the following year. Winter wheat could be separated from other vegetation based on the characteristic of NDVI temporal profile. Figure 4 exhibits the differences of time series NDVI profiles between winter wheat and other vegetation types.

Time-Series Similarity Measurement
A time-series similarity measurement basing on Minkowski distance was applied to discriminate winter wheat. The standard time series curves were obtained by selecting several pure pixels from each vegetation type. Figure 5 exhibits the principle of the Minkowski distance.
Suppose there are two groups of time series Q and S, whose lengths are n, namely Q = q 1 , q 2 , . . . , q n , S = {s 1 , s 2 , . . . , s n }, we need to calculate the path P = p 1 , p 2 , . . . , p n between two groups of time series. P is used to get the distance metric Dmink(Q, S) which can be described by Equation (3): where f Q t is the f Q (t) time series value at time t, r is a user-defined integer, and n is the images number available for the yearly rice cropping system classification. In our study, r was set to 2. When the parameter r was set to 2, the Minkowski distance can be regarded as the Euclidean distance. Due to its nonlinear character, Euclidean distance is more sensitive to outlier values and useful to distinguish other objects. The Minkowski distances between standard time series curves and curves to be processed were measured by a programming technique in MATLAB (2017a) (MathWorks, Natick, MA, USA).
where is the ( ) time series value at time , is a user-defined integer, and is the images number available for the yearly rice cropping system classification. In our study, was set to 2. When the parameter was set to 2, the Minkowski distance can be regarded as the Euclidean distance. Due to its nonlinear character, Euclidean distance is more sensitive to outlier values and useful to distinguish other objects. The Minkowski distances between standard time series curves and curves to be processed were measured by a programming technique in MATLAB (2017a) (MathWorks, Natick, USA).

Time Series LAI Estimation
Ground-measured LAI and its corresponding NDVI were used to establish statistical model of different growth period. The retrieval effect of LAI differed in each growth period. The simulation results of three models (unitary linear model, quadratic polynomial model and exponential model) were analyzed. The modelling results indicated that the correlation coefficients between NDVI and LAI in vegetative growth period were higher than that in whole growing period. The correlation coefficients in reproductive growth period were not significant. LAI inversion during vegetative growth period was modelled. The estimation accuracy was evaluated by R 2 and root mean square error (RMSE). Figure 6 shows the regression results of NDVI and ground-measured LAI. The regression function was described by Equation (4): The regression model exhibited a high degree of fitting to the equation and small estimation error. R 2 and RMSE were 0.7536 and 0.21 respectively. According to the best-fitting model of LAI, the simulated LAI at eight-day intervals were obtained.
The LAI curve exhibited in Figure 7 indicates the growth cycle of winter wheat, where the LAI value lined up, and the abscissa describes time. The LAI time series curves of winter wheat were obtained, reflecting the LAI dynamic changes in the growing period. Features of time series curves agreed with reality, showing the information of wheat growth. The LAI retrieval accuracy of winter wheat is relatively high which could meet the needs of yield estimation for winter wheat.

Time Series LAI Estimation
Ground-measured LAI and its corresponding NDVI were used to establish statistical model of different growth period. The retrieval effect of LAI differed in each growth period. The simulation results of three models (unitary linear model, quadratic polynomial model and exponential model) were analyzed. The modelling results indicated that the correlation coefficients between NDVI and LAI in vegetative growth period were higher than that in whole growing period. The correlation coefficients in reproductive growth period were not significant. LAI inversion during vegetative growth period was modelled. The estimation accuracy was evaluated by R 2 and root mean square error (RMSE). Figure 6 shows the regression results of NDVI and ground-measured LAI. The regression function was described by Equation (4): The regression model exhibited a high degree of fitting to the equation and small estimation error. R 2 and RMSE were 0.7536 and 0.21 respectively. According to the best-fitting model of LAI, the simulated LAI at eight-day intervals were obtained. The LAI curve exhibited in Figure 7 indicates the growth cycle of winter wheat, where the LAI value lined up, and the abscissa describes time. The LAI time series curves of winter wheat were obtained, reflecting the LAI dynamic changes in the growing period. Features of time series curves agreed with reality, showing the information of wheat growth. The LAI retrieval accuracy of winter wheat is relatively high which could meet the needs of yield estimation for winter wheat.

Yield Modeling Based on LAI Phenological Features
To verify the validity of phenological information in yield estimation of winter wheat, the yield estimation model based on phenological features of LAI was established. Inspired by Chul et al. [11], seven important phenological metrics were chosen for their closely relation with yields, including LAI in green-up phase, maximum LAI in whole growing period, accumulated LAI from green-up phase to maturity phase, average LAI in whole growing period, standard deviation of LAI in whole growing period, changing rate of LAI in vegetative growth period, and changing rate of LAI in reproductive growth period.
LAI in green-up phase represents the LAI value at the beginning of vigorous growth for winter wheat. The regional LAI in average green-up date represents the LAI in green-up phase. Maximum LAI, by definition, is the maximum value of LAI generally occurs during the period from heading to

Yield Modeling Based on LAI Phenological Features
To verify the validity of phenological information in yield estimation of winter wheat, the yield estimation model based on phenological features of LAI was established. Inspired by Chul et al. [11], seven important phenological metrics were chosen for their closely relation with yields, including LAI in green-up phase, maximum LAI in whole growing period, accumulated LAI from green-up phase to maturity phase, average LAI in whole growing period, standard deviation of LAI in whole growing period, changing rate of LAI in vegetative growth period, and changing rate of LAI in reproductive growth period.
LAI in green-up phase represents the LAI value at the beginning of vigorous growth for winter wheat. The regional LAI in average green-up date represents the LAI in green-up phase. Maximum LAI, by definition, is the maximum value of LAI generally occurs during the period from heading to

Yield Modeling Based on LAI Phenological Features
To verify the validity of phenological information in yield estimation of winter wheat, the yield estimation model based on phenological features of LAI was established. Inspired by Chul et al. [11], seven important phenological metrics were chosen for their closely relation with yields, including LAI in green-up phase, maximum LAI in whole growing period, accumulated LAI from green-up phase to maturity phase, average LAI in whole growing period, standard deviation of LAI in whole growing period, changing rate of LAI in vegetative growth period, and changing rate of LAI in reproductive growth period. LAI in green-up phase represents the LAI value at the beginning of vigorous growth for winter wheat. The regional LAI in average green-up date represents the LAI in green-up phase. Maximum LAI, by definition, is the maximum value of LAI generally occurs during the period from heading to flowering, representing the formation of nutrient organs and the transporting ability from substance to wheat grains. The LAI in average heading date could represent the maximum LAI in whole growing period. Accumulated LAI from green-up phase to maturity phase represents the overall status of vegetation. Average LAI in whole growing period represents the constant vegetation status. The standard deviation of LAI in whole growing period is defined as the change of the average LAI value in whole growth period.
The changing rate of LAI in vegetative growth period refers to the growing speed or frequency from green-up phase to heading phase. The number of spikes and grains are finally determined in the vegetative growth period which period is a critical period. The changing rate of LAI in vegetative growth period can be described by Equation (5): where RCvg is the changing rate of LAI in vegetative growth period, f (x, a) is the polynomial function for LAI time-series profile, Day max denotes the day of year when LAI value reaches the maximum (usually in heading date), Day rev denotes the day of year when winter wheat enters into green-up date, and N is the number of images (also the difference between Day max and Day rev ) during vegetative growth period. Changing rate of LAI in reproductive growth period, by definition, is the growing speed or frequency from heading phase to maturity phase. The reproductive growth period is the key period for determining grain quality and forming final yield. The changing rate of LAI in reproductive growth period can be described by Equation (6): where RCrg is the changing rate of LAI in reproductive growth period, f (x, a) is the polynomial function for LAI time-series profile, Day max denotes the day of year when LAI value reaches the maximum (usually in heading date), Day mat denotes the day of year when winter wheat enters into harvest time, and N is the number of images (also the difference between Day max and Day mat ) during reproductive growth period. The most efficient parameters were selected as elements to construct the ultimate yield estimation model through the stepwise multi-elemental regression method. Regional yield of winter wheat was obtained and the estimation accuracy was then validated. The multi-element linear regression equation can be described by Equation (7) where y denotes the yield, LAI max refers to maximum LAI in whole growing period, LAI stnd is standard deviation of LAI in whole growing period, LAI greenup is LAI in green-up phase, LAI sum is accumulated LAI from green-up phase to maturity phase, RCrg is the changing rate of LAI in vegetative growth period, and RCrg is the changing rate of LAI in reproductive growth period. Note that the phenological feature of average LAI was excluded in the model building due to relatively low estimation accuracy.

Geographical Detector
Spatial heterogeneity is one of the basic characteristics of geographical phenomena. The geographical detector is developed to detect spatial heterogeneity [59]. This method overcomes limitations of traditional mathematical statistical models, such as many assumptions and large data requirements. Both numerical data and qualitative data can be detected in geodetector method. There are four functions in geodetector, including risk detector, factor detector, ecological detector and interaction detector.
Factor detector measures spatial stratified heterogeneity of dependent variable Y, and determinant power of independent variables X on the spatial heterogeneity of Y both linearly and nonlinearly [78], which can be accomplished by q-statistic and described by Equation (8): where q indicates the explanatory power of variable X on event Y. h = 1, . . . , L is the strata (classification) of the explanatory variable X, N h and N stand for the number of units in strata h and the whole region, respectively. σ 2 h and σ 2 refer to the variance of Y in strata h and the whole region. SSW and SST represent within sum of squares and total sum of squares. The value of q ranges from 0 to 1. There is no association between Y and X when q value equals to 0. Factor X completely controls Y when q value equals to 1. The larger the q value, the stronger the explanatory power of X on Y, and vice versa. Note that X should be a categorical variable, and the number of strata L might range from 2 to 10. Stratification should be done when X is numerical.
A fishnet of 250 m × 250 m rectangular cells was created by ArcGIS 10.4. Density of grid points was determined according to actual spatial pattern of winter wheat. To investigate relationships between influencing factors and winter wheat yield, 1456 sampling points were selected. Each factor was stratified by natural breakpoint method with seven grades in each category. The yields and corresponding stratification values of each factor were extracted to points, and then calculation was performed by the geo-detector tool.

Spatial Pattern of Winter Wheat
The spatial distribution of winter wheat was mapped by calculating Minkowski distance based on time-series similarity analysis. The threshold of calculation was set as 2.64 to extract winter wheat. Figure 8 exhibits the spatial pattern of winter wheat. Generally, winter wheat was mainly distributed in the southwest of the study region, and a small amount of winter wheat was found along the rivers in the northeastern part of the study region.
Due to the high similarity between grassland and winter wheat in time-series profiles, some grasslands were misclassified into winter wheat. The separability of winter wheat from the grassland should consider the similarity characteristics of time series profile in future studies. The mixed pixel problems can be clearly noticed in northeastern part of study region, where winter wheat was grown in small areas with sufficient water near the riverbank. Compared to the spatial resolution of MODIS (6.25 ha), the area of winter wheat fields in this area is quite small (usually less than 1 hectare). Therefore, such a small area cannot be detected by MODIS.
The classification results were also validated with agricultural statistics in 2015 provided by the municipal government. The acreage accuracy of winter wheat was 89.75% at the municipal level, verifying the ability of proposed method in winter wheat identification.

Winter Wheat Yield Estimation Based on Phenological Features
The regional distribution of winter wheat yield was obtained by integrating each LAI phenological features in multi-element linear regression model. To avoid overfitting, the Leave One Out Cross Validation (LOOCV) method of cross-validation was used to validate the established yield estimation. There was a good correlation between measured yield and estimated yield, exhibiting a R 2 value of 0.74 at significant level (P < 0.01), and a RMSE of 1.02 t ha −1 . Note that when the actual yield was relatively small (less than 8 t ha −1 ), the estimated yield turned to be larger than field measured records. When the actual planting area was large, the estimated yield turned to be smaller ( Figure 9).
The spatial pattern of winter wheat yield showed a significant heterogeneity from south to north ( Figure 10). Winter wheat yield exhibited a decreasing trend south to north, which conformed to the The classification accuracy was validated with field dataset and agricultural statistical data respectively. The percentage of pixels correctly classified was defined as the overall classification accuracy. Together with the kappa coefficient, the classification results were validated with ground data. Table 1 indicates the confusion matrix for winter wheat classification, showing a good agreement with the pixel-by-pixel comparisons. The overall accuracy was 88.09% and the Kappa coefficient was 0.81. There were 74 sampling points were classified correctly as winter wheat among the 84 sampling points. There were 10 points were misclassified as winter wheat among the 80 non-winter wheat crop field points. Due to the high similarity between grassland and winter wheat in time-series profiles, some grasslands were misclassified into winter wheat. The separability of winter wheat from the grassland should consider the similarity characteristics of time series profile in future studies. The mixed pixel problems can be clearly noticed in northeastern part of study region, where winter wheat was grown in small areas with sufficient water near the riverbank. Compared to the spatial resolution of MODIS (6.25 ha), the area of winter wheat fields in this area is quite small (usually less than 1 hectare). Therefore, such a small area cannot be detected by MODIS.
The classification results were also validated with agricultural statistics in 2015 provided by the municipal government. The acreage accuracy of winter wheat was 89.75% at the municipal level, verifying the ability of proposed method in winter wheat identification.

Winter Wheat Yield Estimation Based on Phenological Features
The regional distribution of winter wheat yield was obtained by integrating each LAI phenological features in multi-element linear regression model. To avoid overfitting, the Leave One Out Cross Validation (LOOCV) method of cross-validation was used to validate the established yield estimation. There was a good correlation between measured yield and estimated yield, exhibiting a R 2 value of 0.74 at significant level (P < 0.01), and a RMSE of 1.02 t ha −1 . Note that when the actual yield was relatively small (less than 8 t ha −1 ), the estimated yield turned to be larger than field measured records. When the actual planting area was large, the estimated yield turned to be smaller (Figure 9).
The spatial pattern of winter wheat yield showed a significant heterogeneity from south to north ( Figure 10). Winter wheat yield exhibited a decreasing trend south to north, which conformed to the actual distribution of winter wheat yield in YRD region. The average yield per unit area of winter wheat was 6.9 t ha −1 and the standard deviation was 0.76 t ha −1 , which is generally consist with local actual yields, providing a simpler and reliable way to estimate yield based on time series phenological features.
Sustainability 2020, 12, x FOR PEER REVIEW 14 of 21 actual distribution of winter wheat yield in YRD region. The average yield per unit area of winter wheat was 6.9 t ha −1 and the standard deviation was 0.76 t ha −1 , which is generally consist with local actual yields, providing a simpler and reliable way to estimate yield based on time series phenological features.

Analysis of Determinants for Spatial Heterogeneity of Winter Wheat Yield
To investigate which factors was most closely related to spatial heterogeneity of winter wheat yield in YRD region, the relationships between thirteen factors and winter wheat yield were analyzed, and the q value of each influencing factor on winter wheat yield was calculated. The factor detector actual distribution of winter wheat yield in YRD region. The average yield per unit area of winter wheat was 6.9 t ha −1 and the standard deviation was 0.76 t ha −1 , which is generally consist with local actual yields, providing a simpler and reliable way to estimate yield based on time series phenological features.

Analysis of Determinants for Spatial Heterogeneity of Winter Wheat Yield
To investigate which factors was most closely related to spatial heterogeneity of winter wheat yield in YRD region, the relationships between thirteen factors and winter wheat yield were analyzed, and the q value of each influencing factor on winter wheat yield was calculated. The factor detector results indicated that there was a significant difference in explanatory power of influencing factors

Analysis of Determinants for Spatial Heterogeneity of Winter Wheat Yield
To investigate which factors was most closely related to spatial heterogeneity of winter wheat yield in YRD region, the relationships between thirteen factors and winter wheat yield were analyzed, and the q value of each influencing factor on winter wheat yield was calculated. The factor detector results indicated that there was a significant difference in explanatory power of influencing factors on spatial heterogeneity of winter wheat yield ( Table 2). The corresponding q value of each influencing factor, ranging from highest to lowest, were cumulative temperature for heading phase (X2), soil salinity (X7), cumulative temperature for green-up phase (X1), precipitation (X3), population density (X12), sunshine duration (X4), aridity index (X6), organic matter (X8), distance to Yellow River (X11), radiation (X5), DEM (X9), slope (X10) and nighttime light (X13). Cumulative temperature for heading phase, soil salinity and cumulative temperature for green-up phase were three dominant determinants, which explanatory powers account for 0.49, 0.46 and 0.32 respectively. Precipitation, population density, sunshine duration, aridity index, organic matter and distance to Yellow River were secondary determinants, which explanatory powers were above 0.2. Winter wheat yield was less affected by radiation, DEM, slope and nighttime light, with explanatory power of less than 0.1.
An interaction detector was used to investigate the interactions between any two influencing factors. The interaction module results indicated that the explanatory power of any two-factor interaction on wheat yield was greater than that of a single factor, indicating that the spatial heterogeneity of winter wheat yield was largely affected by interactions between factors. The explanatory power of the interactions was sorted according to statistical results, where q value were greater than 0.4. Table 3 exhibited dominant interactions with corresponding q values. The enhancement between cumulative temperature for the green-up phase and soil salinity, cumulative temperature for the heading phase and soil salinity, and cumulative temperature for green-up phase and aridity index were the most significant interactions, and the q values of the interactions were higher than single factor.
The interaction between cumulative temperature for the green-up phase and soil salinity was the most dominant controlling factor for spatial heterogeneity of winter wheat yield, which has highest explanatory power, with a q value of 0.69. The winter wheat yield would vary obviously under conditions of different soil salinity and similar cumulative temperature for the green-up phase, or different cumulative temperature for green-up phase and similar soil salinity. For example, the winter wheat yield with soil salt content of 0.05% was significantly different from that of 0.49% when the cumulative temperature for green-up phase was 155 degree days, and the winter wheat yield with cumulative temperature for green-up phase of 125 degree days obviously differed from that of 158 degree days when the soil salt content was 0.05%. The second dominant controlling factor was the interaction between cumulative temperature for the heading phase and soil salinity with a q value of 0.64. The third dominant controlling factor was cumulative temperature for green-up phase and aridity index, which had a q value of 0.60. The interactions ranking from first to sixth all were combinations of cumulative temperature and other factors. In addition, the combination of soil salinity and other factors greatly increased the explanatory power of a single factor on wheat yield, which showed nonlinear enhancement.

Discussion
According to the results of Section 3.1, we found that the spatial pattern of winter wheat is consistent with previous studies, indicating a decreasing trend according to the distance to the sea [11]. Figure 10 exhibits the distribution shift for winter wheat, indicating the ability of remote sensing to capture the spatial heterogeneity in the YRD region. However, mixed pixels in remote sensing data led to the uncertainties of identifying winter wheat. When identifying winter wheat and estimating its yield, precise spatial information is required. Indeed, for monitoring crop growth and estimating crop area by remote sensing over a Chinese site, spatial resolution of 120 m and 30 m are often needed respectively [79]. Time-series MODIS has been widely used to monitor crop condition where the field areas are much larger than the 250 m pixel size, allowing for individual crops classification [49,80]. However, the results of classification accuracy for winter wheat would be affected by the spatial resolution of 250 m. We should note that the extraction accuracy was not satisfying in northeastern areas of study region. There was a small amount of winter wheat scattering along the rivers where winter wheat was cultivated on a small scale (usually less than 1 ha). The mixed pixel caused by small-scale cultivation would affect classification accuracy. Besides the requirement of spatial information, the temporal information was also of great importance in crop yield estimation. The 8-d time-series used in our study had more accurate temporal resolution than other remote sensing sources, however, these cannot fully reflect the phenological characteristics for winter wheat. In addition, the usage of DPC (the day of pixel composite) information would significantly improve accuracy of agricultural monitoring. However, DPC information is not contained in the MOD09Q1 product, which also may impact the results accuracy. The Sentinel-2 satellite with 10 m spatial resolution and 5-d temporal resolution will be a good solution for mixed pixel at regional scale in our future research, which satellite launched in June, 2015.
According to the results of geodetector, cumulative temperature and soil salinity were key determinants greatly affecting winter wheat yield. Many scholars showed evidences of the two factors on crop yields [81][82][83][84][85][86], which are consistent with our results. Since the complexity of winter wheat growth processes, an influencing factor often did not work independently but interacted together. The results from interaction module of geodetector indicate that the explanatory power of interaction of cumulative temperature for green-up phase and soil salinity was over 69%. Chu et al. [73] found that cumulative temperature and soil salinity significantly influenced phenology of winter wheat. The delay of the green-up phase and the advance of the heading phase will shorten the time for vegetative growth of winter wheat, and ultimately affected the winter wheat yield with the increase of soil salt content. This finding showed good agreement with our study, which indicated that the interaction of cumulative temperature and soil salinity largely affects the spatial heterogeneity of winter wheat yield. Hence, several measures are recommended to effectively increase and stabilize wheat yields in the YRD region, including improving irrigation and drainage systems to reduce soil salinity [87,88], selecting salt-tolerant winter wheat varieties [89,90], and improving agronomy techniques to extend effective cumulative temperature, such as mulching and reasonable arrangements for sowing [91].
The process of crop growth and its ultimate yield is complex, and seasonal soil salinity dynamics lacked consideration in our study. In YRD region, the soil salinity is highly related with groundwater dynamics and has obvious seasonal fluctuations [92]. It is characterized by soil salinity accumulation in spring, desalination in summer, pick-up in autumn, and latent salinity in winter. In the seasonal variation during a year, the salt deposition period is consistent with the growth period of winter wheat, which seriously affects the growth and yield of winter wheat. In addition, soil salinity significantly affects wheat growth throughout the growing period [85]. An interesting problem we hope to address in the near future is identifying phenological periods in which soil salinity has the greatest impact on wheat yield under the condition of constant cumulative temperature.

Conclusions
The current study explored spatial heterogeneity of winter wheat yield and its determining factors in the YRD region by using a yield estimation model integrated with phenological information and the geographical detector method. Applying these methodologies, several conclusions were gained. Winter wheat was mapped through the time-series similarity measurement basing on Minkowski distance with a relatively high accuracy, exhibiting a concentrated distribution in the southwest of the study area. Winter wheat yield was estimated by combining phenological information into the yield estimation model, showing a decreasing trend from south to north. Cumulative temperature, soil salinity and the interaction between cumulative temperature and soil salinity were three key determinants for spatial heterogeneity of winter wheat yield. According to the above results, several measures should be taken to increase winter wheat yield, such as improving irrigation and drainage systems to reduce soil salinity, planting salt-tolerant winter wheat varieties, mulching and reasonable arrangements for sowing. The findings offer the potential to explore spatial heterogeneity of winter wheat yield and its determinants, and provide scientific basis for improving regional wheat yield, as well as effective decision-making support to local government for agricultural planning and management, ensuring food security and promoting the sustainable agricultural development.