Determinationof EffectiveWeatherParameters onRainfedWheat Yield Using Backward Multiple Linear Regressions Based on Relative Importance Metrics

Wheat (Triticum aestivum L.) is the most imperative crop for man feeding and is planted in numerous countries under rainfed conditions in semiarid zones. It is necessary for decision-makers and governments to predict the yield of rainfed wheat before harvest and to determine the effect of the major factors on it. Different methods have been suggested for forecasting yield with various levels of accuracy. One of these approaches is the statistical regression model, which is simple and applicable for regions with scarce data available. Since the weather is the most important factor affecting the production of wheat, particularly in rainfed cultivation, regression models using weather parameters are very common. However, the coefficients of these models are location based and should be determined locally. &erefore, in this research, backward multiple linear regression (BMLR) technique based on relative importance metrics was used to determine the most important effective weather parameters (11 parameters) on rainfed wheat productions in Fars Province, south of Iran, during 2006–2013. &e influence of each parameter in the final model was analyzed using the values of LMG relative importance metric. &e result indicated that sunshine hours had the biggest LMG (34.73%) and, therefore, was the most effective parameter. Also, among the other considered parameters, rainy days, minimum relative humidity, and average relative humidity with LMG values of 21.97%, 21.69%, and 21.62%, respectively, had the most effects on rainfed wheat yield in the studied area. All parameters except for the sunshine hours positively affected rainfed wheat yield. &e most important reason for the significance of these parameters can be the prevailing dry and semidry climate in the southern areas of Iran. &e proposed model for determination of weather parameters effects on rainfed wheat could be a great guidance and aid for different stakeholders such as farmers, decision-makers, and governments.


Introduction
Climate recognition and study of the agricultural plants requirements are some of the most important factors contributing to crop productions. Understanding and managing the effect of weather parameters on crop production could lead to increase in their yield. is issue is especially more crucial in rainfed farming conditions because climate shows the greatest impact on yield in rainfed farming [1]. Wheat is a globally vital crop and a strategic product in Iran. e Fars province located in the south of Iran ranks first in the production of wheat in this country. e rainfed wheat cultivation includes a large portion of this production so that out of about 550000 hectares cultivated area under wheat in this region, 150000 hectares is rainfed farming [2]. e predominant climate of this province is dry and semidry, and water resources are limited [3]. erefore, identifying the effective weather parameters involved in plant growth and crop yield is necessary. Several factors affect the variability of yield and product quality in the field, but, data collection and data analysis are also costly, time-consuming, and hard work [4]. Many researchers tried to analyze these factors and proposed different methods to forecast yield. For example, Mumtaz et al. estimated the wheat yield based on the weather parameters applying remote sensing information for Chakwal rainfed croplands, Punjab Province, and Pakistan [5]. In another study, Sabzevary et al. investigated the effects of climatic parameters on rainfed and irrigated wheat yields using bivariate linear regression analysis in chosen stations of Hamedan State, Iran. ey concluded that the sensitivity of rainfed wheat yield index to atmospheric and agroclimatic factors was higher compared to irrigated wheat [6].
In order to analyze the plant response to the weather parameters, three arbitrary categories of models are suggested: simple statistical models, parameterization models, and analog-physical models [7]. Among these, most statistical models are crop yield-weather models, in which their main advantage is the simplicity and straightforward relation between yield and one or more weather factors. Consequently, several research studies have been performed to develop a regression relationship between weather parameters and rainfed crop yield [8][9][10][11][12][13][14][15][16][17].
Drought has a significant impact on the production of wheat [18]. e study of Wu et al. showed that rainfed yield was related to drought severity and decreased due to increased temperature and reduction of precipitation [19]. Zarei et al. evaluated the most important effective time period on the changes of the annual yield of rainfed wheat under the impact of drought changes by using the correlation between calculated SPEI drought index in different time scales and simulated annual yield using the AquaCrop model based on the backward multiple generalized estimation equation method in the northwest of Iran [20].
Some researchers have tried to estimate the yield of wheat in different regions of Iran and under different weather conditions. Mehnatkesh et al. determined the most significant variables on rainfed wheat yield applying sensitivity analysis in Central Zagros, Iran [21]. ey used the variable collections of ground properties, soil physicochemical characteristics, precipitation, and weed biomass including 54 parameters as the inputs of the artificial neural network method while considered wheat grain and biomass yield as the objectives. e sensitivity analysis outputs revealed that all the variables were effective on the grain yield, given that the weekly precipitation owned the most impact. Zarei and Mahmoudi evaluated the impact of climatic parameters on the annual yield of rainfed wheat based on the records of 10 stations from 1967 to 2016 scattered in Iran. ey included that in all stations, the wind speed and minimum temperature parameters were the most effective and sunshine hour parameter was the least effective variables on the annual yield [22]. Kazmi and Rasul predicted the agrometeorological rainfed wheat yield in the Potohar area, Pakistan, using a linear regression model. In their studied region, the final yield was forecasted reliably by the variables of minimum temperature, sunshine duration, and rainfall depth in January (tilling and stem extension stage) [23].
Siosemarde and Sakine predicted the rainfed wheat yield applying weather variables in the Khoy region at West Azarbaijan State, Iran. Results indicated that the average temperature in October and the number of frost days in April influenced directly on the yield and the average of maximum relative humidity in December affected indirectly [24]. Khoorani et al. modeled and predicted the rainfed wheat (Triticum aestivum) yield in Kurdistan Province, Iran, using five weather parameters including total amount of precipitation, number of days with precipitation, maximum wind speed, mean evapotranspiration, and the average daily temperature as independent variables in linear regression models and the bootstrap resampling method during 1991-2003. Results indicated that using the bootstrap resampling method for modeling and estimating the crop yield increased the interior accuracy of the models [25]. e result of previous research illustrated that the significant weather parameters and their coefficients in the proposed regression models are location based and are dependent on the climate of the study region. erefore, using the aforementioned regression techniques without local calibration would suffer from low accuracy, particularly in the case of a short period of time. In addition, no significant attempt has been made to estimate the rainfed wheat yield in the Fars province, so far. Accordingly, the purpose of this research is to provide a higher-accuracy (more significant) statistical model for rainfed wheat yield estimation in terms of weather parameters. In the other words, we evaluated the feasibility of using backward multiple linear regression (BMLR) based on relative importance metrics to determine the most important effective meteorological parameters as independent variables on rainfed wheat yield in Fars Province, south of Iran, during 2006-2013.

Study Area.
is research was conducted for Fars Province situated in 27°02′ to 31°42′ N and 50°42′ to 55°38′ E ( Figure 1). Fars is in the south of Iran and has three distinct climatic zones: (a) the north and northwest includes mountains that have moderately cold winters and mild summers; (b) the center of the province has relatively rainy mild winters and hot dry summers; (c) the south and southeast areas have cold winters with hot summers.
Fars Province with about 133000 Km 2 area is the fourth largest province of Iran. e important districts with their major cities are Shiraz, Marvdasht, Jahrom, Fasa, Abadeh, Eghlid, Estahban, Firouzabad, Kazeroun, and Lar. e population of the province in 2017 was 4,851,274 of which 67.6% were urban dwellers, 32.1% were rural dwellers, and 0.3% were nomad tribes. e major activities of the inhabitants are industry, agriculture, and the service sector. Wheat, barley, fig, walnut, citrus fruits, especially lemon, dates, apple, pomegranate, beet, cotton, various grains, and saffron are the major agricultural products. e chemical and petrochemical, metal, electrical and electronics, leather, cellulose, food, and medicinal industries are the main industrial activities in this province.

Data.
e rainfed wheat yield data for Fars Province districts, including Abadeh, Eghlid, Estahban, Jahrom, Darab, Zarindasht, Sepidan, Shiraz, Farashband, Kazeroun, Lar, Lamerd, and Marvdasht, were obtained from the Agriculture Organization of Fars Province for the period 2006-2013. is crop is cultivated in Fars Province under irrigated and rainfed conditions from October to June. e yield was expressed as the average grain production (kg/ha) for the harvested area.
Furthermore, necessary weather parameters including minimum, maximum, and average temperature (T min , T max, and T avg ), minimum, maximum, and average relative humidity (RH min , RH max , and RH avg ), wind speed, sunshine hours, reference evapotranspiration (ET 0 ), rain, and rainy days of regions over 2006-2013 were obtained from I.R. of Iran Meteorological Organization (IRIMO). A simple average method was applied to filling the missing data. A homogeneity test using a standard normal homogeneity test at a 5% significance level was carried out on the dataset to recognize any nonhomogeneity. e descriptive statistics of climate parameters in studied stations are given in Table 1.
e climate conditions of the studied stations were determined by using De-Martonne aridity index [26].

Data Analysis.
e obtained data were introduced to the model one by one pursuant to their quantities and were investigated utilizing the Minitab and R programs. In order to determine the effect of factors on yield, a set of backward multiple linear regressions based on relative importance metrics was run.

Multiple Linear Regression.
To analyze the data, multiple linear regression (MLR) is a flexible technique that can be suitable whenever a dependent parameter Y is to be investigated in relation to any other parameters X 1 , X 2 , . . . , X k (the independent parameters). e generalized formula of MLR is given as where β 0 , β 1 , . . ., β k , are equation factors (coefficients) and ε i , i � 1, . . . , n, are the random components of the equation which pursue independent normal distributions with mean 0 and variance σ 2 .
e coefficients β 0 , β 1 , . . ., β k were approximated using the dataset. e prevailing formula of predictive MLR technique is given as where b 0 , b 1 , . . ., b k , are approximations of method variables and Y i is the forecasted value of Y i . We can rewrite the MLR technique in the following matrix form: where Y � (y 1 , . . . , y n ) T is the response vector, X is a n × (k + 1) full-rank design matrix with the first column produced by (1, . . . , 1) T and the l th It must be considered that in the model without an intercept (β 0 � 0), the column (1, . . . , 1) T should be eliminated from matrix X. e simple least squares (maximum likelihood) approximation of the coefficient vector β is represented as is usual procedure hypothesizes that there are adequate measurements to state meaningful something about β.

Backward Multiple Linear Regression (BMLR) Based on
Relative Importance Metrics. As can be observed, the MLR technique includes linear impacts of X 1 , X 2 , . . . , X k . However, because of colinearity between X 1 , X 2 , . . . , X k , some of these impacts may not be significant (p > 0.05). In this state, the backward method (BMLR) is applied and step by step the nonimpressive parameters are eliminated. e last equation has parsimonious parameters and acceptable accuracy. Usual BMLR technique eliminates the predictors from the model based on their p values. Since the reported regression estimates do not take into an account the model building process and, therefore, is advised to stop this common practice [27]. ere are numerous widely discussed  limitations of stepwise methods, such as misleadingly small p values not adjusted to account for the iterative fitting and biased R 2 measures [28]. For choosing an optimal model, there are systematic criteria including Akaike information criterion (AIC), Bayesian information criterion (BIC), and adjusted R 2 . ere are many recommendations for on application of variable selection methods, e.g., [29]. In usual BMLR technique, the relative importance of predictors is investigated using standardized regression coefficients. However, there are several issues listed in [30,31]: (i) In the situation of multicolinearity, regression coefficients including standard regression coefficients are not interpretable.
(ii) High multicolinearity may lead not only to serious distortions in the estimations of the magnitudes of the regression coefficients but also to reversals in their signs.
(iii) In the situation of multicolinearity, regression coefficients are not reliable indicators of relative importance, because it does not provide a natural decomposition of R 2 .
To solve this problem, there are several measures how to decompose R 2 . e LMG measure proposed by Lindeman, Merenda, and Gold is one of the recommended metrics and available through the R package "relaimpo" [32]. Also, there are several measures benchmarked against each other for variable selection [33,34].
In this work, the BMLR technique based on LMG relative importance metric (BMLR-LMG) was applied to analyze the observed dataset.

Results and Discussion
e first subsection concerns the descriptive statistics representing means and standard deviations of research variables under investigation. Subsection two reports the results of BMLR procedure to investigate the effect of factors on yield.

Descriptive Statistics.
e descriptive statistics of studied factors are represented in Table 2. According to these data, during 2006-2013, the rainfed wheat yield in Fars Province ranged from 0 to 2431.22 kg/ha. e average yield of rainfed wheat in this region was 619.20 kg/ha, which is low compared to 1181 Kg/ha average value of Iran. e low yield of rainfed wheat in Fars Province compared to Iran is mainly due to successive droughts and poor agricultural management [2].

Effective Parameters on Yield.
In this part, the impact of different weather parameters on yield was investigated. In this research, the yield was the response variable and the other parameters were continuous predictors. e prevailing formula of MLR was as follows: where the independent variables of the equation are the meteorological parameters. e BMLR-LMG technique was used by applying R software. At first, all parameters were introduced and the MLR model was run. e results are summarized in Table 3.
e results showed that because of colinearity, some of variables were nonsignificant (p value >0.05). erefore, the BMLR-LMG was applied to eliminate the worst parameter (the parameter owing the smallest LMG, T min ) in the next run. T min was eliminated, and the model was run again. is operation proceeded step by step to the point that only significant parameters remained. e summary of omitted parameters in each step in the BMLR-LMG technique is represented in Table 4.
Finally, Table 5 indicates the results of the final run. e influence of each parameter in the final model was analyzed using the values of LMG. According to these results, the relative importance of RH min , RH avr , sunshine hours, and rainy days on yield were significant (LMG more than 20%). e result indicated that sunshine hours had the biggest LMG and, therefore, was the most effective parameter. Also, among the other considered parameters, RH min (21.69%), RH avr (21.62%), sunshine hours (34.73%), and rainy days

(6)
According to the final model, the independent variables of rainy days, RH min , and RH avr have the strongest positive effect on rainfed wheat yield in Fars Province, respectively. In other words, the higher the number of rainy days, minimum relative humidity, and average relative humidity, the more yield will be. Pishbahar and Darparnian found that for dry farming wheat crop in warm climates of Iran, lack of adequate heat throughout plantation time (October), overheating throughout initial growth time (December and January), and lack of adequate rainfall throughout initial growth time (November and December) were the systematic risk factors [35]. e rainy days which has the most positive impact on the rainfed wheat yield in this study mentions the nonuniform temporal distribution of rainfall in the south areas of Iran [3]. Barkley et al. analyzed the effect of weather parameters on wheat yield across Kansas. eir result indicated that the most determinative parameter for wheat yield is often rainfall distribution [36]. In fact, in rainfed regions, the plant products are completely dependent on the frequency and distribution of rainfall. e abundance or scarcity of rainfall can, therefore, unfavorably influence the yield, particularly at crucial wheat growth stages [5]. e same results have been reported by Abi Saab et al. [37].
Also, many researchers conclude that rainfall is the most significant climatic parameter that impacts the crop growth and production in the rainfed areas [5,16,[38][39][40]. Holman et al. indicated the impact of growing season rainfall on wheat yields in western Kansas [41]. In fact, the whole rainfall is not the only parameter that can operate the increment or reduction in yield, but a proper quantity of rainfall at different crop growth stages is essential for maximizing the yield.
According to the prevailing dry and semidry climate of the studied area, minimum relative humidity and average relative humidity have a strong positive role in rainfed wheat productivity.
In this study, the only weather parameter that has a negative effect on rainfed wheat yield was sunshine hours. e negative effect of sunshine hours in the current study can be due to the fact that cereal crops after anthesis are more susceptive to temperature and sunshine that function principally on the production of carbohydrate to fill grain, rather than on the sink capacity of the grain [26]. In the studied area, wheat is planted near the onset of autumn rainfall and fills its grains throughout spring when rainfall is declining and evaporation is enhancing. erefore, the crop may be exposed to a postanthesis water deficit. Lollato et al. illustrated that cumulative solar radiation and average T max , respectively, had a strong positive and negative impact on wheat yield throughout the anthesis-physiological maturity period [42]. Contrary to our findings, Chaurasia et al. found the positive effect of sunshine hours on wheat yield in central Punjab, Pakistan [43]. e sunshine hours is an effective parameter for preventing favorable conditions for the multiplication of pest and diseases even in the areas with good cloud cover [38]. In general, the solar radiation in Fars Province is high and the plants do not have a shortage of radiation energy. erefore, in all regions of the province especially in the north areas, more cloudiness and fewer sunshine hours are directly related to rainfall. Consequently, in such arid and semiarid areas, occurrence of rainfall and satisfaction of rainfed crops water requirement are very important. Finally, the goodness of the fitted model was investigated using the coefficient of determination (R 2 ), adjusted coefficient of determination (R 2 adj ), root mean square error (RMSE), residual analysis, and comparison between the real values and the fitted values of rainfed wheat yield. A lesser RMSE quantity and higher R 2 and R 2 adj quantities (12.4, 0.944, and 0.895, respectively) are considered to indicate the goodness of the fitted predictive model based on these metrics. e performance of the proposed model for determination of weather parameter effects on rainfed wheat yield indicates the power of this model as compared with other literature studies (Table 6). e end of the wheat's growing season is spent in summer. In the study area, at this time due to the absence of precipitation and the lack of cloudy days, there is not much variability in meteorological parameters presented in the proposed model through the years. Since reliable early wheat production forecasts are very useful for policymakers, it is possible to determine an appropriate approximation of these parameters using long-term statistics or forecast data and then predict the yield before harvest. is information can be very important for producers and planners. en, the residual analysis was utilized to investigate the goodness of the fitted model. Independent normal residuals with stable variance are considered to exhibit the goodness of the fitted predictive model. e quantile-quantile (QQ) plot and Kolmogorov-Smirnov (K-S) normality test were utilized to assess the normality of residuals. According to the results in Figure 2, the normal probability plot satisfied the normality of residuals because the points are nearby to the line y � x. e normality was also satisfied with the K-S test (p value >0.05).
On the other hand, according to Figures 3 and 4, the plot of residuals against time and fitted values were completely random about of horizontal axis (y � 0). Hence, the independence and stability of residuals were satisfied. Figure 5 shows the real values versus the fitted values of rainfed wheat yield. As can be seen, the points are nearby to line 1 : 1. Consequently, the BMLR-LMG nicely modeled the rainfed wheat yield at Fars Province during 2006-2013.
To study the validation and robustness of the BMLR-LMG model, the final equation of the BMLR-LMG model was applied to estimate the rainfed wheat yield of year 2014. Figure 6 shows the real values versus the predicted values of 6 Complexity    e results have been also compared with usual backward MLR (BMLR) and forward MLR (FMLR) techniques. As it can be observed in Table 7, the BMLR-LMG model had the maximum value of R 2 and the minimum value of RMSE. erefore, the BMLR-LMG model is a robust and able technique to estimate the rainfed wheat yield compared with other alternatives.

Conclusion
e most contributing weather parameters on rainfed wheat crop yield were examined by using backward multiple linear regression analysis based on relative importance metrics in arid and semiarid Fars Province, southern Iran, during 2006-2013. As Pishbahar and Darparnian concluded, the cultivation of rainfed wheat in warm climates compared to the moderate and cold areas is at higher risk [35]. e current study results indicated that parameters of rainy days, minimum relative humidity, and average relative humidity have a significant and positive impact on rainfed wheat yield. Sunshine hours was highly significant and negatively correlated with rainfed wheat yield.
Because of the lack of appropriate rainfall distribution in the arid regions of southern Iran [44,45], the most positive significant parameter is the number of rainy days. e significant positive effect of minimum relative humidity and average relative humidity is due to the prevailing dry and semidry climate in the study areas and consequently the crucial role of relative humidity in reducing water deficit stresses in wheat. e sunshine hours is the only weather parameter that has a negative significant effect on rainfed wheat yield in the study area. is can be due to the temperature and sunshine susceptibility of cereals after anthesis.
is technique is a development in the direct application of weather parameters in the linear regression method, as shown by the findings. Besides, the information of these weather variables is readily accessible; hence, this simple linear regression technique can be regarded as a suitable tool to predict the wheat yield in rainfed Fars Province, southern Iran.
Data Availability e datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Disclosure
e sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Conflicts of Interest
e authors declare no conflicts of interest. Figure 6: Real values of yield versus predicted values for the BMLR-LMG model.