A Committee Machine with Intelligent Systems for Estimating Monthly Mean Reference Evapotranspiration in an Arid Region

The aim of this research is to estimate the reference evapotranspiration ETo as given by FAO-56 PM equation in Basrah city, southern Iraq by using several climatic inputs data including maximum monthly mean air temperature, minimum monthly mean air temperature, monthly mean relative humidity and monthly mean wind speed. Three artificial intelligent systems (generalized regression neural network GRNN, multi-layer perceptron MLP and adaptive neuro-fuzzy inference systems ANFIS) were used for predicting reference evapotranspiration. Root mean squared error and coefficient of determination were used as comparison criteria for evaluation of performance of all the developed models. The results shown that the models performances of multi-layer perceptron models are better than adaptive neuro-fuzzy inference systems models and slightly better than generalized regression neural network models with different inputs combination. A Committee Machine with Intelligent Systems (CMIS) was constructed for estimation of ETo by integrating the results of predicting ETo from GRNN, MLP and ANFIS, each of them has a weight factor representing its contribution in overall estimation. The results illustrated that the performance of committee machine with intelligent systems is better than any one of the individual artificial intelligent systems for predicting ETo.


INTRODUCTION
Evapotranspiration (ET) is a term used to denote all processes that converting the existing water on the surface into water steam.ET is an essential component in global water energy and carbon cycles and thus provides a link between the atmosphere and the Earth's surface (Tang et al., 2014).The accurate estimation of ET is important for studying hydrological water balance, design of irrigation systems, simulation of crop yield and even efficient planning of water resources projects (Kumar et al., 2011).However, ET is a complex process because it depends on different factors such as weather data and growth stage of the crop (Trajkovic and Kolakovic, 2009).To avoid the need to calibrate a separate ET equation for each crop and stage of growth, the concept of reference evapotranspiration (ET 0 ) was introduced by Allen et al. (1998).ET 0 is defined as the rate of ET from a hypothetical crop with an assumed crop height of (0.12 m), a fixed surface resistance of (70 sec/m) and an albedo of (0.23), which would closely resemble ET from an extensive surface of green grass of uniform height, activity growing, well-watered and completely shading the ground.
The importance of ET 0 in hydrological and agricultural studies leads to develop different instruments and methodologies to estimate it.The ET 0 can be directly measured using either lysimeter field instrument, water balance approach, or estimated indirectly using the climatological data (Kumar et al., 2011).Unfortunately, the available lysimeter data are very limited or sometimes non-existent in developing countries.Because of these difficulties in estimating ET 0 , the indirect ET 0 estimation methodologic which are essentially depending on an easy to capture meteorological data become more popular.In recent few decades, numerous methodologies, classified as temperature-based, radiation-based, pan-evaporationbased and combination-type, have been developed for estimating ET 0 (Trajkovic and Kolakovic, 2009).One of the methodologies that are widely used to estimate ET 0 is the FAO Penman-Monteith method.
The Penman-Monteith method is an accurate method for estimating evapotranspiration and can be used in different regimes (Kumar et al., 2002).The effectiveness of this method for estimating ET 0 and for evaluating other equations have been indicated by many studies (Pereira and Pruitt, 2004;López-Urrea et al., 2006;Gavilān et al., 2006).The main advantages of Penman-Monteith equation are (Landeras et al., 2008): • It is applicable in different environments under different climatic scenarios without local calibration • The adapted equation has been validated using lysimeters data under a wide range of climatic conditions.
The main disadvantage of this method is that it requires a large number of climatic variables such as air temperature, relative humidity, solar radiation and wind speed to compute ET 0 which are not always available in meteorological stations or at least missing for a certain period.To fill this gap, many researchers attempted to use artificial intelligent techniques such as Artificial Neural Networks (ANNs), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Genetic Programming to estimate ET 0 with promising and successful results.
Most of the previous studies mainly focused on one or more techniques for estimating ET 0 , independently.A Committee Machine (CM), or committee neural network, has a parallel architecture that produces a final output by combining the results of individual experts (Haykin, 1991).The experts may be neural networks, empirical formulas, or other algorithms (Chen and Lin, 2006).The main advantage of CM technique is that it can lead to significant improvements in the performance on new data, with little extra computational effort.In fact, the combined response of the CM performs the best to those of its constituent experts.The efficacy of CM for estimating ET 0 is not investigated yet; therefore, the objective of this study is to use three intelligent systems namely Generalized Regression Neural Network (GRNN), Multi-Layer Perceptron (MLP) and Adaptive Neuro-Fuzzy Inference System (ANFIS) along with CM to develop more accurate model for estimating ET 0 from available meteorological data in an arid region.The Basra City in southern Iraq has been selected to demonstrate the adapted methodology.The optimum weights for CM is optimally computed for the first time using Pattern Search (PS) optimization technique.

MODELINGTECHNIQUSE
Penman-monteith method: The FAO-56 PM method is recommended as highly accurate method for determining ET 0 .This method is a physically based approach and requires measurements of air temperature, relative humidity, solar radiation and wind speed as input to estimate ET 0 .In this study the FAO-56 PM method was used as a reference model for assessing the performance of the usedapproaches.FAO-56 PM equation which given by (Allen et al., 1998)

Generalized Regression Neural Network (GRNN):
GRNN is a variation of radial basis neural networks, which is designed for function approximation and regression (Alilou and Yaghmaee, 2015).GRNN is a universal approximation for smooth function, allowing it to solve any function approximation and estimate any continuous variables when giving enough data (Disorntetiwat, 2001).GRNN is a one-pass learning algorithm with a highly parallel structure (Specht, 1991).Basically, GRNN consists of four layer (Fig. 1); the input layer, the pattern layer, the summation layer and the output layer (Barzegar et al., 2016).The number of input units in input layer depends on the total number of the observation parameters (Hannan et al., 2010).Denominator part yielding the predicted values of an unknown input vector x (Specht, 1991): where, W i : The weight connection between the i th neuron in the pattern layer and summation neuron n : The number of the training patterns D : The Gaussian function m : The number of elements of an input vector x k ,x ik : The j th element of x and x i , respectively σ : The spread parameter, whose optimal value is determined experimentally During the training process, the error is measured by the Means Squared Error (MSE).The training process is repeated for several times with different spread factors until the network is optimized according to the minimum amount of MSE or a pre-defined threshold value (Kisi et al., 2015).

Multi-Layer Perceptron (MLP):
The limitations of single layer artificial neural network have led to development of multi-layer feed-forward networks with one or more hidden layers, called Multi-Layer Perceptron (MLP) networks.MLP networks overcome many of the limitations of single layer perceptrons.Multi-Layer Perceptron (MLP) is artificial neural network, the computation in MLP is performed using a set of many simple units with weighted connections between them.MLP is a feed-forward artificial neural network model that maps sets of input data onto a set of appropriate outputs.MLP consists of multiple layers of nodes; each layer is fully connected with another one.Node is called a neuron (or processing element) with a nonlinear activation function.MLP utilizes a supervised learning technique called back-propagation for training the network (Rumelhart and McClelland, 1986).
Learning occurs in the perceptron by changing connection weights after each piece of data is processed, based on the amount of error in the output compared to the expected result (target).Figure 2 show the two-layered feed forward neural networks with sigmoid hidden neurons and linear output neurons.This network includes a nonlinear activation function.The important point to emphasize here is that the smoothly nonlinearity (i.e., differentiable everywhere), as opposed to the hard limiting used in Rosenblatt's perceptron.A commonly used form of nonlinearity that satisfies this requirement is a sigmoid nonlinearity defined by the logistic function which shown in Eq. ( 4): ( ) where, : The output of the neuron; : The induced local field of neuron j (i.e., the weighted sum of all synaptic inputs plus the bias).
The explicit expression for an output value of MLP as shown in Eq. ( 5) (Nourani and Babakhani, 2013): where, MLP has been applied successfully to solve difficult problems in different cases with a highly popular algorithm known as the error back-propagation algorithm.This algorithm is based on the errorconnection learning rule.

Adaptive Neuro-Fuzzy Inference Systems (ANFIS):
Adaptive Neuro Fuzzy Inference System (ANFIS) is a fuzzy mapping algorithm that is based on Takagi-Sugeno fuzzy inference system.It integrates both neural networks and fuzzy logic principles (Loukas, 2001).The parameters associated with the membership functions changes through the learning process.The computation of these parameters (or their adjustment) is facilitated by a gradient vector.This gradient vector provides a measure of how well the fuzzy inference system is modeling the input/output data for a given set of parameters.When the gradient vector is obtained, any of several optimization routines can be applied in order to adjust the parameters to reduce some error measure.This error measure is usually defined by the sum of the squared difference between actual and desired outputs.The shape of membership functions is obtained in neuro-fuzzy by training them with input/output data rather than specifying them manually.The ANFIS consists of five layers (Fig. 3), the basic functions of each layer are the input, fuzzification, rule inference, normalization and defuzzification.
ANFIS can be represented as a linear arrangement of input variables and a constant term as described by Eq. ( 6) (Hossen et al.,2013): where, 1,2, … , : The i th fuzzy rule 1,2, … , : The j th input variable of the k th pattern vector : A fuzzy variable of the j th input variable in the i th rule П : A fuzzy T-norm operator : A rule firing-strength of the i th rule : The i th rule output : The overall output The clustering algorithm is used in this research, the clustering algorithm is a method which is usually employed to discover a cluster center and inform the position of heart (center) of each cluster (Stoffel et al., 2012).It provides a method that shows how to group data points that populate some multidimensional space into a specific number of different clusters (Elleithy, 2010).Arabia and Islamic Republic of Iran.It is located between longitude line (47° 30'-48° 30') and latitude line (30°00'-30° 30') as shown in (Fig. 5).Basrah has a hot desert climate, like the rest of the surrounding region, though it receives slightly more precipitation than inland locations due to its location near the Arabian Gulf.During the summer months, from June to August, Basrah is consistently one of the hottest cities on the world, with temperatures regularly exceeding 50°C in July and August.In winter, Basrah experiences mild weather with average high temperatures around 20°C.At some winter nights, minimum temperatures may be reaching to 0°C.The City experience high humidity, sometimes exceeding 90%, due to its location close to the Arabian Gulf.Basrah is relatively an agricultural area where palm trees, fruit and vegetables are planted.Basrah is also known for planting tomatoes in Safwan-Al Zubair area (south west of center city) in winter season, which supplies the tomatoes demands of other Iraqi Provinces.
The climate information used in this research was obtained from the meteorological recording station in Hi Al-Hussain at the center of the Basrah City (Fig. 5).The samples data which consist of 22 years (1991-2012) monthly records of maximum monthly mean air temperature (T max ), minimum monthly mean air temperature (T min ), monthly mean Relative Humidity (RH) and monthly mean wind speed at 2 m above the ground surface (U 2 ).A statistical summary of these variables with obtained ET 0 by using Penman-Monteith equation (FAO-56 PM) is presented in Table 1.The RH shows low variation if comparing with T and U 2 .On the other hand, U 2 have the lowest correlation with ET 0 and have high skewed and distribution.All variables seem to be effective parameters on ET 0 with respect to correlation values.The inputs T, RH and U 2 and output ET 0 values were used for the constructing intelligent models.Three models for each intelligent system with different inputs combination are employed.The information and input variables for these models are shown in Table 2.
Two statistical errors namely, Root Mean Squared Error (RMSE) and Coefficient of Determination (R 2 ) are used to evaluate the performance of the developed models.The RMSE and R 2 are computed as shown in Eq. (17 and 18):  The RMSE shows the goodness of fit relevant to high values.The R 2 shows the degree to which two variables are linearly related (Karunanidhi et al.,1994).In case of GRNN, the output value is estimated using weighted average of the training dataset, where the weight is calculated using the Euclidean distance between the training and testing data.If the weight or distance is large, then the weight will be very less and if the distance is small, it will put more weight to the output.The decision that is required for each of the models inputs is the selection of the appropriate smoothing factors to be applied.For different input combinations, the optimum spread for the GRNN model was determined according to the MSE criterion.The determined spread values for different combination inputs are shown in Table 3.
A two-layer feed forward network with sigmoid hidden neurons and linear output neurons is used in the present research.The network is trained with Levenberg-Marquardt back propagation algorithm.Many researchers employed the Levenberg-Marquardt algorithm which is an approximation to Newton's method for adjusting the weights of the ANN model because it is more powerful than the conventional gradient descent techniques (Kişi, 2007).The optimal number of neurons in the hidden layer is determined using trial and error method and found to be (20).The model is evaluated by the testing data set which is not used during the training phase.The total number of observations is 264 samples; these observations are divided into three parts.60% (158 samples) for training, these are presented to the network during training and the network is adjusted according to its error.20% (53 samples) is used for validating part; this set of data is used for measuring the generalization of network and to halt training when generalization stops improving.Also, the testing part is taken as the same percentage of validation test (20%, 53 samples), these have no effect on training and so provide an independent measure of network performance during and after training.
In this research, a subtractive clustering method is used for extraction of clusters and fuzzy if-then rules for ANFIS model.The subtractive clustering algorithm is an attractive approach to the synthesis of ANFIS networks, which estimates the cluster number and its cluster location automatically.By using this method, each sample point is seen as a potential cluster center.Computation time in this method becomes linearly proportional to data size, but independent of the dimension problem under consideration.The effective and important parameter in subtractive clustering which controls number of clusters and fuzzy if-then rules is clustering radius.This parameter is ranged of (0, 1).
The training error can be controlled by adjusting clustering radius.Specifying a smaller cluster radius usually yields smaller clusters and more rules, a large cluster radius when approaching to one yields few large cluster in the data and few rules.Optimum clustering radius is determined by performing subtractive clustering network for several times, with changing radius value between (0, 1), leads to different number of if-then rules that could be established.According to the RMSE, the best fuzzy model is selected.The observations are divided into two statistically parts.80% (211 samples) for training, these are presented to the network during training and the network is adjusted according to its error.20% (53 samples) is used for checking part.The checking data is used for both checking and testing the fuzzy inference system parameters.Here, chkRMSE is the root mean square error of the system generated by the checking data.Table 4 show that the best value of clustering radius which equal to (0.3) is associated with lowest value of chkRMSE which equal to (1.1228) for model No. (7).By the same way, Table 5 and 6  A Gaussian membership function (mf) is selected to the extracted input clusters.The normal distribution of input data is carried out by using Gaussian function f(x) as shown in Eq. ( 19): where, &σ : The parameter of normal distribution showing the mean and standard deviation of data, respectively.
The mean represents the cluster center, while, the standard deviation is calculated by the following function: The Gaussian membership function parameters for the models of ANFIS are shown in Table 7.
Pattern search method was used to determine optimal combination of the weights for construction CMIS.The fitness function for PS can be expressed as follows: where, , and

Fig. 2 :
Fig. 2: Two-layered feed forward neural networks : A weight in the hidden layer connecting the i th neuron in the input layer and the jth neuron in the hiddenlayer : The bias for the j th hidden neuron : The activation function of the hidden neuron : A weight in the output layer connecting the j th neuron in the hidden layer and the k th neuron in the output layer : The bias for the k th output neuron : The activation function for the output neuron : i th input variable for input layer : Computed output variable & : The number of the neurons in the input and hidden layers, respectively

Fig
Fig. 3: ANF Committe (CMIS): T networks o different c one is sel such as sta based on th test) whil disadvanta • All th netwo • Rando the b necess These combining Committee the impor significant data, with committee constituent Asoodeh, 2 The id the knowle an overall individual assumption systems wi target vect asshown b Chen and Ghiasi-Fre Fig. 4: Sche schematic (Fig. 4).T intelligent ensemble a and Lin, 2 each of th defining it pattern sea study to de PS is a fam does not re the objec show the optimum value of clustering radius with lowest value chkRMSE for model No. (8) and model No. (9), respectively.

Table 1 :
The monthly statistical summary of data set used in this study

Table 3 :
The GRNN spread values for different combination inputs

Table 4 :
Clustering radius with root mean square error generated by the checking data of model No.(7) / √

Table 7 :
The gaussian membership function parameters for the models of intelligent system (ANFIS) Model No. (7) -