Novel Hybrid Data-Intelligence Model for Forecasting Monthly Rainfall with Uncertainty Analysis

: In this research, three different evolutionary algorithms (EAs), namely, particle swarm optimization (PSO), genetic algorithm (GA) and differential evolution (DE), are integrated with the adaptive neuro-fuzzy inference system (ANFIS) model. The developed hybrid models are proposed to forecast rainfall time series. The capability of the proposed evolutionary hybrid ANFIS was compared with the conventional ANFIS in forecasting monthly rainfall for the Pahang watershed, Malaysia. To select the optimal model, sixteen different combinations of six different lag attributes taking into account the effect of monthly, seasonal, and annual history were considered. The performances of the forecasting models were assessed using various forecasting skill indicators. Moreover, an uncertainty analysis of the developed forecasting models was performed to evaluate the ability of the hybrid ANFIS models. The bound width of 95% conﬁdence interval (d-factor) and the percentage of observed samples which was enveloped by 95% forecasted uncertainties (95PPU) were used for this purpose. The results indicated that all the hybrid ANFIS models performed better than the conventional ANFIS and for all input combinations. The obtained results showed that the models with best input combinations had the (95PPU and d-factor) values of (91.67 and 1.41), (91.03 and 1.41), (89.74 and 1.42), and (88.46 and 1.43) for ANFIS-PSO, ANFIS-GA, ANFIS-DE, and the conventional ANFIS, respectively. Based on the 95PPU and d-factor, it is concluded that all hybrid ANFIS models have an acceptable degree of uncertainty in forecasting monthly rainfall. The results of this study proved that the hybrid ANFIS with an evolutionary algorithm is a reliable modeling technique for forecasting monthly rainfall.


Introduction
It is worth mentioning that rainfall is the only input element source for the hydrologic cycle. However, excessive rainfall and the scarcity of it on Earth affect the tremendous flooding and severe drought that occur over short and long intervals [1]. Rainfall forecasting, therefore, can prevent the many natural disasters, thereby saving human lives. The accuracy of rainfall forecasting can also help the preparation of efficient structural and non-structural designs for disaster management. The accuracy of rainfall time series forecasting depends on the methods used (e.g., stochastic or deterministic) for uncertainty mitigation. Deterministic dynamical forecasting models are developed based on the physical laws related to land-ocean-atmosphere interactions and are thus able to predict changes in rainfall due to changes in Earth's atmosphere. However, the forecasted rainfall provided by dynamical models is often prone to large error at the local scale [2]. Statistical models are simple to implement and operate and have been found to forecast more efficiently the smooth changes in rainfall at the local scale. Therefore, statistical models are often preferred for rainfall forecasting at the local scale [3][4][5][6]. Time series modeling has been a very interesting topic not only for hydrological problems, but also for numerous science and engineering applications [7,8]. For this research, the motivation was to explore reliable and robust hybrid intelligence models that were capable of mimicking the existing non-linear pattern in rainfall by analyzing the historical information and understanding the internal mechanism of the time series data.
Research studies on hybrid techniques using data-driven models, such as artificial neural networks (ANNs), genetic programming (GP), and adaptive neuro-fuzzy inference systems (ANFISs), integrated with different optimization methods (e.g., particle swarm optimization (PSO), genetic algorithm (GA), and differential evolution (DE) algorithm) [9] have been published over the past decades and demonstrated positive outcomes for solving hydrology and water resource problems, such as rainfall runoff, river stage, evaporation, sediment, and groundwater, etc. [10][11][12][13][14][15]. Abrahart et al. [16] used a pruning algorithm (PA) and a genetic algorithm (GA) to optimize data-driven models for runoff forecasting. They compared model performances using different forecasting methods. Chau [17] applied a particle swarm optimization (PSO) algorithm to train ANNs for river stage prediction. He found that PSO could optimize the applied ANNs. In other research, Chau [18] developed a split-step particle swarm optimization (SSPSO) algorithm to train a multilayer perceptron (MLP) for forecasting river stage using different lag times. The author concluded that MLP-SSPSO forecasted the river stage more accurately compared to MLP-PSO. Chen and Chang [19] applied GA and a scaled conjugate gradient algorithm (SCGA) to train feedforward neural networks (FFNN) for forecasting reservoir inflow. They demonstrated that the accuracy of FFNN-GA and FFNN-SCGA was superior to stochastic models. A combined artificial bee colony (ABC) algorithm with ANNs was conducted by [20] to demonstrate the discharge-suspended sediment relationship. They concluded that ANNs-ABC can derive the relationship between discharge and suspended sediment more effectively compared with ANFIS, ANNs, and the rating-curve model. Asadnia et al. [21] developed the improved particle swarm optimization (IPSO) algorithm to train ANNs for predicting river stage. They concluded that the results of ANNs-IPSO were better compared to those of ANNs-PSO, ANNs-conjugate gradient (ANNs-CG), ANNs-gradient descent (ANNs-GD), and ANNs-Levenberg Marquardt (ANNs-LM), respectively. Sudheer et al. [22] applied PSO to train support vector machines (SVMs) for forecasting river discharge. They compared SVMs-PSO with autoregressive moving average model (ARMA) and ANNs and concluded that SVMs-PSO was better than ARMA and ANNs. Taormina and Chau [23] introduced the multi-objective fully informed PSO (MOFIPSO) to train ANNs for predicting river discharge. They showed that the results of ANNs-MOFIPSO were better than ANNs-SPSO. Kalteh [24] combined wavelet GA (WGA) with the support vector regression (SVR) for forecasting long-term discharge. The author reported that SVR-WGA provided better forecasting compared to SVR-GA. Annaty et al. [25] used PSO to train ANFIS for predicting scour depths. They found that ANFIS-PSO was better than ANNs-PSO, FDOT, and HEC-18, respectively. Although there have been previous research studies on the joint use of data-driven models and optimization methods, specific investigations on rainfall forecasting using hybrid methods have been limited so far. The hybridization of evolutionary algorithms with a data-driven model showed a remarkable enhancement in the modeling of hydrological processes [26][27][28][29]. Hence, to the best of the authors' knowledge, and for the first time, a hybrid ANFIS model integrated with three evolutionary algorithms is developed for forecasting rainfall process in a tropical environment.
This paper proposes the hybrid models by combining the ANFIS model and different optimization methods (e.g. particle swarm optimization algorithm (PSO), genetic algorithm (GA), and differential evolutionary (DE) algorithm) for forecasting monthly rainfall. The primary objective focuses on the development of hybrid models, namely, ANFIS-PSO, ANFIS-GA, and ANFIS-DE, which have not been investigated previously for rainfall forecasting. The feasibility of the proposed models is validated against the conventional ANFIS model. The modeling performances are evaluated using model efficiency indices and graphical comparisons. The applied hybrid methods can be used for flood management and analysis in the Pahang River, Malaysia. Located in a tropical environment of Malaysia, the river basin has experienced a few serious floods in the past two decades which emphasizes the need for rainfall trend forecasting.

Adaptive Neuro-Fuzzy Inference System
The fuzzy set approach was first introduced by Zaheh [30], based on a combination of neural network (NN) and the concept of fuzzy logic (FL), so that it could possibly catch the advantages of both in a solitary structure. The integration of FL with fuzzy inference systems (FISs) can be used for model development with fewer or ambiguous data. Among all the fuzzy systems, Sugeno's system is the most prevailing approach [31]. In contrast to NN, FL has a higher ability in the learning procedure to adjust its condition. Consequently, the NN can be utilized to naturally change the membership function (MFs) and decrease the error rate in the designation of rules in FL.
The development of the ANFIS model structure is based on two components, (i) the learning process (nodes) in which the use of the membership function (MF) in the conversion of the input attributes to the fuzzy values that vary between 0 and 1 is usually conducted [32] and (ii) the ANFIS model uses fuzzy rules (i.e., IF-THEN conditions) to certify the relationship between the inputs and the target. The MF is the main part that influences the predictability performance of the ANFIS model and, therefore, the appropriate selection of the MF is one of the predefined stages for the modeling [33]. The structure of an ANFIS network includes five different layers: fuzzification, product, normalization, de-fuzzification, and output. These are shown for a network with two input variables (x 1 and x 2 ) in Figure 1a. In the first layer, each node adjusts to a function parameter. The output of the "fuzzification" layer is a value of membership degree that is estimated using the inputs of MFs.
where Ai and Bi are fuzzy sets, whereas µ A i and µ B i−2 are the MF degree for Ai and Bi.
The ordinarily used MFs are triangular-shaped, trapezoidal-shaped, generalized bell-shaped and Gaussian. In the current study, the Gaussian MF is used due to the lower parameters of this MF in relation to the generalized bell-shape MF. The Gaussian-shape MF is defined as follows: where {ci and σi} is the set of MF parameters that can alter the MF shape. The parameters in this layer are referred to as antecedent parameters. The rules in the ANFIS network are designated due to the premise and consequent parts and are put away in fuzzy rule-based systems. The nonlinear premise and linear consequent parameters are related to layers 1 and 2, respectively. To attain an optimum model, the values of these linear and nonlinear parameters must be adjusted using an optimization algorithm. Due to the classical algorithm's drawbacks, such as being stuck in local minima and slow convergences, three different evolutionary algorithms, namely, PSO, GA, and DE, are employed in this study to optimize the ANFIS model's design.

Particle Swarm Optimization (PSO)
Kennedy and Eberhart [34] introduced the particle swarm optimization, which was modeled after the social characteristics of birds and fish. The main advantages of PSO are the simple procedure, the small amount of calculation, and instant convergences, which have made it suitable for solving different complex and nonlinear engineering problems. Briefly, PSO considers that every particle changes its position in the search space concerning the best position that it has ever been in and the best position near its neighbor. For particles to achieve an optimal or limited circumstance calculation, they usually change their positions within the multi-dimensional search space. PSO determines the initial particle swarm, P(k), in such a way that each particle's x is (k) position (P i P(k)) in the hyperspace equals k = 0, which is the first step. This is followed by the second step where each particle's F function performance is evaluated using the position of the particle (x i (k)): where pbest is the best position of the ith particle. The optimal performance of individual particles is evaluated in the third step in the following ways: where gbest is the global best value attained by the different particles of swarm.
In the fourth step, the velocity vector of each particle is changed using the equation: where r 1 and r 2 represent random parameters which lie within 0 and 1; and the parameters (C1 and C2 = 2) [27]. In Equation (6), w represents the inertia weight parameter. A balance between the local and global swarm performance can be achieved by a careful selection of these parameters to reduce the number of iterations. As per [34], the value of w can be determined as follows: where the initial and final weights are represented as w min and w max , respectively, while the maximum iteration value and iteration number are represented by itr max and itr, respectively. The particles are transformed to their new locations in the fifth stage using the following equation:

Genetic Algorithm (GA) Optimization
The conceptual structure of the genetic algorithm (GA) was first introduced by [28], based on Darwin's theory. The GA as a derivative-free stochastic method of optimization is one of the most well-known and oldest evolutionary algorithms and most widely employed EA approach in solving many engineering problems. It can be used to solve nonlinear, stochastic, and non-differentiable problems that may seem impossible by using gradient-based methods [35]. Goldberg and Holland [36] asserted that the population points for each iteration in the GA are randomly generated, and the best population point desires the optimum solution similar to the final outcome. The number of the population size for this algorithm was selected similarly to the process considered in the PSO which is equal to 100. The basic steps of GA include three important components. The first component is the creation of an initial population using the randomly selected mth individual, giving rise to the first population. Entering the mth individual and generating the output is the second component. Each of the outputs is evaluated based on the objective function known as a fitness function. The expected demand from each individual to achieve the desired objective is determined by the evaluation. From the fittest individual in the previous generation, a new generation is created.
In the reproduction process, the selection of chromosomes from the current generation based on the fitness of each chromosome to produce new generation offspring is done using the "selection" operator. The chromosomes with higher probabilities are chosen to modify and be employed in the next generation. Finally, the crossover operator which is a pivotal operator in the genetic algorithm is defined to generate child chromosomes from two different parent chromosomes. Indeed, using this operator, two new chromosomes are generated which have a higher fitness than the two entering chromosomes (parents). The crossover explores and exploits the new and old solutions, respectively, using a trial and error procedure. The crossover rate in this study is considered 0.75.

Differential Evolution (DE) Optimization
Storn and Price [37] presented the differential evolution (DE), a random population-based algorithm [31]. This method is differentiated from the other methods because it uses differential mutation. When a population problem with n-dimensional space must be solved, the fixed vector numbers are randomly created to understand the different search spaces as well as to reach the minimum objective function through evolution over time. In DE, a mutation function (F: I µ → I µ ) involves the production of a mutated vector (µ) using the equation: where r 1 , r 2 , r 3 ∈ [1, 2, ..., µ] are randomly designated. F ∈ [0,2] is a fixed parameter which affects the vector's differential variation. The capacity of the global search algorithm is usually increased by larger F or population size (µ) quantities. In DE, the crossover operator (CR: I µ → I µ ) modifies the to generate a trial combination of vectors using the following formula: where randb(j) ∈ [0,1] equals the randomized generator evaluation by the jth, while rnbr (i) (rnbr (i) ∈ 1,2, ..., d) is a random selection index. CR ∈ [0,1] represents the crossover parameter which increases the individual variations in the population. The population size for DE is considered similar to PSO and GA, equal to 100. The mutation and crossover constants which are obtained using a trial and error process are equal to 0.25 and 0.85, respectively.

Hybridization of ANFIS Model
This subsection describes the hybridization of the ANFIS model with three evolutionary algorithms (i.e., PSO, GA, and DE). The hybrid ANFIS models, encoded in the MATLAB environment, were used for rainfall time series forecasting. The structure of the proposed hybrid techniques is presented in Figure 1a. For the modeled rainfall time series, the initial hybrid models (ANFIS-PSO, ANFIS-GA, and ANFIS-DE) were developed for the study area using a training dataset. Subsequently, each previously mentioned evolutionary algorithm was adopted to optimize the ANFIS models by looking for the optimum values of premise and consequent parameters. After finding the optimum values of premise and consequent parts, the evolutionary-based hybrid methods of ANFIS regression were inferred and employed in rainfall time series forecasting.
The rainfall time series was split into two subsets: training and testing. All the employed monthly rainfall datasets were collected during the period 2000-2014. In this study, the first nine years of datasets were considered as training datasets and the rest of the data (2009-2014) were employed as testing datasets. Moreover, the number of inputs in the training phase varied from 1 to 6 lag times (antecedent months), which are presented in 16 different input combination structures as follows: Model 4: t−1, t−6 (14) Model 16: t−1, t−2, t−3, t−6, t−12, t−24 (26) All forecasting models have only one output (i.e., rainfall as the objective variable). The antecedent's values are obtained using the auto-correlation function statistical approach.
After determining the training dataset and input combinations, the model configuration was presented. Using training datasets, an initial ANFIS-EA (PSO, GA, and DE) was generated. As the premise and consequent parameters related to the initial ANFIS models were not optimized, the value of these parameters was optimized through an optimizing process using EA (PSO, GA, and DE). In this study, the fuzzy c-means (FCM) clustering approach was considered to generate the fuzzy inference system (FIS). Using different clusters, the fuzzy if-then rules were produced, and the optimum values of premise and consequent parameters were obtained during the optimization process performed by the PSO, GA, and DE algorithms. Moreover, it should be noted that the Gaussian membership function (MF), which is smooth and has the lowest parameters in relation to other smooth MFs such as the bell-shape MF, was used. Readers are recommended to look over the literature on the adapted hybrid models for more informative details [38].

Modeling Performance Indicators
Predictive models such as mathematical models (i.e., machine learning models) are usually evaluated using numerical indicators. The hybrid models developed in this study were inspected using four statistical metrics, including root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (CC), and Willmott's index (WI) [39][40][41]. The mathematical expression can be presented as follows: where R o and R f are the observed and forecasted values of rainfall data, whereas R o and R f are the mean value of the observed and forecasted values of rainfall data.

Uncertainty Analysis
The Monte Carlo simulation (MCS) was used in this study to estimate the uncertainty bounds of a predictor model. Indeed, MCS was simulated using ANFIS-based methods.
To achieve a range of numbers, where the results of a model can be located on that range, a result-dependent factor called the uncertainty analysis is used. Indeed, the uncertainty analysis represents the probability bounds in the model estimations that envelopes the values of target parameters in its range. Two different indices which are considered in uncertainty analyses are 95 percent predicted uncertainties (95PPU) and the d-factor. The 95PPU indicates that estimated percentage of predicted data that is limited by 95 percent predicted uncertainties and the d-factor assesses the average of the confidence band width.

Case Study and Hydrological Data Description
Malaysia is considered a tropical region that experiences several events of monsoon rainfall. Rainfall process is the main trigger for flood events that consequently affect society infrastructure and human lives directly. Hence, establishing a hybrid intelligence predictive model can provide the basic knowledge for protecting the studied watershed. In this research, Pahang watershed metrological information was used to build the predictive model. The Pahang watershed is located in Pahang, the largest state of Peninsular Malaysia. It is limited by Kelantan in the north, Johor in the south, Terengganu and the South China Sea in the east and Selangor and Negeri Sembilan in the west. The total area of the watershed is approximately 36,137 km 2 (Figure 2). The climate of the area is dominated by the northeast monsoon rainfall influence. The average annual rainfall ranges between 1609 and 2132 mm in the basin. In general, the extreme rainfall events occur between November and March. In this study, the monthly rainfall data for the period 2000-2014 were used for the development of hybrid models. The rainfall time series was obtained from the Department of Irrigation and Drainage (DID). It is worth mentioning that the modeling carried out in this research was based on the univariate concept that included only the rainfall information to forecast the rainfall itself. This is highly significant, especially in a watershed that comprises several metrological information sources. The highly stochastic behavior of rainfall patterns often produces floods, and thus the main motivation for this research was to develop an accurate model for rainfall forecasting, which could be used for anticipating the possible occurrence of floods. For this purpose, ANFIS combinations with a couple of evolutionary algorithms were developed.

Application and Analysis
ANFIS learning process implementation was conducted based on the RMSE criterion to determine the optimal number of membership functions of the ANFIS model. The statistical performance of hybrid and classical ANFIS models during training and test phases are tabulated in Table 1a-d. It can be seen from the table that the performance of different models significantly increased when all inputs were incorporated (i.e., model 16), compared to other input combinations.

Application and Analysis
ANFIS learning process implementation was conducted based on the RMSE criterion to determine the optimal number of membership functions of the ANFIS model. The statistical performance of hybrid and classical ANFIS models during training and test phases are tabulated in Table 1a-d. It can be seen from the table that the performance of different models significantly increased when all inputs were incorporated (i.e., model 16), compared to other input combinations. The RMSE values were found to vary in the range of 0.99-5.32, 0.47-5.31, 0.83-5.31, and 0.70-5.29 mm for ANFIS-based, ANFIS-PSO, ANFIS-GA, and ANFIS-DE models, respectively. In quantitative analysis, the hybrid ANFIS-PSO model enhanced the performance of the absolute error measures (RMSE and MAE) over the ANFIS-based model by 38-43% during training and 52-56% during testing phases. The ANFIS-GA model showed an enhancement over the ANFIS-based model by 14-24% during the training phase and 16-21% during the testing phase, whereas ANFIS-DE displayed an enhancement over the ANFIS-based model by 18-26% during the training phase and 26-41% over the testing phase. It can be observed that the major increase in the quantitative measures was presented for the ANFIS-PSO. It important to indicate that the root mean square error is the essential numerical indictor to determine the performance of machine learning modeling for time series forecasting [42][43][44][45].
The minimum values of RMSE obtained for the ANFIS-PSO, ANFIS-GA, and ANFIS-DE were 0.47, 0.83, and 0.70 mm, respectively. The results attained for all hybrid ANFIS models showed an acceptable error level that could be managed by water resource and climatology decision makers using the threshold-based design [46].
The proficiency of the different models was also examined using some graphical presentations, such as observed rainfall values versus forecasted values in scatterplot graphs and Taylor diagrams. The time variation and scatterplots of observed versus simulated hybrid models are provided in  The RMSE values were found to vary in the range of 0.99-5.32, 0.47-5.31, 0.83-5.31, and 0.70-5.29 mm for ANFIS-based, ANFIS-PSO, ANFIS-GA, and ANFIS-DE models, respectively. In quantitative analysis, the hybrid ANFIS-PSO model enhanced the performance of the absolute error measures (RMSE and MAE) over the ANFIS-based model by 38-43% during training and 52-56% during testing phases. The ANFIS-GA model showed an enhancement over the ANFIS-based model by 14-24% during the training phase and 16-21% during the testing phase, whereas ANFIS-DE displayed an enhancement over the ANFIS-based model by 18-26% during the training phase and 26-41% over the testing phase. It can be observed that the major increase in the quantitative measures was presented for the ANFIS-PSO. It important to indicate that the root mean square error is the essential numerical indictor to determine the performance of machine learning modeling for time series forecasting [42][43][44][45]. The minimum values of RMSE obtained for the ANFIS-PSO, ANFIS-GA, and ANFIS-DE were 0.47, 0.83, and 0.70 mm, respectively. The results attained for all hybrid ANFIS models showed an acceptable error level that could be managed by water resource and climatology decision makers using the threshold-based design [46].
The proficiency of the different models was also examined using some graphical presentations, such as observed rainfall values versus forecasted values in scatterplot graphs and Taylor diagrams. The time variation and scatterplots of observed versus simulated hybrid models are provided in    Figure 4 shows a graphical presentation using Taylor diagrams [47]. A Taylor diagram is a method for graphically condensing how intently a model (or several models) matches observations. The similarity between the predictive models and the observation records is quantified in terms of their correlation coefficient and standard deviations. The figures outline the proposed hybrid models and the comparable ANFIS-based model in terms of the mentioned indictors to denote the degree of the prediction skills. The distance from the reference point (observed) is a measure of the centered RMSE [47]. Accordingly, a perfect model (being in full concurrence with the observations) is set apart by the reference point with the correlation coefficient equivalent to 1, and a similar abundancy of varieties contrasted with the observations [48]. The visualization of results revealed that the ANFIS-  Figure 4 shows a graphical presentation using Taylor diagrams [47]. A Taylor diagram is a method for graphically condensing how intently a model (or several models) matches observations. The similarity between the predictive models and the observation records is quantified in terms of their correlation coefficient and standard deviations. The figures outline the proposed hybrid models and the comparable ANFIS-based model in terms of the mentioned indictors to denote the degree of the prediction skills. The distance from the reference point (observed) is a measure of the centered RMSE [47]. Accordingly, a perfect model (being in full concurrence with the observations) is set apart by Water 2019, 11, x FOR PEER REVIEW 20 of 26 As previously discussed, one of the main aims of this study was to examine the uncertainty analysis of the different ANFIS-based methods using d-factor and 95PPU criteria. The reduction in average values of lower and upper bounds (lower than the standard deviation (SD) of observed data) and the increase in observed data in 95PPU in uncertainty result in a more desirable uncertainty for each model. The uncertainty indices of d-factor and 95PPU for testing samples are presented in Table  2 for 64 models (four ANFIS-based models and 16 different input combinations, Equations (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)). In addition, the 95PPU values bracketed for Model 16 of each proposed model are given in Figure 5. As seen in Table 2 and Figure 5  As previously discussed, one of the main aims of this study was to examine the uncertainty analysis of the different ANFIS-based methods using d-factor and 95PPU criteria. The reduction in average values of lower and upper bounds (lower than the standard deviation (SD) of observed data) and the increase in observed data in 95PPU in uncertainty result in a more desirable uncertainty for each model. The uncertainty indices of d-factor and 95PPU for testing samples are presented in Table 2 for 64 models (four ANFIS-based models and 16 different input combinations, Equations (11)- (26)). In addition, the 95PPU values bracketed for Model 16 of each proposed model are given in Figure 5. As seen in Table 2 and Figure 5, minimum values bracketed by 95PPU are approximately 11%-12% and maximum values bracketed by this index are approximately 91.67%, 91.03%, 89.74%, and 88.46% for ANFIS-PSO, ANFIS-GA, ANFIS-DE, and ANFIS, respectively. Moreover, the maximum and minimum values of the d-factor for ANFIS-PSO, ANFIS-GA, ANFIS-DE, and ANFIS are 0.14-1.43, 0.14-1.43, 0.14-1.44, and 0.15-1.44, respectively.  * Bold numbers present the best input combination accuracy results. In accordance with the established research [49], the computed value for 95PPU should be considered at least 80% of predicted values, whereas, for some areas, the use of 50% of data in the 95PPU bound is acceptable. Due to the values attained for all ANFIS-based models for the d-factor and 95PPU, it can be derived that all observed and predicted values are within the 95PPU band for all models except model 1 which used lag 1 as the input variable in rainfall forecasting so that 50% of the observed data fell within this bound. By considering the minimum value of 95PPU as 80% of predicted values, models 12 to 16 for all ANFIS-based models are within this 95PPU level. Therefore, for all models except model 1 and especially for model 12s to 16, the "passing" degree of uncertainty is achieved in rainfall forecasting. Simulation results for hybrid ANFIS models are better than individual ones because the percentages of values bracketed by 95PPU for hybrid evolutionary ANFIS are higher than those of the ANFIS. In addition, the best and worst uncertainty values are found for ANFIS-PSO and the ANFIS which have the lowest and highest (respectively) values for different input combinations.
Generally, there are three kinds of uncertainty in the simulation of a real-world problemuncertainty appearing from data, from local knowledge, and from a simulator model-so that the level of each uncertainty differs considerably with the type of problem. In this study, the uncertainty arising from data, from local knowledge and from the ANFIS-based models is coming from machine and human errors and unknown problems.
It is worth highlighting that the performance of models used for forecasting rainfall time series should be evaluated according to their ability to forecast rainfall on a multiple temporal scale, such as monthly, seasonally, annually, or inter-annually [50]. However, considering that the stochastic forecasting models developed in this study were used to forecast monthly rainfall one month ahead, only the capability of the models in forecasting monthly rainfall is assessed in this study. Hence, an extension of the current study could be devoted to the investigation of multiple temporal scales.
On the other hand, a general circulation model (GCM) tool could be used to simulate global change with respect to greenhouse emissions that cause rainfall changes on a continental scale based on global warming. Simulating climate on a smaller scale (i.e., a river basin area) is complex due to limitations in the horizontal resolution and parameterization of GCMs [51]. Some techniques were developed for downscaling the GCM scenarios, such as nesting regional climate models (RCMs) In accordance with the established research [49], the computed value for 95PPU should be considered at least 80% of predicted values, whereas, for some areas, the use of 50% of data in the 95PPU bound is acceptable. Due to the values attained for all ANFIS-based models for the d-factor and 95PPU, it can be derived that all observed and predicted values are within the 95PPU band for all models except model 1 which used lag 1 as the input variable in rainfall forecasting so that 50% of the observed data fell within this bound. By considering the minimum value of 95PPU as 80% of predicted values, models 12 to 16 for all ANFIS-based models are within this 95PPU level. Therefore, for all models except model 1 and especially for model 12s to 16, the "passing" degree of uncertainty is achieved in rainfall forecasting. Simulation results for hybrid ANFIS models are better than individual ones because the percentages of values bracketed by 95PPU for hybrid evolutionary ANFIS are higher than those of the ANFIS. In addition, the best and worst uncertainty values are found for ANFIS-PSO and the ANFIS which have the lowest and highest (respectively) values for different input combinations.
Generally, there are three kinds of uncertainty in the simulation of a real-world problemuncertainty appearing from data, from local knowledge, and from a simulator model-so that the level of each uncertainty differs considerably with the type of problem. In this study, the uncertainty arising from data, from local knowledge and from the ANFIS-based models is coming from machine and human errors and unknown problems.
It is worth highlighting that the performance of models used for forecasting rainfall time series should be evaluated according to their ability to forecast rainfall on a multiple temporal scale, such as monthly, seasonally, annually, or inter-annually [50]. However, considering that the stochastic forecasting models developed in this study were used to forecast monthly rainfall one month ahead, only the capability of the models in forecasting monthly rainfall is assessed in this study. Hence, an extension of the current study could be devoted to the investigation of multiple temporal scales.
On the other hand, a general circulation model (GCM) tool could be used to simulate global change with respect to greenhouse emissions that cause rainfall changes on a continental scale based on global warming. Simulating climate on a smaller scale (i.e., a river basin area) is complex due to limitations in the horizontal resolution and parameterization of GCMs [51]. Some techniques were developed for downscaling the GCM scenarios, such as nesting regional climate models (RCMs) within a GCM and statistical methods. RCMs can explicitly handle the physical processes that control the regional climate. Such physical processes have been applied indirectly using the linkage between large scale parameters and regional scale climate in statistical models. Statistical techniques are computationally inexpensive and can be easily used for any GCM output and adapted to a regional area compared to dynamic models [52]. Dynamic models based on nesting a fine-scale climate model within a coarse-scale model can produce spatially complete fields of climatological variables (i.e., rainfall) and maintain the spatial correlation and physical relationship between variables. However, due to the intensive computation of dynamic methods, their applications are limited and long-period simulations using various global climate models and multiple greenhouse gas emission scenarios are infeasible [53].

Conclusions
The current research investigated particle swarm optimization (PSO), genetic algorithm (GA), and differential evolution (DE)-based ANFIS models for forecasting monthly rainfall time series. Different efficient bio-inspired paradigms, namely, PSO, GA, and DE, were compared to identify which method was more appropriate to train the ANFIS model. A good generalization from the ANFIS basically depends on the ANFIS training procedure. The use of an evolutionary-based ANFIS to forecast monthly rainfall was tested for a tropical environment. Rainfall forecasting using an ANFIS trained by PSO, GA, and DE was compared with the classical ANFIS model. In this study, the developed hybrid models were used for forecasting monthly rainfall values using antecedent rainfall values. The obtained results indicated that ANFIS-PSO, ANFIS-GA, and ANFIS-DE were appropriate models for forecasting monthly rainfall. However, it was found that ANFIS-PSO was superior to ANFIS-GA, ANFIS-DE, and the pure ANFIS models. Moreover, the dependability of the ANFIS-based model forecasting was computed using uncertainty analysis. Due to the calculated values of two indices, 95PPU and d-factor, it was concluded that the hybrid evolutionary ANFIS model had an allowable degree of uncertainty in rainfall simulations. Furthermore, after comparing all ANFIS-based models, the hybrid ANFIS (especially ANFIS-PSO) indicated a lower degree of uncertainty in comparison with the classical ANFIS.