Application of Metaheuristic Algorithms and ANN Model for Univariate Water Level Forecasting

Mohammed, Sarah J.; Zubaidi, Salah L.; Al-Ansari, Nadhir; Mohammed Ridha, Hussein; Dulaimi, Anmar; Al-Khafaji, Ruqayah

doi:https://doi.org/10.1155/2023/9947603

Advances in Civil Engineering

On this page

Abstract Introduction Results and Discussion Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2023 | Article ID 9947603 | https://doi.org/10.1155/2023/9947603

Application of Metaheuristic Algorithms and ANN Model for Univariate Water Level Forecasting

Sarah J. Mohammed,¹Salah L. Zubaidi,^1,2Nadhir Al-Ansari ,³Hussein Mohammed Ridha,^4,5Anmar Dulaimi,^2,6and Ruqayah Al-Khafaji⁷

Academic Editor: Zaher Mundher Yaseen

Received27 Jan 2023

Revised08 May 2023

Accepted17 May 2023

Published25 May 2023

Abstract

With the rapid development of machine learning (ML) models, the artificial neural network (ANN) is being increasingly applied for forecasting hydrological processes. However, researchers have not treated hybrid ML models in much detail. To address these issues, this study herein suggests a novel methodology to forecast the monthly water level (WL) based on multiple lags of the Tigris River in Al-Kut, Iraq, over ten years. The methodology includes preprocessing data methods, and the ANN model optimises with a marine predator algorithm (MPA). In the optimisation procedure, to decrease uncertainty and expand the predicting range, the slime mould algorithm (SMA-ANN), constriction coefficient-based particle swarm optimisation and chaotic gravitational search algorithms (CPSOCGSA-ANN), and particle swarm optimisation (PSO-ANN) are applied to compare and validate the MPA-ANN model performance. Analysis of results revealed that the data pretreatment methods improved the original data quality and selected the ideal predictors’ scenario by singular spectrum analysis and mutual information methods, respectively. For example, the correlation coefficient of the first lag improved from 0.648 to 0.938. Depending on various evaluation metrics, MPA-ANN tends to forecast WL better than SMA-ANN, PSO-ANN, and CPSOCGSA-ANN algorithms with coefficients of determination of 0.94, 0.81, 0.85, and 0.90, respectively. Evidence shows that the proposed methodology yields excellent results, with a scatter index equal to 0.002. The research outcomes represent an additional step towards evolving various hybrid ML techniques, which are valuable to practitioners wishing to forecast WL data and the management of water resources in light of environmental shifts.

1. Introduction

The river water level (WL) is commonly measured to determine the water volume flowing through the river. It is also applied to comprehend flood or drought scenarios. As a result, a timely and accurate forecast of WL is critical for water resource planning and disaster risk decrease [1, 2]. WL prediction is an essential responsibility of hydrologists, engineers, and relevant authorities in designing a workable conceptual design of water infrastructures and drought management measures and assessing river/lake/reservoir behaviour for operational purposes [3, 4].

Iraq is particularly susceptible to the effects of global warming as a country in an arid to semiarid environment. In general, this country is suffering from a shortage of water, which is anticipated to worsen in the future due to various issues, including climate change, the growth of the oil sector industry, urbanisation, and the high pace of the population increase [5]. Iraq’s primary freshwater sources are the Euphrates and Tigris Rivers. From 2009 to 2014, both rivers suffered from significant water scarcity. This trend is only expected to deteriorate due to climate change and the policies of water source nations such as Iran and Turkey, which were building and operating new dams along the rivers’ routes. In addition, after 2003, terrorism targeted several barrages and dams in Iraq, adversely affecting river water management [6].

Artificial intelligence (AI) models have proven reliable tools for capturing nonlinear patterns, so researchers have used them in various hydrological studies [7]. AI models have shown exceptional performance in forecasting hydrological factors in recent years. These models’ features can receive a substantial quantity of data and can be applied to numerous climatic parameters and other hydrological boundary factors [8]. According to the published literature survey, multiple AI models for water level modelling have been built, including artificial neural network (ANN) [9], adaptive neuro-fuzzy inference (ANFIS) [10], support vector machine (SVM) [11], and random forests (RFs) [12]. The advantages and disadvantages of the models mentioned above (main ML models) are covered according to different topics of hydrology fields, such as drought [13] and water quality [14].

One of the most significant drawbacks of prior studies is that some of them use the trial-and-error procedure, which results in a lengthy time frame, while others have not improved the data through preprocessing the data.

Among the various AI approaches, ANN can be viewed as an active technique for solving nonlinear issues and producing precise predictions [15, 16]. However, ANN may fall into a local rather than a global minimum, resulting in a suboptimal solution or failure to use the proper network structure or hyperparameters for neural network training. To avoid these disadvantages, different approaches, such as metaheuristic algorithms, have been incorporated with the ANN model, and various hybrid models have been suggested [15]. Hybridisation has appeared as an excellent potential technique for eliminating the apparent disadvantages of standalone approaches while also improving forecasting accuracy [17].

Additionally, articles on watershed-level predictions published between 2014 and 2021 were analysed by Mohammed et al. [18]. The study revealed that hybrid strategies outperform single methods for all cases, for instance, Ebtehaj et al. [19], Imran et al. [20], and Nguyen et al. [21]. Multiple studies recommended applying the hybrid models in water level forecasting, for example, Ghorbani et al. [22], Wang et al. [23], Zhu et al. [24], and Dat et al. [25]. Moreover, according to Zhang et al. [26], who reviewed different univariate WL prediction models, the analysis of univariate data-driven models is widely utilised due to their simplicity and low data requirements, for example, Lineros et al. [27], Liu et al. [28], Mohammadi et al. [7], and Ghorbani et al. [22].

Numerous fields of engineering face the traditional optimisation challenge of finding the optimum solution within a wide and uncertain space. When an analytical solution is impractical or time-consuming, numerical techniques can be helpful. However, they cannot ensure a globally optimum outcome due to the high likelihood of settling for a local minimum. Many metaheuristic algorithms are motivated primarily by nature. In addition to creating new metaheuristic algorithms, hybridising algorithms is another tactic for enhancing algorithm performance [29]. Additionally, it was stated in the no free lunch theorem (NFLT) [30] that no universally applicable algorithm can efficiently handle every likely optimisation scenario. In other words, the performance of an optimisation method may be very good on some problems, but it is likely to be quite bad on others. This has resulted in the scientific community proposing numerous strategies for resolving optimisation challenges.

A variety of metaheuristics algorithms approaches were used in the fields of hydrology to estimate the hyperparameters of machine learning (ML) techniques instead of the trial-and-error approach [17], such as the slime mould algorithm (SMA), which was developed by Li et al. [31]. SMA was used to solve various optimisation problems, including optimal power flow problems [32] and water demand forecasting [33]. Also, the marine predator algorithm (MPA) was introduced by Faramarzi et al. [34]. It is a meta-heuristic algorithm based on population [35], which has been applied to solve multiple optimisation topics, such as power resources [36] and estimation of greenhouse gas emissions [37]. Additionally, particle swarm optimisation (PSO) has been employed in multiple hydrology areas, for example, WL [38] and streamflow [39]. Moreover, the constriction coefficient-based particle swarm optimisation and chaotic gravitational search algorithm (CPSOCGSA) was proposed by Rather and Bala [40], and it is used in the prediction of drought [41]. The CPSOCGSA algorithm was proposed under the strategy of hybridising the existing algorithms. In contrast, the MPA and SMA algorithms were developed under the new tactic of the nature-inspired optimisation method.

In the same context, data preprocessing has also been emphasised in the literature to improve time series quality and find optimal predictors. In recent years, cleansing data has become increasingly crucial. This has led to the implementation of various signal pretreatment strategies to decrease the impact of noise in the water level time series, for example, wavelet transform (WT) [42] and singular spectrum analysis (SSA) [15]. Another crucial part of data pretreatment is selecting the optimal input data, for example, mutual information (MI) [43], for a univariate scenario. Employing a nonlinear statistical reliance technique, for example, MI, is appropriate for choosing inputs to ANN models [44].

Hajirahimi and Khashei [17] recently surveyed numerous hybridisations of hybrid structures for time series forecasting. Findings from this study highlight the significance of pretreatment methods and optimisation algorithms in the hybridisation process. The hybridisation of hybrid models has been suggested as a novel idea to achieve high-accuracy prediction, where two or several combined classes are hybridised instead of using the normal individual prediction techniques. One of these procedures that were used effectively is hybridisation of parameter optimisation based with preprocessing-based hybrid models (HOPH). Also, the research recommended that there are still rooms for improvement in the data pretreatment approach and optimisation algorithms. Accordingly, knowledge gaps and promising new research directions of study related to the hybridisation of hybrid models need to be investigated. Also, Mohammed et al. [18] reviewed the watershed-level prediction articles published from 2014 to 2021 and recommended employing all three data preprocessing approaches to improve original data quality and select the optimal predictors. Additionally, SSA is used for denoising data. Moreover, utilising the HOPH technique to forecast WL data because it can optimise the model and the data, there is still space for improvement in WL forecasting.

This study, therefore, considers a novel hybrid methodology called HOPH to predict the WL of the Tigris River (upstream of the Al-Kut barrage) by utilising a set of preprocessing techniques, an ANN model, and metaheuristic algorithms. The study’s significance or value is that the Kut Barrage manages and provides freshwater from the Tigris River to the Dujaila and Al-Gharraf branches by lifting the WL in the barrage upstream. These two branches deliver water for several cities’ growth and prosperity in the south of Iraq, which are already under the stress of a water shortage.

The following steps will be taken in order to accomplish these goals:(1)To use the SSA approach to enhance the raw data quality and the MI method to choose the optimal predictor (lags) scenario during the preprocessing stage(2)To incorporate the MPA algorithm with the ANN technique to forecast monthly WL data(3)To evaluate the MPA-ANN algorithm’s performance using hybrid SMA-ANN, PSO-ANN, and CPSOCGSA-ANN(4)To use the HOPH strategy for estimating monthly water levels depending on multiple time lags(5)To extend the predicting range and decrease the level of uncertainty in monthly water level simulation results by testing various update optimisation algorithms (i.e., two update algorithms and hybridisation of two existing ones)

To the authors’ knowledge, this hybrid technique has been employed to simulate the water level for the first time. Also, this is the first time to predict Al-Kut barrage upstream water levels based on several lags. It is crucial for the local authority because it is responsible for managing and providing freshwater for all the cities in the south of Iraq, which are already under water stress.

2. Case Study and Data Used

Al-Kut city is the centre of Wasit Province, Iraq. It is located on an essential site on the Tigris River. In terms of spatial location and relationships with nearby regions, two river branches (Al-Gharaf and Al-Dujaili) branch out close to the north of the city, as indicated in Figure 1. These two branches deliver freshwater to different cities in Wasit and Thi-Qar provinces for residential, irrigation, commercial, and industrial purposes [6].

Historical monthly water level time series (metre, m) of the Tigris River in Al-Kut city (Al-Kut barrage upstream) was collected from the directorate of water resources in Wasit Province for the period (2011–2020). Figure 2 illustrates the monthly raw WL time series data.

3. Methodology

The suggested methodology for forecasting monthly WL data falls under four headings (Figure 3): (1) data preprocessing, (2) MPA algorithm, (3) ANN model, and (4) model performance criteria:

3.1. Data Preprocessing

According to Maier and Dandy [45], in order for an ANN model to function properly, it is necessary to pretreat data in an appropriate format. These strategies ensure that every input obtains equal attention in the training phase. This research conducts three stages: normalisation, cleaning, and selecting the optimal predictors. Tabachnick and Fidell [46] stated that univariate outliers could be mitigated by first variable transformation and then changing outliers’ scores if found. In order to reduce multicollinearity between predictors, raw water level data were normalised using the natural logarithm method [46]. The cleaning approach aims to improve the value of the regression coefficient and reduce the error scale by treating the outliers and denoising the data [15]. This research used the box-whisker technique to identify outliers over the range of ±1.5 (IQR = Q3 − Q1, Q3 = 3^rd quartile, and Q1 = 1^st quartile) [41]. This approach’s methodology was carried out using the SPSS 24 statistical package. The preprocessing signal is one of the finest ways of denoising the raw dataset using the singular spectrum analysis (SSA) approach to split it into multiple components.

The SSA is a comparatively practical technique for analysing the raw data into several principal components (PCs). Each PC characterises some measure of variation in the original data, with the first signal having the highest value and the last signal having the lowest. Selecting the PCs that account for the most variance and disregarding the PCs that account for the least variance is one way in which the SSA can be used to reduce structure noise in data [47]. This method can analyse linear and nonlinear time series with long, medium, and short-term sample sizes. It does not require statistical assumptions like error normality, linearity, or series stationary [48]. More information about the SSA approach can be found in a study by Golyandina and Zhigljavsky [49].

Choosing relevant predictors is a crucial part of building a prediction model structure [50]. This research applied the mutual information (MI) approach to determine the ideal explanatory variables (lags). MI assesses the statistical relationship between the lagged components and the time series. This method helps choose the most significant correlation components with higher MI [15].

3.2. Overview of the Marine Predator Algorithm (MPA) for ANN Optimisation

Faramarzi et al. [34] suggested MPA as one of the most recent optimisation algorithms inspired by nature. The MPA algorithm was inspired by the motion of marine predators such as sunfish and sharks. MPA has been employed to solve different optimisation problems, including solving the economic emission dispatch [51] and estimating photovoltaic module parameters [52].

3.2.1. Step One: Prey’s Population Initialisation

The MPA begins by establishing a baseline random solution group in accordance with the following equation. This sampling of individuals is produced at random inside the search domain:where is a variable’s highest possible bond, and is a random vector from (0-1).

3.2.2. Step Two: Predator Matrix Creation

Predators and prey are both regarded as search agents in the MPA due to the search for their own food. The elite is the top predator in the search agents, who is normally more skilled than the other search agents. The elite matrix is mathematically modified depending on information about prey locations. The following is how the elite and prey matrices are built:

3.2.3. Step Three: MPA Optimisation Process

After creating the prey and elite matrices, the prey and predator locations are modified in three stages. These stages are determined by the velocity ratio between the prey and the predator. The high-velocity ratio, unit velocity ratio, and low-velocity ratio are the three phases.

3.2.4. Phase One: High-Velocity Ratio

The predator moves faster than the prey during this phase. Additionally, as shown in equations (5) and (6), prey movements have their step size altered:where R denotes a constant number and P is a random vector whose all elements have values between 0 and 1.

Brownian motion is represented by the random vector R_B.

The symbol represents the process of element-wise multiplication.

This stage happens in one-third of the total iterations’ number .

3.2.5. Phase Two: Unit Velocity Ratio

This phase is meant to mimic the hunt for food or prey. Levy flight represents the movement of the prey, while Brownian motion represents the predator’s movement. This phase happens during the second-third of all iterations (i.e., ). The following equations can be used to represent the first 50% of the population:where the Levy distribution number is R_L. To the remaining fifty percent of the population, (5) and (6) are used:CF: the parameter that controls the movement of predator step size.

3.2.6. Phase Three: Low-Velocity Ratio

This stage is the final one in the optimisation process and estimates the predator’s motions when it is quicker than the prey. It happens in the latter third of all iterations (i.e., ):

3.2.7. Step Four: Eddy Formation and FADs

It is also possible to include environmental parameters in the simulation, such as the fish aggregation device (FAD) and eddy formation. The FAD’s impact iswhere r is the random value in a range (0-1), the random indices from the prematrix are referred to as r1 and r2, the FAD’s probability is FADS, and the binary vector is .

3.2.8. Step Five: Marine Memory

Successful foraging positions are well remembered by marine predators. It was a simulation in which the MPA was instructed to save the fitness values of the solution after every iteration and make a comparison of them to fitness values from subsequent iterations.

3.3. Artificial Neural Network (ANN)

ANN is one of the most common ML models that are successfully used in various science and engineering applications. A major advantage of the ANN model is its capability to simulate nonlinear relationships. In recent years, the multilayer feed-forward neural network (ML-FF-NN) has been shown to have strong predictive ability in a variety of hydrology-related domains. The ML-FF-NN architecture consists of at least three layers: the input, the hidden/middle, and the output [53–56].

Thomas et al. [57] explored whether utilising ML-FF-NN with two hidden layers raising generalisation compared to employing just one hidden layer. According to the study, two-hidden-layer networks outperformed generalisations in nine out of ten situations. Additionally, ANNs with two hidden layers have been shown to be effective in capturing the nonlinear relationship between estimated and actual in several research studies, such as Zubaidi et al. [15], Tortajada et al. [58], and Farzad and El-Shafie [59]. The input layer contained all of the parameters the user entered. Then, the calculations took place in the hidden layer. The final output vector was calculated at the output layer [60].

Accordingly, in this study, ANN uses four layers (two hidden layers). The number of nodes in the input layer refers to the lag water level, and the water level (target) refers to the output layer (Figure 4).

The learning algorithm’s primary function is to fine-tune the network’s settings, such as its weights and biases [61]. This modification was implemented to guarantee that the predictions had tolerable error limits. For this reason, the fitness function is often referred to as the error signal presented via the mean squared error (MSE) [60]. The Levenberg–Marquardt algorithm (LM) is often used for ML-FF-NN [50, 62], and the linear and tansigmoidal activation functions were adopted in both the output and hidden layers, respectively.

In addition, there were a number of significant difficulties and issues with ANN modelling that required additional studies [60], such as the number of neurons in each hidden layer and the learning rate coefficient. This research determined the ANN hyperparameters using the recent metaheuristic algorithms (i.e., MPA, CPSOCGSA, PSO, and SMA).

Time series were categorised into three groups, which include the training set (consisting of 70% or 82 data points), the testing set (consisting of 15% or 17 data points), and the validation set (consisting of 15% or 17 data points), respectively, as by Zubaidi et al. [15].

3.4. Model Performance Criteria

Prediction errors are crucial for choosing the right models and for providing information that can be used to suggest changes to current models that will lower forecast deviations in the future [63]. The model performance evaluation metrics chosen include the mean absolute error (MAE), root mean squared error (RMSE), mean bias error (MBE), mean absolute relative error (MARE), coefficient of determination (R²), and scatter index (SI) in equations (9)–(14), respectively. In addition, the residual analysis plot test and the Taylor diagram test were utilised:where the size of the data is N. The WL observed and simulated data are O_i and F_i. The average value of the actual WL data is . The mean value of the estimated WL data is . The best model with nearly zero values is chosen for MARE, RMSE, MBE, and MAE criteria [64, 65]. The model performs well when R² exceeds 0.85 [66]. The model’s efficiency is excellent if SI is less than 10%, good if it is from 10% to 20%, acceptable if it is from 20% to 30%, and failure if it is greater than 30% [67, 68].

4. Results and Discussion

4.1. Input Data Analysis

First, time series for water levels were normalised (raw water level data are free of outliers and still free after normalisation). Figure 5 illustrates the normalised WL dataset.

The normalised time series was then decomposed using SSA into several components to obtain noise-free time series. Figure 6 depicts the normalised data (upper row), the improved time series data (2^nd row), and noise signals (3^rd and 4^th).

In addition, as show in Figure 7, the MI approach was utilised to select the ideal input model (lags) scenario for the forecasting model. According to the literature, the time lag is chosen as the initial minimal level of average mutual information (AMI) [69]. Depending on the AMI figure, four monthly lags (Lag_t−1 to Lag_t−4) of historical WL data were utilised to estimate WL data in the future.

Table 1 displays the correlation coefficients between the target (i.e., future WL) and independent components (lags of WL) in those raw and pretreatment data phases. The table shows that preprocessing data approach considerably enhanced data quality, for example, raising the coefficient of correlation for Lag_t−1 (from 0.648 to 0.938). The improvement of the data comes from the normalisation method, which reduces the variance and removes noise. After that, the data were divided according to Section 3.3.

4.2. Application Hybrid Algorithms-ANN Methods

The MPA, SMA, CPSOCGSA, and PSO algorithms were utilised to improve the ANN technique by locating the optimum hyperparameters of the ANN (using the MATLAB toolbox). All algorithms, using population sizes of 10, 20, 30, 40, and 50, found the optimum number of hidden nodes and the optimum learning rate coefficient of the ANN technique. To decrease uncertainty and increase the forecasting range, every swarm size was duplicated 5 times to reduce uncertainty, for example, see Figure 8 for the MPA-ANN technique. The fourth MPA-ANN application is ideal for the 10-swarm size since it has the lowest error. It was selected and merged with the ideal application for the other swarm sizes.

As shown in Figures 9(a)–9(d), a swarm size of 40 provides the optimal solution for the CPSOCGSA-ANN, SMA-ANN, and MPA-ANN algorithms, whereas a swarm size of 50 delivers the optimal solution for the PSO-ANN algorithm. Analysing the fitness function values for each algorithm in detail reveals that the MSE for the MPA-ANN algorithm is 0.005701 (after 70 iterations). In contrast, SMA-ANN and CPSOCGSA-ANN algorithms did not improve beyond MSE equal to 0.006308 and 0.005756, respectively. The PSO-ANN technique only achieves its best MSE of 0.005722 after 196 iterations. Although this research focuses on accuracy, performance times for each algorithm were recorded. Each algorithm consumed a different amount of time from the other in each swarm; for example, the MPA-ANN algorithm took about 12 minutes and 58 seconds when applying swarm 10, while the CPSOCGSA-ANN, SMA-ANN, and PSO-ANN algorithms took about 6 minutes and 37 seconds, 11 minutes and 54 seconds, and 9 minutes and 50 seconds, respectively. However, according to the above results, Table 2 summarises the hyperparameters of the ANN techniques for the optimal swarm for every optimisation algorithm.

(a)

(b)

(c)

(d)

4.3. Evaluating and Comparing the Techniques’ Performance

In accordance with Tao et al. [2] and Ghorbani et al. [22] methodology, the hyperparameters in Table 2 were used to configure four ANN models. Multiple iterations of each ANN method were performed to identify the best network that consistently solves the problem. There were five statistical metrics used to evaluate the methods’ efficacy (see Section 3.4 for more information). Table 3 demonstrates the statistical requirements for every technique. The MPA-ANN, CPSOCGSA-ANN, and PSO-ANN approaches offered R² of equal or bigger than 85%, which are good findings according to Dawson et al. [66]. In contrast, SMA-ANN yielded R² less than 0.85. However, the MPA-ANN model outperforms the other three models, with an R² of 0.94. Furthermore, MPA-ANN outperforms the other methods in MAE, MARE, RMSE, and MBE tests, such as the MBE values of MPA-ANN, PSO-ANN, SMA-ANN, and CPSOCGSA-ANN are 0.0006, 0.0062, 0.0111, and 0.0057, respectively. This table highlights that the MPA-ANN approach is the ideal one for predicting WL time series in the validation phase.

Also, the Taylor diagrams for four different ANN forecasting models are shown in Figure 10: (A) MPA-ANN, (B) CPSOCGSA-ANN, (C) SMA-ANN, and (D) PSO-ANN. The level of concordance between observed and predicted behaviour is summarised graphically in this figure, taking into consideration the root mean square error difference (RMSD), standard deviation (SD), and correlation coefficient (R) [2, 70]. The reference point depicts the measured water level on the Taylor diagram’s X-axis. A technique close to the observed point (reference) is thought to be better. It thus provides an effective evaluation of the comparative performance of various models. According to Figure 10, the MPA-ANN model achieved better R and lower RMSD and SD when compared to the measured point. The outcomes, as revealed in Figure 10, confirm the outcomes of Table 3 and reveal the superiority of the MPA-ANN model in predicting the WL data.

Moreover, an error analysis was performed to check the prediction models’ goodness of fit. Figure 11 shows the error scatterplots against the number of samples for WL data. Three important patterns can be inferred from the data presented in the figure: (1) the average error for the MPA-ANN model was much nearer to zero than other models, (2) the pattern of distribution does not follow any noticeable trends, and (3) the error density follows a regular distribution across all data. Additionally, with MPA-ANN data, the margin of error was only ((−0.009)–(0.013)) m in comparison to the CPSOCSA-ANN, SMA-ANN, and PSO-ANN models, which are ((−0.005)–(0.016)) m, ((−0.003)–(0.023)) m, and ((−0.013)–(0.020)), respectively. The result thus obtained for error analysis is compatible with previous results.

Overall, the MPA-ANN model outperforms the other hybrid models. Accordingly, SI is employed to check this model’s efficiency and durability. According to the boundaries in Section 3.4, the MPA-ANN has excellent results with SI = 0.002. Additionally, the MPA-ANN model was further supported by the residual analysis. The findings illustrate that the residual data of the MPA-ANN model have a normal distribution based on the significant values, according to the Shapiro–Wilk and Kolmogorov–Smirnov tests. The data are normalised when the significant ( value) > 0.05 [71].

Faramarzi et al. [34] created the coupled MPA method to justify the global solution by incorporating multiple methods and strategies during the improvement process. In the biological relationship between predators and prey, different foraging strategies have had a major impact on MPA. As a result, the Brownian and Levy flight (LF) distributions were developed to exhibit a professional explorer-exploiter tendency and enhance searchability in every performance significantly. This allowed the MPA approach to accurately identify the global optimum solutions to the improvement problems investigated here. The outcomes of the present research support the hypothesis of the HOPH technique. It is also consistent with the previous literature, such as Tikhamarine et al. [72] and Wang et al. [73], in the hydrological fields.

These are the most notable results of this study:(1)These findings emphasise the SSA’s potential utility in enhancing raw data quality by removing the noise from time series and mutual information technique to determine the ideal model input scenario.(2)MPA has proven to be a trustworthy algorithm when combined with the ANN method for estimating WL data compared with CPSOCGSA, PSO, and SMA algorithms.(3)Multiple statistical criteria analyses (i.e., MAE, RMSE, R², MBE, graphical tests, and residual analysis) showed that the proposed methodology (i.e., HOPH) accurately predicted the WL time series.(4)Using four metaheuristic algorithms to integrate the ANN model, each algorithm was performed with five swarms, and each swarm was duplicated five times, leading to an increase in the forecasting range and lower uncertainty.(5)The research findings offer valuable scientific information that helps decision-makers forecast WL information with low uncertainty.

5. Conclusion

This study uses a novel hybrid model combining data pretreatment with a recent optimisation algorithm (MPA) and ANN model to simulate the monthly WL of the Tigris River, Al-Kut city. The ability of the MPA algorithm to enhance ANN model performance was compared to the CPSOCGSA, SMA, and PSO algorithms. Figure 3 summarises this methodology, and the main conclusions are as follows:(i)The results show that data pretreatment techniques successfully enhanced data quality (i.e., denoising time series by SSA) and chose the optimal model input scenario (i.e., selecting lags by MI).(ii)Depending on several statistical criteria, MPA-ANN tends to be superior to other hybrid techniques. Generally, the proposed methodology provides excellent to good performance to forecast monthly WL.(iii)These outcomes can assist local governments, such as stockholders and managers, with valuable information that can improve the water sector company’s irrigation system administration and service and resource management.(iv)For future work, it is advised to conduct more studies on HOPH prediction models (especially the ANN model) because preprocessing approaches and specifying the hyperparameters of soft computing models have much room for improvement.

Data Availability

Data were provided from Wasit Province/ the Directorate of Water Resources.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

The researchers would like to thank Wasit Province/the Directorate of Water Resources for supplying the Tigris water level data.

References

B. Heidarpour, B. Saghafian, J. Yazdi, and H. M. Azamathulla, “Effect of extraordinary large floods on at-site flood frequency,” Water Resources Management, vol. 31, pp. 4187–4205, 2017.
View at: Google Scholar
H. Tao, N. K. Al-Bedyry, K. M. Khedher, S. Shahid, and Z. M. Yaseen, “River water level prediction in coastal catchment using hybridized relevance vector machine model with improved grasshopper optimization,” Journal of Hydrology, vol. 598, Article ID 126477, 2021.
View at: Publisher Site | Google Scholar
B. Deng, S. H. Lai, C. Jiang, P. Kumar, A. El-Shafie, and R. J. Chin, “Advanced water level prediction for a large-scale river–lake system using hybrid soft computing approach: a case study in Dongting Lake, China,” Earth Science Informatics, vol. 14, no. 4, pp. 1987–2001, 2021.
View at: Publisher Site | Google Scholar
L. He, S. Chen, Y. Liang, M. Hou, and J. Chen, “Infilling the missing values of groundwater level using time and space series: case of Nantong City, east coast of China,” Earth Science Informatics, vol. 13, no. 4, pp. 1445–1459, 2020.
View at: Publisher Site | Google Scholar
L. A. Al-Maliki, S. L. Farhan, I. A. Jasim, S. K. Al-Mamoori, N. Al-Ansari, and S. L. Fegade, “Perceptions about water pollution among university students: a case study from Iraq,” Cogent Engineering, vol. 8, no. 1, 2021.
View at: Publisher Site | Google Scholar
S. Ethaib, S. L. Zubaidi, N. Al-Ansari, and S. L. Fegade, “Evaluation water scarcity based on GIS estimation and climate-change effects: a case study of Thi-Qar Governorate, Iraq,” Cogent Engineering, vol. 9, no. 1, 2022.
View at: Publisher Site | Google Scholar
B. Mohammadi, Y. Guan, P. Aghelpour, S. Emamgholizadeh, R. Pillco Zolá, and D. Zhang, “Simulation of titicaca lake water level fluctuations using hybrid machine learning technique integrated with grey wolf optimizer algorithm,” Water, vol. 12, no. 11, p. 3015, 2020.
View at: Publisher Site | Google Scholar
M. Ehteram, A. Ferdowsi, M. Faramarzpour et al., “Hybridization of artificial intelligence models with nature inspired optimization algorithms for lake water level prediction and uncertainty analysis,” Alexandria Engineering Journal, vol. 60, no. 2, pp. 2193–2208, 2021.
View at: Publisher Site | Google Scholar
K. Park, Y. Jung, Y. Seong, and S. Lee, “Development of deep learning models to improve the accuracy of water levels time series prediction through multivariate hydrological data,” Water, vol. 14, 2022.
View at: Google Scholar
P. Páliz Larrea, X. Zapata-Ríos, and L. Campozano Parra, “Application of neural network models and ANFIS for water level forecasting of the salve faccha dam in the andean zone in northern Ecuador,” Water, vol. 13, 2011.
View at: Google Scholar
M. Çimen and O. Kisi, “Comparison of two different data-driven techniques in modeling lake level fluctuations in Turkey,” Journal of Hydrology, vol. 378, no. 3-4, pp. 253–262, 2009.
View at: Publisher Site | Google Scholar
B. Li, G. Yang, R. Wan, X. Dai, and Y. Zhang, “Comparison of random forests and other statistical methods for the prediction of lake water level: a case study of the Poyang Lake in China,” Hydrology Research, vol. 47, pp. 69–83, 2016.
View at: Google Scholar
M. A. Alawsi, S. L. Zubaidi, N. S. S. Al-Bdairi, N. Al-Ansari, and K. Hashim, “Drought forecasting: a review and assessment of the hybrid techniques and data pre-processing,” Hydrology, vol. 9, 2022.
View at: Google Scholar
Z. S. Khudhair, S. L. Zubaidi, S. Ortega-Martorell, N. Al-Ansari, S. Ethaib, and K. Hashim, “A review of hybrid soft computing and data pre-processing techniques to forecast freshwater quality’s parameters,” Current Trends and Future Directions Environments, vol. 9, 2022.
View at: Google Scholar
S. L. Zubaidi, S. Ortega-Martorell, H. Al-Bugharbee et al., “Urban water demand prediction for a city that suffers from climate change and population growth: gauteng province case study,” Water, vol. 12, pp. 1–17, 2020.
View at: Google Scholar
P. Sharma, B. B. Sahoo, Z. Said et al., “Application of machine learning and Box-Behnken design in optimizing engine characteristics operated with a dual-fuel mode of algal biodiesel and waste-derived biogas,” International Journal of Hydrogen Energy, vol. 48, no. 18, pp. 6738–6760, 2023.
View at: Publisher Site | Google Scholar
Z. Hajirahimi and M. Khashei, “Hybridization of hybrid structures for time series forecasting,” A Review Artificial Intelligence Review, vol. 56, 2022.
View at: Google Scholar
S. J. Mohammed, S. L. Zubaidi, S. Ortega-Martorell, N. Al-Ansari, S. Ethaib, and K. Hashim, “Application of hybrid machine learning models and data pre-processing to predict water level of watersheds: recent trends and future perspective,” Cogent Engineering, vol. 9, no. 1, 2022.
View at: Publisher Site | Google Scholar
I. Ebtehaj, S. S. Sammen, L. M. Sidek et al., “Prediction of daily water level using new hybridized GS-GMDH and ANFIS-FCM models,” Engineering Applications of Computational Fluid Mechanics, vol. 15, pp. 1343–1361, 2021.
View at: Google Scholar
M. Imran, P. Sheikh Abdul Khader, M. Rafiq, and K. Singh Rawat, “Forecasting water level of Glacial fed perennial river using a genetically optimized hybrid Machine learning model,” Materials Today: Proceedings, vol. 46, pp. 11113–11119, 2021.
View at: Publisher Site | Google Scholar
D. H. Nguyen, X. Hien Le, J.-Y. Heo, and D.-H. Bae, “Development of an extreme gradient boosting model integrated with evolutionary algorithms for hourly water level prediction,” IEEE Access, vol. 9, pp. 125853–125867, 2021.
View at: Publisher Site | Google Scholar
M. A. Ghorbani, R. C. Deo, V. Karimi, M. H. Kashani, and S. Ghorbani, “Design and implementation of a hybrid MLP-GSA model with multi-layer perceptron-gravitational search algorithm for monthly lake water level forecasting,” Stochastic Environmental Research and Risk Assessment, vol. 33, no. 1, pp. 125–147, 2018.
View at: Publisher Site | Google Scholar
B. Wang, B. Wang, W. Wu, C. Xi, and J. Wang, “Sea-water-level prediction via combined wavelet decomposition, neuro-fuzzy and neural networks using SLA and wind information,” Acta Oceanologica Sinica, vol. 39, no. 5, pp. 157–167, 2020.
View at: Publisher Site | Google Scholar
S. Zhu, H. Lu, M. Ptak, J. Dai, and Q. Ji, “Lake water-level fluctuation forecasting using machine learning models: a systematic review,” Environmental Science and Pollution Research, vol. 27, no. 36, pp. 44807–44819, 2020.
View at: Publisher Site | Google Scholar
N. Q. Dat, N. A. N. Thi, V. K. Solanki, and N. Le An, “Prediction of water level using time series, wavelet and neural network approaches,” International Journal of Information Retrieval Research, vol. 10, no. 3, pp. 1–19, 2020.
View at: Publisher Site | Google Scholar
Z. Zhang, Q. Zhang, and V. P. Singh, “Univariate streamflow forecasting using commonly used data-driven models: literature review and case study,” Hydrological Sciences Journal, vol. 63, no. 7, pp. 1091–1111, 2018.
View at: Publisher Site | Google Scholar
M. L. Lineros, A. M. Luna, P. M. Ferreira, and A. E. Ruano, “Optimized design of neural networks for a river water level prediction,” System Sensors, vol. 21, p. 6504, 2021.
View at: Google Scholar
Z. Liu, L. Cheng, K. Lin, and H. Cai, “A hybrid bayesian vine model for water level prediction,” Environmental Modelling & Software, vol. 142, Article ID 105075, 2021.
View at: Google Scholar
F. A. Şenel, F. Gökçe, A. S. Yüksel, and T. Yiğit, “A novel hybrid PSO–GWO algorithm for optimization problems,” Engineering with Computers, vol. 35, no. 4, pp. 1359–1373, 2018.
View at: Publisher Site | Google Scholar
D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, pp. 67–82, 1997 No.
View at: Publisher Site | Google Scholar
S. Li, H. Chen, M. Wang, A. A. Heidari, and S. Mirjalili, “Slime mould algorithm: a new method for stochastic optimization,” Future Generation Computer Systems, vol. 111, pp. 300–323, 2020.
View at: Google Scholar
S. Khunkitti, A. Siritaratiwat, and S. Premrudeepreechacharn, “Multi-objective optimal power flow problems based on slime mould algorithm,” Sustainability, vol. 13, p. 7448, 2021.
View at: Publisher Site | Google Scholar
K. Yu, L. Liu, and Z. Chen, “An improved slime mould algorithm for demand estimation of urban water resources,” Mathematics, vol. 9, no. 12, p. 1316, 2021.
View at: Publisher Site | Google Scholar
A. Faramarzi, M. Heidarinejad, S. Mirjalili, and A. H. Gandomi, “Marine predators algorithm: a nature-inspired metaheuristic,” Expert Systems with Applications, vol. 152, 2020.
View at: Google Scholar
Q. Fan, H. Huang, Q. Chen, L. Yao, K. Yang, and D. Huang, “A modified self-adaptive marine predators algorithm: framework and engineering applications,” Engineering with Computers, vol. 38, no. 4, pp. 3269–3294, 2021.
View at: Publisher Site | Google Scholar
A. Eid, S. Kamel, and L. Abualigah, “Marine predators algorithm for optimal allocation of active and reactive power resources in distribution networks,” Neural Computing & Applications, vol. 33, pp. 14327–14355, 2021.
View at: Google Scholar
H. Bakır, Ü. Ağbulut, A. E. Gürel et al., “Forecasting of future greenhouse gas emission trajectory for India using energy and economic indexes with various metaheuristic algorithms,” Journal of Cleaner Production, vol. 360, Article ID 131946, 2022.
View at: Publisher Site | Google Scholar
P. Panyadee, P. Champrasert, and C. Aryupong, “Water level prediction using artificial neural network with particle swarm optimization model,” in Proceedings of the Fifth International Conference on Information and Communication Technology, pp. 1–6, Melaka, Malaysia, May 2017.
View at: Google Scholar
R. M. Adnan, R. Mostafa, O. Kisi, Z. M. Yaseen, S. Shahid, and M. Zounemat-Kermani, “Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization,” Knowledge-Based Systems, vol. 230, 2021.
View at: Google Scholar
S. A. Rather and P. S. Bala, Applied Soft Computing and Communication Networks, Springer, Berlin, Germany, 2020.
M. A. Alawsi, S. L. Zubaidi, N. Al-Ansari, H. Al-Bugharbee, and H. M. Ridha, “Tuning ANN hyperparameters by CPSOCGSA, MPA, and SMA for short-term SPI drought forecasting,” Atmosphere, vol. 13, pp. 1–24, 2022.
View at: Google Scholar
Y. Seo, S. Kim, O. Kisi, and V. P. Singh, “Daily water level forecasting using wavelet decomposition and artificial intelligence techniques,” Journal of Hydrology, vol. 520, pp. 224–243, 2015.
View at: Publisher Site | Google Scholar
Y. Jiang, X. Bao, S. Hao, H. Zhao, X. Li, and X. Wu, “Monthly streamflow forecasting using ELM-IPSO based on phase space reconstruction,” Water Resources Management, vol. 34, pp. 3515–3531, 2020.
View at: Google Scholar
H. R. Maier, A. Jain, G. C. Dandy, and K. P. Sudheer, “Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions,” Environmental Modelling & Software, vol. 25, pp. 891–909, 2010.
View at: Google Scholar
H. R. Maier and G. C. Dandy, “Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications,” Environmental Modelling & Software, vol. 15, 2000.
View at: Google Scholar
B. G. Tabachnick and L. S. Fidell, Using Multivariate Statistics, United States of America: Pearson Education, Inc, USA, sixth edition, 2013.
H. Hassani, “Singular spectrum analysis: methodology and comparison,” Journal of Data Science, vol. 5, no. 2, pp. 239–257, 2021.
View at: Publisher Site | Google Scholar
H. Al-Bugharbee and I. Trendafilova, “A fault diagnosis methodology for rolling element bearings based on advanced signal pretreatment and autoregressive modelling,” Journal of Sound and Vibration, vol. 369, pp. 246–265, 2016.
View at: Publisher Site | Google Scholar
N. Golyandina and A. Zhigljavsky, Singular Specturm Analysis for Time Series, Springer, New York, NY, USA, 2013.
M. Bayatvarkeshi, K. Mohammadi, O. Kisi, and R. Fasihi, “A new wavelet conjunction approach for estimation of relative humidity: wavelet principal component analysis combined with ANN,” Neural Computing & Applications, vol. 32, pp. 4989–5000, 2018.
View at: Google Scholar
A. Xia and X. Wu, “A hybrid multi-objective optimization algorithm for economic emission dispatch considering wind power uncertainty,” Iranian Journal of Science and Technology, Transactions of Electrical Engineering, vol. 45, no. 4, pp. 1277–1293, 2021.
View at: Publisher Site | Google Scholar
M. A. E. Sattar, A. Al Sumaiti, H. Ali, and A. A. Z. Diab, “Marine predators algorithm for parameters estimation of photovoltaic modules considering various weather conditions,” Neural Computing & Applications, vol. 33, 2021.
View at: Google Scholar
M. Zounemat-Kermani, E. Matta, A. Cominola et al., “Neurocomputing in surface water hydrology and hydraulics: a review of two decades retrospective, current status and future prospects,” Journal of Hydrology, vol. 588, Article ID 125085, 2020.
View at: Publisher Site | Google Scholar
M. G. Shirkoohi, M. Doghri, and S. Duchesne, “Short-term water demand predictions coupling an artificial neural network model and a genetic algorithm,” Water Supply, vol. 21, 2021.
View at: Google Scholar
I. Veza, A. Afzal, M. Mujtaba et al., “Review of artificial neural networks for gasoline, diesel and homogeneous charge compression ignition engine,” Alexandria Engineering Journal, vol. 61, no. 11, pp. 8363–8391, 2022.
View at: Publisher Site | Google Scholar
R. Tur and S. Yontem, “A comparison of soft computing methods for the prediction of wave height parameters,” Knowledge-Based Engineering, vol. 2, pp. 31–46, 2021.
View at: Google Scholar
A. J. Thomas, M. Petridis, S. D. Walters, S. M. Gheytassi, and R. E. Morgan, Engineering Applications of Neural Networks, Springer, Berlin, Germany, 2017.
C. Tortajada, F. González-Gómez, A. K. Biswas, and J. Buurman, “Water demand management strategies for water-scarce cities,” The case of Spain Sustainable Cities and Society, vol. 45, pp. 649–656, 2019.
View at: Google Scholar
F. Farzad and A. H. El-Shafie, “Performance enhancement of rainfall pattern–water level prediction model utilizing self-organizing-map clustering method,” Water Resources Management, vol. 31, pp. 945–959, 2017.
View at: Google Scholar
A. T. Hoang, S. Nižetić, H. C. Ong et al., “A review on application of artificial neural network (ANN) for performance and emission characteristics of diesel engine fueled with biodiesel-based fuels,” Sustainable Energy Technologies, vol. 47, 2021.
View at: Google Scholar
B. Salman and M. M. Kadhum, “Predicting of load carrying capacity of reactive powder concrete and normal strength concrete column specimens using artificial neural network,” Knowledge-Based Engineering Sciences, vol. 3, pp. 45–53, 2022.
View at: Google Scholar
H. Zare Abyaneh, M. Bayat Varkeshi, G. Golmohammadi, and K. Mohammadi, “Soil temperature estimation using an artificial neural network and co-active neuro-fuzzy inference system in two different climates,” Arabian Journal of Geosciences, vol. 9, no. 5, p. 377, 2016.
View at: Publisher Site | Google Scholar
S. L. Zubaidi, N. S. S. Al-Bdairi, S. Ortega-Martorell et al., “Assessing the Benefits of Nature-Inspired Algorithms for the Parameterization of ANN in the Prediction of Water Demand,” Journal of Water Resources Planning and Management, vol. 149, pp. 1–10, 2023.
View at: Google Scholar
K. Ahmed, S. Shahid, X. Wang, N. Nawaz, and K. Najeebullah, “Evaluation of gridded precipitation datasets over arid regions of Pakistan,” Water, vol. 11, 2019.
View at: Google Scholar
E. Eze, S. Halse, and T. Ajmal, “Developing a novel water quality prediction model for a South African aquaculture farm,” Water, vol. 13, 2021.
View at: Google Scholar
C. W. Dawson, R. J. Abrahart, and L. M. See, “HydroTest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts,” Environmental Modelling & Software, vol. 22, pp. 1034–1052, 2007.
View at: Google Scholar
A. B. Heinemann, P. A. J. Van Oort, D. S. Fernandes, and A. D. H. N. Maia, “Sensitivity of APSIM/ORYZA model due to estimation errors in solar radiation,” Bragantia, vol. 71, no. 4, pp. 572–582, 2012.
View at: Publisher Site | Google Scholar
M.-F. Li, X.-P. Tang, W. Wu, and H.-B. Liu, “General models for estimating daily global solar radiation for different solar radiation zones in mainland China,” Energy Conversion and Management, vol. 70, pp. 139–148, 2013.
View at: Google Scholar
C. Aldrich and L. Auret, Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods, Springer, Berlin, Germany, 2013.
M. A. Ghorbani, R. C. Deo, V. Karimi, Z. M. Yaseen, and O. Terzi, “Implementation of a hybrid MLP-FFA model for water level prediction of Lake Egirdir, Turkey,” Stochastic Environmental Research and Risk Assessment, vol. 32, no. 6, pp. 1683–1697, 2017.
View at: Publisher Site | Google Scholar
M. Valentini, G. B. dos Santos, and B. Muller Vieira, “Multiple linear regression analysis (MLR) applied for modeling a new WQI equation for monitoring the water quality of Mirim Lagoon, in the state of Rio Grande do Sul—Brazil,” SN Applied Sciences, vol. 3, pp. 70–11, 2021.
View at: Publisher Site | Google Scholar
Y. Tikhamarine, D. Souag-Gamane, and O. Kisi, “A new intelligent method for monthly streamflow prediction: hybrid wavelet support vector regression based on grey wolf optimizer (WSVR–GWO),” Arabian Journal of Geosciences, vol. 12, no. 17, p. 540, 2019.
View at: Publisher Site | Google Scholar
Y. Wang, Y. Yuan, Y. Pan, and Z. Fan, “Modeling daily and monthly water quality indicators in a canal using a hybrid wavelet-based support vector regression structure,” Water, vol. 12, 2020.
View at: Google Scholar

Copyright

Copyright © 2023 Sarah J. Mohammed et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

410

Downloads

301

Citations