Projected Water Levels and Identified Future Floods: A Comparative Analysis for Mahaweli River, Sri Lanka

The Rainfall-Runoff (R-R) relationship is essential to the hydrological cycle. Sophisticated hydrological models can accurately investigate R-R relationships; however, they require many data. Therefore, machine learning and soft computing techniques have taken the attention in the environment of limited hydrological, meteorological, and geological data. The accuracy of such models depends on the various parameters, including the quality of inputs and outputs and the used algorithms. However, identifying a perfect algorithm is still challenging. This study develops a fuzzy logic-based algorithm called Cascaded-ANFIS to accurately predict runoff based on rainfall. The model was compared against three regression algorithms: Long Short-Term Memory, Grated Recurrent Unit, and Recurrent Neural Networks. These algorithms have been selected due to their outstanding performances in similar studies. The models were tested on the Mahaweli River, the longest in Sri Lanka. The results showcase that the Cascaded-ANFIS-based model outperforms the other algorithms. The correlation coefficient of each algorithm’s predictions was 0.9330, 0.9120, 0.9133, 0.8915, 0.6811, 0.6811, and 0.6734 for the Cascaded-ANFIS, LSTM, GRU, RNN, Linear, Ridge, and Lasso regression models respectively. Hence, this study concludes that the proposed algorithm is 21% more accurate than the second-best LSTM algorithm. In addition, Shared Socio-economic Pathways (SSP2-4.5 and SSP5-8.5 scenarios) were used to generate future rainfalls, forecast the near-future and mid-future water levels, and identify potential flood events. The future forecasting results indicate a decrease in flood events and magnitudes in both SSP2-4.5 and SSP5-8.5 scenarios. Furthermore, the SSP5-8.5 scenario shows drought weather from May to August yearly. The results of this study can effectively be used to manage and control water resources and mitigate flood damages.


I. INTRODUCTION
Natural disasters often occur due to recent climate changes. Several studies have focused on climate change and its' effect detection where Remote sensing methods are highly used in these methodologies [71].
Floods are frequently observed in natural disasters. However, they are one of the direct outcomes of the rainfall-runoff (R-R) process [77]. Due to their severity and The associate editor coordinating the review of this manuscript and approving it for publication was Akin Tascikaraoglu . frequent occurrence, flood prediction has taken significant attention in R-R modelling [1]. Even though they are natural disasters, their severity has been impacted by anthropogenic activities. Flow hydrographs are drastically changed to have higher peaks quickly due to ongoing urbanization [2], [3], [4]. Flash floods are often in urbanized areas [5], [6]. Hence, urbanization is one of the most impacting factors in today's floods.
Therefore, accurate modelling of runoff-rainfall relationships to catchments is in high demand. It is important to note that each catchment has to be modelled to find its R-R relationship. Commercial and non-commercial hydrological computer packages are available to simulate the R-R relationships of catchments. However, these computer packages require various data related to digital elevation models, soil data, meteorological data, and discharge data. [17]. The accuracy of the catchment models is highly varied due to the quality of catchment data [39]. Only some catchments are gauged to have meteorological and discharge data and other catchment characteristics on a temporal and spatial basis. Thus, the catchment models always need help achieving the required accuracy to model the runoff and then predict the floods.
In the event of limited data, soft computing [38], [40], and machine learning techniques [19], [20], [21], [22] are helpful to model the R-R processes. R-R processes can be modelled only using the known rainfall and measured discharges and, importantly, without any catchment characteristics. Hence, numerous methodologies under soft computing and machine learning have been developed using various algorithms and study cases. One of these data-driven methods is the artificial neural network (ANN), which has been used in various fields, including hydrology and water resources. It has gained popularity because it can address, model, and forecast stochastic and nonlinear situations in the system [41], [42], [43], [44], [45], [46], [47] The algorithm does not replace conceptual watershed modelling of the impossibility of describing the catchment's internal structure and handling the data disseminated relating to the physical properties. Nevertheless, they have gained acceptance as a practical substitute for conceptual models for forecasting because of numerous benefits, such as the ability to produce simple and accurate models [48] and the computation speed [49]. Additionally, this study has demonstrated its strength and ability to mimic hydrological events. As a result, ANN models are suggested for rainfall-runoff modelling due to their straightforward designs and accuracy, enabling addressing the issues of managing water resources.
To create ANN models, most studies have used feedforward and backpropagation (FFBP) networks. Although relatively well known for their ability to anticipate floods, neither model's performance in a particular application has been determined [43]. Since several learning methods may be used to improve ANN, there is still a wide range of probability. Gradient descent (GD) is frequently used in neural network training at the backpropagation stage [50]. GD has been used in recent years to increase the potential of the backpropagation algorithm. However, the GD may experience problems with convergence, training technique slowdown, overfitting, and stocking inside local minima. The performance of the training algorithm can lower the performance when the structure of the model is complex, and the parameter set is significant [11], [51], [52].
Moreover, Feed-forward deep neural networks (FF-DNNs) have been used widely in climate change-related studies. A case study in Kastoria Lake in Greece used FF-DNN to predict dissolved oxygen. They have obtained maximum NSE efficiency of 0.89 [70]. Forecasting of dissolved oxygen was studied using three methods such as the Autoregressive integrated moving average (ARIMA) method, Transfer Function (TF) method, and NN method [72]. They concluded that the ARIMA method provides significant results compared to TF and NN. Additionally, A combination of tools such as remote sensing, weather forecasting, and Artificial Intelligence was used to improve irrigation management in Mediterranean Basins. This study suggests that comprehensively using these tools can enhance the irrigation system rapidly [73].
Recently, several novel evaluations of CNN models were implemented: the Extreme Gradient Boosting (XGBoost) and CNN-transformer. These algorithms have been widely tested for uncertain and nonlinear data. Many studies recommended ANFIS as a highly accurate algorithm for predictions [38], [40]. Xuan-Nam et al. [39] have proposed an ML model for blast-induced ground vibration predictions in quarries. They have employed several state-of-the-art algorithms, such as Moth-flame optimization-based ANFIS, XGBoost, ANN, and SVM. The study showcased that the ANIFS-based algorithm outperformed the other model with an accuracy of 98.62%. Moreover, two environmental types of research have been introduced by Hameed et al. [38] and Junliang et al. [40] employing ANFIS and XGBoost algorithms.
On the other hand, Genetic Algorithms (GA) in the hydrological sciences have been the subject of several investigations to train (ANN) rainfall-runoff models that are more accurate than backpropagation technique-based ANN models in anticipating the quotidian flow [59] using natural code GAs. In conjunction with intelligence approaches, the GA has developed into a potent tool for modelling and optimizing complicated processes [56], [57], [58]. It is commonly used in ANN to enhance efficiency by tuning the parameters [54], [55]. Roy and Singh [11] developed a novel hybrid metaheuristic method for simulating the rainfall-runoff process that integrates Biogeography-Based Optimization (BBO), Particle Swarm Optimization (PSO), and grey wolf optimizer (GWO) combining ANN and Adaptive Network-based Fuzzy Inference Systems (ANFIS). Moreover, three optimization algorithms integrated with ANFIS were introduced for rainfall-runoff predictions, namely, Differential Evolution algorithm based ANFIS (ANFIS-DE), Particle Swarm Optimization based ANFIS (ANFIS-PSO), and Genetic Algorithm based ANFIS (ANFIS-GA) [53]. Investigating and contrasting these models in hydrology is strongly advised because the different algorithms have various advantages and VOLUME 11, 2023 distinct methods for complex modelling phenomena. The investigations in hydrology, particularly rainfall-runoff modelling, are still in the early stages. Hence, the computational analysis has to be comprehensively conducted for a better outcome. Therefore, this research study aims to contribute to scientific society by achieving the following objectives.
The ANFIS system benefits from NN and FIS's collaboration by utilizing their advantages. The system's conversion to straightforward if-then rules is another crucial benefit of this network. ANFIS's if-then control structure gives it the capacity to handle non-linear functions. It is shown that ANFIS has been applied in several study fields and yields generally effective outcomes. It is usually known that ANFIS may be used with many algorithms to reduce training phase error. For instance, the least square approach and gradient descent can increase the efficiency of finding the optimal parameters. This research shows that ANFIS functions similarly to the fuzzy system that Takagi and Sugeno presented in 1985 [30]. The forward section's consequence factors are determined using a least-squares method.
Input, membership function, fuzzification, defuzzification, normalization, and output are the five general layers that makeup ANFIS. An additional explanation is based on Figure 1 and assumes that the ANFIS system has two inputs, x, and y, while the output is denoted as f in Equation (1) and (2).
Fuzzy sets A 1 and B 1 are used here, and design parameters p i , q i , and r i are used where i = 1, 2. The membership makes up the top layer of the ANFIS structure. This layer's nodes are adaptable. For each input, membership ratings are created in this layer. The following equations can be used to explain the functionality: Nodes' linguistic labels are shown as Ai and Bi when x and y are the inputs. The grades of the membership for a set A (A1, A2, B1 and B2) are µ Ai (x), and µ Bj (y), respectively, which are adaptive. For instance, the following equation is used when the bell-shaped is employed.
In this case, the bell-shaped function's corresponding parameters are a i , b i , and c i . Simple multiplication is carried out in the following layer, which comprises fixed nodes. Following is a presentation of the layer's mathematical expression.
A fixed node normalization layer comes after that. This layer is where the output from the second layer is normalized. The operation is demonstrated by the equation below.
Here, w i displays the firing power of node i. Creating the normalized output from the third layer can be more straightforward than the fourth. The outcome of this adaptive layer may be shown using the equation below.
Only one fixed node makes up the final layer. This node adds up all the inputs that are received. Ultimately, the complete result can be extracted by applying the equation below.
Since back-propagation and least squares techniques improve the method's accuracy and speed up convergence, ANFIS has a more substantial capacity for learning. As previously stated, this system uses six tunable parameters (while a bell shape is used). The primary goal of this ANFIS system is to tune these settings to get the lowest cost. The first layer's parameters will be adjusted through back-propagation, and the fourth layer's parameters will be adjusted by the least squares technique [34].
With two main inputs and one main output, the Cascaded-ANFIS algorithm is a repeated ANFIS algorithm. The critical difference between the Cascaded-ANFIS algorithm and the classic ANFIS algorithm is that the output of the conventional ANFIS technique is used as the input for future applications of the traditional ANFIS method. Figure 2 is a valuable tool for presenting the building of this method.  The Cascaded-ANFIS algorithm comprises two major parts: the pair selection method and the training method. The Pair Selection module solves the first considerable issue with ANFIS. The Pair Selection module solves the first significant issue with ANFIS. However, the inner layers of the ANFIS model use fuzzy, merely like the standard ANFIS technique. Membership functions convert numerical data into fuzzy members and are used to achieve fuzzification.
However, the original method uses each characteristic to build a robust model, equally valid for noisy data sets. The novel Cascaded-ANFIS method manages computational complexity through its Training.
The pair Selection takes advantage of sequential feature selection (SFS). This technique employs a 2-input, 1-output ANFIS model to find the best match for each input. The training module also makes use of the 2-input ANFIS model. The ANFIS module may receive the input variables directly since they are connected to the preceding module's best match, which results in current outputs and Root Mean Squared Error (RMSE) for each data pair. The expected error is then contrasted with the RMSE. There is now an error with a pre-determined aim as well. The procedure can be finished if the goal error is attained. If not, the algorithm advances to the next iteration.

III. METHODOLOGY A. PROBLEM FORMULATIONS
The following relationship shown in Equation 10 was modelled using the Cascaded-ANFIS algorithm. The relationship was trained using the ground-measured rainfall at i th station and water level. Subscript t in Equation 10 denotes the time domain of the R-R relationship.
However, it is well noted that time domains can be shifted from rainfall to runoff from that rainfall due to the catchment characteristics like river length, catchment area, land use patterns, and soil type. The travel time of a particular rainfall event has to be clearly understood. Figure 3 develops the flowchart for the developed Cascaded-ANFIS model. As shown in the Figure, the rainfall data is used as the primary input of the system. Then the input data are re-arranged with a delay of one day and two days. The inputs were then removed based on the computation of the correlation between each input and the output of the flow level. A minimal correlation of 0.40 between an input and an output was used in this case. The selection methodology of inputs is discussed in later sections.

B. COMPARATIVE ANALYSIS TO IDENTIFY THE BEST ALGORITHM
Six regression algorithms (Linear Regression, Ridge Regression, Lasso Regression, Long Short-Term Memory (LSTM), Grated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)) together with the Cascaded-ANFIS algorithm were used to formulate the R-R relationship. These ML algorithms were considered in this study due to a few specific reasons, such as algorithms being similar and easy implementation. Moreover, they are low in weight and can be processed in a general computer without GPU support. Table 1 shows the parameter values considered for tuning hyper parameters of Ridge, Lasso, LSTM, GRU, and RNN tuning. These parameters were selected based on trial and error methods. Each parameter is tested with the datasets used in this study and employs the optimum value.
The Cascaded-ANFIS used three Gaussian membership functions for each input in the system. The whole cascades were ten to achieve satisfactory accuracy and error value.

C. THE MAHAWELI RIVER SUB-CATCHMENT ANALYSIS
Localized floods can be observed in sub-catchments in Figure 5 without showcasing major floods downstream due to the catchment characteristics. Therefore, the downstream river gauge may not observe any flood situation. However, upstream sub-catchments might have experienced localized floods. Therefore, it is essential to cluster larger catchments into sub-catchments and then analyze them separately. This scenario was analyzed in this research work and formulated Equation 10 for sub-catchments.

D. FLOOD IDENTIFICATION
According to the desinventar dataset of natural disasters [78], there has been significant damage due to flooding in Sri Lanka. In most cases, the damage has increased due to unexpected heavy rainfall and poor irrigation management. The database reveals that in the past events from 2005 to 2018, there was at least one death due to flooding. The highest number of deaths, injured and missing personals were recorded in 2017, with 67, 73, and 63, respectively.
Historical water levels were analyzed to define threshold water levels to identify floods in the basin. Here, water levels were considered because the authorities recorded the data as water levels instead of the water flows. If the water levels or stream flows exceed the threshold, that flow may be a flood. However, this can be confirmed with the ground-measured discharge data and by comparing flood data to the catchment. Nevertheless, many countries do not have these flood databases, so there can be some issues with the accuracy [79].

E. SHARED SOCIO-ECONOMIC PATHWAYS (SSP) CLIMATE DATA EXTRACTION
IPCC's sixth report [60] presented a new set of scenarios based on greenhouse gas emissions to project the future climates until 2100. Practitioners who engage with future climate data may investigate climate changes across a range of quite diverse futures thanks to the availability of climate forecasts for numerous Shared Socio-Economic Pathways (SSPs). These SSPs are titled SSP1, SSP2, SSP3, SSP4, and SSP5 under several Socioeconomic Pathways. SSPs describe potential future growth pathways for human cultures. A set of models combine assumptions on the ambitions for reducing the impact of climate change with predictions about how population, education, energy usage, technology, and other factors may evolve over the next century. Various conceivable future climates, from a pessimistic high-carbon scenario to a low-carbon one that satisfies the goals of the 2015 Paris Agreement, are described in the climate change forecasts from these scenarios [25], [26].
The Representative Concentration Pathways, or RCPs, or earlier projections of greenhouse gas concentration, are improved upon by SSP-based scenarios. To investigate the consequences of various emission trajectories or emissions concentrations, RCPs were explicitly created for the community of climate modellers. It is challenging to relate social trends such as population growth, educational attainment, and government policies to climate objectives like limiting global warming to below 2 • C since the socioeconomic factors used to establish RCPs need to be standardized. To address this, SSPs outline how societal decisions might alter Radioactive Forcing towards the end of the century. As a result, SSPs were built on RCPs to enable a uniform comparison of societal decisions and the degrees of climate change they cause. These SSP data are used in various recent research studies such as flood forecasting [35], land use optimization [36], and prediction of air pollution for the future [37]. Climate change research [37]. According to these studies, the reliability of SSP data is much higher than the RCP data. Therefore, this study employed SSP projections for daily rainfall data acquisition [27], [28]. Here, two SSP scenarios have been used for the data acquisition, such as SSP2-4.5 and SSP5-8.5. SSP2-4.5 redevelops the low carbon impact globally, while SSP5-8.5 is the high carbon scenario.

F. BIAS CORRECTION
The extracted rainfall data under SSP2-4.5 and SSP5-8.5 were corrected using linear bias correction factors. Usually, the data extracted from climate models may have some systematic errors [61]. Therefore, the model's extracted climate data are corrected for bias using the ground-measured climate data. Various bias correction techniques are available [62]; however, the linear bias correction method was selected in this research work. Equation 11 gives the simple mathematical formulation for linear bias correction. More details on this can be found in Chaturanika et al. [63].
where RF, d, µ m , his, obs, and sim are rainfall, daily, longterm monthly mean, raw SSP data, observed/measured data, and raw RCM forecast. The symbol * denotes the biascorrected datasets.

G. PROJECTED WATER LEVELS AND FLOODS
Bias-corrected SSP rainfall data were fed to the developed R-R relationship in Equation 10. Based on these future rainfalls under two SSP scenarios, the stream flows in the means of water levels were predicted for future years. These predicted water levels for the whole catchment were tested for the extreme values in the time series and then identified localised and downstream floods. These predicted floods are given for the near future (from 2022-2030) and mid-future (2031-2050).

IV. CASE STUDY
Sri Lanka is a country blessed with water resources. However, heavy monsoon rainfall drives many rivers into floods, and annual floods are quite often [64]. Sri Lanka has many rivers, tanks and lakes, and these watersheds are flooded during the monsoon periods. Several deaths and excessive structural damage are annually reported due to extreme weather conditions. Sri Lanka has 103 rivers, and the total length of the rivers is around 4500 km. The longest river in Sri Lanka is the Mahaweli River. It is 335 km long and covers a 10488 km 2 river basin which covers almost one-fifth of the total area of the island [65], [66]. The river has several branches along the way to the sea. 40% of the total electricity demand of Sri Lanka is provided by the hydropower generated by the Mahaweli River. Nevertheless, the Mahaweli River is known to provide a vast water supply for the cultivation of crops such as rice and vegetables [67]. Several Mahaweli River developments have been for hydroelectric generation and irrigation purposes. Many dams were constructed along the river to enhance energy generation, which led to flood risk changes. Kothmale dam was one of those constructed to generate electricity; however, indirectly, it has mitigated the floods downstream [68]. The Mahaweli River was selected for this research study due to its importance in many utilities and its frequent floods in the northeastern monsoon period (from December to February).

A. STUDY AREA AND SUB-CATCHMENTS
The Mahaweli River starts from the central hills of Sri Lanka with several small creeks. Agra Oya from Horton Plains is one of the starting creeks of the Mahaweli River. The river reaches the Bay of Bengal on the southwestern side of Trincomalee Bay. The bay includes the first of several submarine canyons, making Trincomalee one of the finest deep-sea harbours in the world. As part of the Mahaweli Development program, the river and its tributaries are dammed at several locations to allow irrigation in the dry zone, with almost 1,000 km 2 (386 sq mi) of land irrigated. Figure 5 develops the primary catchment and sub-catchments, whereas Figure 4 shows the catchment of the Mahaweli River basin. Two sub-catchments were identified along two tributaries of the Mahaweli River. The catchment above Peradeniya (for Kothmala Oya and other parts upstream creeks of Mahaweli River) is given in sub-figure (a). In contrast, the catchment VOLUME 11, 2023 above Thaldena for Badulu Oya is given in sub-figure (b) in Figure 5. The sub-catchment at Peradeniya is in the wet zone of the country; thus, heavy rainfall can be experienced. However, the sub-catchment at Thaldena is in the wet and intermediate zone. Thus, the rainfall in that sub-catchment is not as high as that at Peradeniya. However, these two sub-catchments are essential in terrain, land use, and urbanization. In addition, two flow gauges can also be found in these two sub-catchments.
B. DATA Figure 4 shows rain gauges for the Mahaweli River basin. Due to the unavailability of complete data in most of the years, the daily rainfall data from 2000 to 2017 were purchased from the Department of Meteorology, Sri Lanka. The missing data percentage for the selected years was less than 1%. The rain gauges were selected to represent the whole catchment covering as much as its area. In addition, the stream flow gauge at Manampitiya was selected to model the R-R relationship. This is the most downstream stream flow gauge available. The water levels at the station were purchased from the Department of Irrigation, Sri Lanka. Furthermore, two water level measuring stations were identified for the selected two sub-catchments: Pereadeniya and Thaldena (refer to Figure 5). The water levels for these two stations were also purchased for the same period from the Department of Irrigation, Sri Lanka. A descriptive analysis of the dataset used in this analysis are shown in Table 2. There were 6207 data samples in the dataset. The selected dataset was divided at a ratio of 7:3 for the training and testing. These sub-dataset samples were used to train and test the algorithms used in this study. The water levels are presented in centimetres, whereas the rainfalls are millimetres. Moreover, several homogeneity tests were conducted, such as the Standard normal homogeneity test (SNHT), Buishand range (BR) test, Pettitt test, and von Neumann ratio (VNR) test to evaluate the dataset before employing it in training models.
Due to the missing data in a significant time frame, few rainfall stations were omitted in the evaluation of the case study. The missing data were presented in Huruluwewa, Dambuluoya, Ulhitiya, Minipe LB, and Rantembe. Therefore, as shown in Table 2, 13 rainfall station data were considered as the inputs.
The correlation calculation in subsection III-C is given in Table3. The selected outputs are highlighted with a minimum of 0.4 correlation. Twelve inputs were selected using the correlation method to train the R-R model. The trial and error method made the selection based on the correlation. At a correlation value of 0.40, the maximum accuracy was obtained. Then the general structure of the Cascaded-ANFIS was used to generate the final outputs of predicted water levels. Additionally, according to the literature, it is considered negligible if a correlation is 0.30 or below. Therefore, 0.40 and above values were considered safe marginal inputs in the system [80]. Figure 6 shows the annual water level measurements at each of the observation points, such as the primary catchment of Mahaweli River (Mannampitiya) and sub-catchments of Mahaweli River (Peradeniya and Thaldena). It can be seen that Mannampitiya water outlets record a higher level of water when compared with the sub-catchments. As Sub-catchments Pereadeniya and Thaldena showcased some higher water levels comparable to the higher water levels at Manampitiya; however, some differences can also be observed (refer to Table 4). Thaldena has not showed a significantly higher water level in 2012. Still, higher water levels were observed at Manampitiya during the same time (t 1 , t 2 , and t 3 in sub-figure (a) in Figure 6). Similar trends can be observed in Peradeniya too. Therefore, the analysis of sub-catchments for floods is highly justified. Comparable observations have led the authors to define flood thresholds for Peradeniya and Thaldena. The threshold for Peradeniya was considered 6 m, while 3 m was considered for Thaldena. The flood events were identified in Peradeniya and presented as t 1 , t 2 , and t 3 in sub-figure (b) in Figure 6 (6.7 m on 03/06/2013, 6.9 m on 14/09/2013, and 6.7 m on 26/12/2014). In comparison, two incidents were identified for Thaldena and presented as t 1 and t 2 in sub-figure (c) Figure 6 (3.1 m on 02/02/2011 and 3.5 m on 26/12/2014).
where u(t) is the predicted parameter,ū(t) is the mean of predicted parameterv(t) is the measured parameter, k is the population size, andv(t) is the mean of the measured parameter. The correlation coefficient (R) redevelops the goodness of fit. It varies from -1 to 1; the best is when it becomes 1. Bias tells the differences between predicted to measured values. The ideal bias value is 0, and 1 becomes the worst. NSE calculates the perfectness of the match between actual and prediction. The results of the NSE can vary between minus infinity being the worst and 1 being the ideal [75]. KGE is a combined calculation of three primary parameters: NSE, bias, and coefficient of variation. Recently it has been used rapidly in hydrological model performance calculations [74].

B. PERFORMANCE EVALUATION
The river in this case study is the longest in Sri Lanka. According to the geographical experts in Sri Lanka, it is considered that the maximum time duration of travelling water from the start to the end of the river is less than three days. However, there are several reservoirs and dams along the river. Hence, we have considered 1-day, 2-day, and 3-day lags to include all corresponding scenarios in the calculation.

1) CORRELATION OF COEFFICIENTS CALCULATION FOR THE PRIMARY CATCHMENT
The primary catchment of the Mahaweli River consists of 13 rain gauges, all of which were used to predict the water level at Manampitiya. As mentioned in the previous sections, the experiment was designed to identify the best R-R prediction algorithm. Figure 7 shows the coefficient of correlation VOLUME 11, 2023    of the predicted water to the ground-measured water level at the Manampitiya river gauge. Figure 8 develops the prediction accuracy under combined scenarios initially identified as per Table 3 for the predicted water levels at Manampitiya. Figure 9 and 10 shows the prediction accuracy of water levels for each algorithm for the sub-catchments Peradeniya and Thaldena.

2) CORRELATION OF COEFFICIENTS CALCULATION FOR SUB-CATCHMENTS
Additionally, other parameters were used to evaluate the results, such as Bias, NSE, RMSE, and KGE. The evaluation results are presented in Table 5. . This is very surprising. This can be due to several reasons, including the future data quality and bias correction technique. However, these strange results imply that the researchers conducted some extensive projected flood analysis based on the ground-measured flow situations. In addition, the R-R model can be implemented for Representative Concentration Pathways (RCPs) and then analyze the differences.

A. MODEL EVALUATIONS
According to the sub-figure (i) in Figure 7, it can be seen herein that the GRU algorithm with an R of 0.9301 performed the best prediction. In addition, the LSTM algorithm with 2-day back rainfall data (t-2 scenario) performed as the second best with 0.9265 (refer to the sub-figure (l) in Figure 7). Interestingly, as per sub-figure (b) in Figure 7, the Cascaded-ANFIS algorithm showcased its highest R-value at 0.9140 for 1-day back rainfall data (t-1 scenario). However, it can be clearly understood that three scenarios separately cannot be used to model the R-R relationship. In other words, the rainfall which occurs two days back for the most upstream location can reach Manampitiya on the current day. Similarly, rainfall received one day back in another location can reach Manampitiya on the current day. Therefore, a combination of these three scenarios has to be considered.
As in the selected rainfall gauge analysis, it was clear that the results were more consistent and accurate. The Cascaded-ANFIS algorithm-based prediction model had   were outperformed by Cascaded-ANFIS. Therefore, the Cascaded-ANFIS algorithm can be used effectively to predictions of water levels.  The sub-catchment correlation coefficient analysis in Figures 9 and 10 shows that the Cascaded-ANFIS algorithm has outperformed the other three algorithms in predicting water levels at the sub-catchment level. In Figure 10, the correlation coefficients were found to be 0.9188 for Cascaded-ANFIS, 0.8894 for LSTM, 0.9082 for GRU, and 0.8594 for RNN. Therefore, the water level prediction for the Thaldena sub-catchment also succeeded by the prediction model developed based on the Cascaded-ANFIS algorithm.
The proposed algorithm shows the least RMSE with 0.66. The proposed algorithm also scored the highest NSE and KGE values, with 0.87 and 0.90. The second-best performances were shown by the GRU algorithm having RMSE, NSE, and KGE as 0.79, 0.83, and 00.88. When considering the bias factor of the predicted outputs, the Cascaded-ANFIS model shows a significantly low value of 1.52. This low score for the bias provides a certification that the model can predict the water levels with higher accuracy and lower bias. The overall results are shown in Table 5. It is also clear that the Linear, Ridge, and Lasso algorithms' performances are significantly miniature compared to the other LSTM, GRU, RNN, and Cascaded-ANFIS algorithms.

B. FORECASTING OF THE RIVER WATER LEVEL
Let the predictions be accurate (assumed). Then, there is a severe issue in the water levels, thus the river flow at Manampitiya. The average water levels for Manampitiya are around 10 m (from its historical data). However, the projected water levels are around 6 m (60% of the average). Therefore, drought conditions can be projected. The predicted outcomes of the trained model can be a result of the dataset. The dataset provides a short range of rainfall data. Therefore, more than the sample size may be needed to train a perfect R-R model. However, this cannot be considered a conclusion of this study. Even though the prediction accuracy is good in the Cascaded-ANFIS model, future data quality is critical in a solid prediction. Therefore, Figures 11 cannot be treated as a conclusion of this study.
However, these water levels were presented in Figure 12 shows the forecasting of water levels at Manampitiya for the projected rainfalls. From the year 2031 to 2050, forecasting is shown in sub-figures (c) and (d) in Figure 12 respectively for SSP2-4.5 and SSP5-8.5. The X-axis contains 365 ticks representing days of the year, and the scale bar on the right side of Figure 12 showcases the intensity of the water level. During the northeaster monsoon (December to February), the water levels can be observed at higher levels, as predicted at 8932 VOLUME 11, 2023 Manampitiya. However, the SSP5-8.5 scenario has projected lower water levels for mid-year, reaching less than 1 m. These can be droughts. However, the SSP5-8.5 is a higher scenario for fossil-fueled development. This observation cannot be seen in the SSP2-4.5 scenario. The key observations are indicated using black and white squares where black being lower water level periods and white being higher water level periods. Nevertheless, as discussed, more research is needed for a solid conclusion on future water levels.

VII. CONCLUSION
An R-R prediction model was developed using the Cascaded-ANFIS algorithm for the Mahaweli River, the longest river in Sri Lanka. The R-R model was developed for the sub-catchment levels as well. The dataset used in the case study was well evaluated using four different methods of homogeneity tests Standard normal homogeneity test (SNHT), Buishand range (BR) test, Pettitt test, and von Neumann ratio (VNR) test. The algorithm was tested against six regression algorithms used in most past studies: Linear regression, Ridge regression, Lasso regression GRU, LSTM, and RNN. The results were comparatively studied using correlation coefficient, bias, RMSE, NSE, and KGE. The highest correlation coefficient was recorded by the Cascaded-ANFIS when utilizing the selected rainfall gauges to train the models having 0.933 where Linear, Ridge, Lasso, GRU, LSTM, and RNN showed the R values of 0.6811, 0.6811, 0.6734, 0.9133, 0.9120, and 0.8915, respectively.
Moreover, the bias value of the proposed algorithm is significantly low (1.52) compared with the other algorithms. The Cascaded-ANFIS model scored 0.66, 0.87, and 0.90 for RMSE, NSE, and KGE, respectively. These results outperformed the other algorithms used in this study.
According to the overall results, it can be concluded herein that the Cascaded-ANFIS algorithm-based prediction model has outperformed the other six algorithms. The second-best algorithm that performed well in prediction was the GRU algorithm. However, the Cascaded-ANFIS algorithm has advantages compared to the black-box regression models, such as lightweight, lower computational cost, easy real-time implementation, and efficiency. Therefore, the Cascaded-ANFIS algorithm can predict the water levels of various catchments under the requirement of measured rainfalls and water levels. More importantly, the model can be developed under mixed rainfall input along the timeline due to the upstream waterś travel time to the riverś downstream.
Overall the prediction model based on the Cascaded-ANFIS algorithm predicts accurate results using the groundmeasured rainfalls. The future water levels were projected under two SSP scenarios for the Manampitya station. However, promising results were only found under the near future and mid-future SSP rainfalls. None of the years was projected to have unacceptable floods (by looking at the records). Therefore, this research does not provide any conclusions about the future projected water levels. More research is needed for a solid outcome for future water levels.