Comparative Analysis of Artificial Neural Network (ANN) and Wavelet Integrated Artificial Neural Network (W-ANN) Approaches for Rainfall Modeling of Southern Rajasthan, India

This paper addresses the challenge of predicting erratic rainfall in Rajasthan state of India, particularly in southern regions. Reliable rainfall predictions are crucial for water resource management and agriculture planning. The research involved selecting 58 stations across seven districts of southern Rajasthan and identifying the best fit computational neural (ANN) and wavelet integrated computational neural (W-ANN) architectures based on performance metrics. Different combinations of input characters, hidden layer neurons, learning algorithms


Introduction
Rainfall holds significant importance for agricultural, domestic, industrial, and recreational purposes.The per capita availability of surface water in India was 2309 cubic meters in 1991 and decreased to 1902 cubic meters in 2001.Projections indicate a further decline to 1401 cubic meters by 2025 and 1191 cubic meters by 2050. 1 The mounting pressure on PARADKAR & MITTAL, Curr.World Environ., Vol.18(3) 1123-1137 (2023)   water resources is a consequence of the escalating population. 2 The formation of rainfall involves intricate interplays of dynamic, thermodynamic, and cloud microphysical processes across vast spatial and temporal scales. 3The stochastic nature of rainfall makes its prediction a formidable challenge.The complexity is amplified by the difficulties in accurate measurement at scales relevant to hydrology and climatology. 4olonged dry spells or heavy rains during critical crop growth stages can lead to substantial yield reductions, significantly impacting the national economy. 5Predicting rainfall patterns becomes pivotal in assessing the overall consequences of climate change. 6Globally, the surge in urbanization, industrialization, and population growth has heightened water demand. 7Studying extreme rainfall events holds paramount significance for water resources management. 8jasthan heavily relies on rainfall, averaging 594. 9 mm annually, with significant variability and sporadic dry spells, particularly in the west. 9Southwest monsoons contribute 75 to 95% of the yearly rain, primarily between June and September, crucial for local farmers. 10Any kind of deficiency in monsoon, mostly because of climate change causes higher frequencies of droughts in the parts of country such as Rajasthan as high as once every four years. 11imited water resources, erratic rainfall, and repetitive droughts may lead to reduced agricultural and economic conditions in some parts. 12Therefore, there is a need to adopt a proactive approach by strengthening the scientific advancement in predicting rainfall.Hydrological data, crucial for predicting rainfall, often exhibits a non-linear character. 13Traditional models like regression models have limitations because they operate under the assumption that data is both linear and stationary.Such models do not deal with nonlinearities in the data. 14An Artificial Neural Network (ANN) can be defined as a computer program which mimics the brain's information processing using interconnected artificial neurons.These neurons form layers and are linked by coefficients, creating a neural structure.ANNs excel at capturing complex nonlinear relationships in data, especially when conventional mathematical models fall short. 15They've proven highly beneficial in tasks like rainfall forecasting and runoff modelling due to their nonlinear nature.ANNs have been in use as forecasting models for the past two decades in various scientific areas. 16cently, the combination of wavelet analysis and artificial neural networks, known as 'W-ANN', has gained attention for its superior predictive accuracy compared to individual ANN analysis. 17,18Breaking down a non-stationary data series into various levels through wavelet decomposition bring forth a way to understand the underlying structure of the series and extract meaningful historical information. 19y incorporating wavelet-transformed series into forecasting models, a hybrid wavelet-ANN approach enhances predictive ability across different resolution levels. 20The effectiveness of wavelet function types on ANN model performance remains underexplored.Initial instances of the wavelet-ANN model were applied to financial time series forecasting, 21 groundwater level prediction, 22 and rainfall runoff modelling, 23 yielding varying degrees of accuracy.Focussing these aspects, the present study was conducted for on evaluating various ANN and hybrid W-ANN models for rainfall prediction of southern Rajasthan, India.

Study Area
Southern Rajasthan is an important physiographic unit of the Rajasthan state, situated amidst the embrace of the Aravalli mountain ranges.It consists of a total of seven districts, out of which six districts viz.Banswara, Dungarpur, Pratapgarh, Udaipur, Chittorgarh, and Rajsamand formed the Udaipur division.Bhilwara district, although not situated within the Udaipur division, is considered a component of southern Rajasthan. 24This region experiences an average annual rainfall ranging from 400 to 1100 mm.It is categorized into two agro-climatic zones: IV A, characterized as Sub-humid Southern, and IV B, classified as Humid Southern. 25The geographical coordinates of the region span from 23°01'10" to 26°01'15" N latitude and 73°01'10" to 75°43'30" E longitude, covering an expanse of 50,510 km 2 .The seven districts in the study area viz.Banswara, Dungarpur, Pratapgarh, Udaipur, Chittorgarh, Rajsamand, and Bhilwara occupy an area of 5037, 3770, 4117, 11724, 10856, 4551 and 10455 km 2 , respectively.The research area covers a distance of around 210 km from the southernmost point to the northernmost point and extends approximately 240 km from the westernmost to the easternmost point.Figure 1 illustrates visual representation of the study area's location.

Fig. 1: A chart depicting the geographical location of the study area Artificial Neural Networks (ANNs)
An ANN technique is successful in hydrological modelling due to its flexibility, efficiency with nonlinear and noisy data, and superior accuracy compared to other models. 15ANNs are mathematical models inspired by brain activity, utilizing distributed storage and parallel processing. 16

ANN Architecture
An ANN functions as a computational system comprised of artificial neurons, each serving as a processing element.This dynamic mathematical framework is adept at discerning intricate non-linear connections within input and output datasets. 26An ANN consists of interconnected units, with each unit possessing input and output capabilities, executing localized computations or functions.The output of a unit is dictated by its input/output characteristics.The architecture of an ANN is shaped by the inter-neuron weights, an activation function governing output generation in each neuron, and learning laws that specify the significance of weights concerning input to a neuron.Incoming signals undergo multiplication with corresponding weights as they progress toward the neurons.These signals aggregate at the neurons, and the resulting net input is subjected to the activation function to generate the output.A typical ANN architecture is depicted in Figure 2.

Fig. 2: Neural Network Structure with Single Hidden Layer
Let X i (where, i=1,2,3,…m) represent input characters and wi (where, i = 1,2,3,..n) denote their respective weights.The node's net input can be expressed by, ... (1)   In this study, ANN models were formulated using ANN toolbox of MATLAB (R.2014a) software.The most widely used neural networks are Multilayer Perceptron (MLPs).As the hydrological state of the catchment area determines the catchment's response to a rainfall event and due to intricate nature of the atmospheric events that generate precipitation, previous rainfall values are often used as input variables to ANN models. 3In the situations where comprehensive data on the required temporal and spatial scales is lacking, past rainfall values are commonly employed as inputs for ANNs.This approach is chosen because past rainfall serves as an indirect indicator of the hydrological state.The study focused on developing three distinct ANN model categories for predicting rainfall at specific stations within the designated area (Table 1).

ANN Model A (Input Layer with Six Neurons)
This model incorporated six neurons in the input layer, considering the rainfall data from the corresponding week over the previous three years and the rainfall of the three preceding weeks from the same year.

ANN Model B (Input Layer with Eight Neurons)
Model B featured eight neurons in its input layer, encompassing the rainfall information for the same week over the previous four years and the rainfall of the four weeks preceding the current week in the same year.

ANN Model C (Input Layer with Ten Neurons)
Model C was designed with ten neurons in its input stratum, utilizing the rainfall data from the same week over the previous five years and the rainfall of the five preceding weeks from the current year.
In Table 1, P m-n is the precipitation of the preceding 'n' week of the same year; RM-n is the precipitation of the same week of the preceding previous 'n' year.

An essential consideration in ANNs involves
determining the optimal configuration, including the quantity of latent stratums and the respective quantity of neurons within each stratum.There is no systematic method but a trial and error procedure is still being preferred choice of most users. 27The quantity of neurons in a latent layer is influenced by variables like input characters, output layer size, training cycles, noise in data, architecture, activation functions, and learning algorithms.While rules like having twice the input neurons can be helpful, 28,29 a more accurate approach is multiplying 0.67 with the input-output sum 30 or using (2n+1) based on input neurons. 31However, the most effective method is experimenting with various hidden unit counts during training. 32The quantity of neurons in the latent stratum ranged from 1 to 20.The decision on how many training cycles to use depends on the learning algorithm employed and the non-linearity between input-output.In the present study, a logsigmoid transfer function was used for generating outputs in ANNs.All neural network architectures were trained with the goal of a mean square error of 0.01 during both training and validation.

Learning Algorithms
A learning or training algorithm is a mathematical function that optimizes an error in order to modify the link weights.Two common types of algorithms PARADKAR & MITTAL, Curr.World Environ., Vol.18(3) 1123-1137 (2023)   used in this study are Levenberg Marquardt or L-M framework and the Resilient Back Propagation or R-P framework.L-M algorithm also denoted by trainlm, is a network training function that adjusts and refines the weights and bias values as per the L-M optimization. 33It is the fastest backpropagation algorithm in the ANN toolbox and needs lesser learning cycles. 34The resilient backpropagation algorithm also denoted by trainrp, is a network training function that adjusts and refines the weights and biases as per the R-P algorithm. 35This algorithm requires only a modest increase in memory. 36 the present study, the rainfall data from the year 1973 to 2022 i.e.50 years of data was used for rainfall forecasting.Out of this, two-thirds part i.e.

Wavelet Based Artificial Neural Networks (W-ANNs)
While artificial neural networks (ANN) offer flexibility in modeling hydrological time series, they present limitations when dealing with highly non-stationary signals in hydrologic processes that exhibit seasonal variations.To address this challenge, the wavelet transform emerges as a valuable tool, capable of decomposing non-stationary data series into sublevels at various scales. 37This decomposition aids in enhancing the interpretation of the hydrological process.In recent years, the wavelet transform has demonstrated success in various engineering applications. 38Its application extends to the investigation of the time-frequency characteristics of long-term climatic data. 39The wavelet transform's ability to provide insightful decompositions of primary time series allows for an improved understanding of the underlying processes at various resolution levels. 20Consequently, integration of an ANN coupled wavelet function results in a hybrid structure known as the wavelet-ANN (denoted as W-ANN).This hybrid approach proves effective in simultaneously capturing frequency and time-domain information from the signal, offering a robust framework for predicting hydrological processes. 40

Wavelet Transform
Wavelets can be characterized within realm of mathematics and used to provide a representation of time series data in terms of time scales along with their interrelationships. 18The process of wavelet analysis involves employing a mother wavelet function for the transformation.Wavelet transforms can be conducted in a continuous form, known as continuous wavelet transform (CWT), or a discrete form, referred to as discrete wavelet transform (DWT).Overall, wavelet transforms serve as a valuable tool for investigating time series, offering insights that contribute to forecasting and other empirical analyses. 39

Continuous Wavelet Transformation
The wavelet transformation on time-scale basis of a continuous data signal, (t), is defined as follows, 41 ... (2)   Where, g* denotes the complex conjugate and (t) is defined as mother wavelet.The character a plays a role of dilation coefficient, while b refers to a time shift of the f(t), which permits the signal around b for the evaluation.

Discrete Wavelet Transformation
One straightforward discretization method for the CWT involves employing the trapezoidal rule.In this approach, N2 coefficients are generated from a dataset of length N.However, it results in redundant information encapsulated within these coefficients, which may or may not be advantageous. 42To address this redundancy, an alternative is to adopt logarithmic continuous spacing for discretization of the scale.This choice allows for a better resolution of the b number of locations, enabling N transform factors to effectively denote a signal of capacity N. A discrete wavelet can be represented by,

…(3)
Where, m and n denote integers which control the expansion and transcription, respectively; a0 is a PARADKAR & MITTAL, Curr.World Environ., Vol.18(3) 1123-1137 (2023) particular expansion factor greater than 1; and b0 denote the position character which must be greater than 0. The commonly used and simple choice for these characters are a 0 = 2 and b 0 = 1. 43ar wavelet is the most suitable wavelet for modeling applications because it shows shift invariant property. 44It has better localization properties because it is a low pass filter concentrated over the narrowest support band. 45Therefore, in this study, by the Haar wavelet decomposition process, the original data series was hierarchically converted into n-level sub series at different frequency bands to reduce the noise.The sub series, i.e. decomposed details derived from the original time series by using discrete wavelet transforms were used as input for ANN to develop hybrid wavelet based ANN models.Thereafter, the procedure opted for training and validation of ANN models was also adopted for W-ANN models.A typical W-ANN architecture is shown in Figure 3.The research of 21,22 derived the following equation to find the appropriate decomposition scale of the main time series.

Y= int[log(Z)]
… Where, Y and Z are decomposition scale and length of the series, respectively.The performance of formulated W-ANN frameworks was evaluated in terms of the same statistical indices as used for evaluating ANN models.The performance of W-ANN frameworks was juxtaposed to that of ANN architectures.

Rainfall Forecasting using Artificial Neural Network (ANN) Techniques
It was observed that, significant rainfall occurred from meteorological week 22

Performance Evaluation of ANN Models for Varying Number of Inputs
It was observed that, Model C was best fitted for 74% of the selected stations followed by Model B (21% of stations) and Model A (5% of stations), respectively.Model A with six inputs was the lowest in terms of performance parameters.The performance measures R, MAE and RMSE for best fit ANN models in all three categories for the Arthuna station (as an example) are shown in Table 2.
The comparison between actual and forecasted rainfall for the Arthuna station using best fitted ANN architecture 10-16-1 (Model C) during the second stage of the validation phase (2016-2022) is shown in Figure 4.  Therefore, ANN architecture 10-16-1 (Model C) was best fitted for Arthuna station.Similar process was undertaken for finding optimum number neurons in the latent layer for all stations in the study area.

Rainfall Prediction using Hybrid Wavelet based Artificial Neural Network (W-ANN)
A black-box structure such as an ANN tends to underestimate peak values in time series data when confronted with sudden extreme inputs, such as heavy rainfall.In contrast, models like W-ANN, which incorporate information from current and past time steps with a focus on long-term periodicity memory, leverage historical data of extreme events to improve the accuracy of peak value forecasts.This approach allows for a more nuanced understanding and prediction of extreme occurrences in the time series. 46 the present study, the main rainfall data series was decomposed at various levels using the wavelet toolbox in MATLAB R2014a software.The decomposed details from discrete wavelet transform were then used as an input to ANN models as stated earlier and the performance of models was examined by the same statistical metrics as used in simple ANN models.Only the best fit models in simple ANN were used to formulate hybrid W-ANN models and their performance was compared.

Development of Hybrid W-ANN Models
Wavelet analysis was employed during the data pre-processing phase, enabling the extraction of lowfrequency data over extended time intervals and highfrequency data over shorter time intervals.In the decomposition process using Haar wavelet, the original input data series was hierarchically converted into 3-level sub series at different frequency bands for reducing the noise according to equation 4. The original and approximation time series at decomposition level 3 in the validation phase for best fit ANN model (10-16-1) for Arthuna station is shown in Figure 6.
The first data series is the original signal followed by decompositions at level 3 using wavelet transform.
The main signal underwent decomposition at level 3 using the Haar wavelet, resulting in 4 sub-signals (level 3 coarse representation and level 1, 2, and 3 fine intricacies).These four sub-signals serve as input layer neurons for the development of optimal ANN models for each station.

Performance Evaluation of W-ANN Models
The evaluation of W-ANN architectures was done using the same statistical measures as used for ANN models.The performance measures for best fit W-ANN model 10-16-1 (Model C) for Arthuna station are shown in Table 3.It can be seen that, R, MAE and RMSE values were 0.868, 3.267 and 1.125, respectively.The value of R in the training phase (0.861) was increased in the validation phase (0.875) due to shorter period of data used in the validation phase than in the training phase.The pictorial representation of actual and predicted rainfall using best fit W-ANN architecture during the second stage of the validation phase (on unseen data from 2016-2022) for selected stations is shown in Figure 7.The use of a simulated network (only input values) in the validation phase may cause a lower value of R. Therefore, a higher number of inputs substantially enhanced the model's performance across the majority of stations. 47,48,49reported improvements in ANN model performance with increased number of inputs.
The increase in the quantity of neurons in the latent stratum up to 10 did not have a significant impact on the model's performance.However, between 10 to 20 neurons their performances improved in the form of R, MAE and RMSE for most of the selected stations. 14,50,51reported improvement in ANN model performance between 10 to 20 neurons in the latent stratum.Thus, all three categories of models performed better with more number of neurons in the latent stratum.
Hybrid W-ANN models found superior than simple ANN models for rainfall forecasting of southern Rajasthan. 18,41,39,20also reported that wavelet based hybrid ANN models provide better accuracy as compared to ANN models due to the useful decomposition of original time series for extraction of information with reduced noise.The comparative performance of ANN and wavelet coupled ANN models in all phases for the Arthuna station is graphically depicted in Figure 3.5.Overall, W-ANN model found superior in all phases as compared to the simple ANN model due to the useful decomposition of inputs to model.

Conclusion
In this research, various ANN and wavelet based ANN architectures were employed for rainfall forecasting of southern Rajasthan.Model category C with ten neurons in the input layer was best fitted for 74% of the selected stations followed by Model B (21% of stations) and Model A (5% of stations), respectively.Model A with six inputs was the lowest in terms of performance parameters.Therefore, the increased number of inputs significantly improved the performance of models in most of the stations.The models showed consistent performance across an increase in the number of neurons in the latent stratum up to 10.However, a remarkable enhancement in the performance was observed as the number of neurons increased from 10 to 20.All three categories of models performed better with more neurons in the latent stratum.Hence, the most effective category of the ANN model for predicting weekly rainfall in southern Rajasthan involved using the rainfall data from the same week over the past five years, along with the rainfall data from the preceding five weeks in those respective years as input variables.Out of 58 stations, for 47 stations (81% of stations) the performance of W-ANN models was improved in terms of statistical indices due to useful decomposition and extraction of information at the appropriate resolution level.For 7 stations (12% of stations) the performance of wavelet coupled ANN models was at par with that of ANN models, while for the remaining 4 stations (7% of stations) the performance of wavelet coupled ANN models was observed poor than ANN architectures.Therefore, hybrid W-ANN models found superior than simple ANN models for rainfall forecasting of southern Rajasthan.The comparison of actual and forecasted values of rainfall revealed that both ANN and W-ANN models forecast weekly rainfall satisfactorily.The hybrid W-ANN model provides greater accuracy than the ANN model for rainfall forecasting based on higher values of the R, and lower values of the MAE, and RMSE.
33 years of data (from 1973 to 2005) was used for model development and one-third part i.e. 17 years of data (from 2006 to 2022) was used for model validation.ANN models were trained by adjusting interconnection weights for input-output matching.Training stopped based on validation dataset error increase.This approach helped select the best-performing ANN model.Performance assessment employed Pearson's Correlation (R), Root Mean Squared Deviation (denoted as RMSE), and Average Absolute Deviation (denoted as MAE) on the validation set.

Fig. 3 :
Fig. 3: Wavelet based Neural Network Architecture with Single Hidden Layer (MW-22) to meteorological week 42 (MW-42).The investigation into rainfall distribution in southern Rajasthan indicated that the weekly mean rainfall during the monsoon period (MW-22 to MW-42) exhibited a range from 2.1 to 68.4 mm, accompanied by corresponding standard deviations ranging from 9.3 to 79.7 mm.The selection of suitable ANN architecture for weekly rainfall forecasting was done based on training and validation data sets.The statistical performance of different models was evaluated for the training and validation phase.For Model A, the training period was 33 years (1976 to 2008) and the validation period was 14 years (2009 to 2022).Similarly, for Model B and Model C, the training period was 32 years (1977 to 2008) and 31 years (1978 to 2008), respectively.

Fig. 6 :
Fig. 6: Original and Approximation Rainfall Time Series (Validation Phase) for Best Fit ANN Model 10-16-1 at Decomposition Level 3 for Arthuna Station