PREDICTIVE ANALYTICS MODIFICATIONS IN WAVELET: CASE STUDY ON SONGKHLA LAKE BASIN RUNOFF

Accurate short-term rainfall-runoff prediction is very important for flood mitigation and the safety of infrastructures in Southern Thailand. This study aims to utilize both analysis and prediction of the runoff forecast by combining the wavelet technique with regression and artificial neural network. The daily rainfall and runoff data were collected from 1,031 days during January 2017 to October 2019 in Songkhla Lake Basin, Thailand. The performance of calibration and validation of the models is evaluated with appropriate statistical methods; coefficient of determination (R), root mean square error (RMSE) and Nash-Sutcliffe efficiency coefficient (ENS). The results of daily runoff series modeling indicated that the wavelet artificial neural network model performed the best among those models. This model showed the Coefficient of Determination, Nash-Sutcliffe Efficiency and Root mean Square Error in the value of 0.9999, 0.9998 and 0.0037, respectively. These values explained that the model can describe 99.99% of the variation of the current runoff in Songkhla Lake Basin.


INTRODUCTION
Over the past 40 years, Thailand has been hit by many weather extremes. Floods are one of the most serious natural issues and present major social concern in Thailand that causes massive damage to lives and properties. Back in 2011, there was a great flood disaster that covered more than one-third of the provinces in Thailand that reduced 14% of Gross Domestic Product [11].
Runoff forecasting plays an important role in water management and flood prediction.
Recently, there are a lot of papers explained that runoff is affected by many factors such as rainfall, land use, soil infiltration rates and others. The relationship between those factors and runoff is complicated and it is a non-linear relationship. Despite a non-linear relationship, none of a statistical model can describe the complexity of the relationship. There have been many attempts to find the most effective model for runoff forecasting.
Artificial neural network (ANN) models have been wildly used in the studies about the rainfall-runoff model [1,2,4,[12][13][14][15][16]. The neural network models are machine learning models based on studies of the brain and nervous system. Moreover, ANN has a flexible mathematical structure and has an advantage in various fields of science which is due to its can model both linear and non-linear without considering any assumptions as in the statistical models [3,18].
In the past decade, wavelet transform has been successfully used with a highly non-linear model. Each study showed that the combination of wavelet transform and other models provided higher accuracy than normal models [1,2,4,5,9,10,14,16]. A combination of a wavelet transform, artificial neural network and regression models are presented at a rainfall-runoff model. According to the Coefficient of Determination (R 2 ), Root Mean Square Error (RMSE) and Nash-Sutcliffe Efficiency (ENS), it can be concluded that the artificial neural network model with wavelet transform is more efficient than the regular artificial neural network and regression model.
In this paper, purposing to compare the performance of regular regression and artificial neural network model with wavelet regression and artificial neural network for forecasting runoff in Songkhla Lake Basin, Thailand.

Study Area:
Songkhla Lake Basin is located within 645&800 North latitude 9930&10045 East longitude. The approximate total area is 8,484.35 km 2 separated into the land area and lake area of 7,652.81 and 831.54 km 2 respectively. Songkhla Lake Basin has been influenced by the northeast monsoon that blows through the South China Sea and the Gulf of Thailand. As a result, the Songkhla Lake Basin receives water vapor, resulting in heavy rainfall from October to December. The average annual rainfall and runoff of Songkhla Lake Basin are 1,992 mm and 4,808 million m 3 , respectively. If rain occurs consecutively with rainfall of 90.1 mm. or more within 24 hours, it can cause a flash flood. From the interpretation of THEOS and LANDSAT-5 TM satellite data, it was found that most of the flooding areas were lowland areas especially in the plain area around the basin. The map of Songkhla Lake Basin is shown in

Data Collection:
In this study, the daily rainfall and runoff time series had 1031 days from January 4th, 2017 to October 31st, 2019 in Songkhla Lake Basin, Songkhla Thailand that was obtained from Climate Information Service, Meteorological Department of Thailand [6] and Southern region irrigation hydrology center, Bureau of Water Management and Hydrology [17], respectively. The first 721 days (70% of data) were used for calibrating the model and the remaining 310 days (30% of data) were used for validation.

Wavelet analysis
Wavelets are functions that use to decompose function. The wavelet transform decomposes a function into a family of wavelets, creates a representation of the function in both the time and frequency domain, thereby allowing efficient access of localized information about the function, so it is a mathematical tool useful for data analysis, its transform function with time domain to a frequency domain. The family of wavelets contains the dilated and translated versions of a mother function, which is called a mother wavelet. The scale and translation of wavelets determine how the mother wavelet dilates and translates along the time or space axis. A scale factor greater than one corresponds to a dilation of the mother wavelet along the horizontal axis, and a positive shift corresponds to a translation to the right of the scaled wavelet along the horizontal axis. There are two types of wavelet transforms: Continuous wavelet transforms and discrete wavelet transforms. Continuous wavelet transform (CWT) uses every possible wavelet over a range of scales and locations i.e. an infinite number of scales and locations. While the discrete wavelet transform (DWT) was developed by Ingrid Daubechies, uses a countable set of wavelets that is defined at a particular set of scales and locations. The method of discrete wavelet transform can be described by the multiresolution analysis as the following.

Let
() x  be a mother wavelet function. For integer numbers, j and k , define the operations of translation and dilation of the mother wavelet [8].
The jk  are the wavelet transform coefficients of () fx.
Every orthonormal mother has an auxiliary wavelet function can be projected into the subspace 0 V so that the development of this projection in term of the moving scale function is an approximation of () fx such that In general, for each integer number j , we have the closed subspaces Therefore, every function 2 () fL  can be projected into the subspace j V so that the development of this projection in term of the moving scale function is an approximation of () fx such that can be written in term of an approximation subspace j V in the j resolution which is formulated on scaling function or low pass filter and the orthogonal terms containing the finer details associated with j W which is formulated on mother wavelet () x  or high pass filter. As in Figure 2, the multiresolution analysis builds a structure that requires an iterative application of scaling and mother wavelet, the function is split between the low-frequency approximation and the high-frequency details, respectively.

The Artificial Neural Network
The Artificial Neural Network is a technique of Artificial Intelligence that simulates the functioning of the nervous system in the human brain. The structure of the neural network is shown in the figure below.

Model Performance
The performance of models during calibration and validation were evaluated by using the statistical indices: Coefficient of Determination (R 2 ), Root Mean Square Error (RMSE) and Nash-Sutcliffe Efficiency (ENS). The formula of each statistical index is presented below:

RESULTS AND DISCUSSION
In this study, input variables were rainfall of current day, past 1 day, past 2 days, past 3 days, and runoff of previous 1 day, 2 days and 3 days expressed as Pt, Pt-1, Pt-2, Pt-3, Qt-1, Qt-2, Qt-3, respectively. These variables were used to examine the regular regression model, regular artificial neural network model, wavelet regression and wavelet artificial neural network models.
The structure of the regular and wavelet artificial neural network consisted of 12 nodes on the hidden layer with 5,000 epochs.
The original time series of daily runoff was decomposed into Details and Approximations by discrete wavelet transform algorithms. According to the trend of the series, Daubechies3 at level 5 was used as the mother wavelet. Consequently, D1, D2, D3, D4 and D5 were detail components, and A5 was approximation components. The decomposed components of details and approximation together with the original data of daily runoff were shown in Figure 4-5.
As can be seen from

CONCLUSIONS
In this study, both of the regression model and the artificial neural network model were found as a model that can reveal the complex relationship between runoff and rainfall. When wavelet transform was used, it was able to significantly improve the performance of those models. Therefore, in terms of adoption of both models in a timely manner provide an alert or warning in the event of flash flooding to the residents of Songkhla Lake Basin.
This study used only two influenced factors, the researcher recommended to investigate other influenced factors to runoff such as water flow rate, soil water absorption rate, evapotranspiration, land-use, etc. to optimize forecasting for long terms.
From this study, regarding the efficiency of the daily runoff forecast model in the Songkhla Lake Basin, noted that if the predictive models obtained from this study are used in organizations dealing with water resource management and flooding. It was found that the both of wavelet

ACKNOWLEDGMENT
Thanks to the Climate Information Services for daily rainfall data and the Water crisis prevention center for daily runoff data in Songkhla Basin, Thailand, to Mr.Sukrit Kirtsaeng for suggesting the study area and to Ms.Ratchaya Weerakarn for editing this article. And thanks to the established program of Smart Analytics for SME and Community (SASMEC).