Weather Parameters Forecasting as Variables for Rainfall Prediction using Adaptive Neuro Fuzzy Inference System (ANFIS) and Support Vector Regression (SVR)

The weather anomaly phenomenon that occurs can have some negative impact such as flooding, floods will paralyze the economic activities of the community, transportation activities, damage public infrastructure. In this research forecasting weather parameters as a variable for predicting the amount of rainfall using the ANFIS method and Support Vector Regression (SVR) with the aim to provide information on future weather conditions quickly and accurately. The people can prepare themselves and prepare the equipment needed to deal with it. Rainfall predicted based on synop data such us relative humidity, wind, and temperature. Each parameters must forcasted by using ANFIS and the result used for predict rainfall. Accurate prediction calculated using MSE and RMSE. Predictions of parameters that affect rainfall using the ANFIS method shown that for wind speed predictions having RMSE of 1.975004, temperature predictions have RMSE of 0.742332, and predictions of relative humidity have RMSE of 3.871590. Predicted rainfall based on the data results of the nearest method pre-processing using the Support Vector Regression (SVR) method produces an MSE error value of 0.0928.


Introduction
Indonesia is a tropical country that has two seasons, rain and dry [1]. Weather conditions are closely related to humans because all the series of activities carried out can be determined by the weather conditions that occur. This climate information is also very much needed in various fields such as agriculture, transportation, tourism, and others. Factors that influence the weather include, temperature, humidity, air pressure, wind speed, total cloud cover, and solar radiation.
Anomaly weather phenomena that occur has many negative impact one of which is floods, floods will cripple community economic activities, transportation activities, damage public infrastructure, and many other adverse effects. The importance of information about the weather is more precise information about rainfall so research have been predicted the occurrence of rain.
This research was conducted in the northern Surabaya region because the region is one of the centers of the city of Surabaya. The area has important public facilities such as port dock, hospital, school, center for religious tourism, and many offices and industrial sites. Given the vital importance of northern Surabaya for the city of Surabaya, information about the amount of rainfall is a necessity to deal with flooding that might occur due to the limited flow area capacity.
Determination of weather conditions has been carried out with a variety of methods and existing models, including; rainfall characteristics and clouds that produce high daily rainfall based on micro rain radar data [2], then the weather in the time series data is predicted using the Adaptive Nero Fuzzy Inference System (ANFIS) [3], rainfall prediction can also be predicted using Modified Nearest Neighbour [4], rainfall prediction in northern Surabaya can also be applied with a fuzzy inference system, data mining can be used to predict the weather [5], and rain prediction in the dry season by using Support Vector Regression Modelling based on SOI and NINO 3.4 [6].
In the modern era, people are required to be able to know something that will happen and make plans for events that have been predicted. A good prediction is based on the behavior of symptoms or patterns that have been observed repeatedly and not a prediction based on speculation that doesn't have concrete reasons. There are various kinds of tools or methods that can be used to predict, some of which are using a time series and then with an artificial intelligence-based method that is Support Vector Machine (SVM) [7]. The problem that can be overcome by using Support Vector Machine is a classification problem, in this problem the goal is to separate two or more classes with functions that are induced from the available examples [8]. Some examples of research cases that use Support Vector Classification method include: diabetic retinopathy detection system using Support Vector Machine [9], analysis of the influence of the Support Vector Machine (SVM) on the classification of data Microarray for cancer detection [10] and the implementation of the method Support Vector Machine method for Identification of cabbage plant leaf disease Support Vector can also be applied to the regression problem by using the concept of loss function which can be called the Support Vector Regression (SVR). Some types of loss functions in SVR are ɛ-insensitive, quadratic, Huber and Laplace. In previous studies, SVR can be used for prediction cases, including forecasting crude palm oil using a Support Vector Regression Radial kernel basis [11], SVR analysts in predicting the exchange rate of the rupiah against the US dollar. (Amanda, Yasin, & Prahutama, 2014), Support Vector Regression for Forecast the Demand and Supply of Pulpwood [12], and Support Vector Regression Modelling Rainfall Prediction in Dry Season Based on Southern Oscillation Index and NINO 3.4 [6].
Based on existing conditions, in this research rainfall will be predicted using Adaptive Neuro Fuzzy Inference System (ANFIS) and Support Vector Regression (SVR) using relative humidity, wind, and temperature data obtained from Meteorological Maritime Station, Meteorological Climatological and Geophysics Agency Perak II Surabaya.

Rain
A process of precipitation with a liquid form that requires the presence of a thick layer of the atmosphere in order to reach temperatures above the melting point of ice above the surface of the nearest earth is rain [13]. Rain occurs because of the process of condensation of water vapor in the atmosphere into droplets dense to fall, in addition to the process of cooling the air and also the addition of water vapor to the air can encourage the process of rain because it causes the air to become more saturated. The types of rain based on the process of occurrence are distinguished three types of rain, namely zenithal, orographic, frontal, and cyclonal rain [14].
The types of rain based on the process of occurrence are distinguished by three types of rain, namely zenithal rain, orographic rain, frontal rain, and cyclonal rain. 1. Zenithal Rain Zenithal rain or convection rain occurs because air containing water vapor rises vertically. The rising air then experiences a decrease in temperature so the water vapor it contains turns into a falling point to earth into the rain. 2. Orographic rain Orographic rain occurs because the mass of air containing water vapor is forced to climb the mountain slope so it is also called mountain rain. Frontal rain occurs in the meeting area between the mass of hot air and cold air masses. The mass of hot air will rise above the mass of air. 4. Cyclonal rain Cyclone rain is rain that occurs because of the influence of cyclone winds. Rain that occurs due to cyclone is very dangerous because it can cause tornadoes and tropical cyclones "Hurricane" [14].

Frontal rain
Rainfall is the amount of rainwater that falls on the ground surface with a certain period measured in units of height above the horizontal surface and doesn't occur by process evaporation, drainage or infiltration. The unit for measuring the rainfall is a millimeter (mm). Rainfall in one region with another region will not always be the same. For example, in region one is experiencing the peak of the rainy season so the rainfall is increasing but in other regions is experiencing a decline, it is clear that rainfall in one region will not be the same as other regions but it doesn't rule out the possibility of the same rainfall [7]. According to BMKG rainfall is grouped into 4 groups, including: 1. Low rainfall (0-100 mm) 2. Medium rainfall (101-300 mm) 3. High rainfall (301-400) 4. Very high rainfall (> 400 mm). Rainfall of an area can be influenced by several parameters, including; wind, relative humidity, and temperature.

Adaptive Neuro Fuzzy Inference System (ANFIS) ANFIS (Adaptive Neuro-Fuzzy Inference System) is a hybrid algorithm which is a combination of
Fuzzy Inference System (FIS) mechanisms that are configured in neural network architecture. ANFIS has two parameters, namely the premise parameter and the consequent parameter. In this algorithm, hybrid training is carried out with steps, such as steps forward and steps backward [15].
In order for a network with a radial baseline function to be equivalent to a fuzzy rule-based sugeno first-order model, limitations are needed: a. Both must have the same aggregation method (weighted average or weighted sum) to derive all outputs. b. The number of activation functions must be equal to the number of fuzzy rules (IF-THEN) c. If there are multiple inputs on the basis of the rules, then each activation function must be the same as the membership function of each input d. The activation function and fuzzy rules must have the same function for neurons and the rules on the output side. In the Neuro-Fuzzy system there are five layers of processes in which the functions and equations of each layer are described as [16] [17], Layer 1: Fuzzyfication Layer, at this layer a fuzzy set will be formed using the membership function. There are several membership functions that can be used including, Bell (bell), Gaussian, trap, triangle, etc. The output at layer 1 can be expressed as: Layer 2: Product layer, at this layer synthesized the transmission of information with layer 1 and the multiplication of all incoming and sending signals out. The output at this layer can be stated by: O2,i = µAi(x). µBi(y) = Wi (2) Layer 3: Normalization, the results of layer 2 are then normalized at layer 3.Output at layer three is stated in: Layer 4: Defuzzyfication, output at layer 4 can be calculated using the formula: O4,i = ŴiŶi = Ŵi (pix1 + qix2 + ri) (4) with {pi, qi, ri} is a set of consequent parameters.  [18]. Both scientists and practitioners have applied this technique to solve real problems in daily life. Its application includes the issue of gene expression analysis, financial prediction, weather to the medical field [19].
SVR is an application for the case of regression, the difference between SVM and SVR is the application of SVM is to find the best separator function (hyperplane) between an infinite number of functions to separate two objects. While SVR has an application to find a function as a hyperplane (dividing line) in the form of a regression function that matches all input data with an error ɛ and makes ɛ as thin as possible [20]. The regression function of the SVR method is A function that shows the relationship between errors and whether the error is penalized is called a loss function. The loss function simplest is the ɛ-insensitive loss function which has the following formulation: for others There are several kernel functions that are used to solve problems in linear and non-linear SVM that can be seen in the following equation. a. Linear Kernel Function k(xi,xj) = xi.xj

Research Methods
The type of this research is quantitative research. The flow of this research begins by conducting a literature study to study the problem to be studied, then collecting data which is then pre-processed by filling in the blanks. After filling in the blank data then predicting parameters that affect rainfall using the TS-ANFIS method, then predicting rainfall using the Support Vector Regression (SVR) method, after the results are drawn conclusions based on the results obtained from this study. The flow in this study can be reviewed in Figure 1. Anfis and SVR in this research can be collaborated because rainfall predicted using SVR based on wind speed, relative humidity, and temperature (weather parameters), where the weather parameters (wind speed, relative humidity, and temperature) are predicted using ANFIS. In the process prediction of parameters after the process preprocessing the initial steps taken are the data constructed into time series data, and then forecast each parameters using Adaptive Neuro Fuzzy Inference System (ANFIS). Devide the result into training data and test data. After the prediction parameter data is obtained using TS_ANFIS, rainfall prediction is performed using the method Support Vector Regression (SVR)with the initial step of normalizing the data then grouping the data into training data and test data, then initializing parameters to conduct training and testing data until the model is obtained optimal for predicting rainfall.

Results
Weather data that includes, wind speed, temperature, humidity, and rainfall obtained from the online data of the Meteorology Climatology and Geophysics Agency (BMKG) Maritim Perak II Surabaya station is processed by filling in blank data (preprocessing), then the prediction process of weather parameters that affects rainfall using TS-ANFIS is carried out. The results of predictions using TS-ANFIS are then used to predict rainfall using Support Vector Regression (SVR). The results of each of these processes are as follows:

Preprocessing
At this stage, the input is using weather data (temperature, humidity, wind speed) from January 2009 to December 2018. Weather data have missing values, some have has value because there is no measurement (missing value) so it needs to be done preprocessing.
Existing data blanks will be filled based on the nearest value that is not lost (nearest method). The output obtained is data that will all be filled. Data prepared by this preparation include wind speed, temperature, humidity, and rainfall. Weather parameter data samples can be reviewed in Table 1. In Table 1, there are parameter data that give code 9999, it's code means that the data on a day is not measurable or unobserved so that the data is filled in so that the results obtained the next process is more optimal. In table 2 the previous data (Table 1) containing code 9999 has been filled with new data calculated using the nearest method[21].

Prediction using ANFIS
Weather parameter prediction with TS-ANFIS is based on data patterns that have occurred before. Weather parameter data is formed into time series format based on the previous time series, the input variables used are (t-2, t-1, and t) and then the output is t+1. Forecasting result of wind speed, temperature, humidity, and rainfall can be reviewed in Table 3, Table 4, and Table 5..
The initial step of ANFIS is the stage inference, after going through the stage inference, the ANFIS model has then carried out hybrid learning with forwarding direction learning using the LSEmethod Recursive and backward learning using gradient descent. The results of predictions of relative humidity, temperature, and wind using ANFIS can be reviewed in Figure 2 -Figure 4.    there are some data that looks different. The wind prediction has an RMSE of 1.975004, temperature prediction has an RMSE of 0.742332, and relative humidity prediction has an RMSE of 3.871590.

Prediction using SVR
After obtaining the results of data in the previous process, the data processed using the Support Vector Regression (SVR) method. The method is used to predict rainfall based on wind speed, temperature, and relative humidity.
However, prior to the prediction process using the SVR algorithm, training and testing of weather temperature, relative humidity, wind speed, and rainfall data are first carried out. In the previous stage, training is conducted to look for models, then testing is conducted to test the models that are formed whether the model is optimal or not. To obtain model prediction in this research was taken 87.5% of the data, then the remaining 12.5% was used as test data. Each element of the above matrix is the result of xi.xj.The K kernel matrix above is then used in the next process to find alpha values and bias using Quadratic Programming with the help of software, found: = [ 2.04373945497679, -0.421145981740259, 0.693980872226408] Bias = 6.93778959270997 If either or * is not zero, then the corresponding observation is called a support vector. After obtaining these values, the SVR model can be used as a predictor. The results of predictions using the SVR can be reviewed in Figure 5.  Figure 5 has two type lines that is red and blue lines, the blue line is representing the rainfall actual data and the prediction of rainfall using Support Vector Regression (SVR) represented by the red lines. Prediction on the first day until fifty-day has a big difference, rainfall season reached the height on the 125 days until 150 days. From this figure can be known that there are differences between the red and blue lines, that differences on day 275 until day 300 is the biggest contributor error of prediction system because has a big difference. On actual data rain season occur with interval 20 days, the predicted rain season occurs with close interval. From that figure rainy season represented by a nonzero graph and the others is the dry season. The error MSE of the rainfall prediction model is 0.0928.

Conclusion
Based on the results of the implementation and trials using TS-ANFIS -SVR that has been predicted rainfall, it can be concluded that prediction of parameters that affect rainfall using TS-ANFIS method shows that for wind speed prediction has an RMSE of 1.975004, temperature prediction has an RMSE of 0.742332, and relative humidity prediction has an RMSE of 3.871590. Rainfall prediction based on data from pre-processing the method nearest using the method Support Vector Regression (SVR) produces a value error MSE of 0.0928.