Rainfall prediction using fuzzy inference system for preliminary micro-hydro power plant planning

East Kalimantan is a very rich area with water sources, in the form of river streams that branch to the remote areas. The conditions of natural potency like this become alternative solution for area that has not been reached by the availability of electric energy from State Electricity Company. The river water in selected location (catchment area) which is channelled to the canal, pipeline or penstock can be used to drive the waterwheel or turbine. The amount of power obtained depends on the volume/water discharge and headwater (the effective height between the reservoir and the turbine). The water discharge is strongly influenced by the amount of rainfall. Rainfall is the amount of water falling on the flat surface for a certain period measured, in units of mm3, above the horizontal surface in the absence of evaporation, run-off and infiltration. In this study, the prediction of rainfall is done in the area of East Kalimantan which has 13 watersheds which, in principle, have the potential for the construction of Micro Hydro Power Plant. Rainfall time series data is modelled by using AR (Auto Regressive) Model based on FIS (Fuzzy Inference System). The FIS structure of the training results is then used to predict the next two years rainfall.


Introduction
Hydropower is a system that utilizes water flow energy to be converted into electrical energy or mechanical energy. There are various ways to harness the energy of the water flow, one of which is the run of river system that does not require a large reservoir. This system is known as Microhydro. The power plant system using the microhydro is called as the Micro Hydro Power Plant (MHPP). MHPP is an alternative renewable energy system that meets energy conservation policies, especially for areas that have great potential for water energy sources.
East Kalimantan is a very rich area with water sources, in the form of river streams that branch to the remote areas. The conditions of natural potency like this become alternative solution for the area that has not been reached by the availability of electric energy from State Electricity Company. The river water in the selected location (catchment area) which is channelled to the canal, pipeline or penstock can be used to drive the waterwheel or turbine. The amount of power obtained by the water channelling method depends on the volume/water discharge and headwater (the effective height between the reservoir and the turbine). The water discharge (m 3 /s) is determined by the flow rate of the river water (m/s) and the cross-sectional area of the river (m).  Water discharge is strongly influenced by the amount of rainfall. Rainfall is the amount of water falling on the flat surface for a certain period measured, in units of mm 3 , above the horizontal surface in the absence of evaporation, run-off and infiltration. Measurements of instantaneous water discharge at the watershed location where the dam will be built are conducted to determine the smallest water discharge during the dry season through hydrological studies. Prediction of rainfall becomes very important for the initial planning of MHPP development.
In this study, the prediction of rainfall is done in the area of East Kalimantan which has 13 watersheds, which in principle has the potential for the construction of MHPP. Rainfall time series data is modeled using AR (Auto Regressive) Model based on FIS. Training is done to obtain the best FIS structure. The FIS structure of the training results is then used to predict the next two years rainfall.

Method
The number of rainy days in a month depends on the amount of rainfall (mm). The number of rainy days is what affects the condition of river water discharge. Rainfall or precipitation is an element of the hydrometer, which is a collection of liquid or solid particles that fall or float in the atmosphere resulting from the condensation process of water vapor (cloud) [1]. Rainfall data can be categorized as time series data with seasonal and non stationary patterns.

Fuzzy Logic
Fuzzy logic provides an inference structure that enables appropriate human reasoning capabilities. The theory of fuzzy logic provides an inference mechanism under cognitive uncertainty. The theory of fuzzy logic is based upon the notion of relative graded membership and so are the functions of mentation and cognitive processes. The utility of fuzzy sets lies in their ability to model uncertain or ambiguous data.
Fuzziness in a fuzzy set is characterized by its membership functions. It classifies the element in the set, whether it is discrete or continuous. The membership functions can also be formed by graphical representations. The graphical representations may include different shapes. There are certain restrictions regarding the shapes used. The rules formed to represent the fuzziness in an application are also fuzzy [2]. The "shape" of the membership function is an important criterion that has to be considered. There are many membership functions (MF) that can be used, one of which is Triangular MF as shown in Figure 1.  From Figure 1, fuzzy set is declared as

FIS (Fuzzy Inference System)
FIS is a system that processes the mapping formulation of a given input to produce an output using Fuzzy Logic. FIS is shown in Figure 2.

2.2.1.Fuzzify inputs.
The first step is to take the inputs and determine the degree to which they belong to each of the appropriate fuzzy sets via membership functions. The input is always a crisp numerical value limited to the universe of discourse of the input variable and the output is a fuzzy degree of membership in the qualifying linguistic set. Fuzzification of the input is performed by considering either a table lookup or a function evaluation. In this way, by the rules, each input is fuzzified over all the qualifying membership functions required.

Applying Fuzzy Operator.
After the inputs are fuzzified, the degree to which each part of the antecedent is satisfied for each rule. If the antecedent of a given rule has more than one part, the fuzzy operator is applied to obtain one number that represents the result of the antecedent for that rule. This number is then applied to the output function. The input to the fuzzy operator is two or more membership values from fuzzified input variables. The output is a single truth value. There are two commonly used fuzzy operators, that are OR (max) and AND (min). For example, there are two inputs (A and B) and one output (C) which have linguistic variables (Low, Medium, High) and have the same fuzzy set as shown in Figure 3. If A(x) = 20 and B(x) = 65 then for the rule: (2) it is illustrated in Figure 4. Before applying the implication method, we must determine the rule's weight. Every rule has a weight (a number between 0 and 1), which is applied to the number given by the antecedent. Generally, this weight is 1 (as it is for this example) and thus has no effect at all on the implication process. After proper weighting has been assigned to each rule, the implication method is implemented. A consequent is a fuzzy set represented by a membership function, which appropriately weights the linguistic characteristics that are attributed to it. The consequent is reshaped using a function associated with the antecedent (a single number). The input for the implication process is a single number given by the antecedent, and the output is a fuzzy set. Implication is implemented for each rule. For implication operation it is used AND method as shown in Figure 5. Because decisions are based on the testing of all of the rules in a FIS, the rules must be combined in some manner in order to make a decision. Aggregation is the process by which the fuzzy sets that represent the outputs of each rule are combined into a single fuzzy set. Aggregation only occurs once for each output variable, just prior to the fifth and final step, defuzzification. The input of the aggregation process is the list of truncated output functions returned by the implication process for each rule. The output of the aggregation process is one fuzzy set for each output variable. For Fuzzy Mamdani, the aggregation stage is using OR (max) method. For example, it is shown in Figure 6.

2.2.5.Defuzzify.
The input for the defuzzification process is a fuzzy set (the aggregate output fuzzy set) and the output is a single number. As much as fuzziness helps the rule evaluation during the intermediate steps, the final desired output for each variable is generally a single number. However, the aggregate of a fuzzy set encompasses a range of output values, and so must be defuzzified in order to resolve a single output value from the set. There are various methods of defuzzification, one of the most popular is the centroid method which returns the center of area under the curve. For a Fuzzy Set X with finite universe of discourse, the centroid method is expressed by: where c x is the centroid value,  The algorithm of prediction using AR model based on FIS is shown in Figure 7.  Table 1.  (6) where N is the number of training data,   i yt is the ith training target,   fuzzy i yt is the ith output of FIS. Implementation of FIS is done by using Matlab programming tool.
All the training data are organized into sequential form by year then by month so that there will be data. Because using 2nd order AR model then each variable will consist of 130 data. By using 5 linguistic variables for initialization then the interval width of each data label is expressed by: where Lvar is the number of lingistic variable for initialization. By using the rainfall data as shown in Table 1 Figure 8, and the prediction result for next 2 years is shown in Figure 9.

Result and Discussion
Summary of the training process is shown in Table 2. The training performance by MAPE is shown in Figure 10. From the training results obtained MAPE 6.28% achieved with 300 linguistic variables. The final interval width is 3.05 with the number of linguistic variables is 300. Table 2 and Figure 10 show that changes in the number of linguistic variables affect MAPE changes. Changes in the number of linguistic variables affect the change in interval width of each data label and number of rule-based.
To predict the next two years (2x12 = 24 predictive data) it is done by means of each prediction result used to add rule-based in such a way that it is done until it reaches 24 predictive data.

Conclusion
The time series data prediction by using AR model based on FIS depends on the order number of AR model that determine the number of FIS inputs and with one FIS output. The interval width of each data label depends on the number of linguistic variables. The number of rule-based relies on the amount of training data already labeled and the number of linguistic variables. The change in the number of linguistic variables depends on the specified MAPE target.