Intelligent decision support system based on rough set and fuzzy logic approach for efficacious precipitation forecast

Article history: Received February 25, 2016 Received in revised format: March 28, 2016 Accepted June 26, 2016 Available online June 26 2016 Weather forecasting is essential and demanding scientific task of meteorological services across the world. It is a complex procedure that includes many specific technological field of study. The prediction is intricate process in meteorology because all decisions are made within a facet of uncertainty associated with weather systems. This research finding introduces a novel rough fuzzy computing approach for a short term rainfall forecasts. The model consists of rough set based optimal weather parameter selection module and fuzzy rule based classification module. The proposed fuzzy decision support model is compared with benchmarked classification approaches. The fuzzy classification model used in fuzzy decision support system is trained and tested using the reduct sets generated using proposed maximum frequency weighted feature reduction technique. The optimal reduct set constituting the weather parameters; minimum temperature, relative humidity and solar radiation achieved better prediction accuracy than complete feature set and the reducts. Most of the classification models have shown better accuracy when trained using the selected subsets of the target input. Thorough evaluation of the proposed model has revealed that coupling fuzzy decision support system and rough based pre-processing techniques was a better approach than traditional techniques. The experimental results revealed the proposed rough fuzzy model as a better rainfall prediction approach for modeling short range rainfall forecast. Growing Science Ltd. All rights reserved. 7 © 201


Introduction
Daily Weather forecast reports play significant role for strategic decision support in the fields of agriculture, aeronautics, marine engineering, etc. Weather prediction mostly relies on the observatory records of previous events.Weather datasets usually consists of record of several features or parameters.Every feature in the dataset may not influence the prediction outcome.At times, some of the irrelevant feature may in turn reduce the accuracy of the forecast scenario.Thus, there exists an ever growing need for new techniques in this field.Knowledge discovery is a process of retrieving the hidden underlying knowledge from large databases.As a recent trend in data mining and knowledge discovery systems, various computational intelligence approaches were implemented along with data mining.The intelligent computing systems have their own pros and cons, fusion of these techniques may at time worsen the model outcomes.The potential benefits of hybridization of intelligent systems depend on the selection of suitable techniques that could complement each other.In this proposed model a sequential fusion of rough sets and fuzzy set approach is implemented to achieve better results.
The intent of this research is to investigate potential application hybridization of rough set theory, data mining and fuzzy set theory for daily rainfall prediction.Pawlak introduced Rough Set Theory (RST) as an emerging mathematical approach in 1982; it is a mathematical model to deal with vague and imperfect knowledge.Rough set theory does not require any prior knowledge or additional information about the data.In rough set, data analysis starts from a table referred as decision or information table of an information system.Rough set has wide scope; it has been adopted in wide range of scientific and medical applications (Pawlak et al., 1995), especially in the field of pattern recognition, data mining, machine learning, process control and knowledge representation systems (Pawlak, 2002).Pawlak and Slowinski (1994) expediented a tool for analysis of a vague depiction of decision situations.The core is the indispensable element of rough set theory (Pawlak, 1997).Pawlak and Skowron (2007) stated that rough set theory could be viewed as a particular implementation of Frege's idea of ambiguity.
According to rough set, an information system I = {Z, A, V, S}, Z: is a universal set that constitutes the domain objects of the system.{A} → C ∪ D is a feature set that includes condition features and the decision feature.V → ∈ is a set of features.Vr is the feature values range of r ∈ A. G: → is an information function, which designates each object feature value in V that constitutes an information system.The two fundamental concepts of rough set theory are core and reduct (Pawlak, 2007).The reduct set {R} is the indispensable part of an information system (IS) with all possible subsets of features.Reduct set can discern all objects as complete feature set {A}. Core is the indispensable feature of reducts and rough set theory is suitable for multi criteria data analysis (Pawlak, 1982).Sudha and Valarmathi (2013) reinstated rough computing as an effective data reduction tool.Greco et al. (2001) defined the usage of rough computing for multi-criteria data analysis.Yao and Zhao (2009) described a reduct construction method based on discernibility matrix simplification as an approximate tool for attribute reduction.Shen and Jensen (2007) stated that feature reduction methods are categorized as filter, wrapper and hybrid approach.Qablan (2012) described reduct computation based on evolutionary approaches.Sudha and Valarmathi (2014) described the importance of feature reduction in modeling multi-criteria decision support systems.
The proposed feature reduction model is constructed based on rough set theory.This model is used to reduce the feature space of rainfall dataset in order to identify the features that influence the prediction.Rough set based genetic algorithm approach has been adopted for feature reduct generation.Thirteen feature reducts were generated for the given input data using Rosetta software.Later, the significant features were determined using the proposed maximum frequency weighted feature reduction technique.The detailed working model and the performance outcomes are discussed in the model development section.The optimal reduct set is determined based on the prediction accuracy.The classification model which exhibits highest prediction accuracy is determined as the most suitable model for this rainfall statistics.The last module deals with rule induction using fuzzy rule learning.Fuzzy set theory has been used in wide variety of fields of application, such as data analysis, system control, pattern recognition, Soil evaluation, agricultural development, automotive etc. from the time of its introduction (Zadeh, 1965).As conventional non-fuzzy rules have "sharp boundaries", they produce an unexpected transition between supports of a class.This proposed approach uses the benefits of fuzzy set for rule induction instead of conventional rule generation approach.A fuzzy round robin ripper algorithm (FR3) which is an extension of RIPPER algorithm is implemented for rule induction.Experimental analysis was carried out using Knowledge Extraction based on Evolutionary Learning (KEEL) software environment.KEEL is an open source software tool used for various data mining problems like regression, classification, unsupervised learning and statistical classification.It includes evolutionary learning and fuzzy rule learning algorithms based on different approaches (Alcala-Fdez et al., 2008).Al-Matarneh et al. (2014) proposed automated weather forecasting model using temperature to predict the daily temperature using two techniques, artificial neural networks and fuzzy logic.They have developed a different weather forecasting models based on the two techniques over different regions.The proposed model implements feed forward neural network architecture and it was trained using back propagation algorithm.Experimental results revealed that the proposed models can forecast with more accurate results.Kusiak et al. (2013) examined the performance of neural network multi-layer perceptron, random forest, classification and regression tree, support vector machine, and k-nearest neighbor algorithms.From the results they have conveyed that data mining techniques are suitable for construction of prediction models for normal as well time series radar data.Prediction model build with multi-layer perceptron has shown better accuracy in rainfall prediction than any other techniques.Olaiya and Adeyemo (2012) described that artificial neural network, decision tree, genetic algorithms, rule induction, nearest neighbor method, memory-based reasoning, logistic regression and discriminant analysis approaches were extensively applied in predictive data analytics.Specifically the artificial neural network approach and tree pruning techniques are appropriate for precipitation prediction analysis.Lee et al. (2012) described a feature selection approach using a genetic algorithm for heavy rain prediction in South Korea; as a similar approach to this evolutionary computation based feature reduction.They used weather data collected from European medium range weather forecast canter between 1989 to 2009.Alcala-Fdez et al. ( 2011) described the application of Fuzzy Association Rule-Based Classification Model for High-Dimensional Problems with Genetic Rule Selection and Lateral Tuning.Talei et al. (2010) discussed about a new approach of combining two powerful artificial intelligence techniques known as neuro-fuzzy systems for rainfall runoff modeling.They evaluated the performance of the proposed model on roughly two years of rainfall and runoff data from 66 separate rainfall events for a sub catchment of Kranji basin in Singapore.The experimental results revealed that this adaptive neuro fuzzy system performs well than physically-based model called Storm Water Management Model (SWMM).Dai and Xu (2013) discussed a new fuzzy based feature reduction approach for medical dataset for tumour diagnosis.Feature selection method based on fuzzy gain ratio under the framework of fuzzy rough set theory has been described with experiment results.Hasan et al. (2008) proposed a fuzzy inference model for rainfall prediction based on scan data from the Soil Climate Analysis Network Station at Alabama Agricultural and Mechanical University (AAMU) campus for the year 2004.The experimental results proved that the percentage of error was less when compared with the calculated amount of rainfall with actual amount of rainfall (Lee et al., 2007).A new method was developed for temperature prediction based on mixed techniques, fuzzy logical relationships and genetic algorithms.Sharma and Manoria (2006) introduced a weather forecasting system based on neuro fuzzy system to predict meteorological condition on the basis of weather system dimensions.They considered atmospheric pressure a most important key parameter and atmospheric temperature and relative humidity next important parameter.Seo and Kim (2012) addressed the significance of rainfall prediction and the complications in knowledge representation from meteorological data.Li and Liu (2005) proposed a hybrid rough fuzzy neural network model to work out weather forecasting problems.They used data from World Meteorological Organization for experimental analysis.For learning stage of fuzzy neural network and the rough sets method was introduced to determine the numbers of rules and original weights using least square algorithm (LSA).For any fuzzy inference model knowledge acquisition is the main concern for building an expert system using fuzzy rule.Knowledge discovery is in the form of IF-THEN rules being extracted from the given input data.Each rule has an antecedent part and a consequent part.
The antecedent part is the collection of conditions connected by AND, OR, NOT logic operators and the consequent part represents its decision (Pant & Ashwagosh, 2004).Wong et al. (2003) developed a fuzzy rule-based rainfall prediction model and compared the performance of proposed method with an established radial basis function networks model.They concluded that fuzzy rule-based methods perform the same way as an established method.Still, the fuzzy rule-based method has an advantage of letting the predictor to understand the model better using fuzzy rules.Consequently, taking the benefit of using the concept of fuzzy set model for predicting rainfall is justifiable for adopting FR3 in this proposed rough fuzzy computing for short term rainfall prediction.Liu et al. (2001) proposed a feature selection approach using genetic algorithm to select major features in their study, and the features were used for prediction using data mining.AliKhashashneh et al. ( 2013) described Johnson reduction algorithm and the Object Reduct using Feature weighting technique (ORAW) for reduct computation.Both the algorithms aim at reducing the number of features in the dataset based on discernibility matrix.

Materials and Methods
A hybrid rough-fuzzy decision support is developed for short range rainfall forecasting in Coimbatore region.The model is evaluated using observatory dataset registered at Coimbatore, India for 29 years between 1984 and 2013.This rainfall dataset consists of eight atmospheric parameters or features; maximum temperature, minimum temperature, relative humidity1, relative humdity2, wind speed, solar radiation, sun shine and evapotranspiration.The raw dataset is data pre-processed in order to improve the quality of the targeted input dataset.The model is evaluated before and after the feature reduction , the input for the proposed model consists eight atmospheric parameters or features labelled features: f1→ Maximum temperature , f2 → Minimum temperature, f3→ Relative Humidity1, f4→ Relative Humidity2 , f5 → Wind Speed, f6→ Solar Radiation , f7→ Sun shine, f8→ evapotranspiration.The decision variable RF= 1 means Rainfall and RF= 0 is no rain day.The proposed rough fuzzy model for rainfall prediction as represented in Fig. 1 is designed to support effective meteorological datamining and decision support.This proposed model consists of rough set weather parameter selection module and fuzzy based rule Induction module.Rough set based feature selection algorithm generates an initial set of reducts for the given input dataset.Individual feature significance is estimated for determining the individual feature weighting using a novel maximum frequency weighted feature reduction technique to find minimal significant reduct set.

Fig. 1. Intelligent Rough Fuzzy Rainfall Prediction Model
The minimal reduct set determined for certain consists of set of features in the complete feature list.
The minimal reduct set that has the highest prediction index is referred as an optimal reduct.An optimal reduct set is the ultimate reduct set with most influencing parameters.Applying the concept rough set core and reduct the optimal reduct set features are revealed as the significant weather parameters.The features in optimal reduct are recorded weather parameters that influences rainfall prediction.The performance of optimal reduct set is evaluated using widely adopted classification models to identify the best suited classification model for rainfall prediction.

Rough Set based Weather Parameter Selection Module
The concept of core and reduct in rough set theory is applied in the proposed model implementation.
The reduct sets are the indispensable part of a complete feature list with all possible subsets of features.
Reduct set {Rf} can classify the input as a complete feature set {Cf}.Core is the indispensable feature of reducts the influencing parameter is defined as core of this {Cf}.Fig. 2 represents the proposed rough computing based feature reduction module.Significant weather parameter identification method implemented genetic algorithm approach in rosette followed by maximum frequency weighted feature reduction.

Fuzzy Inference Module
The fuzzy based rule induction module as shown in Fig. 3 -C) is an evolutionary approach that implements iterative approach based on genetic algorithm feature selection to learn fuzzy rules (Gonzalez & Perez, 2001).

Fig. 3. Fuzzy Rule based Classification Model
SLAVE chooses the significant features of the field with too large search space and the execution time is high when applied on high dimensional data.Therefore, it is the input for this SLAVE-C is subject to feature reduction using rough sets as to improve the efficiency of SLAVE.Also feature reduction is a part of this fuzzy classifier (Blum & Langley, 1997) in which the genetic algorithm works with individuals (representing individual rules) composed of two structures are implemented.First structure represents the significant status of the involved variables in the rule, the other one representing the assignments value.This implementation structure of SLAVE-C generates minimal rule set and maximal accuracy while using a feature reduct itself as input.According to Bardossy et al. (1995) Fuzzy rule-based classification approach can be considered as a suitable weather modeling approach.The evaluations were conducted using supervised learning algorithm for vague environment, fuzzy round robin ripper and chi-RW fuzzy classifier using KEEL (knowledge exploration using evolutionary approach) and Weka.Weka is a widely adopted machine learning and data mining software tool.The algorithms included in weka can moreover be applied directly to a dataset from its own interface or used in user defined Java code (Witten & Frank, 2005).The performance outcomes revealed that SLAVE-C was a suitable fuzzy based classifier for this rainfall prediction statistics.The prediction accuracy, as the rule set induced by SLAVE-C, is better than other fuzzy classifier methods.

Results and Discussions
Experimental results in Table 2 project the subsets of maximum frequency weighted feature reducts generated by rough set weather parameter selection module.{f2f4f5f6f7} was determined as maximum frequency weighted feature reduct.Then all possible combinations of MFR{} is generated to study the influence of weather parameter for prediction.Each and every subset is evaluated using six classifiers.Table 3 shows the experimental results in terms of classification accuracy of subsets with 4 features.

Table 4a
Minimal reduct subset (3features) The experimental results of subsets with 3 features are projected in Table 4 and Table 5.However, the subsets with 2 features and those with fewer than 3 features show poor prediction accuracy.Therefore, from the experimental evaluation it is decided to choose reduct with at least 3 feature set.
Minimal feature reduct and optimal feature reduct are two important core concepts of this proposed rough set based on maximum frequency weighted feature reduction.This procedure has enabled to identify the most significant reduct set from the complete feature set, referred as optimal feature reduct.

Conclusions
Experimental results of this proposed intelligent rough fuzzy hybridization have revealed that the model was suitable for meteorological decision support.The model identified Minimum Temperature {f2}, Relative Humidity2 {f4} and Solar Radiation{f6} as the effective weather parameter of this rainfall forecast statistics.Experimental study has revealed that the subsets that contain at least any one of these parameters were better result than the subset that did not include all this three parameters.The subsets containing two or three of these effective parameters have shown substantially good accuracy.It is concluded that a feature reduct subset with {f2f4f6}is optimal reduct of this input.This reduct has obtained the maximal accuracy than any other subset.Similary the fuzzy based classifiers evaluated for rule induction revealed that supervised learning algorithm on vague environment with feature selection based on genetic algorithm approach as suitable fuzzy rule leaner for rainfall forecasting.
consists of evaluation of performance of the optimal reduct set using fuzzy based classification techniques.Structural learning algorithm on vague

Table 8
Experimental Outcome of Intelligent Rough Fuzzy Model