Age Prediction and Performance Comparison by Adaptive Network based Fuzzy Inference System using Subtractive Clustering

—To integrate the best features of fuzzy systems and neural networks, a data mining approach with ANFIS is applied on all features of Abalone and Monk's problem dataset. The main aim of this research is to reduce the RMSE with fewer numbers of rules in order to achieve high speed and less time consumed in both learning and application phases. For calculating effective RMSE, an adaptive fuzzy inference system with subtractive clustering is proposed. Effective partition of input space is done and loaded into the ANFIS editor. A fuzzy inference system is generated using subtractive clustering and RMSE of training and testing is calculated by hybrid approach which is combination of back propagation and least square method. A structure is generated which shows input and output data along with number of fuzzy rules. This result into lower RMSE with fewer numbers of rules shows that ANFIS is well suited for age prediction of abalone and performance comparison of learning algorithms.


INTRODUCTION
HE architecture and learning procedure underlying ANFIS is presented, which is a fuzzy inference system implemented in the framework of adaptive networks.The proposed ANFIS can construct an input-output mapping based on both human knowledge (in the form of fuzzy if-then rules) and stipulated input-output data pairs by using a hybrid approach.ANFIS is best tradeoff between neural and fuzzy systems providing smoothness and adaptability.The objective of this research is to reduce the RMSE with fewer numbers of rules in order to achieve high speed and less time consumed in both learning and application phases.Neural networks and fuzzy systems both are stand-alone systems.ANFIS is one of the Neuro-fuzzy models.With the increase in the complexity of the process being modeled, the difficulty in developing dependable fuzzy rules and membership functions increases.This has led to the development of another approach which is mostly known as ANFIS approach.A hybrid system named ANFIS has been proposed by Jang (1993).It has the benefits of both fuzzy logic [Junhong Nie & Derek Linkens, 1998] and neural networks [James A. Anderson, 2002].Fuzzy inference in this system is realized with the aid of a training algorithm, which enables to tune the parameters of the fuzzy system.In this paper, firstly the training and testing data of abalone [archive.ics.uci.edu/ml/datasets.html]and monk's problem dataset [archive.ics.uci.edu/ml/datasets.html]are divided.Then they are loaded into the ANFIS editor.After loading training and testing data a fuzzy inference system is generated using subtractive clustering.RMSE is calculated by training the network using hybrid approach which is combination of back propagation and least square method.Then RMSE for training against testing is noted.Finally structure is obtained which shows number of inputs and outputs along with number of fuzzy rules.It is developed in the MATLAB® V7.9.0.529 (R2009b) [MATLAB] environment.
generate fuzzy if-then rules directly from training patterns with no time consuming tuning procedures.Shibendu Shekhar Roy (2005) proposed ANFIS for predicting the surface roughness in turning operation for set of giving cutting parameters.Two different membership functions triangular and bell shaped were adopted during the training process of ANFIS in order to compare the prediction accuracy of surface roughness by two membership functions Sean N.Ghazavi & Thunshun W.Liao (2008) proposed a study of medical data mining that involves the use of eleven feature selection methods and three fuzzy modeling methods, the objective is to determine which combination of feature selection and fuzzy modeling method has the best performance for a given dataset.Three rules are needed to obtain the classification rate 97% when using a modified fuzzy c-means radial basis functions network proposed by Essam Al-Daoud (2010).
Pejman Tahmasebi & Ardeshir Hezarkhani (2010) proposed the adaptive Neuro-fuzzy inference system as a newly applied technique to solve such a problem to evaluate the "copper grade estimation" in Sarcheshmeh porphyry copper system.Fuzzy logic was invented by Zadeh (2010) for handling uncertain and imprecise knowledge in real world applications.It has proved to be a powerful tool for decisionmaking, and to handle and manipulate imprecise and noisy data.Mehdi Khashei et al., (2011) proposed a hybrid approach that combines artificial intelligence with fuzzy in order to benefit from unique advantages of both fuzzy logic and the classification power of the artificial neural networks to construct an efficient and accurate hybrid classifier in less available data situations.Jesmin Nahar et al., (2012) presented a rule extraction experiment on heart disease data using different rule mining algorithms (Apriori, Predictive Apriori and Tertius).Further rule-mining-based analysis was undertaken by categorising data based on gender and significant risk factors for heart disease.Castanho et al., (2012) proposed fuzzy expert system for predicting pathological stage of prostate cancer.A fuzzy expert system was developed with the fuzzy rules and membership functions tuned by a genetic algorithm.As a result, the utilized approach reached better precision taking into account some correlated studies.

III. FUZZY NEURO SYSTEMS
This section describes fuzzy inference system along with ANFIS architecture.

Fuzzy Inference System
A fuzzy inference system is composed of five functional blocks-a rule base containing a number of fuzzy if-then rule, a database which defines the membership functions, a decision-making unit which performs the inference, a fuzzification interface which transforms the crisp inputs, a defuzzification interface which transform the fuzzy.
The steps of inference operations upon fuzzy if-then rules (fuzzy reasoning) performed by fuzzy inference systems are:  To obtain the membership values (or compatibility measures) of each linguistic label, compare the input variables with the membership functions on the premise part (This step is known as fuzzification). To get firing strength (weight) of each rule, combine (through a specific T-norm operator, usually multiplication or min.) the membership values on the premise part. Depending on the firing strength, generate the qualified consequent (either fuzzy or crisp) of each rule. To produce a crisp output, aggregate the qualified consequents (This step is known as defuzzification).

ANFIS Architecture
To facilitate learning and adaptation, ANFIS is a fuzzy Sugeno model put in the framework of adaptive systems.The Sugeno fuzzy model was proposed by Takagi & Sugeno in an effort to formalize a systematic approach to generating fuzzy rules from an input-output data set.
There are broadly three types of Fuzzy reasoning Models.They are Mamdani fuzzy models, Sugeno fuzzy models (TSK model) and Tsukamoto fuzzy models.The Mamdani fuzzy models and Sugeno fuzzy models are most widely used.Sugeno model is widely used in ANFIS because its rules are tunable based on input parameters.
A typical fuzzy rule in a Sugeno fuzzy model has the format IF x is A and y is B THEN z = f(x, y), Where A and B are fuzzy sets in the antecedent; z = f(x, y) is a crisp function in the consequent, f(x, y) is a polynomial in the input variables x and y, but it can be any other functions that can appropriately describe the output of the system within the fuzzy region specified by the antecedent of the rule.If f(x, y) is a first-order polynomial, then model is called as the first-order Sugeno fuzzy model.If f is a constant, then it is called the zero-order Sugeno fuzzy model, which can be viewed either as a special case of the Mamdani fuzzy inference system, where each rule's consequent is specified by a fuzzy singleton, or a special case of Tsukamoto"s fuzzy model where each rule's consequent is specified by a membership function of a step function centered at the constant Moreover, a zero order Sugeno fuzzy model is functionally equivalent to a radial basis function network under certain minor constraints.Layer 1: For each node i∑n this layer generates membership grades of a linguistic label.For instance, the node function of the i−th node may be a generalized bell membership function: Layer 2: Firing strength of a rule is calculated by each node via multiplication and the nodes are fixed in this layer wi = µAi x * µBi y ; i = 1,2 Layer 3: The nodes are fixed nodes.They are labeled with N, indicating that they play a normalization role to the firing strengths from the previous layer.The outputs of this layer can be represented as: Layer 4: The output of each node in this layer is simply the product of the normalized firing strength and a first-order polynomial (for a first-order Sugeno model).Here the nodes are adaptive nodes.Thus, the outputs of this layer are given as: Oi x = wifi = wi(pix + qiy + ri) Layer 5: There is only one single fixed node labeled with S. This node performs the summation of all incoming signals.

IV. METHODS
This section includes ANFIS learning method and subtractive clustering

ANFIS Learning Method
To learn, or adjust weights on connecting arrows between neurons from input-output training samples, neural networks, the back propagation algorithm are used.The ANFIS learning algorithm consists of adjusting the premises and consequents parameters.
In the ANFIS structure, the parameters of the premises and consequents play the role of weights.Specifically, the shape of membership functions in the "If" part of the rules is determined by a finite number of parameters.These parameters are known as premise parameters, whereas the parameters in the "THEN" part of the rules are referred to as consequent parameters.
For ANFIS, a combination of back propagation and Least Square Estimation (LSE) is used.Back propagation is used to learn the premise parameters, and LSE is used to determine the parameters in the rules" consequents.A step in the learning procedure has two passes-forward and backward pass.In the forward pass, node outputs go forward, the premise parameters remain fixed while the consequent parameters {pi, qi, ri} are estimated by least squares method.In the backward pass the error signals are propagated backwards, consequent parameters remain fixed while the back propagation is used to modify the premise parameters {ai, bi, ci}.This combination of least-squares and back propagation methods are used for training FIS membership function parameters to model a given set of input/output data.The performance of this system will be evaluated using RMSE, root mean square errors (difference between the FIS output and the training/testing data output), which is defined as

Subtractive Clustering
When there is no clear idea how many clusters there should be for a given set of data, subtractive clustering is used.Subtractive clustering operates by finding the optimal data point to be defined as a cluster center, based on the density of surrounding data points.In order to determine the next data cluster and its center, all data points within the radius distance of these points are then removed.This process is repeated until all of the data is within the radius distance of a cluster center.When number of inputs is larger this method is used for rule generation.It gives optimized rules by taking into radii specified.It is a fast, one-pass algorithm for estimating the number of clusters and cluster centers in a set of data.It generates FIS structure by scatter partitioning.Here rules are predetermined by fixing number of centers.Membership functions were assigned automatically by software.Therefore, number of tuning parameters is reduced in this case by reducing number of rules.

Abalone Dataset
Abalone dataset contains 4177 entries in which each entry records the features of an abalone together with its age as the desired output.It contains 8 features of an abalone's physical measurements, with no missing data and28 classes corresponding to the age from 1 to 29 years of abalones.The age of abalone is determined by cutting the shell through the cone, staining and counting the number of rings through a microscope --a boring time-consuming task.Other measurements, which are easier to obtain, are used to predict the age.Training and testing data are partitioned.Epochs are kept fixed.Non linear parameters are fixed during forward stroke and steepest descent during backward stroke.Linear parameters are least square during forward stroke and fixed during backward stroke.Lower RMSE along with less number of rules on training and testing of abalone and monk's problem dataset shows that ANFIS is well suited for age prediction of abalone and performance comparison of various learning algorithms.

VI. CONCLUSION
An adaptive fuzzy inference system with neural learning using subtractive clustering is proposed for calculation of RMSE.Effective partition of the input space along with subtractive clustering decreases the rule number and increases the speed in both learning and application phases.It also provides smoothness due to fuzzy control interpolation and adaptability due to neural network back propagation.Lower RMSE and less number of rules results into less time consumed and better performance evaluation along with high speed in learning and application phases of ANFIS.This shows that ANFIS is well suited for age prediction and performance comparison of learning algorithms.In future, reprocessing on dataset can be done through various techniques before applied to the ANFIS.

Table 1 :
Results on Abalone Dataset

Table 2 :
Results on Monk's Problem Dataset