Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks
Introduction
During the past decade, a sufficient number of data mining algorithms have been proposed by the computational intelligence researchers for solving real world classification and clustering problems (Farid et al., 2013, Liao et al., 2012, Ngai et al., 2009). Generally, classification is a data mining function that describes and distinguishes data classes or concepts. The goal of classification is to accurately predict class labels of instances whose attribute values are known, but class values are unknown. Clustering is the task of grouping a set of instances in such a way that instances within a cluster have high similarities in comparison to one another, but are very dissimilar to instances in other clusters. It analyzes instances without consulting a known class label. The instances are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. The performance of data mining algorithms in most cases depends on dataset quality, since low-quality training data may lead to the construction of overfitting or fragile classifiers. Thus, data preprocessing techniques are needed, where the data are prepared for mining. It can improve the quality of the data, thereby helping to improve the accuracy and efficiency of the mining process. There are a number of data preprocessing techniques available such as (a) data cleaning: removal of noisy data, (b) data integration: merging data from multiple sources, (c) data transformations: normalization of data, and (d) data reduction: reducing the data size by aggregating and eliminating redundant features.
This paper presents two independent hybrid algorithms for scaling up the classification accuracy of decision tree (DT) and naïve Bayes (NB) classifiers in multi-class classification problems. DT is a classification tool commonly used in data mining tasks such as ID3 (Quinlan, 1986), ID4 (Utgoff, 1989), ID5 (Utgoff, 1988), C4.5 (Quinlan, 1993), C5.0 (Bujlow, Riaz, & Pedersen, 2012), and CART (Breiman, Friedman, Stone, & Olshen, 1984). The goal of DT is to create a model that predicts the value of a target class for an unseen test instance based on several input features (Loh and Shih, 1997, Safavian and Landgrebe, 1991, Turney, 1995). Amongst other data mining methods, DTs have various advantages: (a) simple to understand, (b) easy to implement, (c) requiring little prior knowledge, (d) able to handle both numerical and categorical data, (e) robust, and (f) dealing with large and noisy datasets. A naïve Bayes (NB) classifier is a simple probabilistic classifier based on: (a) Bayes theorem, (b) strong (naïve) independence assumptions, and (c) independent feature models (Farid et al., 2011, Farid et al., 2010, Lee and Isa, 2010). It is also an important mining classifier for data mining and applied in many real world classification problems because of its high classification performance. Similar to DT, the NB classifier also has several advantages such as (a) easy to use, (b) only one scan of the training data required, (c) handling missing attribute values, and (d) continuous data.
In this paper, we propose two hybrid algorithms respectively for a DT classifier and a NB classifier for multi-class classification tasks. The first proposed hybrid DT algorithm finds the troublesome instances in the training data using a NB classifier and removes these instances from the training set before constructing the learning tree for decision making. Otherwise, DT may suffer from overfitting due to the presence of such noisy instances and its accuracy may decrease. Moreover, it is also noted that to compute class conditional independence using a NB classifier is extremely computationally expensive for a dataset with many attributes. Our second proposed hybrid NB algorithm finds the most crucial subset of attributes using a DT induction. The weights of the selected attributes by DT are also calculated. Then only these most important attributes selected by DT with their corresponding weights are employed for the calculation of the naïve assumption of class conditional independence. We evaluate the performances of the proposed hybrid algorithms against those of existing DT and NB classifiers using the classification accuracy, precision, sensitivity–specificity analysis, and 10-fold cross validation on 10 real benchmark datasets from UCI (University of California, Irvine) machine learning repository (Frank & Asuncion, 2010). The experimental results prove that the proposed methods have produced very promising results in the classification of real world challenging multi-class problems. These methods also allow us to automatically extract the most representative high quality training datasets and identify the most important attributes for the characterization of instances from a large amount of noisy training data with high dimensional attributes.
The rest of the paper is organized as follows. Section 2 gives an overview of the work related to DT and NB classifiers. Section 3 introduces the basic DT and NB classification techniques. Section 4 presents our proposed two hybrid algorithms for the multi-class classification problems respectively based on DT and NB classifiers. Section 5 provides experimental results and a comparison against existing DT and NB algorithms using 10 real benchmark datasets from UCI machine learning repository. Finally, Section 6 concludes the findings and proposes directions for future work.
Section snippets
Related work
In this section, we review recent research on decision trees and naïve Bayes classifiers for various real world multi-class classification problems.
Supervised classification
Classification is one of the most popular data mining techniques that can be used for intelligent decision making. In this section, we discuss some basic techniques for data classification using decision tree and naïve Bayes classifiers. Table 1 summarizes the most commonly used symbols and terms throughout the paper.
The proposed hybrid learning algorithms
In this paper, we have proposed two independent hybrid algorithms respectively for decision tree and naïve Bayes classifiers to improve the classification accuracy in multi-class classification tasks. These proposed algorithms are described in the following Sections 4.1 The proposed hybrid decision tree algorithm, 4.2 The proposed hybrid algorithm for a naïve Bayes classifier. Algorithm 1 is used to describe the proposed hybrid DT induction, which employs a NB classifier to remove any noisy
Experiments
In this section, we describe the test datasets and experimental environments, and present the evaluation results for both of the proposed hybrid decision tree and naïve Bayes classifiers.
Conclusions
In this paper, we have proposed two independent hybrid algorithms for DT and NB classifiers. The proposed methods improved the classification accuracy rates of both DT and NB classifiers in multi-class classification tasks. The first proposed hybrid DT algorithm used a NB classifier to remove the noisy troublesome instances from the training set before the DT induction, while the second proposed hybrid NB classifier used a DT induction to select a subset of attributes for the production of
Acknowledgment
We appreciate the support for this research received from the European Union (EU) sponsored (Erasmus Mundus) cLINK (Centre of Excellence for Learning, Innovation, Networking and Knowledge) project (Grant No. 2645).
References (35)
A co-evolving decision tree classification method
Expert Systems with Applications
(2008)- et al.
Classsification by clustering decision tree-like classifier based on adjusted clusters
Expert Systems with Applications
(2011) - et al.
Robust approach for estimating probabilities in naïve Bayesian classifier for gene expression data
Expert Systems with Applications
(2011) - et al.
Moving towards efficient decision tree construction
Information Sciences
(2009) - et al.
Fuzzifying gini index based decision trees
Expert Systems with Applications
(2009) - et al.
Feature selection for text classification with naïve Bayes
Expert Systems with Applications
(2009) - et al.
Using decision trees to summarize associative classification rules
Expert Systems with Applications
(2009) - et al.
Partition-conditional ICA for Bayesian classification of microarray data
Expert Systems with Applications
(2010) - et al.
An adaptive ensemble classifier for mining concept drifting data streams
Expert Systems with Applications
(2013) - et al.
Decision tree induction using a fast splitting attribute selection for large datasets
Expert Systems with Applications
(2011)
Extended naïve Bayes classifier for mixed data
Expert Systems with Applications
A network intrusion detection system based on a hidden naïve Bayes multiclass classifier
Expert Systems with Applications
Automatically computed document dependent weighting factor facility for na”ıve Bayes classification
Expert Systems with Applications
Data mining techniques and applications – a decade review from 2000 to 2011
Expert Systems with Applications
Application of data mining techniques in customer relationship management: A literature review and classification review article
Expert Systems with Applications
A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems
Expert Systems with Applications
Job performance prediction in a call center using a naïve Bayes classifier
Expert Systems with Applications
Cited by (339)
Classification of the growth level of fungal colonies in solid medium: a machine learning approach[Formula presented]
2023, Expert Systems with ApplicationsFault diagnosis of silage harvester based on a modified random forest
2023, Information Processing in AgricultureHybridized intelligent multi-class classifiers for rockburst risk assessment in deep underground mines
2024, Neural Computing and ApplicationsERP adoption prediction using machine learning techniques and ERP selection among SMEs
2024, International Journal of Business Performance Management