Exception rules in association rule mining
Introduction
Exception rule mining has attracted a lot of research interest [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. Exception rules have been defined as rules with low support and high confidence [4]. A traditional example of exception rules is the rule Champagne ⇒ Caviar. The rule may not have a high support, but it has high confidence. The items are expensive so they are not frequent in the database, but they are always bought together so the rule has high confidence. Exception rules provide valuable knowledge about database patterns.
This paper presents exception rules mining based on association rules in databases. Exception rules describe unusual, contradictory knowledge in the database. An interconnection between exception rules and association rules will be explored. Based on the knowledge about association rules in the database, the exception rules will be generated. In this paper, we consider that association rules may exist in the form of positive association, as well as negative association [13], [14], [15]. Since the exception rules are the opposite of association rules, the exception rules exist in the form of negative, as well as positive, association. A novel exceptionality measure will be proposed to evaluate the reliable exception rules. The exceptions with high exceptionality are the reliable exception rules.
The significance of exception rules has been highlighted in a number of research works [4], [7], [8], [9], [10], [16]. Something that contradicts a user’s common belief is bound to be interesting. Hussain et al. [4] state that “exceptions can take an important role in making critical decisions”. Most researchers focus on association rules that represent common phenomena that occur with high support and confidence. Exceptions, despite their important role in decision making, are still foreign to many users. Exceptions are no doubt highly valuable.
Liu et al. [16] maintain that “reliable exceptions are unknown, unexpected, or contradictory to what the user believes. Hence, they are novel and potentially more interesting than strong patterns to the user”. For example, the rule ‘jobless applicants are granted credit’ will be more novel than the rule ‘jobless applicants are not granted credit’. They stress that “an exception rule is often beneficial since it differs from a common sense rule, which is often a basis for people’s daily activity”.
The above-mentioned examples demonstrate the importance of exception rules in data mining. Association rules and exception rules discover different kinds of rules. Association rules present commonsense knowledge, whereas exception rules represent surprising and unusual facts in the data.
The rest of this paper is organized as follows: Section 2 summarizes existing work in exception rules. Section 3 describes exception rules in detail. Section 4 presents our proposed exceptionally measure. Section 5 describes our proposed algorithm and explains the proposed methods with a detailed example. Section 6 presents our experimental results, and Section 7 gives the conclusions.
Section snippets
Existing work
Existing work on the discovery of exception rules can be classified as either (i) directed or (ii) undirected. A directed search obtains a set of exception rules each of which contradicts a user-specified belief [5], [17], [18]. An undirected search obtains a set of pairs of an exception rule and a general rule [4], [8], [9], [10], [16].
In a directed search of exception rules, user-specified beliefs are obtained first. Each of the discovered exception rules contradicts the user-supplied
Motivating examples
The general definition of an exception is ‘something unusual, something that does not conform to the rule, or something deviating from the norm’. The key terms in this general definition of an exception are the words rule and norm. Therefore, to discover an exception in the given environment, we need to know what the common rules or the norm are in the given environment. Before we start searching for exceptions in a database, we have to discover the strong rules in the database, whether they be
Proposed exceptionality measure
We propose a novel measure to distinguish the reliable exception rules from all other positive and negative exception rules. We name the novel measure the exceptionality measure. The minimum exceptionality minexcep is specified by a user along with the minimum support value minsup and minimum confidence value minconf. Exceptionality of an exception rule ExcRule given the corresponding association rule AssocRule is defined by the formula below:
Algorithm and examples
In this section, we present the algorithm for mining reliable exception rules. The reliable exception rules generated by the algorithm are the exception rules with high exceptionality. The algorithm is then followed by a walk-through example.
Performance evaluation
The test database was downloaded from the UCI Repository of machine learning databases [26]. The test database is the Intrusion Detection database, which is former KDD Cup 1999 data to distinguish the attacks on the network from other database records. The database represents the parameters of a network over a period of time. The original database included 40 parameters and a vast number of records. Most of the parameters are continuous. In our experimentation, the simplified model of 10
Conclusion and future work
In this paper, a novel method has been developed for mining exception rules. The interconnection between strong positive and negative association rules and exception rules is explored, where an exception rule is formed if it contradicts the strong rule, and also if it satisfies some exceptional measure. The proposed exceptionality measure is used to evaluate the candidate exception rule whose desired performance has been proven.
In the future, we are going to consider temporal exceptions, which
References (33)
- et al.
The impact of load balancing to object-oriented query execution scheduling in parallel machine environment
Information Sciences
(2003) - et al.
Query execution scheduling in parallel object-oriented databases
Information and Software Technology
(1999) - et al.
Performance analysis of parallelization models for path expression queries
Information Sciences
(1999) - et al.
Performance evaluation of the object-relational transformation methodology
Data and Knowledge Engineering
(2001) - et al.
A methodology for transforming inheritance relationships in an object-oriented conceptual model to relational tables
Information and Software Technology
(2000) - et al.
Parallel database sorting
Information Sciences
(2002) - et al.
Global parallel indexing for multi-processors database systems
Information Sciences
(2004) Learning rules and their exceptions
Journal of Machine Learning Research
(2002)- B. Grosof, T. Poon, SweetDeal: representing agent contracts with exceptions using XML rules, ontologies, and process...
- et al.
Discovering actionable patterns in event data
IBM Systems Journal
(2002)
Finding interesting patterns using user expectations
IEEE Transactions on Knowledge and Data Engineering
Scheduled discovery of exception rules
Discovery Science, Lecture Notes in Artificial Intelligence
In pursuit of interesting patterns with undirected discovery of exception rules
Progress in Discovery Science, Lecture Notes in Computer Science
Undirected discovery of interesting exception rules
International Journal of Pattern Recognition and Artificial Intelligence
Cited by (66)
OOIMASP: Origin based association rule mining with order independent mostly associated sequential patterns
2018, Expert Systems with ApplicationsCitation Excerpt :Association rules having low support and high confidence are exception rules. Taniar, Rahayu, Lee, and Daly (2008) proposed a novel approach to find exception rules. In the first phase, candidate exception rules are generated and based on exceptionality measure, the final ruleset is computed.
Needle in a Haystack: Generating Audit Hypotheses for Clinical Audits of Hospitals
2022, SN Computer ScienceBig data fuzzy C-means algorithm based on bee colony optimization using an Apache Hbase
2021, Journal of Big DataDevelopment of a framework for preserving the disease-evidence-information to support efficient disease diagnosis
2021, International Journal of Data Warehousing and MiningError checking of large land quality databases through data mining based on low frequency associations
2020, Land Degradation and Development