Further Pruning for Efficient Association Rule Discovery

Zhang, Songmao; Webb, Geoffrey I.

doi:10.1007/3-540-45656-2_52

Songmao Zhang³ &
Geoffrey I. Webb³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2256))

Included in the following conference series:

Australian Joint Conference on Artificial Intelligence

739 Accesses
1 Citations

Abstract

The Apriori algorithm’s frequent itemset approach has become the standard approach to discovering association rules. However, the computation requirements of the frequent itemset approach are infeasible for dense data and the approach is unable to discover infrequent associations. OPUS_AR is an efficient algorithm for association rule discovery that does not utilize frequent itemsets and hence avoids these problems. It can reduce search time by using additional constraints on the search space as well as constraints on itemset frequency. However, the effectiveness of the pruning rules used during search will determine the efficiency of its search. This paper presents and analyses pruning rules for use with OPUS_AR. We demonstrate that application of OPUS_AR is feasible for a number of datasets for which application of the frequent itemset approach is infeasible and that the new pruning rules can reduce compute time by more than 40%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agarwal, C. C. Aggarwal, and V. V. V. Prasad. Depth first generation of long patterns. In Proc. Sixth ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD2000), pages 108–118, Boston, MA, August 2000. ACM.
Google Scholar
R. Agrawal, T. Imielinski, and A. Swami. Mining associations between sets of items in massive databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data, pages 207–216, 1993.
Google Scholar
S. D. Bay. The UCI KDD archive. [http://kdd.ics.uci.edu] Irvine, CA: University of California, Department of Information and Computer Science., 2001.
Google Scholar
R. J. Bayardo. Efficiently mining long patterns from databases. In Proc. 1998 ACMSIGMOD Int. Conf. Management of Data, pages 85–93, 1998.
Google Scholar
R. J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule mining in large, dense databases. Data Mining and Knowledge Discovery, 4(2/3):217–240, 2000.
Article Google Scholar
C. Blake and C. J. Merz. UCI repository of machine learning databases. [Machinereadable data repository]. University of California, Department of Information and Computer Science, Irvine, CA., 2001.
Google Scholar
C. Borgelt. apriori. (Computer Software) http://fuzzy.cs.Uni-Magdeburg.de/borgelt/, February 2000.
S. H. Clearwater and F. J. Provost. RL4: A tool for knowledge-based induction. In Proc. Second Intl. IEEE Conf. on Tools for AI, pages 24–30, Los Alamitos, CA, 1990. IEEE Computer Society Press.
Google Scholar
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD’00), Dallas, TX, May 2000.
Google Scholar
S. Morishita and A. Nakaya. Parallel branch-and-bound graph search for correlated association rules. In Proc. ACM SIGKDD Workshop on Large-Scale Parallel KDD Systems, volume LNAI 1759, pages 127–144. Springer, Berlin, 2000.
Google Scholar
J. Pei, J. Han, and R. Mao. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proc. 2000 ACM-SIGMOD Int. Workshop on Data Mining and Knowledge Discovery (DMKD’00), Dallas, TX, May 2000.
Google Scholar
F. Provost, J. Aronis, and B. Buchanan. Rule-space search for knowledge-based discovery. CIIO Working Paper IS 99-012, Stern School of Business, New York University, New York, NY 10012, 1999.
Google Scholar
R. Rymon. Search through systematic set enumeration. In Proc. KR-92, pages 268–275, Cambridge, MA, 1992.
Google Scholar
R. Segal and O. Etzioni. Learning decision lists using homogeneous rules. In AAAI-94, Seattle, WA, 1994. AAAI press.
Google Scholar
G. I. Webb. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:431–465, 1995.
MATH MathSciNet Google Scholar
G. I. Webb. Efficient search for association rules. In The Sixth ACM SIGKDD Int. Conf.Knowledge Discovery and Data Mining, pages 99–107, Boston, MA, 2000. The Association for Computing Machinery.
Google Scholar
M. J. Zaki. Generating non-redundant association rules. In Proceedings of the Sixth ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD2000), pages 34–43, Boston, MA, August 2000. ACM.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Mathematics, Deakin University, 3217, Geelong, Victoria, Australia
Songmao Zhang & Geoffrey I. Webb

Authors

Songmao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Science, University of South Australia, Mawson Lakes, 5095, SA, Australia
Markus Stumptner & Dan Corbett &
Department of Computer Science, University of Adelaide, 5001, Adelaide, SA, Australia
Mike Brooks

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Webb, G.I. (2001). Further Pruning for Efficient Association Rule Discovery. In: Stumptner, M., Corbett, D., Brooks, M. (eds) AI 2001: Advances in Artificial Intelligence. AI 2001. Lecture Notes in Computer Science(), vol 2256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45656-2_52

Download citation

DOI: https://doi.org/10.1007/3-540-45656-2_52
Published: 14 February 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42960-9
Online ISBN: 978-3-540-45656-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics