Abstract
FP-growth algorithm is an efficient algorithm for mining frequent patterns. It scans database only twice and does not need to generate and test the candidate sets that is quite time consuming. The efficiency of the FP-growth algorithm outperforms previously developed algorithms. But, it must recursively generate huge number of conditional FP-trees that requires much more memory and costs more time.
In this paper, we present an algorithm, CFPmine, that is inspired by several previous works. CFPmine algorithm combines several advantages of existing techniques. One is using constrained subtrees of a compact FP-tree to mine frequent pattern, so that it is doesn’t need to construct conditional FP-trees in the mining process. Second is using an array-based technique to reduce the traverse time to the CFP-tree. And an unified memeory management is also implemented in the algorithm. The experimental evaluation shows that CFPmine algorithm is a high performance algorithm. It outperforms Apriori, Eclat and FP-growth and requires less memory than FP-growth.
Chapter PDF
Similar content being viewed by others
References
Agarwal R C, Aggarwal C C, and Prasad V V V. A Tree Projection Algorithm for Generation of Frequent Itemsets. Journal of Parallel and Distributed Computing, 2001.
Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large database. In Proc of 1993 ACM SIGMOD Conf on Management of Data, 207–216, Washington DC, May 1993.
Agrawal R, Srikant R. Fast algorithms for mining association rules. In Proc of the 20th Int’l Conf on Very Large DataBases (VLDB’94). 487–499. Santiago, Chile, Sept. 1994.
Brin S, Motwani R, Ullman J D, and Tsur S. Dynamic itemset counting and implication rules for market basket data. In SIGMOD Record (ACM Special Interest Group on Management of Data), 26(2):255, 1997
FAN Ming, LI Chuan. Mining frequent patterns in an FP-tree without conditional FP-tree generation (In Chinese). Journal of computer research and development, 40(8): 1216–1222. 2003.
Grahne G, Zhu J. Efficiently using prefix-trees in mining frequent itemsets. In: First Workshop on Frequent Itemset Mining Implementation (FIMI’03). Melbourne, FL
Han J, Pei J, and Yin Y. Mining Frequent Patterns without Candidate Generation. In Proc of 2000 ACM-SIGMOD Int’l Conf on Management of Data (SIGMOD’00). 1–12. Dallas, TX, 2000.
Park J S, Chen M-S and Yu P S. An Effective Hash-based Algorithm for Mining Association Rules. In: Proc of 1995 ACM-SIGMOD int’l Conf on Management of Data (SIGMOD’95). San Jose, CA, 1995. 175–186.
Savasere A, Omiecinski E, Navathe S. An efficient Algorithm for Mining Association Rules in Large Databases, In Proc of 21st Int’l Conf on Very Large Databases (VLDB’95), pages 432–443. Zurich, Switzerland, Sept. 1995.
Toivonen H. Sampling Large Databases for Association Rules. In Proc of 22nd Int’l Conf on Very Large Databases (VLDB’96). pages 134–145. Bombay, India, Sept. 1996.
Zaki M, Parthasarathy S, Ogihara M, and Li W. New algorithms for fast discovery of association rules. In Heckerman D, Mannila H, Pregibon D, and Uthurusamy R eds, Proc of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), page 283. AAAI Press, 1997. http://citeseer.ist.psu.edu/zaki97new.html
http://fuzzy.cs.uni-magdeburg.de/~borgelt/
http://www.cs.helsinki.fi/u/goethals/
http://www.almaden.ibm.com/softwarequest/Resources/datasets/syndata.html
http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 International Federation for Information Processing
About this paper
Cite this paper
Qin, LX., Luo, P., Shi, ZZ. (2005). Efficiently Mining Frequent Itemsets with Compact FP-Tree. In: Shi, Z., He, Q. (eds) Intelligent Information Processing II. IIP 2004. IFIP International Federation for Information Processing, vol 163. Springer, Boston, MA. https://doi.org/10.1007/0-387-23152-8_51
Download citation
DOI: https://doi.org/10.1007/0-387-23152-8_51
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23151-8
Online ISBN: 978-0-387-23152-5
eBook Packages: Computer ScienceComputer Science (R0)