DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints

Bucilă, Cristian; Gehrke, Johannes; Kifer, Daniel; White, Walker

doi:10.1023/A:1024076020895

DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints

Published: July 2003

Volume 7, pages 241–272, (2003)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Cristian Bucilă¹,
Johannes Gehrke¹,
Daniel Kifer¹ &
…
Walker White²

217 Accesses
44 Citations
Explore all metrics

Abstract

Recently, constraint-based mining of itemsets for questions like “find all frequent itemsets whose total price is at least $50” has attracted much attention. Two classes of constraints, monotone and antimonotone, have been very useful in this area. There exist algorithms that efficiently take advantage of either one of these two classes, but no previous algorithms can efficiently handle both types of constraints simultaneously. In this paper, we present DualMiner, the first algorithm that efficiently prunes its search space using both monotone and antimonotone constraints. We complement a theoretical analysis and proof of correctness of DualMiner with an experimental study that shows the efficacy of DualMiner compared to previous work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Rashmin Gajera, Suresh Patel, … Ayush Solanki

Algorithms for frequent itemset mining: a literature review

Article Open access 24 March 2018

Chin-Hoong Chee, Jafreezal Jaafar, … William Yeoh

A Semi-streaming Algorithm for Monotone Regularized Submodular Maximization with a Matroid Constraint

Article 04 April 2024

Qing-Qin Nong, Yue Wang & Su-Ning Gong

References

Agrawal, R., Imielinski, T., and Swami, A.N. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, P. (Buneman and S. Jajodia (Eds.)). Washington, DC: ACM Press, pp. 207–216.
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, (U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.)). AAAI/MIT Press, Chap. 12, pp. 307–328.
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In VLDB'94, Proceedings of 20th International Conference on Very Large Data Bases, (J.B. Bocca, M. Jarke, and C. Zaniolo (Eds.)). Santiago de Chile, Chile: Morgan Kaufmann, pp. 487–499.
Google Scholar
Bayardo, R.J. 1998. Efficiently mining long patterns from databases. In SIGMOD 1998, Proceedings of ACMSIGMOD International Conference on Management of Data, (L.M. Haas and A. Tiwary (Eds.)). Seattle, WA: ACM Press, pp. 85–93.
Google Scholar
Bayardo, R.J., Agrawal, R., and Gunopulos, D. 2000. Constraint-based rule mining in large, dense databases. Data Mining and Knowledge Discovery, 4(2/3):217–240.
Google Scholar
Boulicaut, J. and Jeudy, B. 2000. Using constraints during set mining: Should we prune or not.
Boulicaut, J.-F. and Jeudy, B. 2001. Mining free item sets under constraints. In International Database Engineering and Application Symposium, pp. 322–329
Burdick, D., Calimlim, M., and Gehrke, J. 2001. Mafia: A maximal frequent item set algorithm for transactional databases. In ICDE 2001. IEEE Computer Society.
Delis, A., Faloutsos, C., and Ghandeharizadeh, S. (Eds.). 1999. SIGMOD 1999, Philadephia, PA: ACM Press.
Google Scholar
Gunopulos, D., Mannila, H., Khardon, R., and Toivonen, H. 1997. Data mining, hyper graph transversals, and machine learning. In Proc. PODS 1997, pp. 209–216.
Haas, L.M. and Tiwary, A. (Eds.). 1998. SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, Seattle, WA: ACM Press.
Han, J., Pei, J., Dong, G., and Wang, K. 2001. Efficient computation of iceberg cubes with complex measures. In SIGMOD Conference.
Hipp, J. and Guntzer, U. 2002. Is pushing constraints deeply into the mining algorithms really what we want? SIGKDD Explorations, 4(1):50–55
Google Scholar
Lakshmanan, L.V.S., Ng, R.T., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In SIGMUD 1999, (Delis, Faloutsos, and Ghandeharizadeh (Eds.)). Philadephia, PA: ACMPress, pp. 157–168.
Google Scholar
Leung, C.K.-S., Lakshmanan, L.V., and Ng, R.T. 2002. Exploiting succinct constraints using fp-trees. SIGKDD Explorations, 4(1):31–39.
Google Scholar
Ng, R.T., Lakshmanan, L.V.S., Han, J., and Mah, T. 1999. Exploratory mining via constrained frequent set queries. In SIGMUD 1999, (Delis, Faloutsos, and Ghandeharizadeh (Eds.)). Philadephia, PA: ACM Press, pp. 556–558.
Google Scholar
Ng, R.T., Lakshmanan, L.V.S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations ofconstrained association rules. In SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, (Haas, and Tiwary (Eds.)). Seattle, WA: ACM Press, pp. 13–24.
Google Scholar
Pei, J. and Han, J. 2000. Can we push more constraints into frequent pattern mining? In ACMSIGKDD Conference, pp. 350–354.
Pei, J. and Han, J. 2002. Constrained frequent pattern mining: A pattern-growth view. SIGKDD Explorations, 4(1):31–39.
Google Scholar
Pei, J., Han, J., and Lakshmanan, L.V.S. 2001. Mining frequent item sets with convertible constraints. In ICDE 2001, IEEE Computer Society, pp. 433–442.
Perng, C.-S., Wang, H., Ma, S., and Hellerstein, J.L. 2002. Discovery in multi-attribute data with user-defined constraints. SIGKDD Explorations,4(1):56–64.
Google Scholar
Raedt, L.D. and Kramer, S. 2001. The level wise version space algorithm and its application to molecular fragment finding. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 853–862.
Srikant, R., Vu, Q., and Agrawal, R. 1997. Mining association rules with item constraints. In Proc. 3rd Int. Conf. Knowledge Discovery and Data Mining, (KDD, D. Heckerman, H. Mannila, D. Pregibon, and R. Uthurusamy(Eds.)). AAAI Press, pp. 67–73. IBM data generator. http://www.almaden.ibm.com/cs/quest/syndata.html.
Cristian Bucil¢a is a Ph.D. student at Cornell University. He received his Bachelor's degree in computer science at the Technical University of Cluj-Napoca, Romania. His current research interests are in Data Mining.
Johannes Gehrke is an Assistant Professor in the Department of Computer Science at Cornell University. He obtained his Ph.D. in computer science from the University of Wisconsin-Madison in 1999. Johannes’ research interests are in the areas of data mining and novel distributed database technology. Johannes has received a National Science Foundation Career Award, an Arthur P. Sloan Fellowship, an IBM Faculty Award, and the Cornell College of Engineering James and Mary Tien Excellence in Teaching Award. He co-authored the textbook “Database Management Systems” (McGrawHill, currently in its third edition).
Daniel Kifer is a Ph.D. student at Cornell University. He received a Bachelor's degree in mathematics and in computer science at New York University. His current research interests are Databases and Data Mining. Walker White is an assistant professor in the mathematics department at the University of Dallas, a liberal arts college, where he is responsible for developing their new computer science program. He received his Bachelor's degree in mathematics from Dartmouth College and both a Ph.D. in mathematics and Master's in computer science from Cornell University. His primary research is in mathematical logic and its applications to computer science.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Cornell University, USA
Cristian Bucilă, Johannes Gehrke & Daniel Kifer
Department of Mathematics, University of Dallas, USA
Walker White

Authors

Cristian Bucilă
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Gehrke
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kifer
View author publications
You can also search for this author in PubMed Google Scholar
Walker White
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bucilă, C., Gehrke, J., Kifer, D. et al. DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. Data Mining and Knowledge Discovery 7, 241–272 (2003). https://doi.org/10.1023/A:1024076020895

Download citation

Issue Date: July 2003
DOI: https://doi.org/10.1023/A:1024076020895

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Algorithms for frequent itemset mining: a literature review

A Semi-streaming Algorithm for Monotone Regularized Submodular Maximization with a Matroid Constraint

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Algorithms for frequent itemset mining: a literature review

A Semi-streaming Algorithm for Monotone Regularized Submodular Maximization with a Matroid Constraint

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation