Constraint-Based Rule Mining in Large, Dense Databases

Bayardo, Roberto J.; Agrawal, Rakesh; Gunopulos, Dimitrios

doi:10.1023/A:1009895914772

Constraint-Based Rule Mining in Large, Dense Databases

Published: July 2000

Volume 4, pages 217–240, (2000)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Roberto J. Bayardo Jr¹,
Rakesh Agrawal² &
Dimitrios Gunopulos³

654 Accesses
145 Citations
3 Altmetric
Explore all metrics

Abstract

Constraint-based rule miners find all rules in a given data-set meeting user-specified constraints such as minimum support and confidence. We describe a new algorithm that directly exploits all user-specified constraints including minimum support, minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications. Our algorithm maintains efficiency even at low supports on data that is dense (e.g. relational tables). Previous approaches such as Apriori and its variants exploit only the minimum support constraint, and as a result are ineffective on dense data due to a combinatorial explosion of “frequent itemsets”.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, R., Aggarwal, C., Prasad, V.V.V., and Crestana, V. 1998. A tree projection algorithm for generation of large itemsets for association rules. IBM Research Report RC21341.
Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining associations between sets of items in massive databases. In Proc. of the 1993 ACM-SIGMOD Int'l Conf. on Management of Data, pp. 207–216.
Ali, K., Manganaris, S., and Srikant, R. 1997. Partial classification using association rules. In Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, pp. 115–118.
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining. AAAI Press, pp. 307–328.
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. IBM Research Report RJ9839, IBM Almaden Research Center, San Jose, CA.
Bayardo, R.J. 1997. Brute-force mining of high-confidence classification rules. In Proc. of the Third Int'l Conf. on Knowledge Discovery and Data Mining, pp. 123–126.
Bayardo, R.J. 1998. Efficiently mining long patterns from databases. In Proc. of the 1998 ACM-SIGMOD Int'l Conf. on Management of Data, pp. 85–93.
Bayardo, R.J. and Agrawal, R. 1999. Mining the most interesting rules. In Proc. of the ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, pp. 145–154.
Berry, M.J.A. and Linoff, G.S. 1997. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley & Sons, Inc.
Brin, S., Motwani, R., Ullman, J., and Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data. In Proc. of the 1997 ACM-SIGMOD Int'l Conf. on the Management of Data, pp. 255–264.
Clearwater, S.H. and Provost, F.J. 1990. RL4: A tool for knowledge-based induction. In Proc. of the Second Int'l IEEE Conf. on Tools for Artificial Intelligence, pp. 24–30.
Cohen, W.W. 1995. Fast effective rule induction. In Proc. of the 12th Int'l Conf. on Machine Learning, pp. 115–123.
Dhar, V. and Tuzhilin, A. 1993. Abstract-driven pattern discovery in databases. IEEE Transactions on Knowledge and Data Engineering, 5(6):926–938.
Article Google Scholar
Gunopulos, G., Mannila, H., and Saluja, S. 1997. Discovering all most specific sentences by randomized algorithms. In Proc. of the 6th Int'l Conf. on Database Theory, pp. 215–229.
International Business Machines. 1996. IBM intelligent miner user's guide, Version 1, Release 1.
Klemettinen, M., Mannila, P., Ronkainen, P., and Verkamo, A.I. 1994. Finding interesting rules from large sets of discovered association rules. In Proc. of the Third Int'l Conf. on Information and Knowledge Management, pp. 401–407.
Lin, D.-I and Kedem, Z.M. 1998. Pincer-search: A new algorithm for discovering the maximum frequent set. In Proc. of the Sixth European Conf. on Extending Database Technology, pp. 105–119.
Liu, B., Hsu, W., and Ma, Y. 1998. Integrating classification and association rule mining. In Proc. of the Fourth Int'l Conf. on Knowledge Discovery and Data Mining, pp. 80–86.
Murphy, P. and Pazzani, M. 1994. Exploring the decision forest: An empirical investigation of Occam's Razor in decision tree induction. J. of Artificial Intelligence Research, 1:257–275.
Google Scholar
Ng, R.T., Lakshmanan, V.S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. In Proc. of the 1998 ACM-SIGMOD Int'l Conf. on the Management of Data, pp. 13–24.
Park, J.S., Chen, M.-S., and Yu, P.S. 1996. An effective hash based algorithm for mining association rules. In Proc. of the 1995 SIGMOD Conf. on the Management of Data, pp. 175–186.
Rymon, R. 1992. Search through systematic set enumeration. In Proc. of Third Int'l Conf. on Principles of Knowledge Representation and Reasoning, pp. 539–550.
Rymon, R. 1994. On kernel rules and prime implicants. In Proc. of the Twelfth Nat'l Conf. on Artificial Intelligence, pp. 181–186.
Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proc. of the 21st Conf. on Very Large Data-Bases, pp. 432–444.
Schlimmer, J.C. 1993. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proc. of the Tenth. Int'l Conf. on Machine Learning, pp. 284–290.
Segal, R. and Etzioni, O. 1994. Learning decision lists using homogeneous rules. In Proc. of the Twelfth Nat'l Conf. on Artificial Intelligence, pp. 619–625.
Shafer, J., Agrawal, R., and Mehta, M. 1996. SPRINT: A scalable parallel classifier for data-mining. In Proc. of the 22nd Conf. on Very Large Data-Bases, pp. 544–555.
Smythe, P. and Goodman, R.M. 1992. An information theoretic approach to rule induction from databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301–316.
Article Google Scholar
Srikant, R., Vu, Q., and Agrawal, R. 1997. Mining association rules with item constraints. In Proc. of the Third Int'l Conf. on Knowledge Discovery in Databases and Data Mining, pp. 67–73.
Webb, G.I. 1995. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:431–465.
MATH MathSciNet Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li,W. 1997. New algorithms for fast discovery of association rules. In Proc. of the Third Int'l Conf. on Knowledge Discovery in Databases and Data Mining, pp. 283–286.

Download references

Author information

Authors and Affiliations

IBM Almaden Research Center, San Jose, CA, 95120, USA
Roberto J. Bayardo Jr
IBM Almaden Research Center, San Jose, CA, 95120, USA
Rakesh Agrawal
IBM Almaden Research Center, San Jose, CA, 95120, USA
Dimitrios Gunopulos

Authors

Roberto J. Bayardo Jr
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Gunopulos
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bayardo, R.J., Agrawal, R. & Gunopulos, D. Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery 4, 217–240 (2000). https://doi.org/10.1023/A:1009895914772

Download citation

Issue Date: July 2000
DOI: https://doi.org/10.1023/A:1009895914772

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constraint-Based Rule Mining in Large, Dense Databases

Abstract

Access this article

Similar content being viewed by others

Sets of Robust Rules, and How to Find Them

Frequent Pattern Mining Algorithms: A Survey

Association rule mining algorithms on high-dimensional datasets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Constraint-Based Rule Mining in Large, Dense Databases

Abstract

Access this article

Similar content being viewed by others

Sets of Robust Rules, and How to Find Them

Frequent Pattern Mining Algorithms: A Survey

Association rule mining algorithms on high-dimensional datasets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation