Skip to main content
Log in

Constraint-Based Rule Mining in Large, Dense Databases

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Constraint-based rule miners find all rules in a given data-set meeting user-specified constraints such as minimum support and confidence. We describe a new algorithm that directly exploits all user-specified constraints including minimum support, minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications. Our algorithm maintains efficiency even at low supports on data that is dense (e.g. relational tables). Previous approaches such as Apriori and its variants exploit only the minimum support constraint, and as a result are ineffective on dense data due to a combinatorial explosion of “frequent itemsets”.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, R., Aggarwal, C., Prasad, V.V.V., and Crestana, V. 1998. A tree projection algorithm for generation of large itemsets for association rules. IBM Research Report RC21341.

  • Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining associations between sets of items in massive databases. In Proc. of the 1993 ACM-SIGMOD Int'l Conf. on Management of Data, pp. 207–216.

  • Ali, K., Manganaris, S., and Srikant, R. 1997. Partial classification using association rules. In Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, pp. 115–118.

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining. AAAI Press, pp. 307–328.

  • Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. IBM Research Report RJ9839, IBM Almaden Research Center, San Jose, CA.

  • Bayardo, R.J. 1997. Brute-force mining of high-confidence classification rules. In Proc. of the Third Int'l Conf. on Knowledge Discovery and Data Mining, pp. 123–126.

  • Bayardo, R.J. 1998. Efficiently mining long patterns from databases. In Proc. of the 1998 ACM-SIGMOD Int'l Conf. on Management of Data, pp. 85–93.

  • Bayardo, R.J. and Agrawal, R. 1999. Mining the most interesting rules. In Proc. of the ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, pp. 145–154.

  • Berry, M.J.A. and Linoff, G.S. 1997. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley & Sons, Inc.

  • Brin, S., Motwani, R., Ullman, J., and Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data. In Proc. of the 1997 ACM-SIGMOD Int'l Conf. on the Management of Data, pp. 255–264.

  • Clearwater, S.H. and Provost, F.J. 1990. RL4: A tool for knowledge-based induction. In Proc. of the Second Int'l IEEE Conf. on Tools for Artificial Intelligence, pp. 24–30.

  • Cohen, W.W. 1995. Fast effective rule induction. In Proc. of the 12th Int'l Conf. on Machine Learning, pp. 115–123.

  • Dhar, V. and Tuzhilin, A. 1993. Abstract-driven pattern discovery in databases. IEEE Transactions on Knowledge and Data Engineering, 5(6):926–938.

    Article  Google Scholar 

  • Gunopulos, G., Mannila, H., and Saluja, S. 1997. Discovering all most specific sentences by randomized algorithms. In Proc. of the 6th Int'l Conf. on Database Theory, pp. 215–229.

  • International Business Machines. 1996. IBM intelligent miner user's guide, Version 1, Release 1.

  • Klemettinen, M., Mannila, P., Ronkainen, P., and Verkamo, A.I. 1994. Finding interesting rules from large sets of discovered association rules. In Proc. of the Third Int'l Conf. on Information and Knowledge Management, pp. 401–407.

  • Lin, D.-I and Kedem, Z.M. 1998. Pincer-search: A new algorithm for discovering the maximum frequent set. In Proc. of the Sixth European Conf. on Extending Database Technology, pp. 105–119.

  • Liu, B., Hsu, W., and Ma, Y. 1998. Integrating classification and association rule mining. In Proc. of the Fourth Int'l Conf. on Knowledge Discovery and Data Mining, pp. 80–86.

  • Murphy, P. and Pazzani, M. 1994. Exploring the decision forest: An empirical investigation of Occam's Razor in decision tree induction. J. of Artificial Intelligence Research, 1:257–275.

    Google Scholar 

  • Ng, R.T., Lakshmanan, V.S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. In Proc. of the 1998 ACM-SIGMOD Int'l Conf. on the Management of Data, pp. 13–24.

  • Park, J.S., Chen, M.-S., and Yu, P.S. 1996. An effective hash based algorithm for mining association rules. In Proc. of the 1995 SIGMOD Conf. on the Management of Data, pp. 175–186.

  • Rymon, R. 1992. Search through systematic set enumeration. In Proc. of Third Int'l Conf. on Principles of Knowledge Representation and Reasoning, pp. 539–550.

  • Rymon, R. 1994. On kernel rules and prime implicants. In Proc. of the Twelfth Nat'l Conf. on Artificial Intelligence, pp. 181–186.

  • Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proc. of the 21st Conf. on Very Large Data-Bases, pp. 432–444.

  • Schlimmer, J.C. 1993. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proc. of the Tenth. Int'l Conf. on Machine Learning, pp. 284–290.

  • Segal, R. and Etzioni, O. 1994. Learning decision lists using homogeneous rules. In Proc. of the Twelfth Nat'l Conf. on Artificial Intelligence, pp. 619–625.

  • Shafer, J., Agrawal, R., and Mehta, M. 1996. SPRINT: A scalable parallel classifier for data-mining. In Proc. of the 22nd Conf. on Very Large Data-Bases, pp. 544–555.

  • Smythe, P. and Goodman, R.M. 1992. An information theoretic approach to rule induction from databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301–316.

    Article  Google Scholar 

  • Srikant, R., Vu, Q., and Agrawal, R. 1997. Mining association rules with item constraints. In Proc. of the Third Int'l Conf. on Knowledge Discovery in Databases and Data Mining, pp. 67–73.

  • Webb, G.I. 1995. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:431–465.

    MATH  MathSciNet  Google Scholar 

  • Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li,W. 1997. New algorithms for fast discovery of association rules. In Proc. of the Third Int'l Conf. on Knowledge Discovery in Databases and Data Mining, pp. 283–286.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bayardo, R.J., Agrawal, R. & Gunopulos, D. Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery 4, 217–240 (2000). https://doi.org/10.1023/A:1009895914772

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009895914772

Navigation