Abstract
Let \({\mathcal D}\) be a database of transactions on n attributes, where each attribute specifies a (possibly empty) real closed interval \(I= [a,b] \subseteq {\mathbb R}\). Given an integer threshold t, a multi-dimensional interval I = ([a 1,b 1], ..., [a n ,b n ]) is called t-frequent, if (every component interval of) I is contained in (the corresponding component of) at least t transactions of \({\mathcal D}\) and otherwise, I is said to be t-infrequent. We consider the problem of generating all minimal t-infrequent multi-dimensional intervals, for a given database \({\mathcal D}\) and threshold t. This problem may arise, for instance, in the generation of association rules for a database of time-dependent transactions. We show that this problem can be solved in quasi-polynomial time. This is established by developing a quasi- polynomial time algorithm for generating maximal independent elements for a set of vectors in the product of lattices of intervals, a result which may be of independent interest. In contrast, the generation problem for maximal frequent intervals turns out to be NP-hard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in massive databases. In: Proc. the 1993 ACM-SIGMOD Int. Conf. Management of Data, pp. 207–216 (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1996)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. 20th Int. Conf. Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)
Boros, E., Elbassioni, K., Gurvich, V., Khachiyan, L., Makino, K.: An Intersection Inequality for Discrete Distributions and Related Generation Problems. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 543–555. Springer, Heidelberg (2003)
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, pp. 133–141. Springer, Heidelberg (2002)
Brin, S., Motwani, R., Silverstein, C.: Beyond market basket: Generalizing association rules to correlations. In: Proc. the 1997 ACM-SIGMOD Int. Conf. Management of Data, pp. 265–276 (1997)
Bioch, J.C., Ibaraki, T.: Complexity of identification and dualization of positive Boolean functions. Information and Computation 123, 50–63 (1995)
Elbassioni, K.: An algorithm for dualization in products of lattices and its applications. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 424–435. Springer, Heidelberg (2002)
Edmonds, J., Gryz, J., Liang, D., Miller, R.J.: Mining for empty rectangles in large data sets. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 174–188. Springer, Heidelberg (2000)
Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. Journal of Algorithms 21, 618–628 (1996)
Gurvich, V., Khachiyan, L.: On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions. Discrete Applied Mathematics 96-97, 363–373 (1999)
Gunopulos, D., Khardon, R., Mannila, H., Toivonen, H.: Data mining, hypergraph transversals and machine learning. In: Proc. 16th ACM PODS, pp. 12–15 (1997)
Han, J., Cai, Y., Cercone, N.: Data driven discovery of quantitative rules in relational databases. IEEE Trans. Knowledge and Data Engineering 5(1), 29–40 (1993)
Han, J., Fu, Y.: Discovery of multiple-level association rules from large databases. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 420–431 (1995)
Lin, J.-L.: Mining maximal frequent intervals. In: Proc. 18th Annual ACM Symp. Applied Computing, Melbourne, FL, September 2003, pp. 426–431 (2003)
Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, pp. 189–194 (1996)
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1(3), 259–289 (1997)
Srikant, R., Agrawal, R.: Mining generalized association rules. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 407–419 (1995)
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. the 1996 ACM-SIGMOD Int. Conf. Management of Data, pp. 1–12 (1996)
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 432–444 (1995)
Toivonen, H.: Sampling large databases for association rules. In: Proc. 22nd Int. Conf. Very Large Data Bases (VLDB 1996), pp. 134–145 (1996)
Yu, H.-C.: Efficient data mining for frequent intervals, Master thesis, Department of Information Management, National Taiwan University, Taiwan (July 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Elbassioni, K.M. (2006). Finding All Minimal Infrequent Multi-dimensional Intervals. In: Correa, J.R., Hevia, A., Kiwi, M. (eds) LATIN 2006: Theoretical Informatics. LATIN 2006. Lecture Notes in Computer Science, vol 3887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11682462_40
Download citation
DOI: https://doi.org/10.1007/11682462_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32755-4
Online ISBN: 978-3-540-32756-1
eBook Packages: Computer ScienceComputer Science (R0)