Finding All Minimal Infrequent Multi-dimensional Intervals

Elbassioni, Khaled M.

doi:10.1007/11682462_40

Khaled M. Elbassioni¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3887))

Included in the following conference series:

Latin American Symposium on Theoretical Informatics

981 Accesses
1 Citations

Abstract

Let \({\mathcal D}\) be a database of transactions on n attributes, where each attribute specifies a (possibly empty) real closed interval \(I= [a,b] \subseteq {\mathbb R}\). Given an integer threshold t, a multi-dimensional interval I = ([a ₁,b ₁], ..., [a _n,b _n]) is called t-frequent, if (every component interval of) I is contained in (the corresponding component of) at least t transactions of \({\mathcal D}\) and otherwise, I is said to be t-infrequent. We consider the problem of generating all minimal t-infrequent multi-dimensional intervals, for a given database \({\mathcal D}\) and threshold t. This problem may arise, for instance, in the generation of association rules for a database of time-dependent transactions. We show that this problem can be solved in quasi-polynomial time. This is established by developing a quasi- polynomial time algorithm for generating maximal independent elements for a set of vectors in the product of lattices of intervals, a result which may be of independent interest. In contrast, the generation problem for maximal frequent intervals turns out to be NP-hard.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in massive databases. In: Proc. the 1993 ACM-SIGMOD Int. Conf. Management of Data, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1996)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. 20th Int. Conf. Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)
Google Scholar
Boros, E., Elbassioni, K., Gurvich, V., Khachiyan, L., Makino, K.: An Intersection Inequality for Discrete Distributions and Related Generation Problems. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 543–555. Springer, Heidelberg (2003)
Chapter Google Scholar
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, pp. 133–141. Springer, Heidelberg (2002)
Chapter Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond market basket: Generalizing association rules to correlations. In: Proc. the 1997 ACM-SIGMOD Int. Conf. Management of Data, pp. 265–276 (1997)
Google Scholar
Bioch, J.C., Ibaraki, T.: Complexity of identification and dualization of positive Boolean functions. Information and Computation 123, 50–63 (1995)
Article MathSciNet MATH Google Scholar
Elbassioni, K.: An algorithm for dualization in products of lattices and its applications. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 424–435. Springer, Heidelberg (2002)
Chapter Google Scholar
Edmonds, J., Gryz, J., Liang, D., Miller, R.J.: Mining for empty rectangles in large data sets. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 174–188. Springer, Heidelberg (2000)
Chapter Google Scholar
Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. Journal of Algorithms 21, 618–628 (1996)
Article MathSciNet MATH Google Scholar
Gurvich, V., Khachiyan, L.: On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions. Discrete Applied Mathematics 96-97, 363–373 (1999)
Article MathSciNet MATH Google Scholar
Gunopulos, D., Khardon, R., Mannila, H., Toivonen, H.: Data mining, hypergraph transversals and machine learning. In: Proc. 16th ACM PODS, pp. 12–15 (1997)
Google Scholar
Han, J., Cai, Y., Cercone, N.: Data driven discovery of quantitative rules in relational databases. IEEE Trans. Knowledge and Data Engineering 5(1), 29–40 (1993)
Article Google Scholar
Han, J., Fu, Y.: Discovery of multiple-level association rules from large databases. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 420–431 (1995)
Google Scholar
Lin, J.-L.: Mining maximal frequent intervals. In: Proc. 18th Annual ACM Symp. Applied Computing, Melbourne, FL, September 2003, pp. 426–431 (2003)
Google Scholar
Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, pp. 189–194 (1996)
Google Scholar
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Article Google Scholar
Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1(3), 259–289 (1997)
Article Google Scholar
Srikant, R., Agrawal, R.: Mining generalized association rules. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 407–419 (1995)
Google Scholar
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. the 1996 ACM-SIGMOD Int. Conf. Management of Data, pp. 1–12 (1996)
Google Scholar
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proc. 21st Int. Conf. Very Large Data Bases (VLDB 1995), pp. 432–444 (1995)
Google Scholar
Toivonen, H.: Sampling large databases for association rules. In: Proc. 22nd Int. Conf. Very Large Data Bases (VLDB 1996), pp. 134–145 (1996)
Google Scholar
Yu, H.-C.: Efficient data mining for frequent intervals, Master thesis, Department of Information Management, National Taiwan University, Taiwan (July 2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Max-Planck-Institut für Informatik, Saarbrücken, Germany
Khaled M. Elbassioni

Authors

Khaled M. Elbassioni
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Business, Universidad Adolfo Ibáñez, Chile
José R. Correa
Dept. of Computer Science, University of Chile, Blanco Encalada 2120, 3er piso, Santiago, Chile
Alejandro Hevia
Dept. Ing. Matemática & Ctr. de Modelamiento Matemático, UMI 2807 U. Chile–CNRS, Chile
Marcos Kiwi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Elbassioni, K.M. (2006). Finding All Minimal Infrequent Multi-dimensional Intervals. In: Correa, J.R., Hevia, A., Kiwi, M. (eds) LATIN 2006: Theoretical Informatics. LATIN 2006. Lecture Notes in Computer Science, vol 3887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11682462_40

Download citation

DOI: https://doi.org/10.1007/11682462_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32755-4
Online ISBN: 978-3-540-32756-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics