An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining

Chen, Tianding

doi:10.1007/978-3-540-72909-9_74

Tianding Chen¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4537))

Included in the following conference series:

2422 Accesses

Abstract

The frequent pattern mining problem has been studying for several years, while few works discuss on fault-tolerant pattern mining. Fault-tolerant data mining extracts more interesting information from real world data which may be polluted by noise. However, those few previous works either not define the problem maturely or restrict the problem to finding those patterns tolerate fixed number of fault items. In this paper, the problem of mining proportionally fault-tolerant frequent patterns is discussed. Two algorithms are proposed to solve it. The first algorithm, applies FT-Apriori heuristic and performs the idea of finding all FT-patterns with all possible number of faults. The second algorithm, divides all FT-patterns into several groups by their number of tolerable faults, and mines the content patterns of each group respectively. The experiment result shows more potential fault-tolerant patterns are extracted by our approach. Our contribution is offering a different type of fault-tolerant frequent pattern, in those patterns, the number of tolerated faults is proportional to the length of patterns. This gives the user another choice when traditional fault-tolerant frequent pattern mining result can’t satisfy them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD 1993), pp. 207–216, Washington, DC (May 1993)
Google Scholar
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. 1995 Int. Conf. Data Engineering, pp. 3–14, Taipei, Taiwan (March 1995)
Google Scholar
Ozden, B., Ramaswamy, S., Silberschatz, A.: Cyclic association rules. In: Proc. 1998 Int. Conf. Data Engineering (ICDE 1998), pp. 412–421, Orlando, FL (February 1998)
Google Scholar
Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: Proc. of ACM SIGMOD DMKD Workshop, pp. 21–30 (2000)
Google Scholar
Agrawal, R., Aggarwal, C., Prasad, V.: Depth first generation of long patterns. In: Proc. of ACM SIGKDD Conf. pp. 108–118 (2000)
Google Scholar
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of ICDE Conf. pp. 443–452 (2001)
Google Scholar
Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: Proc. of ICDM conf. pp. 163–170 (2001)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. 1994 Int. Conf. Very Large Data Bases (VLDB 1994), pp. 487–499, Santiago, Chile (September 1994)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: An Efficient Hash-based Algorithm for Mining Association Rules. In: Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1995), pp. 175–186, San Jose, Ca (May 1995)
Google Scholar
Han, J., Fu, Y.: Discovery of Multiple-level Association Rules from Large Databases. In: Proc. 1995 Int. Conf. Very Large Data Bases (VLDB 1995), pp. 420–431, Zurich, Switzerland (September 1995)
Google Scholar
Savasere, A., Omiecinski, E., Navathe, S.: An Efficient Algorithm for Mining Association Rules in Large Databases. In: Proc, Int. Conf. Very Large Data Bases(VLDB 1995), pp. 432–443, Zurich, Switzerland (Septemper 1995)
Google Scholar
Toivonen, H.: Sampling Large Databases for Association Rules. In: Proc.1996 Int. Conf. Very Large Data Bases (VLDB 1996), pp. 134–145, Bombay, India (September 1996)
Google Scholar
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Analysis. In: Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1997), pp. 255–264, Tucson, AZ (May 1997)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns Without Candidate Generation, ACM SIGMOD Record. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of data (May 2000)
Google Scholar
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. In: Proc. of ICDM conf. pp. 441–448 (2001)
Google Scholar
Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection, ACM SIGKDD 2002 (July 23-26, 2002)
Google Scholar
Pei, J., Tung, A.K.H., Han, J.: Fault-Tolerant Frequent Pattern Mining: Problems and Challenges, DMKD 2001, Santa Barbara, CA (May 2001)
Google Scholar
Wang, S.-S., Lee, S.-Y.: Mining Fault-Tolerant Frequent Patterns in Large Database, International Computer Symposium (December 2002)
Google Scholar
Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (August 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Communications and Information Technology, Zhejiang Gongshang University, Hangzhou 310035,
Tianding Chen

Authors

Tianding Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Kevin Chen-Chuan Chang Wei Wang Lei Chen Clarence A. Ellis Ching-Hsien Hsu Ah Chung Tsoi Haixun Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, T. (2007). An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining. In: Chang, K.CC., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72909-9_74

Download citation

DOI: https://doi.org/10.1007/978-3-540-72909-9_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72908-2
Online ISBN: 978-3-540-72909-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics