Skip to main content

An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining

  • Conference paper
Advances in Web and Network Technologies, and Information Management (APWeb 2007, WAIM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4537))

  • 2422 Accesses

Abstract

The frequent pattern mining problem has been studying for several years, while few works discuss on fault-tolerant pattern mining. Fault-tolerant data mining extracts more interesting information from real world data which may be polluted by noise. However, those few previous works either not define the problem maturely or restrict the problem to finding those patterns tolerate fixed number of fault items. In this paper, the problem of mining proportionally fault-tolerant frequent patterns is discussed. Two algorithms are proposed to solve it. The first algorithm, applies FT-Apriori heuristic and performs the idea of finding all FT-patterns with all possible number of faults. The second algorithm, divides all FT-patterns into several groups by their number of tolerable faults, and mines the content patterns of each group respectively. The experiment result shows more potential fault-tolerant patterns are extracted by our approach. Our contribution is offering a different type of fault-tolerant frequent pattern, in those patterns, the number of tolerated faults is proportional to the length of patterns. This gives the user another choice when traditional fault-tolerant frequent pattern mining result can’t satisfy them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD 1993), pp. 207–216, Washington, DC (May 1993)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. 1995 Int. Conf. Data Engineering, pp. 3–14, Taipei, Taiwan (March 1995)

    Google Scholar 

  3. Ozden, B., Ramaswamy, S., Silberschatz, A.: Cyclic association rules. In: Proc. 1998 Int. Conf. Data Engineering (ICDE 1998), pp. 412–421, Orlando, FL (February 1998)

    Google Scholar 

  4. Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: Proc. of ACM SIGMOD DMKD Workshop, pp. 21–30 (2000)

    Google Scholar 

  5. Agrawal, R., Aggarwal, C., Prasad, V.: Depth first generation of long patterns. In: Proc. of ACM SIGKDD Conf. pp. 108–118 (2000)

    Google Scholar 

  6. Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of ICDE Conf. pp. 443–452 (2001)

    Google Scholar 

  7. Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: Proc. of ICDM conf. pp. 163–170 (2001)

    Google Scholar 

  8. Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. 1994 Int. Conf. Very Large Data Bases (VLDB 1994), pp. 487–499, Santiago, Chile (September 1994)

    Google Scholar 

  9. Park, J.S., Chen, M.S., Yu, P.S.: An Efficient Hash-based Algorithm for Mining Association Rules. In: Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1995), pp. 175–186, San Jose, Ca (May 1995)

    Google Scholar 

  10. Han, J., Fu, Y.: Discovery of Multiple-level Association Rules from Large Databases. In: Proc. 1995 Int. Conf. Very Large Data Bases (VLDB 1995), pp. 420–431, Zurich, Switzerland (September 1995)

    Google Scholar 

  11. Savasere, A., Omiecinski, E., Navathe, S.: An Efficient Algorithm for Mining Association Rules in Large Databases. In: Proc, Int. Conf. Very Large Data Bases(VLDB 1995), pp. 432–443, Zurich, Switzerland (Septemper 1995)

    Google Scholar 

  12. Toivonen, H.: Sampling Large Databases for Association Rules. In: Proc.1996 Int. Conf. Very Large Data Bases (VLDB 1996), pp. 134–145, Bombay, India (September 1996)

    Google Scholar 

  13. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Analysis. In: Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1997), pp. 255–264, Tucson, AZ (May 1997)

    Google Scholar 

  14. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns Without Candidate Generation, ACM SIGMOD Record. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of data (May 2000)

    Google Scholar 

  15. Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. In: Proc. of ICDM conf. pp. 441–448 (2001)

    Google Scholar 

  16. Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection, ACM SIGKDD 2002 (July 23-26, 2002)

    Google Scholar 

  17. Pei, J., Tung, A.K.H., Han, J.: Fault-Tolerant Frequent Pattern Mining: Problems and Challenges, DMKD 2001, Santa Barbara, CA (May 2001)

    Google Scholar 

  18. Wang, S.-S., Lee, S.-Y.: Mining Fault-Tolerant Frequent Patterns in Large Database, International Computer Symposium (December 2002)

    Google Scholar 

  19. Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (August 2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Kevin Chen-Chuan Chang Wei Wang Lei Chen Clarence A. Ellis Ching-Hsien Hsu Ah Chung Tsoi Haixun Wang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, T. (2007). An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining. In: Chang, K.CC., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72909-9_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72909-9_74

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72908-2

  • Online ISBN: 978-3-540-72909-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics