Abstract
The frequent pattern mining problem has been studying for several years, while few works discuss on fault-tolerant pattern mining. Fault-tolerant data mining extracts more interesting information from real world data which may be polluted by noise. However, those few previous works either not define the problem maturely or restrict the problem to finding those patterns tolerate fixed number of fault items. In this paper, the problem of mining proportionally fault-tolerant frequent patterns is discussed. Two algorithms are proposed to solve it. The first algorithm, applies FT-Apriori heuristic and performs the idea of finding all FT-patterns with all possible number of faults. The second algorithm, divides all FT-patterns into several groups by their number of tolerable faults, and mines the content patterns of each group respectively. The experiment result shows more potential fault-tolerant patterns are extracted by our approach. Our contribution is offering a different type of fault-tolerant frequent pattern, in those patterns, the number of tolerated faults is proportional to the length of patterns. This gives the user another choice when traditional fault-tolerant frequent pattern mining result can’t satisfy them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD 1993), pp. 207–216, Washington, DC (May 1993)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. 1995 Int. Conf. Data Engineering, pp. 3–14, Taipei, Taiwan (March 1995)
Ozden, B., Ramaswamy, S., Silberschatz, A.: Cyclic association rules. In: Proc. 1998 Int. Conf. Data Engineering (ICDE 1998), pp. 412–421, Orlando, FL (February 1998)
Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: Proc. of ACM SIGMOD DMKD Workshop, pp. 21–30 (2000)
Agrawal, R., Aggarwal, C., Prasad, V.: Depth first generation of long patterns. In: Proc. of ACM SIGKDD Conf. pp. 108–118 (2000)
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of ICDE Conf. pp. 443–452 (2001)
Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: Proc. of ICDM conf. pp. 163–170 (2001)
Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. 1994 Int. Conf. Very Large Data Bases (VLDB 1994), pp. 487–499, Santiago, Chile (September 1994)
Park, J.S., Chen, M.S., Yu, P.S.: An Efficient Hash-based Algorithm for Mining Association Rules. In: Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1995), pp. 175–186, San Jose, Ca (May 1995)
Han, J., Fu, Y.: Discovery of Multiple-level Association Rules from Large Databases. In: Proc. 1995 Int. Conf. Very Large Data Bases (VLDB 1995), pp. 420–431, Zurich, Switzerland (September 1995)
Savasere, A., Omiecinski, E., Navathe, S.: An Efficient Algorithm for Mining Association Rules in Large Databases. In: Proc, Int. Conf. Very Large Data Bases(VLDB 1995), pp. 432–443, Zurich, Switzerland (Septemper 1995)
Toivonen, H.: Sampling Large Databases for Association Rules. In: Proc.1996 Int. Conf. Very Large Data Bases (VLDB 1996), pp. 134–145, Bombay, India (September 1996)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Analysis. In: Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1997), pp. 255–264, Tucson, AZ (May 1997)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns Without Candidate Generation, ACM SIGMOD Record. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of data (May 2000)
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. In: Proc. of ICDM conf. pp. 441–448 (2001)
Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection, ACM SIGKDD 2002 (July 23-26, 2002)
Pei, J., Tung, A.K.H., Han, J.: Fault-Tolerant Frequent Pattern Mining: Problems and Challenges, DMKD 2001, Santa Barbara, CA (May 2001)
Wang, S.-S., Lee, S.-Y.: Mining Fault-Tolerant Frequent Patterns in Large Database, International Computer Symposium (December 2002)
Yang, C., Fayyad, U., Bradley, P.S.: Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (August 2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, T. (2007). An Efficient Algorithm for Proportionally Fault-Tolerant Data Mining. In: Chang, K.CC., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72909-9_74
Download citation
DOI: https://doi.org/10.1007/978-3-540-72909-9_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72908-2
Online ISBN: 978-3-540-72909-9
eBook Packages: Computer ScienceComputer Science (R0)