skip to main content
research-article

Sequential pattern mining -- approaches and algorithms

Published:12 March 2013Publication History
Skip Abstract Section

Abstract

Sequences of events, items, or tokens occurring in an ordered metric space appear often in data and the requirement to detect and analyze frequent subsequences is a common problem. Sequential Pattern Mining arose as a subfield of data mining to focus on this field. This article surveys the approaches and algorithms proposed to date.

References

  1. Aggarwal, C. C. 2007. Data Streams: Models and Algorithms. Springer, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB). J. B. Bocca, M. Jarke, and C. Zaniolo, Eds., Morgan Kaufmann, 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Agrawal, R. and Srikant, R. 1995. Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering (ICDE'95). P. S. Yu and A. S. P. Chen, Eds., IEEE Computer Society Press, 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Agrawal, R. C., Aggarwal, C. C., and Prasad, V. V. V. 1999. A tree projection algorithm for generation of frequent itemsets. In High Performance Data Mining Workshop. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aho, A.1990. Algorithms for Finding Patterns in Strings. Vol. A: Algorithms and Complexity. MIT Press, Cambridge, MA, 255--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ahonen, H., Heinonen, O., Klemettinen, M., and Verkamo, A. I. 1997. Applying data mining techniques in text analysis. Tech. rep. C-1997-23, Department of Computer Science, University of Helsinki.Google ScholarGoogle Scholar
  7. Ahonen, H., Heinonen, O., Klemettinen, M., and Verkamo, A. I. 1998. Applying data mining techniques for descriptive phrase extraction in digital document collections. In Proceedings of the Advances in Digital Libraries Conference. IEEE Computer Society, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Albert-Lorincz, H. and Boulicaut, J.-F. 2003a. A framework for frequent sequence mining under generalized regular expression constraints. In Proceedings of the 2<sup>nd</sup> International Workshop on Inductive Databases. KDID, J.-F. Boulicaut and S. Dzeroski, Eds., 2--16.Google ScholarGoogle Scholar
  9. Albert-Lorincz, H. and Boulicaut, J.-F. 2003b. Mining frequent sequential patterns under regular expressions: A highly adaptive strategy for pushing contraints. In Proceedings of the 3rd SIAM International Conference on Data Mining. D. Barbar'a and C. Kamath, Eds., SIAM.Google ScholarGoogle Scholar
  10. Allen, J. F. 1983. Maintaining knowledge about temporal intervals. Comm. ACM 26, 11,832--843. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Amir, A., Lewenstein, M., and Porat, E. 2000. Faster algorithms for string matching with k mismatches. In Proceedings of the 11<sup>th</sup> Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 794--803. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Antunes, C. and Oliveira, A. L. 2004. Sequential pattern mining with approximated constraints. In Proceedings of the International Conference on Applied Computing.Google ScholarGoogle Scholar
  13. Arslan, A. N. and Egecioglu, O. 1999. An efficient uniform-cost normalized edit distance algorithm. In Proceedings of the 6th Symposium on String Processing and Information Retrieval (SPIRE'99). IEEE Computer Society, 8--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Arslan, A. N. and Egecioglu, O. 2000. Efficient algorithms for normalized edit distance. J. Discr. Algor. 1, 1, 3--20.Google ScholarGoogle Scholar
  15. Ayres, J., Flannick, J., Gehrke, J., and Yiu, T. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 429--435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Batu, T., Ergün, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., and Sami, R. 2003. A sublinear algorithm for weakly approximating edit distance. In Proceedings of the 35th ACM Symposium on Theory of Computing. ACM Press, 316--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bayardo, R. J. and Agrawal, R. 1999. Mining the most interesting rules. In Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining. S. Chaudhuri and D. Madigan, Eds., ACM Press, 145--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bentley, J. L. and Sedgewick, R. 1997. Fast algorithms for sorting and searching strings. In Proceedings of the 8th Annual ACM/SIAM Symposium on Discrete Algorithms. SIAM, 360--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Breslauer, D. and Gąsieniec, L. 1995. Efficient string matching on coded texts. In Proceedings of the 6th Annual Symposium on Combinatorial Pattern Matching. Z. Galil and E. Ukkonen, Eds., Springer, 27--40.Google ScholarGoogle Scholar
  20. Bunke, H. and Csirik, J. 1992. Edit distance of run-length coded strings. In Proceedings of the ACM/SIGAPP Symposium on Applied Computing. ACM Press, 137--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Cai, Y. D., Clutter, D., Pape, G., Han, J., Welge, M., and Auvil, L. 2004. Maids: Mining alarming incidents from data streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 919--920. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Casas-Garriga, G. 2005. Summarizing sequential data with closed partial orders. In Proceedings of the 5th SIAM International Conference on Data Mining. H. Kargupta, J. Srivastava, and A. Chandrika Kamath, Eds., Vol. 119, 380--391.Google ScholarGoogle ScholarCross RefCross Ref
  23. Ceglar, A. and Roddick, J. F. 2006. Association mining. ACM Comput. Surv. 38, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ceglar, A., Roddick, J. F., and Calder, P. 2003. Guiding Knowledge Discovery Through Interactive Data Mining. Idea Group Publishers, Hershey, PA, 45--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Chakrabarti, S., Sarawagi, S., and Dom, B. 1998. Mining surprising patterns using temporal description length. In Proceedings of the 24th International Conference on Very Large Data Bases, (VLDB'98). A. Gupta, O. Shmueli, and J. Widom, Eds. Morgan Kaufmann, 606--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Chan, S., Kao, B., Yip, C. L., and Tang, M. 2002. Mining emerging substrings. Tech. rep. TR-2002-11, HKU CSIS.Google ScholarGoogle Scholar
  27. Cheng, H., Yan, X., and Han, J. 2004. Incspan: Incremental mining of sequential patterns in large database. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '04). ACM Press, 527--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chiu, D.-Y., Wu, Y.-H., and Chen, A. L. P. 2004. An efficient algorithm for mining frequent sequences by a new strategy without support counting. In Proceedings of the 20th International Conference on Data Engineering (ICDE'04). IEEE Computer Society, 375--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Cole, R. and Hariharan, R. 1998. Approximate string matching: A simpler faster algorithm. In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 463--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Cong, S., Han, J., and Padua, D. A. 2005. Parallel mining of closed sequential patterns. In Proceedings of the 11<sup>th</sup> ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. R. Grossman, R. Bayardo, and K. P. Bennett, Eds., ACM, 562--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Cormode, G. and Muthukrishnan, S. 2002. The string edit distance matching problem with moves. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 667--676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Demiriz, A. and Zaki, M. J. 2002. webSPADE: A parallel sequence mining algorithm to analyze the web log data. In Proceedings of the 2<sup>nd</sup> IEEE International Conference on Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. El-Sayed, M., Ruiz, C., and Rundensteiner, E. A. 2004. FS-miner: Efficient and incremental mining of frequent sequence patterns in web logs. In Proceedings of the 6<sup>th</sup> ACM International Workshop on Web Information and Data Management (WIDM'04). A. H. F. Laender, D. Lee, and M. Ronthaler, Eds., ACM, 128--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Fiot, C., Laurent, A., and Teisseire, M. 2007. From crispness to fuzziness: Three algorithms for soft sequential pattern mining. IEEE Trans. Fuzzy Syst. 15, 6, 1263--1277. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Fu, Y. and Han, J. 1995. Meta-Rule-Guided mining of association rules in relational databases. In Proceedings of the 1<sup>st</sup> International Workshop on Integration of Knowledge Discovery with Deductive and Object-Oriented Databases (KDOOD'95). 39--46.Google ScholarGoogle Scholar
  36. Gaber, M. M., Zaslavsky, A., and Krishnaswamy, S. 2005. Mining data streams: A review. SIGMOD Rec. 34, 2, 18--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Garofalakis, M. N., Rastogi, R., and Shim, K. 1999. SPIRIT: Sequential pattern mining with regular expression constraints. In Proceedings of the 25th International Conference on Very Large Databases (VLDB'99). 223--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Giannella, C., Han, J., Pei, J., Yan, X., and Yu, P. S. 2003. Mining frequent patterns in data streams at multiple time granularities. In Next Generation Data Mining. H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha, Eds., 191--212.Google ScholarGoogle Scholar
  39. Guralnik, V. and Karypis, G. 2004. Parallel tree-projection-based sequence mining algorithms. Parallel Comput. 30, 4, 443--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Guralnik, V., Wijesekera, D., and Srivastava, J. 1998. Pattern directed mining of sequence data. In Proceedings of the 4<sup>th</sup> International Conference on Knowledge Discovery and Data Mining (KDD '98). R. Agrawal, P. E. Stolorz, and G. Piatetsky-Shapiro, Eds., AAAI Press, 51--57.Google ScholarGoogle Scholar
  41. Hall, P. A. V. and Dowling, G. R. 1980. Approximate string matching. ACM Comput. Surv. 12, 4, 381--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Han, J., Cheng, H., Xin, D., and Yan, X. 2007. Frequent pattern mining: Current status and future directions. Data Mining Knowl. Discov. 15, 1, 55--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Han, J., Koperski, K., and Stefanovic, N. 1997. GeoMiner: A system prototype for spatial data mining. In Proceedings of the ACM SIGMOD International Conference on the Management of Data (SIGMOD '97). J. Peckham, Ed., ACM Press, 553--556. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Han, J. and Pei, J. 2000. Mining frequent patterns by pattern growth: Methodology and implications. SIGKDD Explor. Newslett. 2, 2, 14--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.-C. 2000a. Freespan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 355--359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Han, J., Pei, J., and Yin, Y. 2000b. Mining frequent patterns without candidate generation. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '00). ACM, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Hingston, P. 2002. Using finite state automata for sequence mining. In Proceedings of the 25<sup>th</sup> Australasian Conference on Computer Science (ACSC'02). Australian Computer Society, Inc., 105--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Hong, T. P., Lin, K. Y., and Wang, S. L. 2001. Mining fuzzy sequential patterns from multiple-item transactions. In Proceedings of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference. Vol. 3., IEEE, 1317--1321.Google ScholarGoogle Scholar
  49. Hoppner, F. 2001. Discovery of temporal patterns. Learning rules about the qualitative behaviour of time series. In Proceedings of the 5<sup>th</sup> European Conference on Principles of Data Mining and Knowledge Discovery (PKDD'01). 192-203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Hoppner, F. and Klawonn, F. 2002. Finding informative rules in interval sequences. Intell. Data Anal. 6, 6, 237--255.Google ScholarGoogle ScholarCross RefCross Ref
  51. Hsu, C. M., Chen, C. Y., Liu, B. J., Huang, C. C., Laio, M. H., Lin, C. C., and Wu, T. L. 2007. Identification of hot regions in protein-protein interactions by sequential pattern mining. BMC Bioinf. 8, 5, 8.Google ScholarGoogle ScholarCross RefCross Ref
  52. Hu, Y. C., Chen, R. S., Tzeng, G. H., and Shieh, J. H. 2003. A fuzzy data mining algorithm for finding sequential patterns. Int. J. Uncert., Fuzziness Knowl. Based Syst. 11, 2, 173--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Hu, Y. C., Tzeng, G. H., and Chen, C. M. 2004. Deriving two-stage learning sequences from knowledge in fuzzy sequential pattern mining. Inf. Sci. 159, 1-2, 69--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Huang, K.-Y., Chang, C.-H., and Lin, K.-Z. 2004. PROWL: An efficient frequent continuity mining algorithm on event sequences. In Proceedings of the 6<sup>th</sup> International Conference on Data Warehousing and Knowledge Discovery (DaWaK'04). Y. Kambayashi and W. Wöß, Eds., Lecture Notes in Computer Science, vol. 3181, Springer, 351--360.Google ScholarGoogle Scholar
  55. Huang, K.-Y., Chang, C.-H., and Lin, K.-Z. 2005. ClosedPROWL: Efficient mining of closed frequent continuities by projected window list technology. In Proceedigns of the SIAM International Conference on Data Mining.Google ScholarGoogle Scholar
  56. Hyyro, H. 2003. A bit-vector algorithm for computing levenshtein and damerau edit distances. Nordic J. Comput. 10, 1, 29--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Joshi, M. V., Karypis, G., and Kumar, V. 1999. Universal formulation of sequential patterns. Tech. rep. 99-21, Department of Computer Science, University of Minnesota.Google ScholarGoogle Scholar
  58. Kam, P.-S. and Fu, A. W.-C. 2000. Discovering temporal patterns for interval-based events. In Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery (DaWaK '00). Y. Kambayashi, M. K. Mohania, and A. M. Tjoa, Eds., Lecture Notes in Computer Science, vol. 1874., Springer, 317--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Kum, H.-C., Chang, J. H., and Wang, W. 2007a. Benchmarking the effectiveness of sequential pattern mining methods. Data Knowl. Engin. 60, 1, 30--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Kum, H.-C., Chang, J. H., and Wang, W. 2007B. Intelligent sequential mining via alignment: Optimization techniques for very large databases. In Proceedings of the 11th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. (PAKDD'07). Springer, 587--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Kum, H.-C., Pei, J., Wang, W., and Duncan, D. 2002. ApproxMAP: Approximate mining of consensus sequential patterns. In Mining Sequential Patterns from Large Data Sets, W. Wang and J. Yang, Eds. Vol. 28., Springer.Google ScholarGoogle Scholar
  62. Landau, G. M., Myers, E. W., and Schmidt, J. P. 1998. Incremental string comparison. SIAM J. Comput. 27, 2, 557--582. Google ScholarGoogle ScholarCross RefCross Ref
  63. Laur, P.-A., Symphor, J.-E., Nock, R., and Poncelet, P. 2005. Mining sequential patterns on data streams: A near-optimal statistical approach. In Proceedigns of the 2<sup>nd</sup> International Workshop on Knowledge Discovery from Data Streams.Google ScholarGoogle Scholar
  64. Lin, J., Keogh, E., Lonardi, S., and Chiu, B. 2003. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM Press, 2--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Luo, C. and Chung, S. M. 2004. A scalable algorithm for mining maximal frequent sequences using sampling. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI '04). IEEE Computer Society, 156--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Mabroukeh, N. R. and Ezeife, C. I. 2010. A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43, 1, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Mannila, H. and Toivonen, H. 1996. Discovering generalized episodes using minimal occurrences. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD '96). AAAI Press, 146--151.Google ScholarGoogle Scholar
  68. Mannila, H., Toivonen, H., and Verkamo, A. I. 1995. Discovering frequent episodes in sequences. In Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining (KDD '95). U. M. Fayyad and R. Uthurusamy, Eds., AAAI Press, 210--215.Google ScholarGoogle Scholar
  69. Mannila, H., Toivonen, H., and Verkamo, A. I. 1997. Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1, 3, 259--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Marascu, A. and Masseglia, F. 2005. Mining sequential patterns from temporal streaming data. In Proceedings of the 1<sup>st</sup> ECML/PKDD Workshop on Mining Spatio-Temporal Data (MSTD'05), held in conjunction with the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'05).Google ScholarGoogle Scholar
  71. Masseglia, F., Cathala, F., and Poncelet, P. 1998. The PSP approach for mining sequential patterns. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD'98). Lecture Notes in Artificial Intelligence, vol. 1510., Springer, 176--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Masseglia, F., Poncelet, P., and Teisseire, M. 2000. Incremental mining of sequential patterns in large databases. Tech. rep., LIRMM.Google ScholarGoogle Scholar
  73. Mooney, C. H. and Roddick, J. F. 2006. Marking time in sequence mining. In Proceedings of the Australasian Conference on Data Mining and Analystics (AusDM '06). P. Christen, P. Kennedy, J. Li, S. Simoff, and G. Williams, Eds., Vol. 61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Navarro, G. 2001. A guided tour to approximate string matching. ACM Comput. Surv. 33, 1, 31--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Ng, R. T., Lakshmanan, L. V. S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'98). ACM, 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Nguyen, S. N., Sun, X., and Orlowska, M. E. 2005. Improvements of incspan: Incremental mining of sequential patterns in large database. In Proceedings of the 9th Pacific-Asia Conference (PAKDD'05). T. B. Ho, D. Cheung, and H. Liu, Eds., Vol. 3518., Springer, 442--451. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Oommen, B. J. and Loke, R. K. S. 1995. Pattern recognition of strings with substitutions, insertions, deletions and generalized transpositions. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. Vol. 2. 1154--1159.Google ScholarGoogle Scholar
  78. Oommen, B. J. and Zhang, K. 1996. The Normalized String Editing problem revisited. IEEE Trans. Pattern Anal. Mach. Intell. 18, 6, 669--672. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Orlando, S., Perego, R., and Silvestri, C. 2004. A New Algorithm for gap constrained sequence mining. In Proceedings of the ACM Symposium on Applied Computing (SAC). ACM Press, 540--547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Ouh, J. Z., Wu, P. H., and Chen, M. S. 2001. Experimental results on a constrained based sequential pattern mining for telecommunication alarm data. In Proceedings of the 2nd International Conference on Web Information Systems Engineering (WISE'01). IEEE Computer Society, 186--193.Google ScholarGoogle Scholar
  81. Padmanabhan, B. and Tuzhilin, A. 1996. Pattern discovery in temporal databases: A temporal logic approach. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. E. Simoudis, J. Han, and U. Fayyad, Eds., AAAI Press, 351--354.Google ScholarGoogle Scholar
  82. Pan, F., Cong, G., Tung, A. K. H., Yang, J., and Zaki, M. J. 2003. Carpenter: finding closed patterns in long biological datasets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '03). ACM Press, 637--642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Parthasarathy, S., Zaki, M. J., Ogihara, M., and Dwarkadas, S. 1999. Incremental and interactive sequence mining. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM). ACM, 251--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Pei, J., Han, J., and Lakshmanan, L. V. S. 2001a. Mining frequent itemsets with convertible constraints. In Proceedings of the 17th International Conference on Data Engineering. IEEE Computer Society, 433--442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Pei, J., Han, J., and Mao, R. 2000a. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proceedings of the ACM SIGMOD International Workshop on Data Mining. ACM Press, 21--30.Google ScholarGoogle Scholar
  86. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. 2001b. PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth. In Proceedings of the International Conference of Data Engineering (ICDE'01). 215--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Pei, J., Han, J., Mortazavi-Asl, B., and Zhu, H. 2000b. Mining access patterns efficiently from web logs. In Proceedings of the 4th Pacific-Asia Conference (PAKDD'00). Lecture Notes in Computer Science, vol. 1805., Springer, 396--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Pei, J., Han, J., and Wang, W. 2002. Mining sequential patterns with constraints in large databases. In Proceedings of the 11th International Conference on Information and Knowledge Management. ACM Press, 18--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Pei, J., Han, J., and Wang, W. 2007. Constraint-Based sequential pattern mining: The pattern-growth methods. J. Intell. Inf. Syst. 28, 2, 133--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Pei, J., Liu, J., Wang, H., Wang, K., Yu, P. S., and Wang, J. 2005. Efficiently mining frequent closed partial orders. In Proceedings of the 5th IEEE International Conference on Data Mining. IEEE, 753--756. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Pei, J., Wang, H., Liu, J., Wang, K., Wang, J., and Yu, P. S. 2006. Discovering frequent closed partial orders from strings. IEEE Trans. Knowl. Data Engin. 18, 11, 1467--1481. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., and Dayal, U. 2001. Multi-Dimensional sequential pattern mining. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM Press, 81--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Rajman, M. and Besanç, on, R. 1998. Text mining -- knowledge extraction from unstructured textual data. In Proceedings of the 6th Conference of International Federation of Classification Societies (IFCS'98).Google ScholarGoogle ScholarCross RefCross Ref
  94. Sankoff, D. and Kruskal, J. B. 1999. Time Warps, String Edits, and Macromolecules/The Theory and Practice of Sequence Comparison. David Hume Series., Center for the Study of Language and Information, Stanford, CA.Google ScholarGoogle Scholar
  95. Savary, L. and Zeitouni, K. 2005. Indexed bit map (ibm) for mining frequent sequences. In Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'05). A. Jorge, L. Torgo, P. Brazdil, R. Camacho, and J. A. Gama, Eds., Lecture Notes in Computer Science, vol. 3721., Springer, 659--666.Google ScholarGoogle Scholar
  96. Seno, M. and Karypis, G. 2001. LPMiner: An algorithm for finding frequent itemsets using length-decreasing support constraint. In Proceedigns of the 1st IEEE Conference on Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Seno, M. and Karypis, G. 2002. SLPMiner: An algorithm for finding frequent sequential patterns using length-decreasing support. Tech. rep. 02-023, University of Minnesota.Google ScholarGoogle Scholar
  98. Seno, M. and Karypis, G. 2005. Finding frequent patterns using length-decreasing support constraints. IEEE Trans. Knowl. Data Engin. 10, 3, 197--228.Google ScholarGoogle Scholar
  99. Srikant, R. and Agrawal, R. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology (EDBT'96), P. M. G. Apers, M. Bouzeghoub, and G. Gardarin, Eds., Lecture Notes in Computer Science, vol. 1057. Springer, 3--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Srivastava, J., Cooley, R., Deshpande, M., and Tan, P.-N. 2000. Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explor. 1, 2, 12--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Sun, X., Orlowska, M. E., and Zhou, X. 2003. Finding event-oriented patterns in long temporal sequences. In Proceedings of the 7th Pacific-Asia Conference (PAKDD'03), K.-Y. Whang, J. J. and, K. S. and, and J. Srivastava, Eds., Lecture Notes in Computer Science, vol. 2637., Springer, 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Teng, W.-G., Chen, M.-S., and Yu, P. S. 2003. A regression-based temporal pattern mining scheme for data streams. In Proceedings of the 29th International Conference on Very Large Data Bases (VLBD '03), J. C. Freytag, P. C. Lockemann, S. Abiteboul, M. J. Carey, P. G. Selinger, and A. Heuer, Eds., Morgan Kaufmann, 93--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Tichy, W. F. 1984. The string-to-string correction problem with block moves. ACM Trans. Comput. Syst. 2, 4, 309--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Toivonen, H. 1996. Discovery of frequent patterns in large data collections. Tech. rep. a-1996-5, Department of Computer Science, University of Helsinki.Google ScholarGoogle Scholar
  105. Tumasonis, R. and Dzemyda, G. 2004. The probabilistic algorithm for mining frequent sequences. In Proceedings of the Conference on Advances in Databases and Information Systems (ADBIS).Google ScholarGoogle Scholar
  106. Wagner, R. A. and Fischer, M. J. 1974. The string-to-string correction problem. J. ACM 21, 1, 168--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Wang, J. and Han, J. 2004. Bide: Efficient mining of frequent closed sequences. In Proceedings of the International Conference on Data Engineering (ICDE'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Wang, J., Han, J., and Pei, J. 2003. CLOSET&plus;: Searching for the best strategies for mining frequent closed itemsets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. L. Getoor, T. E. Senator, P. Domingos, and C. Faloutsos, Eds., ACM Press, 236--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Wang, K. 1997. Discovering patterns from large and dynamic sequential data. J. Intell. Inf. Syst. 9, 1, 33--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Wang, K. and Tan, J. 1996. Incremental discovery of sequential patterns. In Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery.Google ScholarGoogle Scholar
  111. Wang, K., Xu, Y., and Yu, J. X. 2004. Scalable sequential pattern mining for biological sequences. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management (CIKM'04). ACM, 178--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Wu, P. H., Peng, W. C., and Chen, M. S. 2001. Mining sequential alarm patterns in a telecommunication database. In Databases in Telecommunications II, W. Jonker, Ed., Lecture Notes in Computer Science, vol. 2209., Springer, 37--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Yan, X., Han, J., and Afshar, R. 2003. CloSpan: Mining closed sequential patterns in large datasets. In Proceedings of the International Conference on Data Mining (SDM'03).Google ScholarGoogle Scholar
  114. Yang, J., Wang, W., Yu, P. S., and Han, J. 2002. Mining long sequential patterns in a noisy environment. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Yang, Z. and Kitsuregawa, M. 2005. LAPIN-SPAM: An improved algorithm for mining sequential pattern. In Proceedings of the 21st International Conference on Data Engineering Workshops (ICDEW'05). IEEE Computer Society, 1222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Yang, Z., Wang, Y., and Kitsuregawa, M. 2005. LAPIN: Effective sequential pattern mining algorithms by last position induction. In Proceedings of the 21st International Conference on Data Engineering (ICDE‘05).Google ScholarGoogle Scholar
  117. Yu, C.-C. and Chen, Y.-L. 2005. Mining sequential patterns from multidimensional sequence data. IEEE Trans. Knowl. Data Engin. 17, 1, 136--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Zaki, M., Lesh, N., and Ogihara, M. 1998. Planmine: Sequence mining for plan failures. In Proceedings of the 4<sup>th</sup> International Conference on Knowledge Discovery and Data Mining (KDD'98), R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, Eds., ACM Press, 369--373.Google ScholarGoogle Scholar
  119. Zaki, M. J. 1998. Efficient enumeration of frequent sequences. In Proceedings of the 7th International Conference on Information and Knowledge Management. ACM Press, 68--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. Zaki, M. J. 2000. Sequence mining in categorical domains: Incorporating constraints. In Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM '00), A. Agah, J. Callan, and E. Rundensteiner, Eds., ACM Press, 422--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Zaki, M. J. 2001a. Parallel sequence mining on shared-memory machines. J. Parallel Distrib. Comput. 61, 3, 401--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. Zaki, M. J. 2001b. SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42, 1/2, 31--60.Google ScholarGoogle Scholar
  123. Zaki, M. J. and Hsiao, C.-J. 2002. CHARM: An efficient algorithm for closed itemset mining. In Proceedings of the 2nd SIAM International Conference on Data Mining (SDM'02). R. L. Grossman, J. Han, V. Kumar, H. Mannila, and R. Motwani, Eds., SIAM, 457--473.Google ScholarGoogle Scholar
  124. Zhang, M., Kao, B., Cheung, D. W.-L., and Yip, C. L. 2002. Efficient algorithms for incremental update of frequent sequences. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 186--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Zhang, M., Kao, B., Yip, C., and Cheung, D. 2001. A GSP-based efficient algorithm for mining frequent sequences. In Proceedings of the International Conference on Artificial Intelligence (ICAI'01).Google ScholarGoogle Scholar
  126. Zhao, Q. and Bhowmick, S. S. 2003. Sequential pattern mining: A survey. Tech. rep., Nanyang Technological University, Singapore.Google ScholarGoogle Scholar
  127. Zheng, Q., Xu, K., Ma, S., and Lv, W. 2002. The algorithms of updating sequential patterns. In Proceedings of the 5th International Workshop on High Performance Data Mining,in conjunction with the 2nd SIAM Conference on Data Mining.Google ScholarGoogle Scholar

Index Terms

  1. Sequential pattern mining -- approaches and algorithms

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 45, Issue 2
      February 2013
      417 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/2431211
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 March 2013
      • Revised: 1 September 2011
      • Accepted: 1 September 2011
      • Received: 1 September 2009
      Published in csur Volume 45, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader