Skip to main content

Pattern Matching in Text Compressed by Using Antidictionaries

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1645))

Included in the following conference series:

Abstract

In this paper we focus on the problem of compressed pattern matching for the text compression using antidictionaries, which is a new compression scheme proposed recently by Crochemore et al. (1998). We show an algorithm which preprocesses a pattern of length m and an antidictionary M in O(m 2 + ‖M‖) time, and then scans a compressed text of length n in O(n + r) time to find all pattern occurrences, where ‖M‖ is the total length of strings in M and r is the number of the pattern occurrences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho and M. Corasick. Efficient string matching: An aid to bibliographic search. Comm. ACM, 18(6):333–340, 1975.

    Article  MathSciNet  MATH  Google Scholar 

  2. A. Amir and G. Benson. Efficient two-dimensional compressed matching. In Proc. Data Compression Conference’92, page 279, 1992.

    Google Scholar 

  3. A. Amir and G. Benson. Two-dimensional periodicity and its application. In Proc. 3rd Ann. ACM-SIAM Symp. on Discrete Algorithms, pages 440–452, 1992.

    Google Scholar 

  4. A. Amir, G. Benson, and M. Farach. Let sleeping files lie: Pattern matching in Z-compressed files. Journal of Computer and System Sciences, 52:299–307, 1996.

    Article  MathSciNet  MATH  Google Scholar 

  5. A. Amir, G. Benson, and M. Farach. Optimal two-dimensional compressed matching. Journal of Algorithms, 24(2):354–379, 1997.

    Article  MathSciNet  MATH  Google Scholar 

  6. A. Amir, G.M. Landau, and U. Vishkin. Efficient pattern matching with scaling. Journal of Algorithms, 13(1):2–32, 1992.

    Article  MATH  Google Scholar 

  7. M. Crochemore, F. Mignosi, and A. Restivo. Minimal forbidden words and factor automata. In L. Brim, J. Gruska, and J. Zlatuska, editors, Proc. 23rd Internationial Symp. on Mathematical Foundations of Computer Science, volume 1450 of Lecture Notes in Computer Science, pages 665–673. Springer-Verlag, 1998.

    Google Scholar 

  8. M. Crochemore, F. Mignosi, A. Restivo, and S. Salemi. ext compression using antidictionaries. Technical Report IGM-98-10, Institut Gaspard-Monge, 1998.

    Google Scholar 

  9. E.S. de Moura, G. Navarro, N. Ziviani, and R. Baeza-Yates. Direct pattern matching on compressed text. In Proc. 5th International Symp. on String Processing and Information Retrieval, pages 90–95. IEEE Computer Society, 1998.

    Google Scholar 

  10. E.S. de Moura, G. Navarro, N. Ziviani, and R. Baeza-Yates. Fast sequencial searching on compressed texts allowing errors. In Proc. 21st Ann. International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 298–306. York Press, 1998.

    Google Scholar 

  11. T. Eilam-Tzoreff and U. Vishkin. Matching patterns in strings subject to multi-linear transformations. Theoretical Computer Science, 60(3):231–254, 1988.

    Article  MathSciNet  MATH  Google Scholar 

  12. M. Farach and M. Thorup. String-matching in Lempel-Ziv compressed strings. In Proc. 27th Ann. ACM Symp. on Theory of Computing, pages 703–713, 1995.

    Google Scholar 

  13. S. Fukamachi, T. Shinohara, and M. Takeda. String pattern matching for compressed data using variable length codes. Submitted, 1998.

    Google Scholar 

  14. L. Gąsieniec, M. Karpinski, W. Plandowski, and W. Rytter. Efficient algorithms for Lempel-Ziv encoding. In Proc. 4th Scandinavian Workshop on Algorithm Theory, volume 1097 of Lecture Notes in Computer Science, pages 392–403. Springer-Verlag, 1996.

    Google Scholar 

  15. M. Karpinski, W. Rytter, and A. Shinohara. An efficient pattern-matching algorithm for strings with short descriptions. Nordic Journal of Computing, 4:172–186, 1997.

    MathSciNet  MATH  Google Scholar 

  16. T. Kida, M. Takeda, A. Shinohara, and S. Arikawa. Shift-And approach to pattern matching in LZW compressed text. In Proc. 10th Ann. Symp. on Combinatorial Pattern Matching, Lecture Notes in Computer Science. Springer-Verlag, 1999. To appear.

    Book  MATH  Google Scholar 

  17. T. Kida, M. Takeda, A. Shinohara, M. Miyazaki, and S. Arikawa. Multiple pattern matching in LZW compressed text. In Proc. Data Compression Conference’ 98, pages 103–112. IEEE Computer Society, 1998.

    Google Scholar 

  18. U. Manber. A text compression scheme that allows fast searching directly in the compressed file. In Proc. 5th Ann. Symp. on Combinatorial Pattern Matching, volume 807 of Lecture Notes in Computer Science, pages 113–124. Springer-Verlag, 1994.

    Google Scholar 

  19. M. Miyazaki, S. Fukamachi, M. Takeda, and T. Shinohara. Speeding up the pattern matching machine for compressed texts. Transactions of Information Processing Society of Japan, 39(9):2638–2648, 1998. (in Japanese).

    MathSciNet  Google Scholar 

  20. M. Miyazaki, A. Shinohara, and M. Takeda. An improved pattern matching algorithm for strings in terms of straight-line programs. In Proc. 8th Ann. Symp. on Combinatorial Pattern Matching, volume 1264 of Lecture Notes in Computer Science, pages 1–11. Springer-Verlag, 1997.

    Google Scholar 

  21. Y. Shibata, T. Kida, S. Fukamachi, M. Takeda, A. Shinohara, T. Shinohara, and S. Arikawa. Byte pair encoding: a text compression scheme that accelerates pattern matching. Technical Report DOI-TR-161, Department of Informatics, Kyushu University, April 1999.

    Google Scholar 

  22. M. Takeda. Pattern matching machine for text compressed using finite state model. Technical Report DOI-TR-142, Department of Informatics, Kyushu University, October 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S. (1999). Pattern Matching in Text Compressed by Using Antidictionaries. In: Crochemore, M., Paterson, M. (eds) Combinatorial Pattern Matching. CPM 1999. Lecture Notes in Computer Science, vol 1645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48452-3_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-48452-3_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66278-5

  • Online ISBN: 978-3-540-48452-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics