Abstract
We develop a new exact bit-parallel string matching algorithm, based on the Shift-Or algorithm (Baeza-Yates & Gonnet, 1992). Assuming that the pattern representation fits into a single computer word, this algorithm has optimal O(n log σ m / m) average running time, as well as optimal O(n) worst case running time, where n, m and σ are the sizes of the text, the pattern, and the alphabet, respectively. We also study several implementation details. The experimental results show that our algorithm is the fastest in most of the cases where it can be applied, displacing even the long-standing BNDM (Navarro & Raffinot, 2000) family of algorithms. Finally, we show how to adapt our techniques for the Shift-Add algorithm (Baeza-Yates & Gonnet, 1992), obtaining optimal time for searching under Hamming distance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R.A., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)
Crochemore, M., Iliopoulos, C., Navarro, G., Pinzon, Y., Salinger, A.: Bit-parallel (δ,γ)-matching suffix automata. Journal of Discrete Algorithms (JDA) 3(2–4), 198–214 (2005)
Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Oxford (1994)
He, L., Fang, B.: Linear nondeterministic dawg string matching algorithm. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 70–71. Springer, Heidelberg (2004)
Holub, J., Durian, B.: Fast variants of bit parallel approach to suffix automata. In: Talk given in The Second Haifa Annual International Stringology Research Workshop of the Israeli Science Foundation (2005), http://www.cri.haifa.ac.il/events/2005/string/presentations/Holub.pdf
Horspool, R.N.: Practical fast searching in strings. Softw. Pract. Exp. 10(6), 501–506 (1980)
Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(1), 323–350 (1977)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)
Navarro, G.: NR-grep: a fast and flexible pattern matching tool. Softw. Pract. Exp. 31, 1265–1312 (2001)
Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString
Peltola, H., Tarhio, J.: Alternative algorithms for bit-parallel string matching. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 80–93. Springer, Heidelberg (2003)
Sunday, D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)
Sutinen, E., Tarhio, J.: On using q-gram locations in approximate string matching. In: Spirakis, P.G. (ed.) ESA 1995. LNCS, vol. 979, pp. 327–340. Springer, Heidelberg (1995)
Takaoka, T.: Approximate pattern matching with samples. In: Du, D.-Z., Zhang, X.-S. (eds.) ISAAC 1994. LNCS, vol. 834, pp. 236–242. Springer, Heidelberg (1994)
Wu, S., Manber, U.: Fast text searching allowing errors. Commun. ACM 35(10), 83–91 (1992)
Yao, A.C.: The complexity of pattern matching for a random string. SIAM J. Comput. 8(3), 368–387 (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fredriksson, K., Grabowski, S. (2005). Practical and Optimal String Matching. In: Consens, M., Navarro, G. (eds) String Processing and Information Retrieval. SPIRE 2005. Lecture Notes in Computer Science, vol 3772. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11575832_42
Download citation
DOI: https://doi.org/10.1007/11575832_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29740-6
Online ISBN: 978-3-540-32241-2
eBook Packages: Computer ScienceComputer Science (R0)