Skip to main content
Log in

Full-Fledged Real-Time Indexing for Constant Size Alphabets

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

In this paper we describe a data structure that supports pattern matching queries on a dynamically arriving text over an alphabet of constant size. Each new symbol can be prepended to T in O(1) worst-case time. At any moment, we can report all occurrences of a pattern P in the current text in \(O(|P|+k)\) time, where |P| is the length of P and k is the number of occurrences. This resolves, under assumption of constant size alphabet, a long-standing open problem of existence of a real-time indexing method for string matching (see Amir and Nor in Real-time indexing over fixed finite alphabets, pp. 1086–1095, 2008).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Henceforth, \(\log ^{(3)}n=\log \log \log n\).

  2. For simplicity we assume that \(\log ^{(3)}n\) and \(\log \log n\) are integers and \(\log ^{(3)}n \) divides \(\log \log n\). If this is not the case, we can find \(d'\) and d that satisfy these requirements such that \(\log \log n\le d\le 2\log \log n\) and \(\log ^{(3)}n\le d'\le 2\log ^{(3)}n\).

  3. In fact, the query time is even slightly better.

  4. In fact, it would suffice to store \(3d-1\) most recently read symbols in compact form.

References

  1. Amir, A., Kopelowitz, T., Lewenstein, M., Lewenstein, N.: Towards real-time suffix tree construction. In: Consens, M., Navarro, G. (eds.) Proceedings of International Symposium on String Processing and Information Retrieval (SPIRE), volume 3772 of Lecture Notes in Computer Science, pp. 67–78. Springer, Berlin (2005)

  2. Amir, A. and Nor, I.: Real-time indexing over fixed finite alphabets. In: Proceedings of 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), pp. 1086–1095 (2008)

  3. Breslauer, D., Grossi, R., Mignosi, F.: Simple real-time constant-space string matching. In: Giancarlo, R., Manzini, G. (eds.) Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 6661, pp. 173–183. Springer, Berlin (2011)

  4. Breslauer, D., Italiano, G.F.: Near real-time suffix tree construction via the fringe marked ancestor problem. In: Proceedings of 18th International Symposium on String Processing and Information Retrieval (SPIRE 2011), pp. 156–167 (2011)

  5. Cole, R., Hariharan, R.: Dynamic LCA queries on trees. SIAM J. Comput. 34(4), 894–923 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Dietz, P.F., Sleator, D.D.: Two algorithms for maintaining order in a list. In: Proceedings of 19th Annual ACM Symposium on Theory of Computing (STOC 1987), pp. 365–372 (1987)

  7. Fischer, J., Gawrychowski, P.: Alphabet-dependent string searching with wexponential search trees. CoRR, abs/1302.3347 (2013)

  8. Fredman, M.L., Willard, D.E.: Trans-dichotomous algorithms for minimum spanning trees and shortest paths. J. Comput. Syst. Sci. 48(3), 533–551 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  9. Galil, Z.: String matching in real time. J. ACM 28(1), 134–149 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  10. Giora, Y., Kaplan, H.: Optimal dynamic vertical ray shooting in rectilinear planar subdivisions. ACM Trans. Algorithms (2009). doi:10.1145/1541885.1541889

  11. Kopelowitz, T.: On-line indexing for general alphabets via predecessor queries on subsets of an ordered list. In: Proceedings of 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2012), pp. 283–292 (2012)

  12. Kosaraju, S.R.: Real-time pattern matching and quasi-real-time construction of suffix trees (preliminary version). In: Proceedings of 26th Annual ACM Symposium on Theory of Computing (STOC 1994), pp. 310–316. ACM (1994)

  13. Kucherov, G., Nekrich, Y., Starikovskaya, T.: Cross-document pattern matching. In: Kärkkäinen, J., Stoye, J. (eds) Proceedings of the 23rd Annual Symposium on Combinatorial Pattern Matching (CPM), July 3–5, 2012, Helsinki (Finland), volume 7354 of Lecture Notes in Computer Science, pp. 196–207. Springer (2012)

  14. Mortensen, C.W.: Fully-dynamic two dimensional orthogonal range and line segment intersection reporting in logarithmic time. In: Proceedings of 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2003), pp. 618–627 (2003)

  15. Navarro, G., Nekrich, Y.: Top-k document retrieval in optimal time and linear space. In: Proceedings of 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012), pp. 1066–1077 (2012)

  16. Slisenko, A.: String-matching in real time: some properties of the data structure. In: Mathematical Foundations of Computer Science 1978, Proceedings, 7th Symposium, Zakopane, Poland, September 4–8, 1978, volume 64 of Lecture Notes in Computer Science, pp. 493–496. Springer (1978)

  17. van Emde Boas, P., Kaas, R., Zijlstra, E.: Design and implementation of an efficient priority queue. Math. Syst. Theory 10, 99–127 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  18. Willard, D.E.: A density control algorithm for doing insertions and deletions in a sequentially ordered file in good worst-case time. Inf. Comput. 97(2), 150–204 (1992)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

GK has been supported by the Labex Bézout program funded by the French government. This work was done during the visit of YN to the Laboratoire d’Informatique Gaspard Monge, supported by Université Paris-Est Marne-la-Vallée and CNRS. We thank the anonymous reviewers for helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yakov Nekrich.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kucherov, G., Nekrich, Y. Full-Fledged Real-Time Indexing for Constant Size Alphabets. Algorithmica 79, 387–400 (2017). https://doi.org/10.1007/s00453-016-0199-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-016-0199-7

Keywords

Navigation