Abstract
This paper proposes a simple data structure, called a prefix list, which maintains all prefixes of a string in reverse lexicographic order. It can be on-line incrementally constructed in time and space linear in the string length. It is strongly related to sufix trees and sufix arrays, and may share applications with these existing structures. A sufix array can be built via the corresponding prefix list in linear time. Particular applications of the prefix list lie in source-coding problems that require on-line right-to-left string matching. We apply the prefix list to on-line estimation of source entropy and to context-based symbol-ranking text compression algorithms.
Partially supported by the Kayamori foundation of informational science advance- ment and by the Okawa foundation for information and telecommunications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arnold, R. and Bell, T.: A corpus for the evaluation of lossless compression algorithms. DCC’97, Proc. Data Compression Conf., Snowbird, Utah (1997) 201–210
Bell, T.C., Cleary, J.G., and Witten, I.H.: Text Compression. Prentice Hall, Englewood Cliffs (1990)
Burrows, M. and Wheeler, D.J.: A block-sorting lossless data compression algorithm. SRC Research Report, 124 (1994)
Chen, M.T. and Seiferas, J.: Efficient and elegant subword-tree construction. In Apostolico, A. and Galil, Z. (eds.): Combinatorial Algorithms on Words, NATO ASI Series, Springer, Berlin (1984)
Fenwick, P.M.: Symbol ranking text compression with Shannon recodings. J. Universal Computer Science 3 (1997) 70–85. http://www.iicm.edu/jucs_3_2
Gonnet, G.H., Baeza-Yates, R.A., and Snider, T.: New indices for text: Pat trees and pat arrays. In Frakes, W.B. and Baeza-Yates, R.A. (eds.): Information Retrieval: Data Structures and Algorithms, Chap. 5. Prentice Hall, Englewood Cliffs (1992) 66–82
Kontoyiannis, I., Algoet, P.H., Suhov, Yu. M., and Wyner, A.J.: Nonparametric entropy estimation for stationary processes and random fields, with applications to English text. IEEE Trans. Inform. Theory 44 (1998) 1319–1327
Manber, U. and Myers, G.: Sufix arrays: A new method for on-line string searches. Proc. 1st Annual ACM-SIAM Symposium on Discrete Algorithms (1990) 319–327. Appeared also in SIAM J. Comput. 22 (1993) 935-948
Matias, Y., Muthukrishnan, S., Sahinalp, S.C., and Ziv, J.: Augmenting sufix trees with applications. Proc. ESA’98 European Symposium on Algorithms, Venice, Italy (1998)
McCreight, E.M.: A space-economical sufix tree construction algorithm. J. ACM 23 (1976) 262–272
Salomon, D.: Data Compression: The Complete Reference. Springer, New York (1998)
Ukkonen, E.: On-line construction of sufix trees. Algorithmica 14 (1995) 249–260
Wyner, A.D., Ziv, J., and Wyner, A. J.: On the role of pattern matching in information theory. IEEE Trans. Inform. Theory 44 (1998) 2045–2056
Yokoo, H.: Data compression using a sort-based context similarity measure. Computer Journal 40 (1997) 94–102
Yokoo, H.: Context tables: A tool for describing text compression algorithms. DCC’98, Proc. Data Compression Conf., Snowbird, Utah (1998) 299–308
Ziv, J. and Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory IT-23 (1977) 337–343
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yokoo, H. (1999). A Dynamic Data Structure for Reverse Lexicographically Sorted Prefixes. In: Crochemore, M., Paterson, M. (eds) Combinatorial Pattern Matching. CPM 1999. Lecture Notes in Computer Science, vol 1645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48452-3_12
Download citation
DOI: https://doi.org/10.1007/3-540-48452-3_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66278-5
Online ISBN: 978-3-540-48452-3
eBook Packages: Springer Book Archive