Skip to main content

Nearly k-Universal Words - Investigating a Part of Simon’s Congruence

  • Conference paper
  • First Online:
Descriptional Complexity of Formal Systems (DCFS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13439))

Included in the following conference series:

Abstract

Determining the index of Simon’s congruence is a long outstanding open problem. Two words u and v are called Simon congruent if they have the same set of scattered factors (also known as subwords or subsequences), which are parts of the word in the correct order but not necessarily consecutive, e.g., \(\mathtt {oath}\) is a scattered factor of \(\mathtt {logarithm}\) but \(\mathtt {tail}\) is not. Following the idea of scattered factor k-universality (also known as k-richness), we investigate nearly k-universality, i.e., words where exactly one scattered factor of length k is absent. We present a full characterisation as well as the index of the congruence in this special case and the shortlex normal form for each such class. Moreover, we extend the definition to m-nearly k-universality (exactly m scattered factors of length k are absent), show some results for \(m>1\), and give a full combinatorial characterisation of m-nearly k-universal words which are additionally \((k-1)\)-universal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Am. Math. Soc. 12(4), 1119–1178 (1999)

    Article  MathSciNet  Google Scholar 

  2. Barker, L., Fleischmann, P., Harwardt, K., Manea, F., Nowotka, D.: Scattered factor-universality of words. In: Jonoska, N., Savchuk, D. (eds.) DLT 2020. LNCS, vol. 12086, pp. 14–28. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-48516-0_2

    Chapter  Google Scholar 

  3. Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: SPIRE, pp. 39–48. IEEE (2000)

    Google Scholar 

  4. Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.: The smallest automation recognizing the subwords of a text. Theor. Comp. Sci. 40, 31–55 (1985)

    Article  Google Scholar 

  5. Day, J., Fleischmann, P., Kosche, M., Koß, T., Manea, F., Siemer, S.: The edit distance to k-subsequence universality. In: STACS, vol. 187, pp. 25:1–25:19 (2021)

    Google Scholar 

  6. Do, D., Le, T., Le, N.: Using deep neural networks and biological subwords to detect protein s-sulfenylation sites. Brief. Bioinform. 22(3) (2021)

    Google Scholar 

  7. Dress, A., Erdős, P.: Reconstructing words from subwords in linear time. Ann. Combinatorics 8(4), 457–462 (2005)

    Article  MathSciNet  Google Scholar 

  8. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. ACM Sigmod Rec. 23(2), 419–429 (1994)

    Article  Google Scholar 

  9. Fleischer, L., Kufleitner, M.: Testing Simon’s congruence. In: Proceedings of MFCS 2018, LIPIcs, vol. 117, pp. 62:1–62:13 (2018)

    Google Scholar 

  10. Fleischmann, P., Germann, S., Nowotka, D.: Scattered factor universality-the power of the remainder. preprint arXiv:2104.09063 (published at RuFiDim) (2021)

  11. Fleischmann, P., Lejeune, M., Manea, F., Nowotka, D., Rigo, M.: Reconstructing words from right-bounded-block words. Int. J. Found. Comput. 32, 1–22 (2021)

    Article  MathSciNet  Google Scholar 

  12. Gawrychowski, P., Kosche, M., Koß, T., Manea, F., Siemer, S.: Efficiently testing Simon’s congruence. In: STACS, LIPIcs, vol. 187, pp. 34:1–34:18 (2021)

    Google Scholar 

  13. Hebrard, J.J.: An algorithm for distinguishing efficiently bit-strings by their subsequences. Theor. Comput. Sci. 82(1), 35–49 (1991)

    Article  MathSciNet  Google Scholar 

  14. Karandikar, P., Kufleitner, M., Schnoebelen, P.: On the index of Simon’s congruence for piecewise testability. Inf. Process. Lett. 115(4), 515–519 (2015)

    Article  MathSciNet  Google Scholar 

  15. Karandikar, P., Schnoebelen, P.: The height of piecewise-testable languages with applications in logical complexity. In: Proceedings of CSL, LIPIcs, vol. 62, pp. 37:1–37:22 (2016)

    Google Scholar 

  16. Karandikar, P., Schnoebelen, P.: The height of piecewise-testable languages and the complexity of the logic of subwords. LICS 15(2) (2019)

    Google Scholar 

  17. Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. J. ACM 53(6), 918–936 (2006)

    Article  MathSciNet  Google Scholar 

  18. Keogh, E., Lin, J., Lee, S.H., Van Herle, H.: Finding the most unusual time series subsequence: algorithms and applications. KAIS 11(1), 1–27 (2007)

    Google Scholar 

  19. Kosche, M., Koß, T., Manea, F., Siemer, S.: Absent subsequences in words. In: Bell, P.C., Totzke, P., Potapov, I. (eds.) RP 2021. LNCS, vol. 13035, pp. 115–131. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-89716-1_8

    Chapter  Google Scholar 

  20. Kátai-Urbán, K., Pach, P., Pluhár, G., Pongrácz, A., Szabó, C.: On the word problem for syntactic monoids of piecewise testable languages. Semigroup Forum 84(2), 323–332 (2012)

    Article  MathSciNet  Google Scholar 

  21. Lothaire, M.: Combinatorics on Words. Cambridge Mathematical Library, Cambridge University Press, Cambridge (1997)

    Google Scholar 

  22. Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM (JACM) 25(2), 322–336 (1978)

    Article  MathSciNet  Google Scholar 

  23. Maňuch, J.: Characterization of a word by its subwords. In: DLT, pp. 210–219. World Scientific (2000)

    Google Scholar 

  24. Pach, P.: Normal forms under Simon’s congruence. Semigroup Forum 97(2), 251–267 (2018)

    Article  MathSciNet  Google Scholar 

  25. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. preprint arXiv:1508.07909 (2015)

  26. Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07407-4_23

    Chapter  Google Scholar 

  27. Wagner, R., Fischer, M.: The string-to-string correction problem. JACM 21(1), 168–173 (1974)

    Article  MathSciNet  Google Scholar 

  28. Wang, C., Cho, K., Gu, J.: Neural machine translation with byte-level subwords. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9154–9160 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pamela Fleischmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fleischmann, P., Haschke, L., Huch, A., Mayrock, A., Nowotka, D. (2022). Nearly k-Universal Words - Investigating a Part of Simon’s Congruence. In: Han, YS., Vaszil, G. (eds) Descriptional Complexity of Formal Systems. DCFS 2022. Lecture Notes in Computer Science, vol 13439. Springer, Cham. https://doi.org/10.1007/978-3-031-13257-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13257-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13256-8

  • Online ISBN: 978-3-031-13257-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics