skip to main content
article

The greedy algorithm for the minimum common string partition problem

Published:01 October 2005Publication History
Skip Abstract Section

Abstract

In the Minimum Common String Partition problem (MCSP), we are given two strings on input, and we wish to partition them into the same collection of substrings, minimizing the number of the substrings in the partition. This problem is NP-hard, even for a special case, denoted 2-MCSP, where each letter occurs at most twice in each input string. We study a greedy algorithm for MCSP that at each step extracts a longest common substring from the given strings. We show that the approximation ratio of this algorithm is between Ω(n0.43) and O(n0.69). In the case of 2-MCSP, we show that the approximation ratio is equal to 3. For 4-MCSP, we give a lower bound of Ω(log n).

References

  1. Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., and Jiang, T. 2005. Computing the assignment of orthologous genes via genome rearrangement. In Proceedings of the 3rd Asia-Pacific Bioinformatics Conference. 363--378. To appear in IEEE/ACM Transactions on Computational Biology and Bioinformatics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chrobak, M., Kolman, P., and Sgall, J. 2004. The greedy algorithm for the minimum common string partition problem. In Proceedings of the 7th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems. Lecture Notes in Computer Science, vol. 3122. Springer, Berlin, Germany, 84--95.Google ScholarGoogle Scholar
  3. Cormode, G., and Muthukrishnan, J. 2002. The string edit distance matching with moves. In Proceedings of the 13th Annual Symposium on Discrete Algorithms. 667--676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Goldstein, A., Kolman, P., and Zheng, J. 2004. Minimum common string partition problem: Hardness and approximations. In Proceedings of the 15th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, vol. 3341. Springer, Berlin, Germany, 473--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kaplan, H., and Shafrir, N. 2005. The greedy algorithm for edit distance with moves. Inform. Process. Lett. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kolman, P. 2005. Approximating reversal distance for strings with bounded number of duplicates. In Proceedings of the 30th International Symposium on Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 3618. Springer, Berlin, Germany, 580--590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Kruskal, J. B., and Sankoff, D. 1983. An anthology of algorithms and concepts for sequence comparison. In Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J. B. Kruskal, Eds. Addison-Wesley, Reading, MA.Google ScholarGoogle Scholar
  8. Levenshtein, V. I. 1965. Binary codes capable of correcting deletions, insertions and reversals. Dok. Akad. Nauk SSSR 163, 4, 845--848. (in Russian).Google ScholarGoogle Scholar
  9. Lopresti, D., and Tomkins, A. 1997. Block edit models for approximate string matching. Theoret. Comput. Sci. 181, 1, 159--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shapira, D., and Storer, J. 2002. Edit distance with move operations. In Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 2373. Springer, Berlin, Germany, 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Tichy, W. F. 1984. The string-to-string correction problem with block moves. ACM Trans. Comput. Syst. 2, 4, 309--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Waterman, M. S., and Eggert, M. 1987. A new algorithm for best subsequence alignments with applications to tRNA-rRNA comparison. J. Molec. Biol. 197, 4, 723--728.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. The greedy algorithm for the minimum common string partition problem

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Algorithms
      ACM Transactions on Algorithms  Volume 1, Issue 2
      October 2005
      190 pages
      ISSN:1549-6325
      EISSN:1549-6333
      DOI:10.1145/1103963
      Issue’s Table of Contents

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 October 2005
      Published in talg Volume 1, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader