The longest common subsequence problem for arc-annotated sequences

https://doi.org/10.1016/S1570-8667(03)00080-7Get rights and content
Under an Elsevier user license
open archive

Abstract

Arc-annotated sequences are useful in representing the structural information of RNA and protein sequences. The Longest Arc-Preserving Common Subsequence (LAPCS) problem has recently been introduced in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in: Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280] as a framework for studying the similarity of arc-annotated sequences. In this paper, we consider arc-annotated sequences with various arc structures and present some new algorithmic and complexity results on the LAPCS problem. Some of our results answer an open question in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in: Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280] and some others improve the hardness results in [P.A. Evans, Algorithms and complexity for annotated sequence analysis, PhD Thesis, University of Victoria, 1999; P.A. Evans, Finding common subsequences with arcs and pseudoknots, in: Proceedings of 10th Annual Symposium on Combinatorial Pattern Matching (CPM'99), in Lecture Notes in Comput. Sci., vol. 1645, 1999, pp. 270–280].

Keywords

RNA structural similarity comparison
Sequence annotation
Longest common subsequence
Maximum independent set
MAX SNP-hard
Approximation algorithm
Dynamic programming

Cited by (0)

An extended abstract of this work appears in Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching (CPM 2000), LNCS 1848, 2000, pp. 154–165.

1

Supported in part by NSERC Research Grant OGP0046613, a CITO grant, and a UCR startup grant.

2

Work done while at the Department of Computer Science, University of Waterloo and the Department of Computing and Software, McMaster University. Supported in part by NSERC Research Grant OGP0046613 and a CITO grant.

3

Work done while at the Department of Computer Science, University of Waterloo. Supported in part by NSERC Research Grant OGP0046506 and a CGAT grant.

4

Supported in part by NSERC Research Grant OGP0046373.