Abstract
Multiple sequence alignment is a central problem in Bioinformatics. A known integer programming approach is to apply branch-and-cut to exponentially large graph-theoretic models. This paper describes a new integer program formulation that generates models small enough to be passed to generic solvers. The formulation is a hybrid relating the sparse alignment graph with a compact encoding of the alignment matrix via channelling constraints. Alignments obtained with a SAT-based local search algorithm are competitive with those of state-of-the-art algorithms, though execution times are much longer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Multiple Sequence Alignment With Arbitrary Gap Costs: Computing an Optimal Solution Using Polyhedral Combinatorics. Bioinformatics (Suppl. 2), S4–S16
Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments. Journal of Molecular Biology 264, 823–838 (1996)
Kececioglu, J.D.: Exact and Approximation Algorithms for DNA Sequence Reconstruction. PhD thesis, University of Arizona (1991)
Kececioglu, J.D., Lenhof, H.-P., Mehlhorn, K., Mutzel, P., Reinert, K., Vingron, M.: A Polyhedral Approach to Sequence Alignment Problems. Discrete Applied Mathematics 104, 143–186 (2000)
Mizuguchi, K., Deane, C.M., Blundell, T.L., Overington, J.P.: HOMSTRAD: A Database of Protein Structure Alignments for Homologous Families. Protein Science 7, 2469–2471 (1998)
Needleman, S.B., Wunsch, C.D.: A General Method Applicable to the Search of Similarities in the Amino Acid Sequences of Two Proteins. Journal of Molecular Biology 48, 443–453 (1970)
Notredame, C., Higgins, D.G.: SAGA: Sequence Alignment by Genetic Algorithm. Nucleic Acids Research 2, 1515–1524 (1996)
Notredame, C., Higgins, D.G., Heringa, J.: T-COFFEE:A Novel Method for Fast and Accurate Multiple Sequence Alignment. Journal of Molecular Biology 302, 205–217 (2000)
Prestwich, S.D.: Randomised Backtracking for Linear Pseudo-Boolean Constraint Problems. In: Fourth International Workshop on Integration of AI and OR techniques in Constraint Programming for Combinatorial Optimisation Problems, le Croisic, France, pp. 7–20 (2002)
Reinert, K., Lenhof, H.-P., Mutzel, P., Mehlhorn, K., Kececioglu, J.: A Branch-and-Cut Algorithm for Multiple Sequence Alignment. In: First Annual International Conference on Computational Molecular Biology, pp. 241–249 (1997)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice. Nucleic Acids Research 22, 4673–4680 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Prestwich, S., Higgins, D., O’Sullivan, O. (2003). A SAT-Based Approach to Multiple Sequence Alignment. In: Rossi, F. (eds) Principles and Practice of Constraint Programming – CP 2003. CP 2003. Lecture Notes in Computer Science, vol 2833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45193-8_83
Download citation
DOI: https://doi.org/10.1007/978-3-540-45193-8_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20202-8
Online ISBN: 978-3-540-45193-8
eBook Packages: Springer Book Archive