Abstract
This article introduces the topic of bioinformatics to an audience of computer scientists. We discuss the definition of bioinformatics, give a classification of the problem areas which bioinformatics addresses, and illustrate these in detail with examples. We highlight those areas which we believe to be suitable for the application of constraint solving techniques, or where similar techniques are already used. Finally, we give some advice for computer scientists who are considering getting involved in bioinformatics, and provide a resource list and a reading list.
Similar content being viewed by others
References
Akutsu, T., & Miyano, S. (1997). On the approximation of protein threading. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), Santa Fe, NM, pages 3–8.
Baldi, P., & Brunak, S. (1998). Bioinformatics: The Machine Learning Approach. Cambridge, MA: MIT Press.
Benson, G. (1997). Sequence alignment with tandem repeats. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), pages 27–36.
Brazma, A., Jonassen, I., Eidhammer, I., & Gilbert, D. R. (1998). Approaches to the automatic discovery of patterns in biosequences. Journal of Computational Biology, 5(2): 277–303.
Berger, B., & Leighton, T. (1998). Protein folding in the hydrophobic-hydrophilic (HP) modell is NP-complete. In Proceedings of the Second Annual International Conferences on Compututational Molecular Biology (RECOMB98), New York, pages 30–39.
Baxter, K., Steeg, E., Lathrop, R., Glasgow, J., & Fortier, S. (1996). From electron density and sequence to structure: Integrating protein image analysis and threading for structure determination. In Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology (ISMB'96), St. Louis, MO, pages 25–33.
CASP3. (1998). Third community wide experiment on the critical assessment of techniques for protein structure prediction, http://predictioncenter.llnl.gov/casp3/casp3.html.
Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A., & Yannakakis, M. (1998). On the complexity of protein folding. In Proceedings of STOC. Short version in Proceedings of RECOMB'98, pages 61–62.
Crippen, G. M., & Havel, T. F. (1988). Distance Geometry and Molecular Conformation. Taunton, Somerset, UK: Research Studies Press.
Christof, T., Juenger, M., Kececioglu, J., Mutzel, P., & Reinelt, G. (1997). A branch-and-cut approach to physical map with end-probes. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), Santa Fe, NM, pages 84–92.
Crick, F. H. C. (1958). On protein synthesis. Symposium of the Society of Experimental Biology, 12: 138–167.
Durbin, R., Eddy, S., Krough, A., & Mitchison, G. (1998). Biological Sequence Analysis. CUP.
Elliott, W. H., & Elliott, D. C. (1997). Biochemistry and Molecular Biology. OUP.
Eidhammer, I., Jonassen, I., & Taylor, W. R. (1999). Structure comparison and structure patterns. Technical Report 174, Department of Informatics, University of Bergen, Bergen, Norway.
Eidhammer, I., Jonassen, I., Grindhang, S. H., Gilbert, D., Ratnayke, M., (2000). A constraint based structure description language for biosequences, Constraints, 6: 173–200.
Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39: 783–791.
Fitch, W. M. (1971). Toward defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology, 20: 406–416.
Greenberg, D. S., & Istrail, S. (1995). Physical mapping by sts-hybradisation: algorithmic strategies and the challenge of software evaluation. Journal of Computational Biology, 2(2): 219–273.
Gunn, J. R. (1998). Hierarchical minimization with distance and angle constraints. In Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology (ISMB'98), Montréal, Québec, Canada.
Gilbert, D. R., Westhead, D. R., Nagano, N., & Thornton, J. M. (1999). Motif-based searching in tops protein topology databases. Bioinformatics, 15(4): 317–326.
Gilbert, D., Westhead, D., Thornton, J., & Viksna, J. (1999). Tops cartoons: formalisation, searching and comparison. RECOMB99 (poster).
Hart, W. E. (1997). On the computational complexity of sequence design problems. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), Santa Fe, NM, pages 128–136.
Havel, T. (1993). Predicting the structure of the flavodoxin from Escherichia coli by homology modeling, distance geometry and molecular dynamics. Molecular Simulation, 10: 175–210.
Havel, T., & Snow, M. (1991). A new method for building protein conformations from sequence alignments with homologoues of known structure. Journal of Molecular Biology, 217: 1–7.
Jonassen, I., Eidhammer, I., & Taylor, W. R. (1999). Discovery of local packing motifs in protein structures. Proteins, 34(2): 206–219.
Jourdan, J., & Valdés-Pérez, R. E. (1989). Constraint logic programming applied to hypothetical reasoning in chemistry. In E. L. Lusk and R. A. Overbeek, eds., Logic Programming, Proceedings of the North American Conference, Cleveland, pages 154–172. Cambridge, MA: The MIT Press.
Krippahl, L., & Barahona, P. (1999). Applying constraint programming to protein structure determination. In J. Jaffar, ed., Proceedings Principles and Practice of Constraint Programming—CP99, pages 289–302. New York: Springer-Verlag.
Kececioglu, J. D. (1993). The maximum weight trace problem in multiple sequence alignment. In Proceedings 4th Symposium Combinatorical Pattern Matching, volume 684 of Lecture Notes in Computer Science, pages 106–119. New York: Springer-Verlag.
Koch, I., Lengauer, T., & Wanke, E. (1996). An algorithm for finding maximal common subtopologies in a set of protein structures. Journal of Computational Biology, 3(2): 289–306.
Lau, K. F., & Dill, K. A. (1989). A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules, 22: 3986–3997.
Lefebvre, F. (1996). A grammar-based unification of several alignment and folding algorithms. In D. J. States, P. Agarwal, T. Gaasterland, L. Hunter, & R. Smith, eds., Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, pages 143–154. Menlo Park, CA: AAAI Press.
Lathrop, R. H., & Smith, T. F. (1996). Global optimum protein threading with gapped alignment and empirical pair score functions. Journal of Molecular Biology, 255: 641–665.
Myers, G., Selznick, S., Zhang, Z., & Miller, W. (1997). Progressive multiple alignment with constraints. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), Santa Fe, NM, pages 220–225.
Ngo, J. T., & Marks, J. (1992). Computational complexity of a problem in molecular structure prediction. Protein Engineering, 5: 313–321.
Rigoutsos, I., & Floratos, A. (1998). Combinatorial pattern discovery in biological sequences. Bioinformatics, 14(1): 55–67.
Reinert, K., Lenhof, H.-P., Mutzel, P., Melhorn, K., & Kececioglu, J. P. (1997). A branch-and-cut algorithm for multiple sequence alignment. In Proceedings of the First Annual International Conferences on Compututational Molecular Biology (RECOMB97), Santa Fe, NM, pages 241–249.
Sternberg, M. J., Bates, P. A., Kelley, L. A., & MacCallum, R. M. (1999). Progress in protein structure prediction: assessment of CASP3. Current Opinions in Structural Biology, 9(3): 368–373.
Searls, D. (1993). The computational linguistics of biological sequences. In L. Hunter, ed., Artificial Intelligence and Molecular Biology, chapter 2, pages 47–120. Menlo Park, CA: AAAI/MIT Press.
Sokal, R. R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin, 28: 1409–1438.
Sánchez, R., & Šali, A. (1997). Comparative protein modeling as an optimization problem. Journal of Molecular Structure (Theochem), 398: 489–496.
Unger, R., & Moult, J. (1993). Finding the lowest free energy conformation of a protein is an NP-hard problem: proof and implications. Bull. Math. Biol., 55(6): 1183–1198.
Šali, A., & Blundell, T. (1993). Comparative protein modelling by satisfaction of spatial restraints. Journal of Molecular Biology, 234: 779–815.
Šali, A. (1996). Comparative protein modeling by satisfaction of spatial restraints. http://guitar.rockefeller.edu/publications/papers/molmed-95/html/rev.html.
Waterman, M. (1995). Introduction to Computational Biology. London: Chapman & Hall.
Yue, K., & Dill, K. A. (1996). Folding proteins with a simple energy function and extensive conformational searching. Protein Science, 5(2): 254–261.
Zuker, M. (1989). On finding all foldings of an RNA molecule. Science, 244: 48–52.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Backofen, R., Gilbert, D. Bioinformatics and Constraints. Constraints 6, 141–156 (2001). https://doi.org/10.1023/A:1011477420926
Issue Date:
DOI: https://doi.org/10.1023/A:1011477420926