Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Opinion
  • Published:

Bioinformatics goes back to the future

Abstract

The need to turn raw data into knowledge has led the bioinformatics field to focus increasingly on the manipulation of information. By drawing parallels with both cryptography and artificial intelligence, we can develop an understanding of the changes that are occurring in bioinformatics, and how these changes are likely to influence the bioinformatics job market.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Similar content being viewed by others

References

  1. Edman, P. & Begg, G. A protein sequenator. Eur. J. Biochem. 1, 80–91 (1967).

    Article  CAS  Google Scholar 

  2. Huang, X. C., Quesada, M. A. & Mathies, R. A. DNA sequencing using capillary array electrophoresis. Anal. Chem. 64, 2149–2154 (1992).

    Article  CAS  Google Scholar 

  3. Maxam, A. M. & Gilbert, W. A new method for sequencing DNA. Proc. Natl Acad. Sci. USA 74, 560–564 (1977).

    Article  CAS  Google Scholar 

  4. Saiki, R. K. et al. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239, 487–491 (1988).

    Article  CAS  Google Scholar 

  5. Sanger, F., Nicklen, S. & Coulson, A. R. DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA 74, 5463–5468 (1977).

    Article  CAS  Google Scholar 

  6. Reese, M. G. et al. Genome annotation assessment in Drosophila melanogaster. Genome Res. 10, 483–501 (2000).

    Article  CAS  Google Scholar 

  7. Huang, X. An improved sequence assembly program. Genomics 33, 21–31 (1996).

    Article  CAS  Google Scholar 

  8. Huang, X. & Madan, A. CAP3: a DNA sequence assembly program. Genome Res. 9, 868–877 (1999).

    Article  CAS  Google Scholar 

  9. Huson, D. H. et al. Design of a compartmentalized shotgun assembler for the human genome. Bioinformatics 17, S132–S139 (2001).

    Article  Google Scholar 

  10. Kent, W. J. & Haussler, D. Assembly of the working draft of the human genome with GigAssembler. Genome Res. 11, 1541–1548 (2001).

    Article  CAS  Google Scholar 

  11. Myers, E. W. et al. A whole-genome assembly of Drosophila. Science 287, 2196–2204 (2000).

    Article  CAS  Google Scholar 

  12. Weber, J. L. & Myers, E. W. Human whole-genome shotgun sequencing. Genome Res. 7, 401–409 (1997).

    Article  CAS  Google Scholar 

  13. Discala, C., Benigni, X., Barillot, E. & Vaysseix, G. DBcat: a catalog of 500 biological databases. Nucleic Acids Res. 28, 8–9 (2000).

    Article  CAS  Google Scholar 

  14. Fitch, W. M. Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970).

    Article  CAS  Google Scholar 

  15. Fitch, W. M. Homology — a personal view on some of the problems. Trends Genet. 16, 227–231 (2000).

    Article  CAS  Google Scholar 

  16. Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).

    Article  CAS  Google Scholar 

  17. Gerlt, J. A. & Babbitt, P. C. Can sequence determine function? Genome Biol. 1, S0005.1–S0005.10 (2000).

    Article  Google Scholar 

  18. Bork, P. & Bairoch, A. Go hunting in the sequence databases but watch out for the traps. Trends Genet. 12, 425–427 (1996).

    Article  CAS  Google Scholar 

  19. Karp, P. What we do not know about sequence analysis and sequence databases. Bioinformatics 14, 753–754 (1998).

    Article  CAS  Google Scholar 

  20. Attwood, T. K. & Miller, C. J. Which craft is best in bioinformatics? Comput. Chem. 25, 329–339 (2001).

    Article  CAS  Google Scholar 

  21. Gruber, T. R. Towards principals for the design of ontologies used for knowledge sharing. Int. J. Hum. Comput. Stud. 43, 907–928 (1995).

    Article  Google Scholar 

  22. Rich, E. & Knight, K. Artificial Intelligence (McGraw-Hill, New York, 1991).

    Google Scholar 

  23. Ringland, G. A. & Duce, D. A. Approaches to Knowledge Representation: An Introduction (John Wiley, New York, 1988).

    Google Scholar 

  24. Smith, M. Station X: The Codebreakers of Bletchley Park (Channel 4 Books, Macmillan Publishers Ltd, London, 1998).

    Google Scholar 

  25. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  Google Scholar 

  26. Guha, R. V. & Lenat, D. B. Cyc: a midterm report. AI Magazine Fall (Autumn), 32–59 (1990).

  27. Wood, M. M. in Joint Conference on Information Sciences Vol. 2 (ed. Wang, P. P.) 75–78 (Duke Univ., North Carolina, 1997).

    Google Scholar 

  28. Stoesser, G. et al. The EMBL nucleotide sequence database. Nucleic Acids Res. 30, 21–26 (2002).

    Article  CAS  Google Scholar 

  29. Brenner, S. Life sentences: ontology recapitulates philology. Genome Biol. 3, 1006.1–1006.2 (2002).

    Google Scholar 

  30. Wheelan, S. J. & Boguski, M. S. Late night thoughts on sequence annotation. Genome Res. 8, 168–169 (1998).

    Article  CAS  Google Scholar 

  31. Smith, R. F. Perspectives: sequence database searching in the era of large-scale genomic sequencing. Genome Res. 6, 653–660 (1996).

    Article  CAS  Google Scholar 

  32. Brazma, A. et al. Minimum information about a microarray experiment (MIAME) —toward standards for microarray data. Nature Genet. 29, 365–371 (2001).

    Article  CAS  Google Scholar 

  33. Woods, W. A. What's in a Link: Foundations for Semantic Networks (eds Bobrow, D. & Collins, A.) 35–82 (Academic Press, New York, 1975).

    Book  Google Scholar 

  34. Brachman, R. J. On the Epistemological Status of Semantic Networks (ed. Findler, N. V.) (Academic Press, New York, 1979).

    Google Scholar 

  35. Brachman, R. J. What IS-A is and isn't: an analysis of taxonomic links in semantic networks. IEEE Comput. 16, 30–36 (1983).

    Article  Google Scholar 

  36. Brachman, R. J. I lied about the trees. AI Magazine 6, 80–85 (1985).

    Google Scholar 

  37. Samuel, A. Some studies in machine learning using the game of checkers II — recent progress. IBM J. 11, 601–617 (1967).

    Article  Google Scholar 

  38. Burmeister, J. & Wiles, J. An Introduction to the computer GO filed and associated Internet resources Technical report 339 (Department of Computer Science, Univ. of Queensland, Australia, 1995).

    Google Scholar 

  39. Campbell, M. S. in HAL's Legacy: 2001's Computer as Dream and Reality (ed. Stork, D. G.) 76–98 (MIT Press, Cambridge, Massachusetts, 1997).

    Google Scholar 

  40. Hodges, A. Alan Turing: The Enigma of Intelligence (Unwin, London, 1985).

    Google Scholar 

  41. Salton, G. The SMART Retrieval System — Experiments in Automatic Document Processing (Prentice Hall Inc., Englewood Cliffs, New Jersey, 1971).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Crispin J. Miller.

Related links

Related links

FURTHER INFORMATION

MGED

TrEMBL

SWISS-PROT

Glossary

ANNOTATION

The explanatory notes accompanying a database entry that describe, for example, the function of a protein (either determined experimentally, or by the predictions of bioinformatics algorithms), bibliographic references and links to other databases.

CRYPTOGRAPHY

The practise and study of encryption and decryption: encoding data so that it can only be decoded by specific individuals.

MACHINE TRANSLATION

The use of computers to provide automatic translation from one natural language to another.

ONTOLOGY

A machine-readable representation of complex information that can be manipulated by a computer program.

REASONING ENGINE

The software required to form logical relationships between entries in an ontology and to make deductions about them.

SEMANTICS

The meaning of a fragment of language (as opposed to syntax, which describes how symbols in that language might be legitimately combined, independent of their meaning). The distinction between syntax and semantics allows linguists to deal with sentences such as 'colourless green ideas sleep furiously', which are syntactically correct, but meaningless.

SYMBOL

An entity represented in such a way that a computer program can manipulate it. This might be a word, such as 'cat' or 'dog', a node in a graph or an entry in a database. Because they 'stand for' something else, such representations are referred to as symbols and the programs that deal with them do so by a process of 'symbol manipulation'.

TERMINOLOGY

A predefined set of keywords and terms used, for instance, to annotate a database.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Miller, C., Attwood, T. Bioinformatics goes back to the future. Nat Rev Mol Cell Biol 4, 157–162 (2003). https://doi.org/10.1038/nrm1013

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrm1013

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing