Skip to main content

The Origin and Early Reception of Sequence Databases

  • Protocol
  • First Online:
Book cover Data Mining in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 696))

Abstract

Emerging areas of scientific research never arise in a social or intellectual vacuum, but must establish themselves in relation to well-established disciplines. This necessity poses challenges for scientists who must not only create a new disciplinary identity, but must also defend their research from criticism and even condescension from other scientists. The early use of sequence databases provides an excellent case study for examining the challenges facing novel sciences. The need for sequence databases grew out of protein sequencing in biochemistry beginning in the late 1950s. The rapid increase in the number of sequences made databases an attractive resource, but protein biochemists often considered building, managing, and doing research with databases a “second-rate” science. Similarly, computational biologists who used databases and digital computers to study evolutionary phenomena faced criticism from more traditional evolutionary biologists. In retrospect, one can see this early computational biology as laying important foundations for the bioinformatics, molecular evolution, and molecular systematics of today. However, within the context of the 1960s, establishing a scientific identity posed serious challenges for Margaret Dayhoff, Walter Fitch, and Russell Doolittle and other computational biologists who used computers and databases to investigate evolutionary problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wolfe KH, Li WH (2003) Molecular evolution meets the genomic revolution. Nat Genet Suppl 33:255–265

    Article  CAS  Google Scholar 

  2. Kanehisa M, Bork P (2003) Bioinformatics in the post-sequence era. Nat Genet Suppl 33:305–310

    Article  CAS  Google Scholar 

  3. Patterson SD, Aebersold RH (2003) Proteomics: the first decade and beyond. Nat Genet Suppl 33:311–323

    Article  CAS  Google Scholar 

  4. de Chadarevian S (1996) Sequences, conformation, information: biochemists and molecular biologists in the 1950s. J Hist Biol 29:361–386

    Article  PubMed  Google Scholar 

  5. de Chadarevian S (1999) Protein sequencing and the making of molecular genetics. Trends Biochem Sci 24:203–206

    Article  PubMed  Google Scholar 

  6. Sanger F (1959) The chemistry of insulin. Science 129:1340–1344

    Article  CAS  PubMed  Google Scholar 

  7. Sanger F (1988) Sequences, sequences, sequences. Ann Rev Biochem 57:1–28

    Article  CAS  PubMed  Google Scholar 

  8. Strasser BJ (in press) Collecting, comparing, and computing sequences: the making of Margaret O. Dayhoff’s atlas of protein sequence and structure. J Hist Biol.

    Google Scholar 

  9. Strasser BJ (2006) Collecting and experimenting: the moral economies of biological research, 1960s–1980s. Preprints Max-Planck Inst Hist Sci 310:105–123

    Google Scholar 

  10. Strasser BJ (2008) GenBank – natural history in the 21st century? Science 322:537–538

    Article  CAS  PubMed  Google Scholar 

  11. Smith TF (1990) The history of the genetic sequence databases. Genomics 6:701–707

    Article  CAS  PubMed  Google Scholar 

  12. Schachman HK (1979) Summary remarks: a retrospect on proteins. In: Srinivasan PR, Fruton JS, Edsall JT (eds) The origins of modern biochemistry: a retrospect on proteins, vol 325. Annals of the New York Academy of Sciences, New York, pp 363–373

    Google Scholar 

  13. Eck RV, Dayhoff MO (1966) The atlas of protein sequence and structure 1966. National Biomedical Research Foundation, Silver Spring, MA

    Google Scholar 

  14. Hunt LT (1983) Margaret O. Dayhoff, 1925–1983. DNA 2:97–98

    Article  CAS  PubMed  Google Scholar 

  15. Hunt LT (1984) Margaret O. Dayhoff, 1925–1983. Bull Math Biol 46:467–472

    Google Scholar 

  16. Margoliash E, Schejter A (1996) How does a small protein become so popular?: a succinct account of the development of our understanding of cytochrome c. In: Scott RA, Mauk AG (eds) Cytochrome c: a multidisciplinary approach. University Science Books, Sausalito, CA

    Google Scholar 

  17. Doolittle RF, Blömback B (1964) Amino acid sequence investigations of fibrinopeptides from various mammals: evolutionary implications. Nature 202:147–152

    Article  CAS  PubMed  Google Scholar 

  18. Ingram VM (1961) Gene evolution and the haemoglobins. Nature 189:704–708

    Article  CAS  PubMed  Google Scholar 

  19. Zuckerkandl E, Pauling L (1963) Chemical paleogenetics: molecular “restoration studies” of extinct forms of life. Acta Chem Scand 17:S9–S16

    Article  Google Scholar 

  20. Dayhoff MO (1969) Computer analysis of protein evolution. Sci Am 221:87–95

    Article  Google Scholar 

  21. Hagen JB (1999) Naturalists, molecular biologists, and the challenges of molecular evolution. J Hist Biol 32:321–341

    Article  CAS  PubMed  Google Scholar 

  22. Doolittle RF (2000) On the trail of protein sequences. Bioinformatics 16:24–33

    Article  CAS  PubMed  Google Scholar 

  23. Moody G (2004) Digital code of life: how bioinformatics is revolutionizing science, medicine, and business. Wiley, Hoboken, NJ

    Google Scholar 

  24. Hagen JB (2000) The origins of bioinformatics. Nat Rev Genet 1:231–236

    Article  CAS  PubMed  Google Scholar 

  25. Crick FHC (1958) On protein synthesis. Symp Soc Exp Biol 12:138–163

    CAS  PubMed  Google Scholar 

  26. Aronson J (2002) Molecules and monkeys: George Gaylord Simpson and the challenge of molecular evolution. Hist Philos Life Sci 24:441–465

    Article  PubMed  Google Scholar 

  27. Dietrich MR (1998) Paradox and persuasion: negotiating the place of molecular evolution within evolutionary biology. J Hist Biol 31:85–111

    Article  CAS  PubMed  Google Scholar 

  28. Morgan GJ (1998) Emile Zuckerkandl, Linus Pauling and the molecular evolutionary clock, 1959–1965. J Hist Biol 31:155–178

    Article  CAS  PubMed  Google Scholar 

  29. Sommer M (2008) History in the gene: negotiations between molecular and organismal anthropology. J Hist Biol 41:473–528

    Article  PubMed  Google Scholar 

  30. Hagen JB (in press). Waiting for Sequences: Morris Goodman, Immunodiffusion Experiments, and the Origins of Molecular Anthropology. J Hist Biol.

    Google Scholar 

  31. Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins. In: Bryson V, Vogel HJ (eds) Evolving genes and proteins. Academic Press, New York, pp 97–166

    Google Scholar 

  32. Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366

    Article  CAS  PubMed  Google Scholar 

  33. Strasser BJ (1999) Sickle cell anemia, a molecular disease. Science 286:1488–1490

    Article  CAS  PubMed  Google Scholar 

  34. Dietrich MR (1994) The origins of the neutral theory of molecular evolution. J Hist Biol 27:21–59

    Article  CAS  PubMed  Google Scholar 

  35. Kumar S (2005) Molecular clocks: four decades of evolution. Nat Rev Genet 6:654–662

    Article  CAS  PubMed  Google Scholar 

  36. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge

    Book  Google Scholar 

  37. Suárez E, Barahona A (1996) The experimental roots of the neutral theory of molecular evolution. Hist Philos Life Sci 18:55–81

    Google Scholar 

  38. Margoliash E (1972) The molecular variation of cytochrome c as a function of the evolution of species. Harvey Lect 66:177–247

    CAS  Google Scholar 

  39. Hagen JB (2001) The introduction of computers into systematic research in the united states during the 1960s. Stud His Philos Biol Biomed Sci 32:291–314

    Google Scholar 

  40. Hagen JB (2003) The statistical frame of mind in systematic biology from quantitative zoology to biometry. J Hist Biol 36:353–384

    Article  PubMed  Google Scholar 

  41. Felsenstein J (2004) Inferring phylogenies. Sinauer, Sunderland, MA

    Google Scholar 

  42. Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279–284

    Article  CAS  PubMed  Google Scholar 

  43. Fitch WM (1988) This week’s citation classic. Curr Contents 19(27):16

    Google Scholar 

  44. Fitch WM (1987) This week’s citation classic. Curr Contents 18(27):14

    Google Scholar 

  45. Margoliash E, Fitch WM, Dickerson RE (1968) Molecular expression of evolutionary phenomena in the primary and tertiary structures of cytochrome c. Structure, function, and evolution in proteins. Brookhaven Symp Biol 21(2):259–305

    CAS  PubMed  Google Scholar 

  46. Dickerson RE, Geis I (1969) The structure and action of proteins. Harper & Row, New York

    Google Scholar 

  47. Hull DL (1988) Science as a process: an evolutionary account of the social and conceptual development of science. University of Chicago Press, Chicago

    Google Scholar 

  48. Fitch WM (2000) Homology: a personal view on some of the problems. Trends Genet 16(5):227–231

    Article  CAS  PubMed  Google Scholar 

  49. Doolittle RF (1997) A Delicate Balance. Boston Rev (February–March).

    Google Scholar 

  50. Doolittle RF, Oncley JL, Surgenor DM (1962) Species differences in the interaction of thrombin and fibrinogen. J Biol Chem 237:3123–3127

    CAS  Google Scholar 

  51. Doolittle RF (1997) Some reflections on the early days of sequence searching. J Mol Med 75:239–241

    CAS  PubMed  Google Scholar 

  52. Bairoch A (2000) Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics 16:48–64

    Article  CAS  PubMed  Google Scholar 

  53. Dayhoff MO, Eck RV, Chang MA, Souchard MR (1965) Atlas of protein sequence and structure. National Biological Research Foundation, Silver Spring, MD

    Google Scholar 

  54. Fitch WM (1970) Distinguishing homologous from analogous proteins. Syst Zool 19:99–113

    Article  CAS  PubMed  Google Scholar 

  55. Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416

    Article  Google Scholar 

  56. Ledley RS (1965) Use of computers in biology and medicine. McGraw-Hill, New York

    Google Scholar 

  57. Smith EL (1979) Amino acid sequences of proteins – the beginnings. In: Srinivasan PR, Fruton JS, Edsall JT (eds) The origins of modern biochemistry: a retrospect on proteins, vol 325. Annals of the New York Academy of Sciences, New York, pp 107–118

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joel B. Hagen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Hagen, J.B. (2011). The Origin and Early Reception of Sequence Databases. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-987-1_4

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-986-4

  • Online ISBN: 978-1-60761-987-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics