Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Evolutionary information for specifying a protein fold

Abstract

Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function capturing coevolution between amino acid residues is necessary and sufficient to specify sequences that fold into native structures. The artificial proteins show thermodynamic stabilities similar to natural WW domains, and structure determination of one artificial protein shows excellent agreement with the WW fold at atomic resolution. The relative simplicity of the information used for creating sequences suggests a marked reduction to the potential complexity of the protein-folding problem.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: SCA-based protein design.
Figure 2: Experimental evaluation of artificial proteins (part 1).
Figure 3: Experimental evaluation of artificial proteins (part 2).
Figure 4: Summary of experiments on all natural and artificial WW sequences.
Figure 5: NMR structure determination of CC45, an artificial WW domain.

Similar content being viewed by others

References

  1. Anfinsen, C. B. Principles that govern the folding of protein chains. Science 181, 223–230 (1973)

    Article  ADS  CAS  Google Scholar 

  2. Daggett, V. & Fersht, A. The present view of the mechanism of protein folding. Nature Rev. Mol. Cell Biol. 4, 497–502 (2003)

    Article  CAS  Google Scholar 

  3. Dinner, A. R., Sali, A., Smith, L. J., Dobson, C. M. & Karplus, M. Understanding protein folding via free-energy surfaces from theory and experiment. Trends Biochem. Sci. 25, 331–339 (2000)

    Article  CAS  Google Scholar 

  4. Hidalgo, P. & MacKinnon, R. Revealing the architecture of a K+ channel pore through mutant cycles with a peptide inhibitor. Science 268, 307–310 (1995)

    Article  ADS  CAS  Google Scholar 

  5. Horovitz, A. & Fersht, A. R. Co-operative interactions during protein folding. J. Mol. Biol. 224, 733–740 (1992)

    Article  CAS  Google Scholar 

  6. Chen, J. & Stites, W. E. Higher-order packing interactions in triple and quadruple mutants of staphylococcal nuclease. Biochemistry 40, 14012–14019 (2001)

    Article  CAS  Google Scholar 

  7. LiCata, V. J. & Ackers, G. K. Long-range, small magnitude nonadditivity of mutational effects in proteins. Biochemistry 34, 3133–3139 (1995)

    Article  CAS  Google Scholar 

  8. Luque, I., Leavitt, S. A. & Freire, E. The linkage between protein folding and functional cooperativity: two sides of the same coin? Annu. Rev. Biophys. Biomol. Struct. 31, 235–256 (2002)

    Article  CAS  Google Scholar 

  9. Gerstein, M. & Chothia, C. Packing at the protein-water interface. Proc. Natl Acad. Sci. USA 93, 10167–10172 (1996)

    Article  ADS  CAS  Google Scholar 

  10. Richards, F. M. & Lim, W. A. An analysis of packing in the protein folding problem. Q. Rev. Biophys. 26, 423–498 (1993)

    Article  CAS  Google Scholar 

  11. Lichtarge, O., Bourne, H. R. & Cohen, F. E. An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257, 342–358 (1996)

    Article  CAS  Google Scholar 

  12. Lockless, S. W. & Ranganathan, R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286, 295–299 (1999)

    Article  CAS  Google Scholar 

  13. Hatley, M. E., Lockless, S. W., Gibson, S. K., Gilman, A. G. & Ranganathan, R. Allosteric determinants in guanine nucleotide-binding proteins. Proc. Natl Acad. Sci. USA 100, 14445–14450 (2003)

    Article  ADS  CAS  Google Scholar 

  14. Shulman, A. I., Larson, C., Mangelsdorf, D. J. & Ranganathan, R. Structural determinants of allosteric ligand activation in RXR heterodimers. Cell 116, 417–429 (2004)

    Article  CAS  Google Scholar 

  15. Suel, G. M., Lockless, S. W., Wall, M. A. & Ranganathan, R. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nature Struct. Biol. 10, 59–69 (2003)

    Article  Google Scholar 

  16. Peterson, F. C., Penkert, R. R., Volkman, B. F. & Prehoda, K. E. Cdc42 regulates the Par-6 PDZ domain through an allosteric CRIB-PDZ transition. Mol. Cell 13, 665–676 (2004)

    Article  CAS  Google Scholar 

  17. Fuentes, E. J., Der, C. J. & Lee, A. L. Ligand-dependent dynamics and intramolecular signalling in a PDZ domain. J. Mol. Biol. 335, 1105–1115 (2004)

    Article  CAS  Google Scholar 

  18. Estabrook, R. A. et al. Statistical coevolution analysis and molecular dynamics: identification of amino acid pairs essential for catalysis. Proc. Natl Acad. Sci. USA 102, 994–999 (2005)

    Article  ADS  CAS  Google Scholar 

  19. Ota, N. & Agard, D. A. Intramolecular signalling pathways revealed by modeling anisotropic thermal diffusion. J. Mol. Biol. 351, 345–354 (2005)

    Article  CAS  Google Scholar 

  20. Fodor, A. A. & Aldrich, R. W. Influence of conservation on calculations of amino acid covariance in multiple sequence alignments. Proteins 56, 211–221 (2004)

    Article  CAS  Google Scholar 

  21. Fodor, A. A. & Aldrich, R. W. On evolutionary conservation of thermodynamic coupling in proteins. J. Biol. Chem. 279, 19046–19050 (2004)

    Article  CAS  Google Scholar 

  22. Russ, W. P., Lowery, D. M., Mishra, P., Yaffe, M. B. & Ranganathan, R. Natural-like function in artificial WW domains. Nature doi:10.1038/nature03990 (this issue)

  23. Macias, M. J. et al. Structure of the WW domain of a kinase-associated protein complexed with a proline-rich peptide. Nature 382, 646–649 (1996)

    Article  ADS  CAS  Google Scholar 

  24. Bork, P. & Sudol, M. The WW domain: a signalling site in dystrophin? Trends Biochem. Sci. 19, 531–533 (1994)

    Article  CAS  Google Scholar 

  25. Jager, M., Nguyen, H., Crane, J. C., Kelly, J. W. & Gruebele, M. The folding mechanism of a β-sheet: the WW domain. J. Mol. Biol. 311, 373–393 (2001)

    Article  CAS  Google Scholar 

  26. Koepf, E. K., Petrassi, H. M., Sudol, M. & Kelly, J. W. WW: An isolated three-stranded antiparallel β-sheet domain that unfolds and refolds reversibly; evidence for a structured hydrophobic cluster in urea and GdnHCl and a disordered thermal unfolded state. Protein Sci. 8, 841–853 (1999)

    Article  CAS  Google Scholar 

  27. Bashford, D., Chothia, C. & Lesk, A. M. Determinants of a protein fold. Unique features of the globin amino acid sequences. J. Mol. Biol. 196, 199–216 (1987)

    Article  CAS  Google Scholar 

  28. Wintjens, R. et al. 1H NMR study on the binding of Pin1 Trp-Trp domain with phosphothreonine peptides. J. Biol. Chem. 276, 25150–25156 (2001)

    Article  CAS  Google Scholar 

  29. Kanelis, V., Rotin, D. & Forman-Kay, J. D. Solution structure of a Nedd4 WW domain-ENaC peptide complex. Nature Struct. Biol. 8, 407–412 (2001)

    Article  CAS  Google Scholar 

  30. Schreiber, G. & Fersht, A. R. Energetics of protein-protein interactions: analysis of the barnase-barstar interface by single mutations and double mutant cycles. J. Mol. Biol. 248, 478–486 (1995)

    CAS  PubMed  Google Scholar 

  31. Eriksson, A. E. et al. Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect. Science 255, 178–183 (1992)

    Article  ADS  CAS  Google Scholar 

  32. Bowie, J. U., Reidhaar-Olson, J. F., Lim, W. A. & Sauer, R. T. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science 247, 1306–1310 (1990)

    Article  ADS  CAS  Google Scholar 

  33. Liang, J. & Dill, K. A. Are proteins well-packed? Biophys. J. 81, 751–766 (2001)

    Article  ADS  CAS  Google Scholar 

  34. Lindorff-Larsen, K., Best, R. B., Depristo, M. A., Dobson, C. M. & Vendruscolo, M. Simultaneous determination of protein structure and dynamics. Nature 433, 128–132 (2005)

    Article  ADS  CAS  Google Scholar 

  35. Benkovic, S. J. & Hammes-Schiffer, S. A perspective on enzyme catalysis. Science 301, 1196–1202 (2003)

    Article  ADS  CAS  Google Scholar 

  36. Eisenmesser, E. Z., Bosco, D. A., Akke, M. & Kern, D. Enzyme dynamics during catalysis. Science 295, 1520–1523 (2002)

    Article  ADS  CAS  Google Scholar 

  37. Frauenfelder, H., McMahon, B. H. & Fenimore, P. W. Myoglobin: the hydrogen atom of biology and a paradigm of complexity. Proc. Natl Acad. Sci. USA 100, 8615–8617 (2003)

    Article  ADS  CAS  Google Scholar 

  38. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)

    Article  CAS  Google Scholar 

  39. Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  CAS  Google Scholar 

  40. Huang, X. et al. Structure of a WW domain containing fragment of dystrophin in complex with β-dystroglycan. Nature Struct. Biol. 7, 634–638 (2000)

    Article  CAS  Google Scholar 

  41. Kasanov, J., Pirozzi, G., Uveges, A. J. & Kay, B. K. Characterizing Class I WW domains defines key specificity determinants and generates mutant domains with novel specificities. Chem. Biol. 8, 231–241 (2001)

    Article  CAS  Google Scholar 

  42. John, D. M. & Weeks, K. M. van't Hoff enthalpies without baselines. Protein Sci. 9, 1416–1419 (2000)

    Article  CAS  Google Scholar 

  43. Sklenar, V., Piotto, M., Leppik, R. & Saudek, V. Gradient-tailored water suppression for 1H-15N HSQC experiments optimized to retain full sensitivity. J. Magn. Reson. A 102, 241–245 (1993)

    Article  ADS  CAS  Google Scholar 

  44. Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995)

    Article  CAS  Google Scholar 

  45. Johnson, B. A. & Blevins, R. A. NMRView: a computer program for the visualization and analysis of NMR data. J. Biomol. NMR 4, 603–614 (1994)

    Article  CAS  Google Scholar 

  46. Brünger, A. T. et al. Crystallography and NMR system (CNS): a new software system for macromolecular structure determination. Acta Crystallogr. D 54, 905–921 (1998)

    Article  Google Scholar 

  47. Linge, J. P., O'Donoghue, S. I. & Nilges, M. Methods in Enzymology 71–90 (Academic Press, 2001)

    Google Scholar 

  48. Cornilescu, G., Delaglio, F. & Bax, A. Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J. Biomol. NMR 13, 289–302 (1999)

    Article  CAS  Google Scholar 

  49. Laskowski, R. A., Rullman, J. A. C., MacArthur, M. W., Kaptein, R. & Thornton, J. M. AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR 8, 477–486 (1996)

    Article  CAS  Google Scholar 

  50. Koradi, R., Billeter, M. & Wüthrich, K. MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 14, 51–55 (1996)

    Article  CAS  Google Scholar 

  51. Delano, W. L. The PyMOL Molecular Graphics Systemhttp://www.pymol.org (2002).

Download references

Acknowledgements

We thank A. G. Gilman, M. Rosen, and members of the Ranganathan laboratory for advice and critical review of the manuscript, and E. Olson for contributing to this project. R.R. and S.W.L. thank the students of the 2001 Woods Hole physiology course for participating in an initial phase of this work. This study was supported by the Robert A. Welch foundation (R.R. and K.H.G.), the Mallinckrodt Foundation Scholar Award (R.R.), and the NIH (K.H.G.). W.P.R. is an associate and R.R. is an investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rama Ranganathan.

Ethics declarations

Competing interests

Atomic coordinates for CC45 have been deposited in the Protein Data Bank under accession code 1YMZ. Reprints and permissions information is available at npg.nature.com/reprintsandpermissions. The authors declare no competing financial interests.

Supplementary information

Supplementary Notes

Supplementary Table S1, Supplementary Table S2, Supplementary Methods and Supplementary Figure Legends. (DOC 156 kb)

Supplementary Figure S1

Statistical coupling analysis for one site in the WW domain family. (PDF 578 kb)

Supplementary Figure S2

Summary of all sequences constructed and experimentally tested. (PDF 2191 kb)

Supplementary Figure S3

Assessment of expression and solubility of all sequences constructed. (PDF 12056 kb)

Supplementary Figure S4

Thermal denaturation and renaturation studies. (PDF 5249 kb)

Supplementary Figure S5

One-dimensional proton NMR spectra for natural WW domains (a), CC sequences (b), and IC sequences (c). (PDF 6444 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Socolich, M., Lockless, S., Russ, W. et al. Evolutionary information for specifying a protein fold. Nature 437, 512–518 (2005). https://doi.org/10.1038/nature03991

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature03991

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing