Abstract
Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the template protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured.
Similar content being viewed by others
Abbreviations
- PDB:
-
Protein data bank
- EC:
-
Enzyme commission
- CASTp:
-
Computed atlas of surface topography of proteins
- pRMSD:
-
Pocket root mean square deviation
- pvSOAR:
-
Pocket and void surface patterns of amino acid residues
- SOLAR :
-
Signature of local active regions
References
Kinoshita K, Nakamura H (2003) Identification of protein biochemical functions by similarity search using the molecular surface database ef-site. Protein Sci 12:1589–1595
Fersht A, Matouschek A, Serrano L (1992) The folding of an enzyme: I. theory of protein engineering analysis of stability and pathway of protein folding. J Mol Biol 224:771–782
Bartlett G, Porter C, Borkakoti N, Thornton M (2002) Analysis of catalytic residues in enzyme active sites. J Mol Biol 324:105–121
Putnam C, Arvai A, Bourne Y, Tainer J (2000) Active and inhibited human catalase structures: ligand and nadph binding and catalytic mechanism. J Mol Biol 296:295–309
Virkamaki A, Ueki K, Kahn C (1999) Protein-protein interaction in insulin signaling and the molecular mechanisms of insulin resistance. J Clin Invest 103(7):931–943
Ofran Y, Punta M, Schneider R, Rost B (2005) Beyond annotation transfer by homology: novel protein-function prediction methods to assist drug discovery. Drug Discov Today 10:1475–1482
Henrich S, Salo-Ahen O, Huang B, Rippmann F, Cruciani G, Wade R (2010) Computational approaches to identifying and characterizing protein binding sites for ligand design. J Mol Recogn 23:209–219
Elcock A (2001) Prediction of functionally important residues based solely on the computed energetics of protein structure. J Mol Biol 312:885–896
Ota M, Kinoshita K, Nishikawa K (2003) Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. J Mol Biol 327:1053–1064
Chelliah V, Chen L, Blundell T, Lovell S (2004) Distinguishing structural and functional restraints in evolution in order to identify interaction sites. J Mol Biol 342:1487–1504
Cheng G, Qian B, Samudrala R, Baker D (2005) Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res 33:5861–5867
Morita M, Nakamura S, Shimizu K (2008) Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 73:468–479
Boobbyer D, Goodford P, McWhinnie P, Wade R (1989) New hydrogen-bond potentials for use in determining energetically favorable binding sites on molecules of known structure. J Med Chem 32(5):1083–1094
Landon M, Lancia D, Yu J, Thiel S, Vajda S (2007) Identification of hot spots within druggable binding regions by computational solvent mapping of proteins. J Med Chem 50(6):1231–1240
Vajada S, Guarnieri F (2006) Characterization of protein-ligand interaction sites using experimental and computational methods. Curr Opin Drug Di De 9:354–362
Clark M, Guarnieri F, Shkurko I, Wiseman J (2006) Grand canonical monte carlo simulation of ligand-protein binding. J Chem Inf Model 46:231–242
Wade R, Goodford P (1993) Further development of hydrogen bond functions for use in determining energetically favorable binding sites on molecules of known structure. 2. Ligand probe groups with the ability to form more than two hydrogen bonds. J Med Chem 36(1):148–156
Cammer S, Hoffman B, Speir J, Canady M, Nelson M, Knutson S, Gallina M, Baxter S, Fetrow J (2003) Structure-based active site profiles for genome analysis and sub-family classification. J Mol Biol 334(3):387–401
Brylinski M, Skolnick J (2007) A threading-based method (findsite) for ligand-binding site prediction and functional annotation. PNAS 105:129–134
Laskowski R (1995) Surfnet: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J Mol Graph 13:323–330
Laurie A, Jackson R (2005) Q-sitefinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 21:1908–1916
Binkowski A, Adamian L, Liang J (2003) Inferrring functional relationship of proteins from local sequence and spatial surface patterns. J Mol Biol 332:505–526
Binkowski A, Joachimiak A, Liang J (2005) Protein surface analysis for function annotation in high-throughput structural genomics pipeline. Protein Sci 14:2972–2981
Tseng Y, Liang J (2007) Predicting enzyme functional surfaces and locating key residues automatically from structures. Ann Biomed Eng 35(6):1037–1042
Tseng Y, Dundas J, Liang J (2009) Predicting protein function and binding profiles via matching of local evolutionary and geometric surface patterns. J Mol Biol 387:451–464
Levitt D, Banaszak J (1992) Pocket: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. J Mol Graph 10:229–234
Hendlich M, Rippmann F, Barnickel G (1997) Ligsite: automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model 15:359–363
Huang B, Schroeder M (2006) Ligsitecsc: predicting ligand binding sites using the connolly surface and degree of conservation. BMC Struct Biol 6:19
Liang J, Edelsbrunner H, Woodward C (1995) Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design. Protein Sci 7:1884–1897
Brady G, Stouten P (2000) Fast prediction and visualization of protein binding pockets with pass. J Comput Aid Mol Des 14:383–401
Weisel M, Proschak E, Schneider G (2007) Pocketpicker: analysis of ligand binding-sites with shape descriptors. Chem Cent J 1:7
Le Guilloux V, Schmidtke P, Tuffery P (2009) Fpocket: an open source platform for ligand pocket detection. BMC Bioinform 10:168
Kinoshita K, Nakamura H (2009) Identification of the ligand binding sites on the molecular surface of proteins. Protein Sci 14:711–718
Loewenstein Y, Raimondo D, Redfern O, Watson J, Frishman D, Linial M, Orengo C, Thornton J, Tramontano A (2009) Protein function annotation by homology-based inference. Genome Biol 10:207
uncker A, Jensen L, Pierleoni A, Bernsel A, Tress M, Bork P, Heijne G, Valencia A, Ouzounis C, Casadio R, Brunak S (2009) Sequence-based feature prediction and annotation of proteins. Genome Biol 10:206
Lee D, Redfern O, Orengo C (2007) Predicting protein function from sequence and structure. Nature 8:995–1005
Russell R, Sasieni P, Sternberg J (1998) Supersites within superfolds. Binding site similarity in absence of homology. J Mol Biol 282:903–918
Todd A, Orengo C, Thornton J (2001) Evolution of function in protein superfamilies from a structural perspective. J Mol Biol 307:1113–1143
Chen B, Honig B (2010) Vasp: a volumetric analysis of surface properties yields insights into protein-ligand binding specificity. PLoS Comput Biol 6:1–11
Chiang R, Sali A, Babbitt P (2008) Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies. PLoS Comput Biol 4:1–11
Tseng Y, Li W (2009) Identification of protein functional surfaces by the concept of a split pocket. Proteins 76:959–976
Liang J, Tseng Y, Dundas J, Binkowski A, Joachimiak A, Ouyang Z, Adamian L (2008) Predicting and characterizing protein functions through matching geometric and evolutionary patterns of binding surfaces. Adv Protein Chem 75:107–141
Dundas J, Adamian L, Liang J (2011) Structural signatures of enzyme binding pockets from order-independent surface alignment: a study of metalloendopeptidase and nad binding proteins. J Mol Biol 406:713–729
Sali A, Blundell T (1993) Comparative protein modeling by satisfaction of spatial restraints. J Mol Biol 234:779–815
Rost B (1999) Twilight zone of protein sequence alignments. Protein Eng 12:85–94
Marti-Renom M, Stuart A, Fiser A, Sanchez R, Melo F, Sali A (2000) Comparative protein structure modeling of genes and genomes. Ann Rev Biophys Biomol Struct 29:291–325
Eramian D, Eswar N, Shen M, Sali A (2008) How well can the accuracy of comparative protein structure models be predicted?. Protein Sci 17:1881–1893
Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294(5540):93–96
Fiser A (2009) Comparative protein structure modelling. Springer, Berlin, vol 3, pp 57–90
Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide protein data bank. Nat Struct Biol 10:980–980
Berman H, Henrick K, Nakamura H, Markley J (2006) The worldwide protein data bank(wwpdb): ensuring a single, uniform archive of pdb data. Nucleic Acids Res 35:D301–D303
Kleywegt G, Jones A (1997) Model building and refinement practice. Methods Enzymol 277:208–230
Smith T, Waterman M (2006) Comparison of biosequences. Adv Appl Math 2:482–489
Eramian D, Marti-Renom M, Webb B, Madhusudhan M, Eswar N, Shen M, Pieper U, Sali A (2007) Comparative protein structure modeling with modeller. Curr Protoc Protein Sci 50:2.9.1–2.9.31
Li M, Wang B (2007) Homology modeling and examination of the effect of the d92e mutation on the h5n1 nonstructural protein ns1 effector domain. J Mol Model 13:1237–1244
Zheng Z, Zuo Z, Liu Z, Tsai K, Liu A, Zou GL (2005) Construction of a 3d model of nattokinase, a novel fibrinolytic enzyme from bacillus natto a novel nucleophilic catalytic mechanism for nattokinase. J Mol Graph Model 23:373–380
Kiss R, Kovari Z, Keseru G (2004) Homology modelling and binding site mapping of the human histamine h1 receptor. Eur J Med Chem 39:959–967
Gabdoulline R, Stein M, Wade R (2007) Apipsa: relating enzymatic kinetic parameters and interaction fields. BMC Bioinform 8:373–388
Bateman A, Finn R, Sims P, Wiedmer T, Biegert A, Soding J (2009) Phospholipid scramblases and tubby-like proteins belong to a new superfamily of membrane tethered transcription factors. Bioinformatics 25:159–162
Whalen K, Starczak V, Nelson D, Goldstone J, Hahn M (2010) Cytochrome p450 diversity and induction by gorgonian allelochemicals in the marine gastropod cyphoma gibbosum. BMC Bioinform 10:24–38
Dundas J, Ouyang Z, Tseng J, Binkowski A, Turpaz Y, Liang J (2006) CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res 34:W116–W118
Umeyama S (1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Trans 13:376–380
Tian W, Skolnick J (2003) How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol 333:863–882
Wilk M, Gnanadesikan R (1968) Probability plotting methods for the analysis of data. Biometrika 55:1–17
Acknowledgments
This work is supported by grants from NIH (GM079804, GM081682, GM086145, GM055876-13) and NSF (DMS-0800257).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, J., Dundas, J., Kachalo, S. et al. Accuracy of functional surfaces on comparatively modeled protein structures. J Struct Funct Genomics 12, 97–107 (2011). https://doi.org/10.1007/s10969-011-9109-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10969-011-9109-z