Abstract
In the field of computational structural proteomics, contact predictions have shown new prospects of solving the longstanding problem of ab initio protein structure prediction. In the last few years, application of deep learning algorithms and availability of large protein sequence databases, combined with improvement in methods that derive contacts from multiple sequence alignments, have shown a huge increase in the precision of contact prediction. In addition, these predicted contacts have also been used to build three-dimensional models from scratch.
In this chapter, we briefly discuss many elements of protein residue–residue contacts and the methods available for prediction, focusing on a state-of-the-art contact prediction tool, DNcon. Illustrating with a case study, we describe how DNcon can be used to make ab initio contact predictions for a given protein sequence and discuss how the predicted contacts may be analyzed and evaluated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242. doi:10.1093/nar/28.1.235
Rohl CA, Strauss CEM, Misura KMS, Baker D (2004) Protein structure prediction using Rosetta. In: Ludwig B, Michael LJ (eds) Methods in enzymology, vol 383. Academic, Cambridge, MA, pp 66–93, http://dx.doi.org/10.1016/S0076-6879(04)83004-0
Kosciolek T, Jones DT (2014) De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts. PLoS One 9(3):e92197
Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One 6(12):e28766
Adhikari B, Bhattacharya D, Cao R, Cheng J (2015) CONFOLD: residue‐residue contact‐guided ab initio protein folding. Protein Struct Funct Bioinform. doi:10.1002/prot.24829
Vendruscolo M, Domany E (2000) Protein folding using contact maps. Vitam Horm 58: 171–212
Mirny L, Domany E (1996) Protein fold recognition and dynamics in the space of contact maps. Protein Struct Funct Bioinform 26(4):391–410. doi:10.1002/(SICI)1097-0134(199612)26:4<391::AID-PROT3>3.0.CO;2-F
Rohl CA, Strauss CE, Misura KM, Baker D (2004) Protein structure prediction using Rosetta. Methods Enzymol 383:66–93. doi:10.1016/s0076-6879(04)83004-0
Jones DT (2001) Predicting novel protein folds by using FRAGFOLD. Proteins 5:127–132
Kliger Y, Levy O, Oren A, Ashkenazy H, Tiran Z, Novik A, Rosenberg A, Amir A, Wool A, Toporik A, Schreiber E, Eshel D, Levine Z, Cohen Y, Nold-Petry C, Dinarello CA, Borukhov I (2009) Peptides modulating conformational changes in secreted chaperones: from in silico design to preclinical proof of concept. Proc Natl Acad Sci U S A 106(33): 13797–13801. doi:10.1073/pnas.0906514106
Miller CS, Eisenberg D (2008) Using inferred residue contacts to distinguish between correct and incorrect protein models. Bioinformatics 24(14):1575–1582. doi:10.1093/bioinformatics/btn248
Wang Z, Eickholt J, Cheng J (2011) APOLLO: a quality assessment service for single and multiple protein models. Bioinformatics 27(12):1715–1716. doi:10.1093/bioinformatics/btr268
Duarte JM, Sathyapriya R, Stehr H, Filippis I, Lappe M (2010) Optimal contact definition for reconstruction of contact maps. BMC Bioinformatics 11(1):283
Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190
Niggemann M, Steipe B (2000) Exploring local and non-local interactions for protein stability by structural motif engineering. J Mol Biol 296(1):181–195. doi:10.1006/jmbi.1999.3385
Monastyrskyy B, Fidelis K, Tramontano A, Kryshtafovych A (2011) Evaluation of residue–residue contact predictions in CASP9. Protein Struct Funct Bioinform 79(S10):119–125
Monastyrskyy B, D’Andrea D, Fidelis K, Tramontano A, Kryshtafovych A (2014) Evaluation of residue–residue contact prediction in CASP10. Protein Struct Funct Bioinform 82(S2):138–153
Eickholt J, Cheng J (2013) A study and benchmark of DNcon: a method for protein residue-residue contact prediction using deep networks. BMC Bioinformatics 14(Suppl 14):S12
Sathyapriya R, Duarte JM, Stehr H, Filippis I, Lappe M (2009) Defining an essence of structure determining residue contacts in proteins. PLoS Comput Biol 5(12):e1000584
Michel M, Hayat S, Skwark MJ, Sander C, Marks DS, Elofsson A (2014) PconsFold: improved contact predictions improve protein models. Bioinformatics 30(17):i482–i488
Eickholt J, Cheng J (2012) Predicting protein residue–residue contacts using deep networks and boosting. Bioinformatics 28(23): 3066–3072
Jones DT, Singh T, Kosciolek T, Tetchner S (2015) MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006. doi:10.1093/bioinformatics/btu791
Schneider M, Brock O (2014) Combining physicochemical and evolutionary information for protein contact prediction. PLoS One 9(10):e108438. doi:10.1371/journal.pone.0108438
Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28(19):2449–2457. doi:10.1093/bioinformatics/bts475
Björkholm P, Daniluk P, Kryshtafovych A, Fidelis K, Andersson R, Hvidsten TR (2009) Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue–residue contacts. Bioinformatics 25(10):1264–1270. doi:10.1093/bioinformatics/btp149
Skwark MJ, Raimondi D, Michel M, Elofsson A (2014) Improved contact predictions using the recognition of protein like contact patterns. PLoS Comput Biol 10(11):e1003889
Jones DT, Singh T, Kosciolek T, Tetchner S (2014) MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006, btu791
Tegge AN, Wang Z, Eickholt J, Cheng J (2009) NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 37(suppl 2): W515–W518
Xue B, Faraggi E, Zhou Y (2009) Predicting residue–residue contact maps by a two-layer, integrated neural-network method. Protein Struct Funct Bioinform 76(1):176–183. doi:10.1002/prot.22329
Shackelford G, Karplus K (2007) Contact prediction using mutual information and neural nets. Protein Struct Funct Bioinform 69(S8):159–164. doi:10.1002/prot.21791
Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts in proteins. Protein Eng 12(1):15–21. doi:10.1093/protein/12.1.15
Fariselli P, Olmea O, Valencia A, Casadio R (2001) Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations. Proteins 5:157–162
MacCallum RM (2004) Striped sheets and protein contact prediction. Bioinformatics 20(Suppl 1):i224–i231. doi:10.1093/bioinformatics/bth913
Chen P, Li J (2010) Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC Struct Biol 10(Suppl 1):S2
Li Y, Fang Y, Fang J (2011) Predicting residue–residue contacts using random forest models. Bioinformatics 27(24):3379–3384. doi:10.1093/bioinformatics/btr579
Lippi M, Frasconi P (2009) Prediction of protein β-residue contacts by Markov logic networks with grounding-specific weights. Bioinformatics 25(18):2326–2333. doi:10.1093/bioinformatics/btp421
Cheng J, Baldi P (2007) Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 8(1):113
Wu S, Zhang Y (2008) A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24(7):924–931. doi:10.1093/bioinformatics/btn069
Shindyalov IN, Kolchanov NA, Sander C (1994) Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng 7(3):349–358
Gobel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins 18(4):309–317. doi:10.1002/prot.340180402
Olmea O, Valencia A (1997) Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Folding Des 2(Suppl 1):S25–S32. doi:10.1016/S1359-0278(97)00060-6, http://dx.doi.org/
Lapedes AS, Giraud B, Liu L, Stormo GD (1999) Correlated mutations in models of protein sequences: phylogenetic and structural effects. In: Seillier-Moiseiwitsch F (ed) Statistics in molecular biology and genetics, vol 33, Lecture Notes--Monograph Series. Institute of Mathematical Statistics, Hayward, CA, pp 236–256. doi:10.1214/lnms/1215455556
Weigt M, White RA, Szurmant H, Hoch JA, Hwa T (2009) Identification of direct residue contacts in protein–protein interaction by message passing. Proc Natl Acad Sci 106(1):67–72. doi:10.1073/pnas.0805923106
Tetchner S, Kosciolek T, Jones DT (2014) Opportunities and limitations in applying coevolution-derived contacts to protein structure prediction. Bio Algorithm Med Syst 10(4):243–254
Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci 108(49):E1293–E1301. doi:10.1073/pnas.1111471108
Ekeberg M, Lövkvist C, Lan Y, Weigt M, Aurell E (2013) Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E 87(1):012707
Ekeberg M, Hartonen T, Aurell E (2014) Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J Comput Phys 276:341–356. doi:10.1016/j.jcp.2014.07.024, http://dx.doi.org/
Feinauer C, Skwark MJ, Pagnani A, Aurell E (2014) Improving contact prediction along three dimensions. PLoS Comput Biol 10(10):e1003847. doi:10.1371/journal.pcbi.1003847
Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci 110(39):15674–15679. doi:10.1073/pnas.1314045110
Clark GW, Ackerman SH, Tillier ER, Gatti DL (2014) Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments. BMC Bioinformatics 15(1):157
Misura KM, Chivian D, Rohl CA, Kim DE, Baker D (2006) Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proc Natl Acad Sci U S A 103(14):5361–5366. doi:10.1073/pnas.0509355103
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004) UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem 25(13):1605–1612. doi:10.1002/jcc.20084
Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz JS, Krasnogor N (2012) Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. Bioinformatics. doi:10.1093/bioinformatics/bts472
Vullo A, Walsh I, Pollastri G (2006) A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics 7:180. doi:10.1186/1471-2105-7-180
Seemayer S, Gruber M, Söding J (2014) CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinformatics 30(21):3128–3130
Kaján L, Hopf TA, Marks DS, Rost B (2014) FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 15(1):85
Jeong CS, Kim D (2012) Reliable and robust detection of coevolving protein residues. Protein Eng Des Sel 25(11):705–713. doi:10.1093/protein/gzs081
Buslje CM, Santos J, Delfino JM, Nielsen M (2009) Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics 25(9):1125–1131. doi:10.1093/bioinformatics/btp135
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Adhikari, B., Cheng, J. (2016). Protein Residue Contacts and Prediction Methods. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 1415. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3572-7_24
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3572-7_24
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3570-3
Online ISBN: 978-1-4939-3572-7
eBook Packages: Springer Protocols