Skip to main content

Protein Residue Contacts and Prediction Methods

  • Protocol
  • First Online:
Data Mining Techniques for the Life Sciences

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1415))

Abstract

In the field of computational structural proteomics, contact predictions have shown new prospects of solving the longstanding problem of ab initio protein structure prediction. In the last few years, application of deep learning algorithms and availability of large protein sequence databases, combined with improvement in methods that derive contacts from multiple sequence alignments, have shown a huge increase in the precision of contact prediction. In addition, these predicted contacts have also been used to build three-dimensional models from scratch.

In this chapter, we briefly discuss many elements of protein residue–residue contacts and the methods available for prediction, focusing on a state-of-the-art contact prediction tool, DNcon. Illustrating with a case study, we describe how DNcon can be used to make ab initio contact predictions for a given protein sequence and discuss how the predicted contacts may be analyzed and evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.wwpdb.org/documentation/file-format

References

  1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242. doi:10.1093/nar/28.1.235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Rohl CA, Strauss CEM, Misura KMS, Baker D (2004) Protein structure prediction using Rosetta. In: Ludwig B, Michael LJ (eds) Methods in enzymology, vol 383. Academic, Cambridge, MA, pp 66–93, http://dx.doi.org/10.1016/S0076-6879(04)83004-0

    Google Scholar 

  3. Kosciolek T, Jones DT (2014) De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts. PLoS One 9(3):e92197

    Article  PubMed  PubMed Central  Google Scholar 

  4. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One 6(12):e28766

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Adhikari B, Bhattacharya D, Cao R, Cheng J (2015) CONFOLD: residue‐residue contact‐guided ab initio protein folding. Protein Struct Funct Bioinform. doi:10.1002/prot.24829

    Google Scholar 

  6. Vendruscolo M, Domany E (2000) Protein folding using contact maps. Vitam Horm 58: 171–212

    Article  CAS  PubMed  Google Scholar 

  7. Mirny L, Domany E (1996) Protein fold recognition and dynamics in the space of contact maps. Protein Struct Funct Bioinform 26(4):391–410. doi:10.1002/(SICI)1097-0134(199612)26:4<391::AID-PROT3>3.0.CO;2-F

    Google Scholar 

  8. Rohl CA, Strauss CE, Misura KM, Baker D (2004) Protein structure prediction using Rosetta. Methods Enzymol 383:66–93. doi:10.1016/s0076-6879(04)83004-0

    Article  CAS  PubMed  Google Scholar 

  9. Jones DT (2001) Predicting novel protein folds by using FRAGFOLD. Proteins 5:127–132

    Article  PubMed  Google Scholar 

  10. Kliger Y, Levy O, Oren A, Ashkenazy H, Tiran Z, Novik A, Rosenberg A, Amir A, Wool A, Toporik A, Schreiber E, Eshel D, Levine Z, Cohen Y, Nold-Petry C, Dinarello CA, Borukhov I (2009) Peptides modulating conformational changes in secreted chaperones: from in silico design to preclinical proof of concept. Proc Natl Acad Sci U S A 106(33): 13797–13801. doi:10.1073/pnas.0906514106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Miller CS, Eisenberg D (2008) Using inferred residue contacts to distinguish between correct and incorrect protein models. Bioinformatics 24(14):1575–1582. doi:10.1093/bioinformatics/btn248

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wang Z, Eickholt J, Cheng J (2011) APOLLO: a quality assessment service for single and multiple protein models. Bioinformatics 27(12):1715–1716. doi:10.1093/bioinformatics/btr268

    Article  PubMed  PubMed Central  Google Scholar 

  13. Duarte JM, Sathyapriya R, Stehr H, Filippis I, Lappe M (2010) Optimal contact definition for reconstruction of contact maps. BMC Bioinformatics 11(1):283

    Article  PubMed  PubMed Central  Google Scholar 

  14. Jones DT, Buchan DW, Cozzetto D, Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28(2):184–190

    Article  CAS  PubMed  Google Scholar 

  15. Niggemann M, Steipe B (2000) Exploring local and non-local interactions for protein stability by structural motif engineering. J Mol Biol 296(1):181–195. doi:10.1006/jmbi.1999.3385

    Article  CAS  PubMed  Google Scholar 

  16. Monastyrskyy B, Fidelis K, Tramontano A, Kryshtafovych A (2011) Evaluation of residue–residue contact predictions in CASP9. Protein Struct Funct Bioinform 79(S10):119–125

    Article  CAS  Google Scholar 

  17. Monastyrskyy B, D’Andrea D, Fidelis K, Tramontano A, Kryshtafovych A (2014) Evaluation of residue–residue contact prediction in CASP10. Protein Struct Funct Bioinform 82(S2):138–153

    Article  CAS  Google Scholar 

  18. Eickholt J, Cheng J (2013) A study and benchmark of DNcon: a method for protein residue-residue contact prediction using deep networks. BMC Bioinformatics 14(Suppl 14):S12

    Article  PubMed  PubMed Central  Google Scholar 

  19. Sathyapriya R, Duarte JM, Stehr H, Filippis I, Lappe M (2009) Defining an essence of structure determining residue contacts in proteins. PLoS Comput Biol 5(12):e1000584

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Michel M, Hayat S, Skwark MJ, Sander C, Marks DS, Elofsson A (2014) PconsFold: improved contact predictions improve protein models. Bioinformatics 30(17):i482–i488

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Eickholt J, Cheng J (2012) Predicting protein residue–residue contacts using deep networks and boosting. Bioinformatics 28(23): 3066–3072

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Jones DT, Singh T, Kosciolek T, Tetchner S (2015) MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006. doi:10.1093/bioinformatics/btu791

    Article  PubMed  PubMed Central  Google Scholar 

  23. Schneider M, Brock O (2014) Combining physicochemical and evolutionary information for protein contact prediction. PLoS One 9(10):e108438. doi:10.1371/journal.pone.0108438

    Article  PubMed  PubMed Central  Google Scholar 

  24. Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28(19):2449–2457. doi:10.1093/bioinformatics/bts475

    Article  PubMed  PubMed Central  Google Scholar 

  25. Björkholm P, Daniluk P, Kryshtafovych A, Fidelis K, Andersson R, Hvidsten TR (2009) Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue–residue contacts. Bioinformatics 25(10):1264–1270. doi:10.1093/bioinformatics/btp149

    Article  PubMed  PubMed Central  Google Scholar 

  26. Skwark MJ, Raimondi D, Michel M, Elofsson A (2014) Improved contact predictions using the recognition of protein like contact patterns. PLoS Comput Biol 10(11):e1003889

    Article  PubMed  PubMed Central  Google Scholar 

  27. Jones DT, Singh T, Kosciolek T, Tetchner S (2014) MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics 31(7):999–1006, btu791

    Article  PubMed  PubMed Central  Google Scholar 

  28. Tegge AN, Wang Z, Eickholt J, Cheng J (2009) NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 37(suppl 2): W515–W518

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Xue B, Faraggi E, Zhou Y (2009) Predicting residue–residue contact maps by a two-layer, integrated neural-network method. Protein Struct Funct Bioinform 76(1):176–183. doi:10.1002/prot.22329

    Article  CAS  Google Scholar 

  30. Shackelford G, Karplus K (2007) Contact prediction using mutual information and neural nets. Protein Struct Funct Bioinform 69(S8):159–164. doi:10.1002/prot.21791

    Article  CAS  Google Scholar 

  31. Fariselli P, Casadio R (1999) A neural network based predictor of residue contacts in proteins. Protein Eng 12(1):15–21. doi:10.1093/protein/12.1.15

    Article  CAS  PubMed  Google Scholar 

  32. Fariselli P, Olmea O, Valencia A, Casadio R (2001) Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations. Proteins 5:157–162

    Article  PubMed  Google Scholar 

  33. MacCallum RM (2004) Striped sheets and protein contact prediction. Bioinformatics 20(Suppl 1):i224–i231. doi:10.1093/bioinformatics/bth913

    Article  CAS  PubMed  Google Scholar 

  34. Chen P, Li J (2010) Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC Struct Biol 10(Suppl 1):S2

    Article  PubMed  PubMed Central  Google Scholar 

  35. Li Y, Fang Y, Fang J (2011) Predicting residue–residue contacts using random forest models. Bioinformatics 27(24):3379–3384. doi:10.1093/bioinformatics/btr579

    Article  CAS  PubMed  Google Scholar 

  36. Lippi M, Frasconi P (2009) Prediction of protein β-residue contacts by Markov logic networks with grounding-specific weights. Bioinformatics 25(18):2326–2333. doi:10.1093/bioinformatics/btp421

    Article  CAS  PubMed  Google Scholar 

  37. Cheng J, Baldi P (2007) Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 8(1):113

    Article  PubMed  PubMed Central  Google Scholar 

  38. Wu S, Zhang Y (2008) A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24(7):924–931. doi:10.1093/bioinformatics/btn069

    Article  PubMed  PubMed Central  Google Scholar 

  39. Shindyalov IN, Kolchanov NA, Sander C (1994) Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng 7(3):349–358

    Article  CAS  PubMed  Google Scholar 

  40. Gobel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins 18(4):309–317. doi:10.1002/prot.340180402

    Article  CAS  PubMed  Google Scholar 

  41. Olmea O, Valencia A (1997) Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Folding Des 2(Suppl 1):S25–S32. doi:10.1016/S1359-0278(97)00060-6, http://dx.doi.org/

    Article  CAS  Google Scholar 

  42. Lapedes AS, Giraud B, Liu L, Stormo GD (1999) Correlated mutations in models of protein sequences: phylogenetic and structural effects. In: Seillier-Moiseiwitsch F (ed) Statistics in molecular biology and genetics, vol 33, Lecture Notes--Monograph Series. Institute of Mathematical Statistics, Hayward, CA, pp 236–256. doi:10.1214/lnms/1215455556

    Chapter  Google Scholar 

  43. Weigt M, White RA, Szurmant H, Hoch JA, Hwa T (2009) Identification of direct residue contacts in protein–protein interaction by message passing. Proc Natl Acad Sci 106(1):67–72. doi:10.1073/pnas.0805923106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Tetchner S, Kosciolek T, Jones DT (2014) Opportunities and limitations in applying coevolution-derived contacts to protein structure prediction. Bio Algorithm Med Syst 10(4):243–254

    Google Scholar 

  45. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci 108(49):E1293–E1301. doi:10.1073/pnas.1111471108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Ekeberg M, Lövkvist C, Lan Y, Weigt M, Aurell E (2013) Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E 87(1):012707

    Article  Google Scholar 

  47. Ekeberg M, Hartonen T, Aurell E (2014) Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J Comput Phys 276:341–356. doi:10.1016/j.jcp.2014.07.024, http://dx.doi.org/

    Article  CAS  Google Scholar 

  48. Feinauer C, Skwark MJ, Pagnani A, Aurell E (2014) Improving contact prediction along three dimensions. PLoS Comput Biol 10(10):e1003847. doi:10.1371/journal.pcbi.1003847

    Article  PubMed  PubMed Central  Google Scholar 

  49. Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci 110(39):15674–15679. doi:10.1073/pnas.1314045110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Clark GW, Ackerman SH, Tillier ER, Gatti DL (2014) Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments. BMC Bioinformatics 15(1):157

    Article  PubMed  PubMed Central  Google Scholar 

  51. Misura KM, Chivian D, Rohl CA, Kim DE, Baker D (2006) Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proc Natl Acad Sci U S A 103(14):5361–5366. doi:10.1073/pnas.0509355103

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004) UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem 25(13):1605–1612. doi:10.1002/jcc.20084

    Article  CAS  PubMed  Google Scholar 

  53. Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz JS, Krasnogor N (2012) Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. Bioinformatics. doi:10.1093/bioinformatics/bts472

    PubMed  Google Scholar 

  54. Vullo A, Walsh I, Pollastri G (2006) A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics 7:180. doi:10.1186/1471-2105-7-180

    Article  PubMed  PubMed Central  Google Scholar 

  55. Seemayer S, Gruber M, Söding J (2014) CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations. Bioinformatics 30(21):3128–3130

    Article  PubMed  PubMed Central  Google Scholar 

  56. Kaján L, Hopf TA, Marks DS, Rost B (2014) FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 15(1):85

    Article  PubMed  PubMed Central  Google Scholar 

  57. Jeong CS, Kim D (2012) Reliable and robust detection of coevolving protein residues. Protein Eng Des Sel 25(11):705–713. doi:10.1093/protein/gzs081

    Article  CAS  PubMed  Google Scholar 

  58. Buslje CM, Santos J, Delfino JM, Nielsen M (2009) Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics 25(9):1125–1131. doi:10.1093/bioinformatics/btp135

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianlin Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Adhikari, B., Cheng, J. (2016). Protein Residue Contacts and Prediction Methods. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 1415. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3572-7_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3572-7_24

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3570-3

  • Online ISBN: 978-1-4939-3572-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics