Skip to main content
Log in

Using predicted shape string to enhance the accuracy of γ-turn prediction

  • Original Article
  • Published:
Amino Acids Aims and scope Submit manuscript

Abstract

Numerous methods for predicting γ-turns in proteins have been developed. However, the results they generally provided are not very good, with a Matthews correlation coefficient (MCC) ≤0.18. Here, an attempt has been made to develop a method to improve the accuracy of γ-turn prediction. First, we employ the geometric mean metric as optimal criterion to evaluate the performance of support vector machine for the highly imbalanced γ-turn dataset. This metric tries to maximize both the sensitivity and the specificity while keeping them balanced. Second, a predictor to generate protein shape string by structure alignment against the protein structure database has been designed and the predicted shape string is introduced as new variable for γ-turn prediction. Based on this perception, we have developed a new method for γ-turn prediction. After training and testing the benchmark dataset of 320 non-homologous protein chains using a fivefold cross-validation technique, the present method achieves excellent performance. The overall prediction accuracy Q total can achieve 92.2% and the MCC is 0.38, which outperform the existing γ-turn prediction methods. Our results indicate that the protein shape string is useful for predicting protein tight turns and it is reasonable to use the dihedral angle information as a variable for machine learning to predict protein folding. The dataset used in this work and the software to generate predicted shape string from structure database can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/GammaTurnPrediction/ freely.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Alkorta I, Suarez ML, Herranz R, GonzalezMuniz R, GarciaLopez MT (1996) Similarity study on peptide gamma-turn conformation mimetics. J Mol Model 2:16–25

    Google Scholar 

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  • Anand A, Pugalenthi G, Fogel GB, Suganthan PN (2010) An approach for classification of highly imbalanced data using weighting and undersampling. Amino Acids 39:1385–1391

    Article  PubMed  CAS  Google Scholar 

  • Barandela R, Sanchez JS, Garcia V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recogn 36:849–851

    Article  Google Scholar 

  • Bystrov VF, Portnova SL, Tsetlin VI, Ivanov VT, Ovchinnikov YA (1969) Conformational studies of peptide systems. The rotational states of the NH–CH fragment of alanine dipeptides by nuclear magnetic resonance. Tetrahedron 25:493–515

    Article  PubMed  CAS  Google Scholar 

  • Cai YD, Liu XJ, Xu XB, Chou KC (2002) Support vector machines for the classification and prediction of beta-turn types. J Pept Sci 8:297–301

    Article  PubMed  CAS  Google Scholar 

  • Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  • Chou KC (1997a) Prediction and classification of alpha-turn types. Biopolymers 42:837–853

    Article  PubMed  CAS  Google Scholar 

  • Chou KC (1997b) Prediction of beta-turns. J Pept Res 49:120–144

    Article  PubMed  CAS  Google Scholar 

  • Chou KC (2000) Prediction of tight turns and their types in proteins. Anal Biochem 286:1–16

    Article  PubMed  CAS  Google Scholar 

  • Chou KC, Blinn JR (1997) Classification and prediction of beta-turn types. J Protein Chem 16:575–595

    Article  PubMed  CAS  Google Scholar 

  • Chou KC, Cai YD (2002) Using functional domain composition and support vectormachines for prediction of protein subcellular location. J Biol Chem 277:45765–45769

    Article  PubMed  CAS  Google Scholar 

  • DiFrancesco V, Garnier J, Munson PJ (1996) Improving protein secondary structure prediction with aligned homologous sequences. Protein Sci 5:106–113

    Article  CAS  Google Scholar 

  • Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:97–120

    Article  PubMed  CAS  Google Scholar 

  • Gibrat JF, Garnier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198:425–443

    Article  PubMed  CAS  Google Scholar 

  • Guruprasad K, Rajkumar S (2000) Beta-and gamma-turns in proteins revisited: a new set of amino acid turn-type dependent positional preferences and potentials. J Biosci 25:143–156

    PubMed  CAS  Google Scholar 

  • Guruprasad K, Shukla S, Adindla S, Guruprasad L (2003) Prediction of gamma-turns from amino acid sequences. J Pept Res 61:243–251

    Article  PubMed  CAS  Google Scholar 

  • Hovmöller S, Zhou T (2004) Protein shape strings and DNA sequences [Online]. Available: http://www.fos.su.se/~pdbdna/pdb_shape_dna.html

  • Hovmöller S, Zhou T, Ohlson T (2002) Conformations of amino acids in proteins. Acta Crystallogr D 58:768–776

    Article  PubMed  Google Scholar 

  • Hu XZ, Li QZ (2008) Using support vector machine to predict beta- and gamma-turns in proteins. J Comput Chem 29:1867–1875

    Article  PubMed  CAS  Google Scholar 

  • Hutchinson EG, Thornton JM (1996) PROMOTIF—a program to identify and analyze structural motifs in proteins. Protein Sci 5:212–220

    Article  PubMed  CAS  Google Scholar 

  • Ison RE, Hovmöller S, Kretsinger RH (2005) Proteins and their shape strings. An exemplary computer representation of protein structure. IEEE Eng Med Biol Mag 24:41–49

    Article  PubMed  Google Scholar 

  • Jahandideh S, Sarvestani AS, Abdolmaleki P, Jahandideh M, Barfeie M (2007) gamma-turn types prediction in proteins using the support vector machines. J Theor Biol 249:785–790

    Article  PubMed  CAS  Google Scholar 

  • Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202

    Article  PubMed  CAS  Google Scholar 

  • Kaur H, Raghava GPS (2002) An evaluation of {beta}-turn prediction methods. Bioinformatics 18:1508–1514

    Article  PubMed  CAS  Google Scholar 

  • Kaur H, Raghava GPS (2003) A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. Protein Sci 12:923–929

    Article  PubMed  CAS  Google Scholar 

  • Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. Morgan Kaufmann 179–186. doi:10.1.1.43.4487

  • Li WZ, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659

    Article  PubMed  CAS  Google Scholar 

  • Montgomerie S, Sundararaj S, Gallin WJ, Wishart DS (2006) Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinform 7:301

    Article  Google Scholar 

  • Pham TH, Satou K, Ho TB (2005) Support vector machines for prediction and analysis of beta and gamma-turns in proteins. J Bioinform Comput Biol 3:343–358

    Article  PubMed  CAS  Google Scholar 

  • Richardson JS (1981) The anatomy and taxonomy of protein structure. Adv Protein Chem 34:167–339

    Article  PubMed  CAS  Google Scholar 

  • Robert MK, Holte R, Matwin S (1997) Learning when negative examples abound. Springer, Berlin, pp 146–153. doi: 10.1.1.36.88

  • Rose GD, Gierasch LM, Smith JA (1985) Turns in peptides and proteins. Adv Protein Chem 37:1–109

    Article  PubMed  CAS  Google Scholar 

  • Shepherd AJ, Gorse D, Thornton JM (1999) Prediction of the location and type of beta-turns in proteins using neural networks. Protein Sci 8:1045–1055

    Article  PubMed  CAS  Google Scholar 

  • Wang L, Wu LY, Wang Y, Zhang XS, Chen LN (2010) SANA: an algorithm for sequential and non-sequential protein structure alignment. Amino Acids 39:417–425

    Article  PubMed  Google Scholar 

  • Wrtten IH, Frank E (1999) Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Wu G, Chang EY (2003) Class-boundary alignment for imbalanced dataset learning. doi:10.1.1.94.9007

  • Zell A, Mamier G (1997) Neural Network Simulator, Version 4.2. University of Stuttgart, Stuttgart

    Google Scholar 

  • Zhang Q, Yoon S, Welsh WJ (2005) Improved method for predicting beta-turn using support vector machine. Bioinformatics 21:2370–2374

    Article  PubMed  CAS  Google Scholar 

  • Zhou TP, Shu NJ, Hövmoller S (2010) A novel method for accurate one-dimensional protein structure prediction based on fragment matching. Bioinformatics 26:470–477

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank the National Natural Science Foundation of China (20675057, 20705024) for financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tonghua Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Y., Li, T., Li, D. et al. Using predicted shape string to enhance the accuracy of γ-turn prediction. Amino Acids 42, 1749–1755 (2012). https://doi.org/10.1007/s00726-011-0889-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00726-011-0889-z

Keywords

Navigation