Abstract
Computational prediction of protein structural class based solely on sequence data remains a challenging problem in protein science. Existing methods differ in the protein sequence representation models and prediction engines adopted. In this study, a powerful feature extraction method, which combines position-specific score matrix (PSSM) with auto covariance (AC) transformation, is introduced. Thus, a sample protein is represented by a series of discrete components, which could partially incorporate the long-range sequence order information and evolutionary information reflected from the PSI-BLAST profile. To verify the performance of our method, jackknife cross-validation tests are performed on four widely used benchmark datasets. Comparison of our results with existing methods shows that our method provides the state-of-the-art performance for structural class prediction. A Web server that implements the proposed method is freely available at http://202.194.133.5/xinxi/AAC_PSSM_AC/index.htm.
Similar content being viewed by others
References
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Anand A, Pugalenthi G, Suganthan PN (2008) Predicting protein structural class by SVM with class-wise optimized features and decision probabilities. J Theor Biol 253(2):375–380
Cai YD, Feng KY, Lu WC, Chou KC (2006) Using LogitBoost classifier to predict protein structural classes. J Theor Biol 238(1):172–176
Cai YD, Liu XJ, Xu X, Zhou GP (2001) Support vector machines for predicting protein structural class. BMC Bioinformatics 2:3
Cai YD, Zhou GP (2000) Prediction of protein structural classes by neural network. Biochimie 82(8):783–785
Cao YF, Liu S, Zhang LD, Qin J, Wang J, Tang KX (2006) Prediction of protein structural class with Rough Sets. BMC Bioinformatics 7:20
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines
Chen C, Tian YX, Zou XY, Cai PX, Mo JY (2006a) Using pseudo-amino acid composition and support vector machine to predict protein structural class. J Theor Biol 243(3):444–448
Chen C, Zhou X, Tian Y, Zou X, Cai P (2006b) Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. Anal Biochem 357(1):116–121
Chen K, Kurgan LA, Ruan JS (2008) Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J Comput Chem 29(10):1596–1604
Chen L, Lu L, Feng K, Li W, Song J, Zheng L, Yuan Y, Zeng Z, Lu W, Cai Y (2009) Multiple classifier integration for the prediction of protein structural classes. J Comput Chem 30(14):2248–2254
Chou KC (1999) A key driving force in determination of protein structural classes. Biochem Biophys Res Commun 264(1):216–224
Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43(3):246–255
Chou KC, Cai YD (2004) Predicting protein structural class by functional domain composition. Biochem Biophys Res Commun 321(4):1007–1009
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Costantini S, Facchiano AM (2009) Prediction of the protein structural class by specific peptide frequencies. Biochimie 91(2):226–229
Dong QW, Zhou SG, Guan JH (2009) A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20):2655–2662
Feng KY, Cai YD, Chou KC (2005) Boosting classifier for predicting protein domain structural class. Biochem Biophys Res Commun 334(1):213–217
Guo Y, Li M, Lu M, Wen Z, Huang Z (2006) Predicting G-protein coupled receptors-G-protein coupling specificity based on autocross-covariance transform. Proteins 65(1):55–60
Guo YZ, Yu LZ, Wen ZN, Li ML (2008) Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences. Nucleic Acids Res 36(9):3025–3030
Kedarisetti KD, Kurgan L, Dick S (2006) Classifier ensembles for protein structural class prediction with varying homology. Biochem Biophys Res Commun 348(3):981–988
Kurgan L, Chen K (2007) Prediction of protein structural class for the twilight zone sequences. Biochem Biophys Res Commun 357(2):453–460
Kurgan L, Cios K, Chen K (2008a) SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. BMC Bioinformatics 9:226
Kurgan LA, Homaeian L (2006) Prediction of structural classes for protein sequences and domains—impact of prediction algorithms, sequence representation and homology, and test procedures on accuracy. Pattern Recogn 39(12):2323–2343
Kurgan LA, Zhang T, Zhang H, Shen SY, Ruan JS (2008b) Secondary structure-based assignment of the protein structural classes. Amino Acids 35(3):551–564
Levitt M, Chothia C (1976) Structural Patterns in Globular Proteins. Nature 261(5561):552–558
Li ZC, Zhou XB, Dai Z, Zou XY (2009) Prediction of protein structural classes by Chou’s pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis. Amino Acids 37(2):415–425
Li ZC, Zhou XB, Lin YR, Zou XY (2008) Prediction of protein structure class by coupling improved genetic algorithm and support vector machine. Amino Acids 35(3):581–590
Liu T, Jia C (2010) A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J Theor Biol 267(3):272–275
Liu T, Zheng X, Wang J (2010) Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. Biochimie 92(10):1330–1334
Luo RY, Feng ZP, Liu JK (2002) Prediction of protein structural class by amino acid and polypeptide composition. Eur J Biochem 269(17):4219–4225
Mizianty MJ, Kurgan L (2009) Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences. BMC Bioinformatics 10:414
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247(4):536–540
Nakashima H, Nishikawa K, Ooi T (1986) The folding type of a protein is relevant to the amino acid composition. J Biochem 99(1):153–162
Qiu JD, Luo SH, Huang JH, Liang RP (2009) Using support vector machines for prediction of protein structural classes based on discrete wavelet transform. J Comput Chem 30(8):1344–1350
Shen HB, Yang J, Liu XJ, Chou KC (2005) Using supervised fuzzy clustering to predict protein structural classes. Biochem Biophys Res Commun 334(2):577–581
Sun XD, Huang RB (2006) Prediction of protein structural classes using support vector machines. Amino Acids 30(4):469–475
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Vapnik V (1998) Statistical learning theory. Wiley, New York
Wang ZX, Yuan Z (2000) How good is prediction of protein structural class by the component-coupled method? Proteins 38(2):165–175
Wold S, Jonsson J, Sjostrom M, Sandberg M, Rannar S (1993) DNA and peptide sequences and chemical processes multivariately modeled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 277(2):239–253
Wu J, Li M, Yu L, Wang C (2010) An ensemble classifier of support vector machines used to predict protein structural classes by fusing auto covariance and pseudo-amino acid composition. Protein J 29(1):62–67
Xiao X, Lin WZ, Chou KC (2008) Using grey dynamic modeling and pseudo amino acid composition to predict protein structural classes. J Comput Chem 29(12):2018–2024
Yang JY, Peng ZL, Chen X (2010) Prediction of protein structural classes for low-homology sequences based on predicted secondary structure. BMC Bioinformatics 11 Suppl 1:S9
Yang JY, Peng ZL, Yu ZG, Zhang RJ, Anh V, Wang DS (2009) Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation. J Theor Biol 257(4):618–626
Zhang TL, Ding YS (2007) Using pseudo amino acid composition and binary-tree support vector machines to predict protein structural classes. Amino Acids 33(4):623–629
Zhang TL, Ding YS, Chou KC (2008) Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern. J Theor Biol 250(1):186–193
Zheng X, Li C, Wang J (2010) An information-theoretic approach to the prediction of protein structural class. J Comput Chem 31(6):1201–1206
Zhou GP (1998) An intriguing controversy over protein structural class prediction. J Protein Chem 17(8):729–738
Acknowledgments
This work was partially supported by the National Natural Science Foundation of China (No. 10731040), Shanghai Leading Academic Discipline Project (No. S30405), Innovation Program of Shanghai Municipal Education Commission (No. 09zz134), and the Science and Technology Project of Provincial Education Department of Shandong of China (No. J08LI09).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, T., Geng, X., Zheng, X. et al. Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles. Amino Acids 42, 2243–2249 (2012). https://doi.org/10.1007/s00726-011-0964-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-011-0964-5