Skip to main content

Advertisement

Log in

iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC

  • Published:
The Journal of Membrane Biology Aims and scope Submit manuscript

Abstract

Antifreeze proteins (AFPs), known as thermal hysteresis proteins, are ice-binding proteins. AFPs have been found in many fields such as in vertebrates, invertebrates, plants, bacteria, and fungi. Although the function of AFPs is common, the sequences and structures of them show a high degree of diversity. AFPs can be adsorbed in ice crystal surface and inhibit the growth of ice crystals in solution. However, the interaction between AFPs and ice crystal is not completely known for human beings. It is vitally significant to propose an automated means as a high-throughput tool to timely identify the AFPs. Analyzing physicochemical characteristics of AFPs sequences is very significant to understand the ice-protein interaction. In this manuscript, a predictor called “iAFP-Ense” was developed. The operation engine to run the AFPs prediction is an ensemble classifier formed by a voting system to fuse eleven different random forest classifiers based on feature extraction. We also compare our predictor with the AFP-PseAAC via the tenfold cross-validation on the same benchmark dataset. The comparison with the existing methods indicates the new predictor is very promising, meaning that many important key features which are deeply hidden in complicated protein sequences. The predictor used in this article is freely available at http://www.jci-bioinfo.cn/iAFP-Ense.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389–3402

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Anand A, Pugalenthi G, Suganthan P (2008) Predicting protein structural class by SVM with class-wise optimized features and decision probabilities. J Theor Biol 253:375–380

    Article  CAS  PubMed  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Breton G, Danyluk J, ois Ouellet F, Sarhan F (2000) Biotechnological applications of plant freezing associated proteins. Biotechnol Annu Rev 6:59–101

    Article  CAS  PubMed  Google Scholar 

  • Cai Y-D, Ricardo P-W, Jen C-H, Chou K-C (2004) Application of SVM to predict membrane protein types. J Theor Biol 226:373–376

    Article  CAS  PubMed  Google Scholar 

  • Cai Y-D, Zhou G-P, Chou K-C (2005) Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition. J Theor Biol 234:145–149

    Article  CAS  PubMed  Google Scholar 

  • Cao D-S, Xu Q-S, Liang Y-Z (2013) propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics 29:960–962

    Article  CAS  PubMed  Google Scholar 

  • Chen W, Lin H, Feng P-M, Ding C, Zuo Y-C et al (2012) iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties. PLoS ONE 7:e47843

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen W, Feng P-M, Lin H, Chou K-C (2013) iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucl Acids Res 41:e68

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cheng C-HC (1998) Evolution of the diverse antifreeze proteins. Curr Opin Genet Dev 8:715–720

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (1992) Energy-optimized structure of antifreeze protein and its binding mechanism. J Mol Biol 223:509–517

    Article  CAS  PubMed  Google Scholar 

  • Chou KC (2001a) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43:246–255

    Article  CAS  PubMed  Google Scholar 

  • Chou KC (2001b) Prediction of protein signal sequences and their cleavage sites. Proteins 42:136–139

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2001c) Using subsite coupling to predict signal peptides. Protein Eng 14:75–79

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2001d) Prediction of signal peptides using scaled window. Peptides 22:1973–1979

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2005a) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2005b) Prediction of G-protein-coupled receptor classes. J Proteome Res 4:1413–1418

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2009) Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Curr Proteom 6:262–274

    Article  CAS  Google Scholar 

  • Chou K-C (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol BioSyst 9:1092–1100

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Cai Y-D (2005) Prediction of membrane protein types by incorporating amphipathic effects. J Chem Inf Model 45:407–413

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Cai Y-D (2006) Prediction of protease types in a hybridization space. Biochem Biophys Res Commun 339:1015–1020

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Shen H-B (2006) Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-nearest neighbor classifiers. J Proteome Res 5:1888–1897

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Shen H-B (2007a) Recent progress in protein subcellular location prediction. Anal Biochem 370:1–16

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Shen H-B (2007b) Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. J Proteome Res 6:1728–1734

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Shen H-B (2008) Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat Protoc 3:153–162

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C, Shen H-B (2009) Review: recent advances in developing web-servers for predicting protein attributes. Nat Sci 1:63

    CAS  Google Scholar 

  • Chou K-C, Zhang C-T (1995) Prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349

    Article  CAS  PubMed  Google Scholar 

  • Davies PL, Baardsnes J, Kuiper MJ, Walker VK (2002) Structure and function of antifreeze proteins. Philos Trans Royal Soc B 357:927–935

    Article  CAS  Google Scholar 

  • Du P, Wang X, Xu C, Gao Y (2012) PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou’s pseudo-amino acid compositions. Anal Biochem 425:117–119

    Article  CAS  PubMed  Google Scholar 

  • Ewart K, Lin Q, Hew C (1999) Structure, function and evolution of antifreeze proteins. Cell Mol Life Sci CMLS 55:271–283

    Article  CAS  PubMed  Google Scholar 

  • Fan G-L, Li Q-Z (2013) Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou’s pseudo amino acid composition. J Theor Biol 334:45–51

    Article  CAS  PubMed  Google Scholar 

  • Feng K-Y, Cai Y-D, Chou K-C (2005) Boosting classifier for predicting protein domain structural class. Biochem Biophys Res Commun 334:213–217

    Article  CAS  PubMed  Google Scholar 

  • Griffith M, Ewart KV (1995) Antifreeze proteins and their potential use in frozen foods. Biotechnol Adv 13:375–402

    Article  CAS  PubMed  Google Scholar 

  • Gu  B, Sun  X, Sheng V-S  (2016) Structural Minimax Probability Machine. IEEE Transactions on Neural Networks and Learning Systems,  doi: 10.1109/TNNLS.2016.2544779

  • Gu B, Sheng V-S, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Networks, 67:140–150

    Article  PubMed  Google Scholar 

  • Huang R-B, Du Q-S, Wei Y-T, Pang Z-W, Wei H et al (2009) Physics and chemistry-driven artificial neural network for predicting bioactivity of peptides and proteins and their design. J Theor Biol 256:428–435

    Article  CAS  PubMed  Google Scholar 

  • Huang T, Wang J, Cai Y-D, Yu H, Chou K-C (2012) Hepatitis C virus network based classification of hepatocellular cirrhosis and carcinoma. PLoS ONE 7:e34460

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jia Z, Davies PL (2002) Antifreeze proteins: an unusual receptor–ligand interaction. Trends Biochem Sci 27:101–106

    Article  CAS  PubMed  Google Scholar 

  • Jia J, Xiao X, Liu B, Jiao L (2011) Bagging-based spectral clustering ensemble selection. Pattern Recogn Lett 32:1456–1467

    Article  Google Scholar 

  • Jia J, Xiao X, Liu B (2015) Prediction of protein–protein interactions with physicochemical descriptors and wavelet transform via random forests. J Lab Autom 22:368–377

    Google Scholar 

  • Jiang Y, Huang T, Chen L, Gao Y-F, Cai Y et al (2013) Signal propagation in protein interaction network during colorectal cancer progression. BioMed Res Int 2013:9

    Google Scholar 

  • Kandaswamy KK, Chou K-C, Martinetz T, Möller S, Suganthan P et al (2011) AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties. J Theor Biol 270:56–62

    Article  CAS  PubMed  Google Scholar 

  • Levitt J (1980) Responses of plants to environmental stresses. Volume II. Water, radiation, salt, and other stresses, Academic Press, New York

  • Li B-Q, Huang T, Liu L, Cai Y-D, Chou K-C (2012) Identification of colorectal cancer related genes with mRMR and shortest path in protein-protein interaction network. PLoS ONE 7:e33393

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lin W-Z, Fang J-A, Xiao X, Chou K-C (2011) iDNA-Prot: identification of DNA binding proteins using random forest with grey model. PLoS ONE 6:e24756

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Liu B, Zhang D, Xu R, Xu J, Wang X et al (2014) Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection. Bioinformatics 30:472–479

    Article  CAS  PubMed  Google Scholar 

  • Min J-L, Xiao X, Chou K-C (2013) iEzy-Drug: a web server for identifying the interaction between enzymes and drugs in cellular networking. BioMed Res Int 2013:13

    Google Scholar 

  • Mondal S, Pai PP (2014) Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction. J Theor Biol 356:30–35

    Article  CAS  PubMed  Google Scholar 

  • Moriyama M, Abe J, Yoshida M, Tsurumi Y, Nakayama S (1995) Seasonal changes in freezing tolerance, moisture content and dry weight of three temperate grasses [Dactylis glomerata, Lolium perenne, Phleum pratense]. J Jpn Soc Grassl Sci

  • Qiu W-R, Xiao X, Chou K-C (2014) iRSpot-TNCPseAAC: identify recombination spots with trinucleotide composition and pseudo amino acid components. Int J Mol Sci 15:1746–1766

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sakai A, Larcher W (1987) Frost survival of plants. Responses and adaptation to freezing stress. Springer, Berlin

    Book  Google Scholar 

  • Schäffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL et al (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucl Acids Res 29:2994–3005

    Article  PubMed  PubMed Central  Google Scholar 

  • Scholander P, Van Dam L, Kanwisher J, Hammel H, Gordon M (1957) Supercooling and osmoregulation in Arctic fish. J Cell Comp Physiol 49:5–24

    Article  CAS  Google Scholar 

  • Sformo T, Kohl F, McIntyre J, Kerr P, Duman J et al (2009) Simultaneous freeze tolerance and avoidance in individual fungus gnats, Exechia nugatoria. J Comp Physiol B 179:897–902

    Article  PubMed  Google Scholar 

  • Shen H-B, Chou K-C (2008) PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. Anal Biochem 373:386–388

    Article  CAS  PubMed  Google Scholar 

  • Shen H-B, Chou K-C (2009) A top-down approach to enhance the power of predicting human protein subcellular localization: hum-mPLoc 2.0. Anal Biochem 394:269–274

    Article  CAS  PubMed  Google Scholar 

  • Shen H-B, Yang J, Chou K-C (2006) Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition. J Theor Biol 240:9–13

    Article  CAS  PubMed  Google Scholar 

  • Shi J-Y, Zhang S-W, Pan Q, Zhou G-P (2008) Using pseudo amino acid composition to predict protein subcellular location: approached with amino acid composition distribution. Amino Acids 35:321–327

    Article  CAS  PubMed  Google Scholar 

  • Sonnhammer EL, Eddy SR, Durbin R (1997) Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins-Struct Funct Genet 28:405–420

    Article  CAS  PubMed  Google Scholar 

  • Wang M, Yang J, Xu Z-J, Chou K-C (2005) SLLE for predicting membrane protein types. J Theor Biol 232:7–15

    Article  CAS  PubMed  Google Scholar 

  • Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inform Sciences. 295:395–406

    Article  Google Scholar 

  • Xiao X, Min J-L, Wang P, Chou K-C (2013a) iGPCR-Drug: a web server for predicting interaction between GPCRs and drugs in cellular networking. PLoS ONE 8:e72234

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xiao X, Wang P, Lin W-Z, Jia J-H, Chou K-C (2013b) iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal Biochem 436:168–177

    Article  CAS  PubMed  Google Scholar 

  • Xu Y, Ding J, Wu L-Y, Chou K-C (2013a) iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition. PLoS ONE 8:e55844

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu Y, Shao X-J, Wu L-Y, Deng N-Y, Chou K-C (2013b) iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ 1:e171

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu Y, Bäumer A, Meister K, Bischak CG, DeVries AL et al (2016) Protein–water dynamics in antifreeze protein III activity. Chem Phys Lett 647:1–6

    Article  CAS  Google Scholar 

  • Y-d Cai, Zhou G-P, Jen C-H, Lin S-L, Chou K-C (2004) Identify catalytic triads of serine hydrolases by support vector machines. J Theor Biol 228:551–557

    Article  CAS  Google Scholar 

  • Yoshida M, Abe J, Moriyama M, Shimokawa S, Nakamura Y (1997) Seasonal changes in the physical state of crown water associated with freezing tolerance in winter wheat. Physiol Plant 99:363–370

    Article  CAS  Google Scholar 

  • Yu C-S, Lu C-H (2011) Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions. PLoS ONE 6:e20445

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhao X, Ma Z, Yin M (2012) Using support vector machine and evolutionary profiles to predict antifreeze protein sequences. Int J Mol Sci 13:2196–2207

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Nature Science Foundation of China (No. 31260273, 61261027), the Jiangxi Provincial Foreign Scientific and Technological Cooperation Project (No. 20120BDH80023), Natural Science Foundation of Jiangxi Province, China (No. 20114BAB211013, 20122BAB211033, 20122BAB201044, 20122BAB201020), the Department of Education of JiangXi Province (GJJ12490, GJJ4642, GJJ14651, GJJ14640), the LuoDi plan of the Department of Education of JiangXi Province (KJLD12083), and the JiangXi Provincial Foundation for Leaders of Disciplines in Science (20113BCB22008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Xiao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiao, X., Hui, M. & Liu, Z. iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC. J Membrane Biol 249, 845–854 (2016). https://doi.org/10.1007/s00232-016-9935-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00232-016-9935-9

Keywords

Navigation