Abstract
Protein–protein interaction (PPI) site prediction aids to ascertain the interface residues that participate in interaction processes. Fuzzy support vector machine (F-SVM) is proposed as an effective method to solve this problem, and we have shown that the performance of the classical SVM can be enhanced with the help of an interaction-affinity based fuzzy membership function. The performances of both SVM and F-SVM on the PPI databases of the Homo sapiens and E. coli organisms are evaluated and estimated the statistical significance of the developed method over classical SVM and other fuzzy membership-based SVM methods available in the literature. Our membership function uses the residue-level interaction affinity scores for each pair of positive and negative sequence fragments. The average AUC scores in the 10-fold cross-validation experiments are measured as 79.94% and 80.48% for the Homo sapiens and E. coli organisms respectively. On the independent test datasets, AUC scores are obtained as 76.59% and 80.17% respectively for the two organisms. In almost all cases, the developed F-SVM method improves the performances obtained by the corresponding classical SVM and the other classifiers, available in the literature.
Similar content being viewed by others
References
Argos P 1988 An investigation of protein subunit and domain interfaces. Protein Eng. 2 101–113
Arias AM 1989 Molecular biology of the cell. In B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts and JD, Watson, Garland (eds), 1989 $46.95 (v+ 1187 pages) ISBN 0 8240 3695 6, 2nd edn. Elsevier Current Trends
Bandyopadhyay S, Maulik U and Wang JTL 2007 (Eds) Analysis of biological data. A Soft Computing Approach. World Scientific, Singapore
Basu S and Plewczynski D 2010 AMS 3.0: prediction of post-translational modifications. BMC Bioinforma 11 210
Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, Shindyalov I and Bourne P 2000 The protein data bank. Nucleic Acids Res. 28 235–242
Bordner AJ and Abagyan R 2005 Statistical analysis and prediction of protein–protein interfaces. Proteins Struct. Funct. Bioinforma 60 353–366
Caragea C, Sinapov J, Honavar V and Dobbs D 2007 Assessing the performance of macromolecular sequence classifiers. Bioinformatics and Bioengineering, BIBE 2007. Proceedings of the 7th IEEE International Conference on pp 320–326
Chang C-C and Lin C-J 2011 LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST). 2 27
Chatterjee P, Basu S, Kundu M, Nasipuri M and Plewczynski D 2011a PPI_SVM: prediction of protein-protein interactions using machine learning, do-main-domain affinities and frequency tables. Cell. Mol. Biol. Lett. 16 264–278
Chatterjee P, Basu S, Kundu M, Nasipuri M and Plewczynski D 2011b PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machine. J. Mol. Model. 17 2191–2201
Chelliah V, Chen L, Blundell T and Lovell S 2004 Distinguishing structural and functional restraints in evolution inorder to identify interaction sites. J. Mol. Biol. 342 1487–1504
Chen Y and Wang JZ 2003 Support vector learning for fuzzy rule-based classification systems. IEEE Trans. Fuzzy Syst. 11 716–728
Chiang J-H and Hao P-Y 2004 Support vector learning mechanism for fuzzy rule-based modeling: a new approach. IEEE Trans. Fuzzy Syst. 12 1–12
Cortes C and Vapnik VN 1995 Support vector networks. Mach. Learn. 20 273–297
Demšar J 2006 Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7 1–30
Huang HP and Liu YH 2002 Fuzzy support vector machine for pattern recognition and data mining. Int. J. Fuzzy Syst. 4 826–835
Inoue T and Abe S 2001 Fuzzy support vector machines for pattern classification. Proc. IJCNN’01. 2 1449–1454
Ishibuchi H and Yamamoto T 2005 Rule weight specification in fuzzy rule-based classification systems. IEEE Trans. Fuzzy Syst. 13 428–435
Janin J, Miller S and Chothia C 1988 Surface, subunit interfaces and interior of oligomericproteins. J. Mol. Biol. 204 155–164
Jiang X, Yi Z and Lv JC 2006 Fuzzy SVM with a new fuzzy membership function. Neural Comput. Applic. 15 268–276
Jones S and Thornton J 1995 Protein-protein interactions: a review of protein dimer structures. Prog. Biophys. Mol. Biol. 63 31–65
Jones S and Thornton JM 1996 Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA 93 13–20
Jones S and Thornton JM 1997 Analysis of protein-protein interaction sites using surface patches. JMB. 272 121–132
Koike A and Takagi T 2004 Prediction of protein–protein interaction sites using support vector machines. Protein Eng. Des. Sel. 17 165–173
Korn A and Burnett R 1991 Distribution and complementarity of hydropathy in multi-subunit proteins. Proteins Struct. Funct. Bioinforma 9 37–55
Krogan N, Cagney G, Yu H, Zhong G, et al. 2006 Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440 637–643
Lin C-F and Wang S-D 2002 Fuzzy support vector machines. IEEE Trans. Neural Netw. 13 464–471
Lo Conte L, Chothia C and Janin J 1999 The atomic structure of protein– protein recognition sites. J. Mol. Biol. 285 2177–2198
Maulik U, Bandyopadhyay S and Wang JT 2011a Computational intelligence and pattern analysis in biology informatics, p 20
Maulik U, Bhattacharyya M, Mukhopadhyay A and Bandyopadhyay S 2011b Identifying the immunodeficiency gateway proteins in humans and their involvement in microrna regulation. Mol. BioSyst. 7 1842–1851
Miller S 1989 The structure of interfaces between subunits of dimeric and tetrameric proteins. Protein Eng. 3 77–83
Mukhopadhyay A, Maulik U and Bandyopadhyay S 2012 A novel biclustering approach to association rule mining for predicting HIV-1–human protein interactions. PLoS One 7 e32289
Plewczynski D 2010 Brainstorming: weighted voting prediction of inhibitors for protein targets. J. Mol. Model 17 2133–2141
Plewczynski D, Basu S and Saha I 2012 AMS 4.0: consensus prediction of post-translational modifications in protein sequences. Amino Acids 43 573–582
Saha I, Maulik U, Bandyopadhyay S and Plewczynski D 2012 Fuzzy clustering of physicochemical and biochemical properties of amino acids. Amino Acids 43 583–594
Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU and Eisenberg D 2004 The database of interacting proteins: 2004 update. Nucleic Acids Res. 32 D449–D451
Sengupta D, Maulik U and Bandyopadhyay S 2012 Weighted Markov chain based aggregation of biomolecule orderings. IEEE/ACM Trans. Comput. Biol. Bioinforma 9 924–933
Šikić M, Tomić S and Vlahoviček K 2009 Prediction of protein–protein interaction sites in sequences and 3D structures by random forests. PLoS Comput. Biol. 5 e1000278
Singh R, Park D, Xu J, Hosur R and Berger B 2010 Struct2Net: a web service to predict protein–protein interactions using a structure-based approach. Nucleic Acids Res. 38 W508–W515
Sriwastava B, Basu S, Maulik U and Plewczynski D 2012 Prediction of E. coli protein-protein interaction sites using inter-residue distances and high-quality-index features. Information Systems Design and Intelligent Applications 2012. INDIA 837–844
Sriwastava BK, Basu S, Maulik U and Plewczynski D 2013 PPIcons: identification of protein-protein interaction sites in selected organisms. J. Mol. Model. 9 4059–4070
Sriwastava BK, Basu S and Maulik U 2013 Fuzzy SVM with a novel membership function for prediction of protein-protein interaction sites in Homo sapiens; In Pattern recognition and machine intelligence. Springer, Berlin Heidelberg 8251 668–673
Tang H and Qu L-S 2008 Fuzzy support vector machine with a new fuzzy membership function for pattern classification. In Machine Learning and Cybernetics, 2008 International Conference on IEEE. Kunming 2 768–773
Vapnik VN 1995 The nature of statistical learning theory (New York: Springer-Verlag)
Wei Y and Wu X 2012 A new fuzzy SVM based on the posterior probability weighting membership. J. Comput. 7 1385–1392
Zhou H-X and Shan Y 2001 Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins Struct. Funct. Genet. 44 336–343
Acknowledgements
This project is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, PURSE project and FASTTRACK grant (SR/FTP/ETA-04/2012) of DST, India.
Author information
Authors and Affiliations
Corresponding authors
Additional information
[Sriwastava BK, Basu S and Maulik U 2015 Protein–Protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM. J. Biosci.] DOI 10.1007/s12038-015-9564-y
Supplementary materials pertaining to this article are available on the Journal of Biosciences Website at http://www.ias.ac.in/jbiosci/oct2015/supp/Sriwastava.pdf
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(PDF 423 kb)
Rights and permissions
About this article
Cite this article
Sriwastava, B.K., Basu, S. & Maulik, U. Protein–Protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM. J Biosci 40, 809–818 (2015). https://doi.org/10.1007/s12038-015-9564-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12038-015-9564-y