Abstract
Protein fold prediction problem is considered as one of the most challenging tasks for molecular biology and one of the biggest unsolved problems for science. Recently, varieties of classification approaches have been proposed to solve this problem. In this study, a fusion of heterogeneous Meta classifiers namely: LogitBoost, Random Forest, and Rotation Forest is proposed to solve this problem. The proposed approach aims at enhancing the protein fold prediction accuracy by enforcing diversity among its individual members by employing divers and accurate base classifiers. Employed classifiers combined using five different algebraic combiners (combinational policies) namely: Majority voting, Maximum of Probability, Minimum of Probability, Product of Probability, and Average of probability. Our experimental results show that our proposed approach enhances the protein fold prediction accuracy using Ding and Dubchak’s dataset and Dubchak et al.’s feature set better than the previous works found in the literature.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Shi, S.Y.M., Suganthan, P.N., Deb, K.: Multi class protein fold recognition using multi-objective evolutionary algorithms. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 61–66 (2004)
Dehzangi, A., Khosravi, B.G.: Introducing Novel Physicochemical Based Features to Enhance Protein Fold Prediction Accuracy. In: Proceeding in: IEEE International Conference on Computer Design and Applications, pp. 592–596 (2010)
Ghanty, P., Pal, N.R.: Prediction of Protein Folds: Extraction of New Features, Dimensionality Reduction, and Fusion of Heterogeneous Classifiers. IEEE Transaction on Nanobioscience 8(1), 100–110 (2009)
Lin, K.L., Li, C.Y., Huang, C.D., Chang, H.M., Yang, C.Y., Lin, C.T., Tang, C.Y., Hsu, D.F.: Feature Selection and Combination Criteria for Improving Accuracy in Protein Structure Prediction. IEEE Transactions on Nanobioscience 6(2), 186–196 (2008)
Shen, H.B., Chou, K.C.: Ensemble Classifier for Protein Fold Pattern Recognition. Bioinformatics 22(14), 1717–1722 (2006)
Dehzangi, A., Amnuaisuk, S.P., Ng, K.H.: Investigating the Influence of Combined Features to Classifiers’ Performance: A Comparison Study on a Protein Fold Prediction Problem. In: 6th IEEE International Conference on Information Technology in Asia, pp. 213–217 (2009)
Hashemi, H.B., Shakery, A., Naeini, M.P.: Protein Fold Pattern Recognition Using Bayesian Ensemble of RBF Neural Networks. In: International Conference of Soft Computing and Pattern Recognition SOCPAR, pp. 436–441 (2009)
Kecman, V., Yang, T.: Protein Fold Recognition with Adaptive Local Hyper plane Algorithm. In: 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology, Nashville, Tennessee, USA, pp. 75–78 (2009)
Nanni, L.: Ensemble of classifiers for protein fold recognition. In: New Issues in Neurocomputing: 13th European Symposium on Artificial Neural Networks, pp. 850–853 (2006)
Chen, Y., Zhang, X., Yang, M.Q., Yang, J.Y.: Ensemble of Probabilistic Neural Networks for Protein Fold Recognition. In: 7th IEEE International Conference on Bioinformatics and Bioengineering, pp. 66–70 (2007)
Dehzangi, A., Amnuaisuk, S.P., Dehzangi, O.: Using Random Forest for Protein Fold Prediction Problem: An Empirical Study. Journal of Information Science and Engineering 26(6) (2010)
Dehzangi, A., Amnuaisuk, S.P., Manafi, M., Safa, S.: Using rotation forest for protein fold prediction problem: An empirical study. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2010. LNCS, vol. 6023, pp. 217–227. Springer, Heidelberg (2010)
Krishnaraj, Y., Reddy, C.K.: Boosting methods for Protein Fold Recognition: An Empirical Comparison. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 393–396 (2008)
Lampros, C., Papaloukas, C., Exarchos, K., Fotiadis, D.I., Tsalikakis, D.: Improving the protein fold recognition accuracy of a reduced state-space hidden Markov model. Computers in Biology and Medicine 39(10), 907–914 (2009)
Bologna, G., Appel, R.D.: A comparison study on protein fold recognition. In: Proceedings of the Ninth International Conference on Neural Information Processing, pp. 2492–2496 (2002)
Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)
Dubchak, I., Muchnik, I., Kim, S.K.: Protein folding class predictor for SCOP: approach based on global descriptors. In: Proceedings in the 5th International Conference on Intelligent Systems for Molecular Biology, pp. 104–107 (1997)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Breiman, L.: Random Forest. Machine learning 45(1), 5–32 (2001)
Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)
Livingston, F.: Implementation of Breiman’s Random Forest Machine Learning Algorithm, Machine Learning. ECE591Q (2005)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions 28(10), 1619–1630 (2006)
Kuncheva, L.I., Rodríguez, J.J.: An experimental study on rotation forest ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley, New York (2001)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)
Freund, Y., Schapier, R.E.: A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)
Bauer, E., Kohavi, R.: An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning 36, 105–139 (1999)
Damoulas, T., Girolami, M.A.: Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection. Bioinformatics 24, 1264–1270 (2008)
Bouchaffra, D., Tan, J.: Protein Fold Recognition using a Structural Hidden Markov Model. In: 18th International Conference on Pattern Recognition, pp. 186–189 (2006)
Shamim, M.T.A., Anwaruddin, M., Nagarajaram, H.A.: Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs. Bioinformatics 23(24), 3320–3327 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dehzangi, A., Foladizadeh, R.H., Aflaki, M., Karamizadeh, S. (2011). The Application of Fusion of Heterogeneous Meta Classifiers to Enhance Protein Fold Prediction Accuracy. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20039-7_54
Download citation
DOI: https://doi.org/10.1007/978-3-642-20039-7_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20038-0
Online ISBN: 978-3-642-20039-7
eBook Packages: Computer ScienceComputer Science (R0)