The Application of Fusion of Heterogeneous Meta Classifiers to Enhance Protein Fold Prediction Accuracy

Dehzangi, Abdollah; Foladizadeh, Roozbeh Hojabri; Aflaki, Mohammad; Karamizadeh, Sasan

doi:10.1007/978-3-642-20039-7_54

The Application of Fusion of Heterogeneous Meta Classifiers to Enhance Protein Fold Prediction Accuracy

Abdollah Dehzangi²²,
Roozbeh Hojabri Foladizadeh²²,
Mohammad Aflaki²³ &
…
Sasan Karamizadeh²²

Conference paper

1079 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6591))

Abstract

Protein fold prediction problem is considered as one of the most challenging tasks for molecular biology and one of the biggest unsolved problems for science. Recently, varieties of classification approaches have been proposed to solve this problem. In this study, a fusion of heterogeneous Meta classifiers namely: LogitBoost, Random Forest, and Rotation Forest is proposed to solve this problem. The proposed approach aims at enhancing the protein fold prediction accuracy by enforcing diversity among its individual members by employing divers and accurate base classifiers. Employed classifiers combined using five different algebraic combiners (combinational policies) namely: Majority voting, Maximum of Probability, Minimum of Probability, Product of Probability, and Average of probability. Our experimental results show that our proposed approach enhances the protein fold prediction accuracy using Ding and Dubchak’s dataset and Dubchak et al.’s feature set better than the previous works found in the literature.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Shi, S.Y.M., Suganthan, P.N., Deb, K.: Multi class protein fold recognition using multi-objective evolutionary algorithms. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 61–66 (2004)
Google Scholar
Dehzangi, A., Khosravi, B.G.: Introducing Novel Physicochemical Based Features to Enhance Protein Fold Prediction Accuracy. In: Proceeding in: IEEE International Conference on Computer Design and Applications, pp. 592–596 (2010)
Google Scholar
Ghanty, P., Pal, N.R.: Prediction of Protein Folds: Extraction of New Features, Dimensionality Reduction, and Fusion of Heterogeneous Classifiers. IEEE Transaction on Nanobioscience 8(1), 100–110 (2009)
Article Google Scholar
Lin, K.L., Li, C.Y., Huang, C.D., Chang, H.M., Yang, C.Y., Lin, C.T., Tang, C.Y., Hsu, D.F.: Feature Selection and Combination Criteria for Improving Accuracy in Protein Structure Prediction. IEEE Transactions on Nanobioscience 6(2), 186–196 (2008)
Article Google Scholar
Shen, H.B., Chou, K.C.: Ensemble Classifier for Protein Fold Pattern Recognition. Bioinformatics 22(14), 1717–1722 (2006)
Article Google Scholar
Dehzangi, A., Amnuaisuk, S.P., Ng, K.H.: Investigating the Influence of Combined Features to Classifiers’ Performance: A Comparison Study on a Protein Fold Prediction Problem. In: 6th IEEE International Conference on Information Technology in Asia, pp. 213–217 (2009)
Google Scholar
Hashemi, H.B., Shakery, A., Naeini, M.P.: Protein Fold Pattern Recognition Using Bayesian Ensemble of RBF Neural Networks. In: International Conference of Soft Computing and Pattern Recognition SOCPAR, pp. 436–441 (2009)
Google Scholar
Kecman, V., Yang, T.: Protein Fold Recognition with Adaptive Local Hyper plane Algorithm. In: 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology, Nashville, Tennessee, USA, pp. 75–78 (2009)
Google Scholar
Nanni, L.: Ensemble of classifiers for protein fold recognition. In: New Issues in Neurocomputing: 13th European Symposium on Artificial Neural Networks, pp. 850–853 (2006)
Google Scholar
Chen, Y., Zhang, X., Yang, M.Q., Yang, J.Y.: Ensemble of Probabilistic Neural Networks for Protein Fold Recognition. In: 7th IEEE International Conference on Bioinformatics and Bioengineering, pp. 66–70 (2007)
Google Scholar
Dehzangi, A., Amnuaisuk, S.P., Dehzangi, O.: Using Random Forest for Protein Fold Prediction Problem: An Empirical Study. Journal of Information Science and Engineering 26(6) (2010)
Google Scholar
Dehzangi, A., Amnuaisuk, S.P., Manafi, M., Safa, S.: Using rotation forest for protein fold prediction problem: An empirical study. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2010. LNCS, vol. 6023, pp. 217–227. Springer, Heidelberg (2010)
Chapter Google Scholar
Krishnaraj, Y., Reddy, C.K.: Boosting methods for Protein Fold Recognition: An Empirical Comparison. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 393–396 (2008)
Google Scholar
Lampros, C., Papaloukas, C., Exarchos, K., Fotiadis, D.I., Tsalikakis, D.: Improving the protein fold recognition accuracy of a reduced state-space hidden Markov model. Computers in Biology and Medicine 39(10), 907–914 (2009)
Article Google Scholar
Bologna, G., Appel, R.D.: A comparison study on protein fold recognition. In: Proceedings of the Ninth International Conference on Neural Information Processing, pp. 2492–2496 (2002)
Google Scholar
Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)
Article Google Scholar
Dubchak, I., Muchnik, I., Kim, S.K.: Protein folding class predictor for SCOP: approach based on global descriptors. In: Proceedings in the 5th International Conference on Intelligent Systems for Molecular Biology, pp. 104–107 (1997)
Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Breiman, L.: Random Forest. Machine learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)
MATH Google Scholar
Livingston, F.: Implementation of Breiman’s Random Forest Machine Learning Algorithm, Machine Learning. ECE591Q (2005)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Google Scholar
Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions 28(10), 1619–1630 (2006)
Google Scholar
Kuncheva, L.I., Rodríguez, J.J.: An experimental study on rotation forest ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)
Chapter Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley, New York (2001)
MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapier, R.E.: A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)
Google Scholar
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)
Google Scholar
Bauer, E., Kohavi, R.: An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning 36, 105–139 (1999)
Article Google Scholar
Damoulas, T., Girolami, M.A.: Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection. Bioinformatics 24, 1264–1270 (2008)
Article Google Scholar
Bouchaffra, D., Tan, J.: Protein Fold Recognition using a Structural Hidden Markov Model. In: 18th International Conference on Pattern Recognition, pp. 186–189 (2006)
Google Scholar
Shamim, M.T.A., Anwaruddin, M., Nagarajaram, H.A.: Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs. Bioinformatics 23(24), 3320–3327 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Multi Media University, Cyberjaya, Selangor, Malaysia
Abdollah Dehzangi, Roozbeh Hojabri Foladizadeh & Sasan Karamizadeh
Faculty of Engineering, Multi Media University, Cyberjaya, Selangor, Malaysia
Mohammad Aflaki

Authors

Abdollah Dehzangi
View author publications
You can also search for this author in PubMed Google Scholar
Roozbeh Hojabri Foladizadeh
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Aflaki
View author publications
You can also search for this author in PubMed Google Scholar
Sasan Karamizadeh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wroclaw University of Technology, 50-370, Wroclaw, Poland
Ngoc Thanh Nguyen
Department of Computer Engineering, Yeungnam University, 712-749, Dae-Dong, Gyeungsan, Korea
Chong-Gun Kim
Institute of Informatics, Automation and Robotics, Wroclaw University of Technology, 50-370, Wrocław, Poland
Adam Janiak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dehzangi, A., Foladizadeh, R.H., Aflaki, M., Karamizadeh, S. (2011). The Application of Fusion of Heterogeneous Meta Classifiers to Enhance Protein Fold Prediction Accuracy. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20039-7_54

Download citation

DOI: https://doi.org/10.1007/978-3-642-20039-7_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20038-0
Online ISBN: 978-3-642-20039-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics