Abstract
In the field of bio-informatics, size of the bio-database is increasing at an exponential rate. In this scenario, traditional data analysis procedure fails to classify it. Currently, a lot of classification techniques involving data mining are used to classify biological data, like protein sequence. In this paper, most popular classification techniques, like neural network-based classifier, fuzzy ARTMAP-based classifier, and rough set classifier are reviewed with the proper limitation. The accuracy level and computational time are also been analyzed in this review. At the end, an idea is proposed which can increase the accuracy level with low computational overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
T.L. Jason et al., Application of Neural Networks to Biological Data Mining: A case study in Protein Sequence Classification (KDD, Boston, 2000), pp. 305–309
C. Wu, M. Berry, S. Shivakumar, J. Mclarty, Neural Networks for Full-Scale Protein Sequence Classification: Sequence Encoding with Singular Value Decomposition (Kluwer Academic Publishers, Boston, Machine Learning, 1995), pp. 177–193
Z. Zainuddin, M. Kumar, Radial basic function neural networks in protein sequence classification. MJMS 2(2), 195–204 (2008)
P.V. Nageswara Rao, T. Uma Devi, D. Kaladhar, G. Sridhar, A.A. Rao (2009) A probabilistic neural network approach for protein superfamily classification. J. Theor. Appl. Inf. Technol
S. Mohamed, D. Rubin, T. Marwala, in Multi-class Protein Sequence Classification Using Fuzzy ARTMAP. IEEE Conference. (2006) pp. 1676–1680
E.G. Mansoori et al., Generating fuzzy rules for protein classification. Iran. J. Fuzzy Syst. 5(2), 21–33 (2008)
E.G. Mansoori, M.J. Zolghadri, S.D. Katebi, Protein superfamily classification using fuzzy rule-based classifier. IEEE Trans. Nanobiosci. 8(1), 92–99 (2009)
S.A. Rahman, A.A. Bakar, Z.A.M. Hussein, in Feature Selection and Classification of Protein Subfamilies Using Rough Sets. International Conference on Electrical Engineering and Informatics. (Selangor, Malaysia, 2009)
Z. Pawlak (2002) Rough set theory and its applications, J. Telecommun. Inf. Technol
R. Yellasiri, C.R. Rao, Rough set protein classifier. J. Theor. Appl. Inform. Technol (2009)
S. Saha, R. Chaki (2012) Application of data mining in protein sequence classification. IJDMS. 4(5)
J.D. Spalding, D.C. Hoyle, Accuracy of String Kernels for Protein Sequence Classification, ICAPR 2005, vol. 3686. (Springer (LNCS), 2005)
N.M. Zaki, S. Deri, R.M. Illias, Protein sequences classification based on string weighting scheme. Int. J. Comput. Internet Manage. 13(1), 50–60 (2005)
A.F. Ali, D.M. Shawky, A novel approach for protein classification using fourier transform. IJEAS 6(4), 2010 (2010)
R. Busa-Fekete, A. Kocsor, S. Pongor (2010) Tree-based algorithms for protein classification. Int. J. Comput. Sci. Eng. (IJCSE)
K. Boujenfa, N. Essoussi, M. Limam, Tree-kNN: A tree-based algorithm for protein sequence classification. IJCSE 3, 961–968 (2011)
P. Desai, Sequence Classification Using Hidden Markov Model (2005)
M.M. Rahman, A.U. Alam, A. Al-Mamun, T.E. Mursalin, A more appropriate protein classification using data mining. JATIT, 33–43 (2010)
D. Bolser et al., Visualization and graph-theoretic analysis of a large-scale protein structural interactome. BMC Bioinformatics 4, 1–11 (2003)
C. Caragea, A. Silvescu, P. Mitra, Protein sequence classification using feature hashing. Proteome Sci. 10(Suppl 1), S14 (2012)
X.M. Zhao et al., A Novel Hybrid GA/SVM System for Protein Sequences Classification, IDEAL 2004, vol. 3177. (Springer(LNCS), 2004), pp. 11–16
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Saha, S., Bhattacharya, T. (2020). Protein Sequence Classification Involving Data Mining Technique: A Review. In: Elçi, A., Sa, P., Modi, C., Olague, G., Sahoo, M., Bakshi, S. (eds) Smart Computing Paradigms: New Progresses and Challenges. Advances in Intelligent Systems and Computing, vol 767. Springer, Singapore. https://doi.org/10.1007/978-981-13-9680-9_17
Download citation
DOI: https://doi.org/10.1007/978-981-13-9680-9_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9679-3
Online ISBN: 978-981-13-9680-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)