Abstract
In this paper, we propose a genetic programming (GP) based approach to analyze multiclass microarray datasets. Here, a multiclass problem is divided into a set of two-class problems. Instead of applying a tree for each two-class problem, a small-scale ensemble system containing a set of trees is deployed and denoted by sub-ensemble (SE). The SEs tackling the respective two-class problems are combined to construct an individual of the GP, so that an individual can deal with a multiclass problem directly. In the experiments, the GP implements classification and feature selection at the same time. The results obtained at independent test sets show that our method is efficient in the search of genes with great biological significance, and achieves high classification accuracy at the same time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jirapech-Umpai, T., Aitken, S.: Feature Selection and Classification for Microarray Data Analysis: Evolutionary Methods for Identifying Predictive Genes. BMC Bioinformatics 6, 148 (2005)
Liu, J.J., Cutler, G., Li, W.X., Pan, Z., Peng, S.H., Hoey, T., Chen, L.B., Ling, X.F.B.: Multiclass Cancer Classification and Biomarker Discovery Using GA-based Algorithms. Bioinformatics 21, 2691–2697 (2005)
Ooi, C.H., Tan, P.: Genetic Algorithms Applied to Multi-class Prediction for the Analysis of Gene Expression Data. Bioinformatics 19, 37–44 (2003)
Hong, J.H., Cho, S.B.: The Classification of Cancer Based on DNA Microarray Data that Uses Diverse Ensemble Genetic Programming. Artificial Intelligence in Medicine 36, 43–58 (2006)
Yu, J.J., Yu, J.D., Almal, A.A., Dhanasekaran, S.M., Ghosh, D., Worzel, W.P., Chinnaiyan, A.M.: Feature Selection and Molecular Classification of Cancer Using Genetic Programming. Neoplasia, 9, 292–U216 (2007)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)
Kishore, J.K., Patnaik, L.M., Mani, V., Agrawal, V.K.: Application of Genetic Programming for Multicategory Pattern Classification. IEEE Transactions on Evolutionary Computation 4, 242–258 (2000)
Muni, D.P., Pal, N.R., Das, J.: A Novel Approach to Design Classifiers Using Genetic Programming. IEEE Transactions on Evolutionary Computation 8, 183–196 (2004)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Chichester (2004)
Silva, S., Almeida, J.: Dynamic Maximum Tree Depth – a Simple Technique for Avoiding Bloat in Tree-based GP. In: GECCO 2003, pp. 1776–1787 (2003)
Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van de Rijn, M., Waltham, M., Pergamenschikov, A., Lee, J.C.E., Lashkari, D., Shalon, D., Myers, T.G., Weinstein, J.N., Botstein, D., Brown, P.O.: Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines. Nat. Genet. 24, 227–235 (2000)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Lin, T.C., Liu, R.S., Chen, C.Y., Chao, Y.T., Chen, S.Y.: Pattern Classification in DNA Microarray Data of Multiple Tumor Types. Pattern Recognition 39, 2426–2438 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, CG., Liu, KH. (2008). A GP Based Approach to the Classification of Multiclass Microarray Datasets. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2008. Lecture Notes in Computer Science(), vol 5227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85984-0_42
Download citation
DOI: https://doi.org/10.1007/978-3-540-85984-0_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85983-3
Online ISBN: 978-3-540-85984-0
eBook Packages: Computer ScienceComputer Science (R0)