Abstract
It is important to develop computational methods that can effectively resolve two intrinsic problems in microarray data: high dimensionality and small sample size. In this paper, we propose a self-supervised learning framework for classifying microarray gene expression data using Kernel Discriminant-EM (KDEM) algorithm. This framework applies self-supervised learning techniques in an optimal nonlinear discriminating subspace. It efficiently utilizes a large set of unlabeled data to compensate for the insufficiency of a small set of labeled data and it extends linear algorithm in DEM to kernel algorithm to handle nonlinearly separable data in a lower dimensional space. Extensive experiments on the Plasmodium falciparum expression profiles show the promising performance of the approach.
Chapter PDF
Similar content being viewed by others
Keywords
- Plasmodium Falciparum
- Unlabeled Data
- Lower Dimensional Space
- Small Sample Size Problem
- Malaria Parasite Plasmodium Falciparum
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Wu, Y., Tian, Q., Huang, T.S.: Discriminant EM algorithm with application to image retrieval. In: Proc. of IEEE Conf. Computer Vision and Pattern Recognition (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: 2nd Pattern Classification. John Wiley & Sons, Inc., Chichester (2001)
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Mass (2002)
Bozdech, Z., Llinas, M., Pulliam, B.L., Wong, E.D., Zhu, J., DeRisi, J.L.: The transcriptome of the intraerythrocytic development cycle of plasmodium falciparum. Plos Biology 1(1), 1–16 (2003)
Gardner, M.J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R.W., et al.: Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419, 498–511 (2002)
Brown, M.P., Grundy, W.N., Lin, D., et al.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl, Acad. Sci. USA 97(1), 262–267 (2000)
The Gene Ontology Consortium, Gene Ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)
Wu, Y., Wang, X., Liu, X., Wang, Y.: Data-mining approaches reveal hidden families of proteases in the genome of malaria parasite. Genome Res. 13, 601–616 (2003)
Gantt, S.M., Myung, J.M., Briones, M.R., Li, W.D., Corey, E.J., Omura, S., Nussenzweig, V., Sinnis, P.: Proteasome inhibitors block development of Plasmodium spp. Antimicrob Agents Chemother 42, 2731–2738 (1998)
Kitano, H.: Systems biology: A brief overview. Science 295, 1662–1664 (2002)
Bowers, P.M., Cokus, S.J., Eisenberg, D., Yeates, T.O.: Use of logic relationships to decipher protein network organization. Science 306, 2246–2249 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, Y., Tian, Q., Liu, F., Sanchez, M., Wang, Y. (2006). A Self-supervised Learning Framework for Classifying Microarray Gene Expression Data. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758525_93
Download citation
DOI: https://doi.org/10.1007/11758525_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34381-3
Online ISBN: 978-3-540-34382-0
eBook Packages: Computer ScienceComputer Science (R0)