Abstract
DNA micro array technology has become the most extensively used functional genomics approach in the bioinformatics field after genome sequencing. Revealing the patterns concealed in gene expression data offers a fabulous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the difficulty of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. The first step to address this challenge is the use of clustering techniques. Many clustering methods have been devised and used in the analysis of micro array data but less effort has gone into algorithmic speed up of those methods. In this research, quad tree based high-speed two dimensional hierarchical clustering is presented. In the hierarchical clustering process, the construction of the closest pair data structure in each level is the important time factor which determines the processing time of clustering. The proposed high-speed two dimensional clustering process uses the quad tree based data structure for finding the closest pair elements and thus reduces the processing time effectively and produces the better analysis of gene expression data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Liang, J., Kachalo, S.: Computational analysis of microarray gene expression profiles: clustering, classification, and beyond. Chemometrics and Intelligent Laboratory Systems 62(2), 199–216 (2002)
Cvek, U., Trutschl, M., Stone II, R., Syed, Z., Clifford, J.L., Sabichi, A.L.: Multidimensional Visualization Tools for Analysis of Expression Data. World Academy of Science, Engineering and Technology 54(50), 281–289 (2009)
Kim, S.Y., Choi, T.M.: Fuzzy Types Clustering for Microarray Data. World Academy of Science, Engineering and Technology 4, 12–15 (2005)
Wu, X., Chen, Y., Brooks, B.R., Su, Y.A.: The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis. Eurasip Journal on Applied Signal Processing (1), 53–63 (2004)
Chen, G., Jaradat, S.A., Banerjee, N., Tanaka, T.S., Ko, M.S.H., Zhang, M.Q.: Evaluation and Comparison of Clustering Algorithms in Analyzing ES Cell Gene Expression Data. Statistica Sinica 12, 241–262 (2002)
Qin, Z.: Clustering microarray gene expression data using weighted Chinese restaurant Process. Bioinformatics 22(16), 1988–1997 (2006)
Gruzdz, Ihnatowicz, Siddiqi, Akhgar: Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System. World Academy of Science, Engineering and Technology 16(26), 140–144 (2006)
Wang, R., Scharenbroich, L., Hart, C., Wold, B., Mjolsness, E.: Clustering Analysis of Microarray Gene Expression Data by Splitting Algorithm. J. Parallel Distrib. Comput. 63(7-8), 692–706 (2003)
Lee, M., Kim, Y.-M., Kim, Y.J., Lee, Y.-K., Yoon, H.: An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data. World Academy of Science, Engineering and Technology 29(48), 261–266 (2007)
Kim, S.Y., Hamasaki, T.: Evaluation of Clustering based on Preprocessing in Gene Expression Data. International Journal of Biological and Life Sciences 3(1), 48–53 (2007)
Layana, C., Diambra, L.: Dynamical Analysis of Circadian Gene Expression. International Journal of Biological and Life Sciences 3(2), 101–105 (2007)
Eisenberg, I., Novershtern, N., Itzhaki, Z., Becker-Cohen, M., Sadeh, M., Willems, P.H.G.M., Friedman, N., Koopman, W.J.H., Mitrani-Rosenbaum, S.: Mitochondrial processes are impaired in hereditary inclusion body myopathy. Human Molecular Genetics 17(23), 3663–3674 (2008)
D’Souza, Sekaran, C., Kandasamy: A Phenomic Algorithm for Reconstruction of Gene Networks. International Journal of Biological and Life Sciences 4(2), 76–81 (2008)
Jing, L., Ng, M.K., Zeng, T.: Novel Hybrid Method for Gene Selection and Cancer Prediction. World Academy of Science, Engineering and Technology 62(89), 482–489 (2010)
ALL/AML datasets from http://www.broadinstitute.org/cancer/software/genepattern/datasets/
Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear-time Document Clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, pp. 16–22 (1999)
Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques. In: Proceedings of the KDD-2000 Workshop on Text Mining, Boston, MA, pp. 109–111 (2000)
Chakraborty, A., De, S.K., Dasgupta, R.: Balancing of Quad Tree Using Point Pattern Analysis. World Academy of Science, Engineering and Technology (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Priscilla, R., Swamynathan, S. (2012). A High-Speed Two Dimensional Hierarchical Clustering of Microarray Gene Expression Data. In: Satapathy, S.C., Avadhani, P.S., Abraham, A. (eds) Proceedings of the International Conference on Information Systems Design and Intelligent Applications 2012 (INDIA 2012) held in Visakhapatnam, India, January 2012. Advances in Intelligent and Soft Computing, vol 132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27443-5_62
Download citation
DOI: https://doi.org/10.1007/978-3-642-27443-5_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27442-8
Online ISBN: 978-3-642-27443-5
eBook Packages: EngineeringEngineering (R0)