ABSTRACT
This paper describes an extension to the Restricted Growth Function grouping Genetic Algorithm applied to the Consensus Clustering of a retinal nerve fibre layer data-set. Consensus Clustering is an optimisation based method which combines the results of a number of data clustering methods, and is used when it is unknown which clustering method is expected to perform the best. Consensus Clustering has been shown to produce results which are better than the averaged results of the input methods, but could benefit from a more efficient optimisation method. A Restricted Growth Function grouping Genetic Algorithm is a new method of grouping a number of objects into mutually exclusive subsets based upon a fitness function. This method does not suffer from degeneracy, and thus could be applied to the Consensus Clustering problem more efficiently than Simulated Annealing, the current optimisation method. Within this paper it is shown that this type of Genetic Algorithm can indeed improve the performance of Consensus Clustering, and in fact can be improved further by taking advantage of some application specific properties. These findings are demonstrated on a retinal nerve fibre layer data-set and on a synthetic data-set.
- Altman, D.G., Practical Statistics for Medical Research. Chapman and Hall, London, 1997. Google ScholarDigital Library
- Cuff, J.A., Clamp, M.E., Siddiqui, S.A., Finlay, M., and Barton, G.J., JPred: A consensus secondary structure prediction server. Bioinformatics, 14 (1998), 892--893.Google ScholarCross Ref
- Er, M., A fast algorithm for generating set partitions. The Computer Journal, 31, 3 (1988), 283--284. Google ScholarDigital Library
- Falkenauer, E., Genetic Algorithms and Grouping Problems. Wiley, 1998. Google ScholarDigital Library
- Fraley, C., Raftery, A.E., Model--based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97 (2002), 611--631Google ScholarCross Ref
- Garey, M. and Johnson, D., Computers and Intractability. W. H. Freeman and Company, New York, NY, 1979. Google ScholarDigital Library
- Garway-Heath, D.F., Poinoosawmy, D., Fitzke, F., Hitchings, R.A., Mapping the Visual Field to the Optic Disc. Opthalmology 107 (2000), 1809--1815.Google ScholarCross Ref
- Goldberg, D. and Lingle, R., Alleles, loci, and the travelling salesman problem. In Proceedings of the First International Conference on Genetic Algorithms and their Applications (1985), 154--159. Google ScholarDigital Library
- Hackworth, T., Genetic algorithms; Some effects of redundancy in chromosomes, In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO--1999) (Orlando, Florida, USA, 1999), 99--106.Google Scholar
- Hall, L.O., Ozyurt, I.B. and Bezdek, J.C., "Clustering with a genetically optimized approach", IEEE Transactions on Evolutionary Computation 3, 2 (1999), 103--112. Google ScholarDigital Library
- Healey P.R. and Mitchell P., Visibility of lamina cribrosa pores and open-angle glaucoma, American Journal of Ophthalmology 138, 5 (2004), 871--872.Google ScholarCross Ref
- Jain, A., Murty, M., and Flynn, P., Data clustering: A review. ACM Computing Surveys 31, 3 (1999), 264--323. Google ScholarDigital Library
- Kaufman, L., Rousseeuw P.J., Clustering by means of medoids, In Statistical Analysis Based Upon the L1 Norm. Edited by: Dodge Y., Amsterdam, Holland, 1987, 405--416.Google Scholar
- Kellam, P., Liu, X., Martin, N., Orengo, C., Swift, S., and Tucker, A., Comparing, Contrasting and Combining Clusters in Viral Gene Expression Data. In Proceedings of the Intelligent Data Analysis in Medicine and Pharmacology Workshop (IDAMAP--2001) (London, UK, 2001), 56--62.Google Scholar
- Kirkpatrick, S., Gelatt Jr, C.D., and Vecchi M.P., Optimization by simulated annealing. Science, 220 (1983), 671--680.Google ScholarCross Ref
- Kohonen, T., Self Organization and Associative Memory. 3rd edition, Springer-Verlag, New York, 1989. Google ScholarDigital Library
- Lukashin, A.V., and Fuchs, R., Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters. Bioinformatics, 17 (2001), 405--414.Google ScholarCross Ref
- Ma P.C.H., Chan K.C.C., Xao, X. and Chiu K.Y., An Evolutionary Clustering Algorithm for Gene Expression Microarray Data Analysis, IEEE Transactions on Evolutionary Computation 10,3 (2006), 296--314. Google ScholarDigital Library
- McQueen, J., Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (Berkeley, 1967), 281--297, 1967.Google Scholar
- Monti, S., Tamayo, P., Mesirov, J., and Golub, T., Consensus clustering: a resampling-based method for class discovery and visualization of gene expression, microarray data. Machine Learning, 52 (2003), 91--118. Google ScholarDigital Library
- Park, Y. and Song, M., A genetic algorithm for clustering problems. In Proceedings of the 3rd Annual Conference on Genetic Programming, 1998, Morgan Kaufmann, 568--575.Google Scholar
- Proskurowski, A., Ruskey, F., and Smith, M., Analysis of algorithms for listing equivalence classes of k ary strings. SIAM Journal on Discrete Mathematics, 11, 1 (1998), 94 109. Google ScholarDigital Library
- Radcliffe, N. and Surry, P., Fitness variance of formae and performance prediction. In Whitley, D. and Vose, M., editors, Foundations of Genetic Algorithms 3, (San Mateo, 1995), Morgan Kaufmann, 51--72.Google Scholar
- Radcliffe, N., Equivalence class analysis of genetic algorithms. Complex Systems 5 (1991), 183--205.Google Scholar
- Reeves, C. and Yamada, T., Genetic algorithms, path relinking, and the flowshop sequencing problem. Evolutionary Computation 6,1 (1998), 45--60. Google ScholarDigital Library
- Strehl, A., and Ghosh, J., Cluster Ensembles -- A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research, 3 (2002), 583--617. Google ScholarDigital Library
- Swift, S., Tucker, A., Vinciotti, V., Martin, N., Orengo, C., Liu, X., and P. Kellam, Consensus Clustering and Functional Interpretation of Gene Expression Data. Genome Biology 5, 11 (2004), R94.1--R94.16.Google ScholarCross Ref
- Tucker, A., Crampton, J., and Swift, S., RGFGA: An Efficient Representation and Crossover for Grouping Genetic Algorithms. Evolutionary Computation, 13, 4 (2005), 477--499. Google ScholarDigital Library
- Tucker, A., Swift, S., and Liu, X., Grouping multivariate time series via correlation. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31 (2001), 235 245. Google ScholarDigital Library
- Ward, J.H., Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58 (1963), 236--244.Google ScholarCross Ref
Index Terms
- An improved restricted growth function genetic algorithm for the consensus clustering of retinal nerve fibre data
Recommendations
A quantum-inspired genetic algorithm for k-means clustering
The number of clusters has to be known in advance for the conventional k-means clustering algorithm and moreover the clustering result is sensitive to the selection of the initial cluster centroids. This sensitivity may make the algorithm converge to ...
Revisiting the restricted growth function genetic algorithm for grouping problems
An overview of the restricted growth function genetic algorithm is given. Empirically we show that the algorithm exhibits poor performance and is consistently outperformed on a range of problems by two very basic evolutionary algorithms with blind ...
A genetic k-medoids clustering algorithm
We propose a hybrid genetic algorithm for k -medoids clustering. A novel heuristic operator is designed and integrated with the genetic algorithm to fine-tune the search. Further, variable length individuals that encode different number of medoids (...
Comments