Abstract
Based on independent component analysis (ICA) and self-organizing maps (SOM), this paper proposes an ISOM-DH model for the incomplete data’s handling in data mining. Under these circumstances the data remain dependent and non-Gaussian, this model can make full use of the information of the given data to estimate the missing data and can visualize the handled high-dimensional data. Compared with mixture of principal component analyzers (MPCA), mean method and standard SOM-based fuzzy map model, ISOM-DH model can be applied to more cases, thus performing its superiority. Meanwhile, the correctness and reasonableness of ISOM-DH model is also validated by the experiment carried out in this paper.
Similar content being viewed by others
References
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69
Wang S (2003) Application of self-organising maps for data mining with incomplete data sets. Neural Comput Appl 12:42–48
Chang P-C, Lai C-Y (2005) A hybrid system combining self-organizing maps with case-based reasoning in wholesaler’s new-release book for forecasting. Expert Syst Appl 29:183–192
Oba S et al (2002) Missing value estimation using mixture of PCAs. LNCS 2415, pp 492–497
Ad Feelders (1999) Handling missing data in trees-surrogate splits or statistical imputation. LNAI 1704, pp 329–334
Grzymala-Busse JW (2004) Rough set approach to incomplete data. LNAI 3070, pp 50–55
Gerardo BD et al (2004) The association rule algorithm with missing data in data mining. LNCS3043, pp 97–105
Li D et al (2004) Towards missing data imputation—a study of fuzzy K-means clustering method. LNAI 3066, pp 573–579
Zs. J. Viharos et al (2002) Training and application of artificial neural networks with incomplete data. LNAI 2358, pp 649–659
Latkowski R (2002) Incomplete data decomposition for classification. LNAI 2475, pp 413–420
Jutten C, Herault J (1998) Independent component analysis versus PCA. In: Proceeding of European signal processing conference, 287–314
Singh Y, Rai CS (2003) A simplified approach to independent component analysis. Neural Comput Appl 12:173–177
Kocsor A, Csirik J (2001) Fast independent component analysis in kernel feature spaces. LNCS 2234, pp 271–281
Theis FJ et al (2002) Overcomplete ICA with a geometric algorithm. LNCS 2415, pp 1049–1054
Vapnik V (2004) Statistical learning theory. Publishing House of Electronics Industry, Beijing
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Peng, H., Zhu, S. Handling of incomplete data sets using ICA and SOM in data mining. Neural Comput & Applic 16, 167–172 (2007). https://doi.org/10.1007/s00521-006-0058-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-006-0058-6