Skip to main content
Log in

Handling of incomplete data sets using ICA and SOM in data mining

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Based on independent component analysis (ICA) and self-organizing maps (SOM), this paper proposes an ISOM-DH model for the incomplete data’s handling in data mining. Under these circumstances the data remain dependent and non-Gaussian, this model can make full use of the information of the given data to estimate the missing data and can visualize the handled high-dimensional data. Compared with mixture of principal component analyzers (MPCA), mean method and standard SOM-based fuzzy map model, ISOM-DH model can be applied to more cases, thus performing its superiority. Meanwhile, the correctness and reasonableness of ISOM-DH model is also validated by the experiment carried out in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69

    Article  MATH  MathSciNet  Google Scholar 

  2. Wang S (2003) Application of self-organising maps for data mining with incomplete data sets. Neural Comput Appl 12:42–48

    Article  Google Scholar 

  3. Chang P-C, Lai C-Y (2005) A hybrid system combining self-organizing maps with case-based reasoning in wholesaler’s new-release book for forecasting. Expert Syst Appl 29:183–192

    Article  Google Scholar 

  4. Oba S et al (2002) Missing value estimation using mixture of PCAs. LNCS 2415, pp 492–497

  5. Ad Feelders (1999) Handling missing data in trees-surrogate splits or statistical imputation. LNAI 1704, pp 329–334

    Google Scholar 

  6. Grzymala-Busse JW (2004) Rough set approach to incomplete data. LNAI 3070, pp 50–55

  7. Gerardo BD et al (2004) The association rule algorithm with missing data in data mining. LNCS3043, pp 97–105

  8. Li D et al (2004) Towards missing data imputation—a study of fuzzy K-means clustering method. LNAI 3066, pp 573–579

    Google Scholar 

  9. Zs. J. Viharos et al (2002) Training and application of artificial neural networks with incomplete data. LNAI 2358, pp 649–659

  10. Latkowski R (2002) Incomplete data decomposition for classification. LNAI 2475, pp 413–420

    Google Scholar 

  11. Jutten C, Herault J (1998) Independent component analysis versus PCA. In: Proceeding of European signal processing conference, 287–314

  12. Singh Y, Rai CS (2003) A simplified approach to independent component analysis. Neural Comput Appl 12:173–177

    Article  Google Scholar 

  13. Kocsor A, Csirik J (2001) Fast independent component analysis in kernel feature spaces. LNCS 2234, pp 271–281

    Google Scholar 

  14. Theis FJ et al (2002) Overcomplete ICA with a geometric algorithm. LNCS 2415, pp 1049–1054

  15. Vapnik V (2004) Statistical learning theory. Publishing House of Electronics Industry, Beijing

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyi Peng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peng, H., Zhu, S. Handling of incomplete data sets using ICA and SOM in data mining. Neural Comput & Applic 16, 167–172 (2007). https://doi.org/10.1007/s00521-006-0058-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-006-0058-6

Keywords

Navigation