Skip to main content

Enhancing Feature Selection with Density Cluster for Better Clustering

  • Conference paper
  • First Online:
Computational and Statistical Methods in Intelligent Systems (CoMeSySo 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 859))

Included in the following conference series:

Abstract

Feature selection is an important data analysis technique that used to reduce the redundancy of features and exploit hidden information in high-dimensional data. In this paper we propose a similarity metric based feature selection method named Fesim. We use the Euclidean distance to measure the similarity among all features, and then apply the density based DBSCAN algorithm to clustering features which to be relevant. Moreover, we present a strategy which choose representative features of each cluster accurately. We conducted comprehensive experiments to evaluate the proposed approach, and the results on different datasets are demonstrated its superiority.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the 10th National Conference on Artificial Intelligence, pp. 129–134 (1992)

    Google Scholar 

  2. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 1157–1182 (2003)

    Google Scholar 

  3. George, F.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 1289–1305 (2003)

    Google Scholar 

  4. Brassard, G., Bratley, P.: Fundamentals of Algorithmics, 1st edn. Pearson, London (1995)

    MATH  Google Scholar 

  5. Ester, M., Kriegel, H.P., Xu, X.: A database interface for clustering in large spatial databases. In: KDD, pp. 94–99 (1995)

    Google Scholar 

  6. Ester, M., Kriegel, H.P., Xu, X.: A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Palo Alto (1996)

    Google Scholar 

  7. Tahir, N.M., Hussain, A., Samad, S.A.: Feature Selection for Classification Using Decision Tree. Research and Development, Malaysia (2006)

    Book  Google Scholar 

  8. Au, W.-H.: Attribute clustering for grouping, selection, and classification of gene expression data. IEEE Trans. Comput. Biol. Bioinform. 83–101 (2005)

    Google Scholar 

  9. Liu, H.: A new feature selection method based on clustering. In: 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2. IEEE (2011)

    Google Scholar 

  10. Maji, P.: Mutual information-based supervised attribute clustering for microarray sample classification. IEEE Trans. Knowl. Data Eng. 24(1), 127–140 (2012)

    Article  Google Scholar 

  11. Eshaghi, N., Aghagolzadeh, A.: FFS: an F-DBSCAN clustering- based feature selection for classification data. J. Adv. Comput. Res. Sari Branch, Islamic Azad University, Sari, I. R. Iran, pp. 43–54 (2017)

    Google Scholar 

  12. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository, University of California, Department of Information and Computer Science, Irvine (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html

  13. Higuera, C., Gardiner, K.J., Cios, K.J.: Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome. PLoS One 10(6), e0129126 (2015)

    Article  Google Scholar 

  14. Ahmed, M.M., Dhanasekaran, A.R., Block, A., Tong, S., Costa, A.C.S., Stasko, M., et al.: Protein dynamics associated with failed and rescued learning in the Ts65Dn mouse model of down syndrome. PLoS One 10(3), e0119491 (2015)

    Article  Google Scholar 

  15. Zarchi, M.S., SMM Fatemi Bushehri, Dehghanizadeh, M.:. SCADI: a standard dataset for self-care problems classification of children with physical and motor disability. Int. J. Med. Inf. (2018)

    Google Scholar 

  16. Andrzejak, R.G., Lehnertz, K., Rieke, C., Mormann, F., David, P., Elger, C.E.: Indications of nonlinear deterministic and finite dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. Rev. E 64, 061907 (2001)

    Article  Google Scholar 

  17. Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Article  Google Scholar 

  18. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. (1985)

    Google Scholar 

  19. Rosenberg, A., Hirschberg, J.: V-Measure: a conditional entropy-based external cluster evaluation measure (2007)

    Google Scholar 

  20. Rousseeuw, Peter J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  21. Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. (1974)

    Google Scholar 

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (Grant No. 61462012, No. 61562010, No. U1531246), the Innovation Team of the Data Analysis and Cloud Service of Guizhou Province (Grant No. [2015]53), Science and Technology Project of the Department of Science and Technology in Guizhou Province (Grant No. LH [2016]7427).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Y., Li, H., Chen, M., Dai, Z., Li, H., Zhu, M. (2019). Enhancing Feature Selection with Density Cluster for Better Clustering. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Computational and Statistical Methods in Intelligent Systems. CoMeSySo 2018. Advances in Intelligent Systems and Computing, vol 859. Springer, Cham. https://doi.org/10.1007/978-3-030-00211-4_15

Download citation

Publish with us

Policies and ethics