Skip to main content

A Clustering Strategy-Based Evolutionary Algorithm for Feature Selection in Classification

  • Conference paper
  • First Online:
Advances and Trends in Artificial Intelligence. Theory and Applications (IEA/AIE 2023)

Abstract

Feature selection is a technique used in data pre-processing to select the most relevant subset of features from a larger set, with the goal of improving classification performance. Evolutionary algorithms have been commonly proposed to solve feature selection problems, but they can suffer from issues originated from diversity reduction and crowding distance decrease, which can lead to suboptimal results. In this study, we propose a new evolutionary algorithm called clustering strategy based evolutionary algorithm (CEA) for feature selection in classification. CEA combines the clustering mechanism to gather individuals into different clusters, and the crossover operation is dominated by the parents in different clusters, thus enhancing the exploration ability of the algorithm and avoiding the population falling into the local optimal solution space. The performance of CEA was evaluated on 13 classification datasets and compared to four mainstream evolutionary algorithms. The experimental results showed that CEA was able to achieve better classification performance using similar or fewer features than the other algorithms.

This research was partially supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI under Grant JP22H03643, Japan Science and Technology Agency (JST) Support for Pioneering Research Initiated by the Next Generation (SPRING) under Grant JPMJSP2145, JST through the Establishment of University Fellowships towards the Creation of Science Technology Innovation under Grant JPMJFS2115, and Natural Science Foundation of Jiangsu Province (No. BK20210605).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Frank, E., et al.: WEKA-a machine learning workbench for data mining. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 1269–1277. Springer, Boston (2010). https://doi.org/10.1007/978-0-387-09823-4_66

    Chapter  Google Scholar 

  2. Wang, Z., Gao, S., Zhou, M., Sato, S., Cheng, J., Wang, J.: Information-theory-based nondominated sorting ant colony optimization for multiobjective feature selection in classification. IEEE Trans. Cybern. (2022)

    Google Scholar 

  3. Wang, Z., Gao, S., Zhang, Y., Guo, L.: Symmetric uncertainty-incorporated probabilistic sequence-based ant colony optimization for feature selection in classification. Knowl.-Based Syst. 256, 109874 (2022)

    Article  Google Scholar 

  4. Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2015)

    Article  Google Scholar 

  5. Zhan, Z.-H., Shi, L., Tan, K.C., Zhang, J.: A survey on evolutionary computation for complex continuous optimization. Artif. Intell. Rev. 55(1), 59–110 (2021). https://doi.org/10.1007/s10462-021-10042-y

    Article  Google Scholar 

  6. Sudholt, D.: The benefits of population diversity in evolutionary algorithms: a survey of rigorous runtime analyses. In: Theory of Evolutionary Computation. NCS, pp. 359–404. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29414-4_8

    Chapter  MATH  Google Scholar 

  7. Wang, Y., Gao, S., Zhou, M., Yu, Y.: A multi-layered gravitational search algorithm for function optimization and real-world problems. IEEE/CAA J. Automatica Sinica 8(1), 94–109 (2020)

    Article  Google Scholar 

  8. Gheyas, I.A., Smith, L.S.: Feature subset selection in large dimensionality domains. Pattern Recogn. 43(1), 5–13 (2010)

    Article  MATH  Google Scholar 

  9. Xu, H., Xue, B., Zhang, M.: A duplication analysis-based evolutionary algorithm for biobjective feature selection. IEEE Trans. Evol. Comput. 25(2), 205–218 (2020)

    Article  Google Scholar 

  10. Deng, X., Li, Y., Weng, J., Zhang, J.: Feature selection for text classification: a review. Multimed. Tools Appl. 78, 3797–3816 (2019)

    Article  Google Scholar 

  11. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  12. Heris, M.K.: Binary and real-coded genetic algorithms in matlab (2015). https://yarpiz.com/23/ypea101-genetic-algorithms

  13. Kumar, V., Kumar, D.: Binary whale optimization algorithm and its application to unit commitment problem. Neural Comput. Appl. 32, 2095–2123 (2020)

    Article  Google Scholar 

  14. Price, K.V.: Differential evolution. In: Zelinka, I., Snášel, V., Abraham, A. (eds.) Handbook of Optimization: From Classical to Modern Approach, pp. 187–214. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-30504-7_8

    Chapter  Google Scholar 

  15. Bertsimas, D., Tsitsiklis, J.: Simulated annealing. Stat. Sci. 8(1), 10–15 (1993)

    Article  MATH  Google Scholar 

  16. Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2, 165–193 (2015)

    Article  Google Scholar 

  17. Zeebaree, D.Q., Haron, H., Abdulazeez, A.M., Zeebaree, S.: Combination of k-means clustering with genetic algorithm: a review. Int. J. Appl. Eng. Res. 12(24), 14238–14245 (2017)

    Google Scholar 

  18. Sinha, A., Jana, P.K.: A hybrid mapreduce-based k-means clustering using genetic algorithm for distributed datasets. J. Supercomput. 74(4), 1562–1579 (2018)

    Article  Google Scholar 

  19. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  20. Yu, Y., Gao, S., Wang, Y., Todo, Y.: Global optimum-based search differential evolution. IEEE/CAA J. Automatica Sinica 6(2), 379–394 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shangce Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, B., Wang, Z., Lei, Z., Yu, J., Jin, T., Gao, S. (2023). A Clustering Strategy-Based Evolutionary Algorithm for Feature Selection in Classification. In: Fujita, H., Wang, Y., Xiao, Y., Moonis, A. (eds) Advances and Trends in Artificial Intelligence. Theory and Applications. IEA/AIE 2023. Lecture Notes in Computer Science(), vol 13925. Springer, Cham. https://doi.org/10.1007/978-3-031-36819-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36819-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36818-9

  • Online ISBN: 978-3-031-36819-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics