Skip to main content

Advertisement

Log in

A novel hybrid feature selection method based on rough set and improved harmony search

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Feature selection is a process of selecting optimal features that produce the most prognostic outcome. It is one of the essential steps in knowledge discovery. The crisis is that not all features are important. Most of the features may be redundant, and the rest may be irrelevant and noisy. This paper presents a novel feature selection approach to deal with issues of high dimensionality in the medical dataset. Medical datasets are habitually classified by a large number of measurements and a comparatively small number of patient records. Most of these measurements are irrelevant or noisy. This paper proposes a supervised feature selection method based on Rough Set Quick Reduct hybridized with Improved Harmony Search algorithm. Rough set theory is one of the most thriving methods used for feature selection. The Rough Set Improved Harmony Search Quick Reduct (RS-IHS-QR) algorithm is a relatively new population-based meta-heuristic optimization algorithm. This approach imitates the music improvisation process, where each musician improvises their instrument’s pitch by searching for a perfect state of harmony. The quality of the reduced data is measured by the classification performance. The proposed algorithm is experimentally compared with the existing algorithms Rough Set Quick Reduct (RS-QR) and Rough Set Particle Swarm Optimization Quick Reduct (RS-PSO-QR). The number of features selected by the proposed method is comparatively low. The proposed algorithm reveals more than 90 % classification accuracy in most of the cases and the time taken to reduct the dataset also decreased than the existing methods. The experimental result demonstrates the efficiency and effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Abdel-AalM RE (2005) GMDH-based feature ranking and selection for improved classification of medical data. J Biomed Inform 38(6):456–468

    Article  Google Scholar 

  2. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2008) Application of ant colony optimization for feature selection in text categorization. In: Proceedings of the IEEE congress on evolutionary computation (CEC ‘08), Hong Kong, pp. 2867–2873

  3. Al-Ani A, Khushaba RN (2012) A population based feature subset selection algorithm guided by fuzzy feature dependency. In: Proceedings of advanced machine learning technologies and applications (AMLTA 2012), December 8-10, Cairo, Egypt, 322(1):430–438

  4. Al-Betar M, Khader A, Liao I (2010) A harmony search with multi-pitch adjusting rate for the university course timetabling. In Geem Z (ed) Recent advances in Harmony search algorithm. Springer, Berlin, vol 270, pp 147–161

  5. Alia OM, Mandava R (2011) The variants of the harmony search algorithm: an overview. Artif Intell Rev 36(1):49–68

    Article  Google Scholar 

  6. Alpigini JJ, Peters JF, Skowronek J, Zhong N (eds) (2002) Rough sets and current trends in computing. In: Proceedings of third international conference, RSCTC 2002, Malvern, PA, USA, October 14-16,. LNAI 2475, Springer. ISBN 3-540-44274-X

  7. Anaraki JR, Eftekhari M (2013) Rough set based feature selection: a review. Fifth conference on information and knowledge technology (IKT), 28-30 May 2013, 301–306. IEEE. doi:10.1109/IKT.2013.6620083

  8. Asad AH, Azar AT, Hassanien AE (2014) A comparative study on feature selection for retinal vessel segmentation using ant colony system. Recent Adv Intell Inform Adv Intell Syst Comput 235:1–11. doi:10.1007/978-3-319-01778-5_1

    Article  Google Scholar 

  9. Azar AT (2014) Neuro-fuzzy feature selection approach based on linguistic hedges for medical diagnosis. Int J Model Identif Control 22(3):195–206. doi:10.1504/IJMIC.2014.065338

    Article  Google Scholar 

  10. Azar AT, Hassanien AE (2014) Dimensionality reduction of medical big data using neural-fuzzy classifier. Soft computing, pp 1–13, Springer. doi:10.1007/s00500-014-1327-4

  11. Azar AT, Banu PKN, Inbarani HH (2013a) PSORR: an unsupervised feature selection technique for fetal heart rate. In: 5th International conference on modelling, identification and control (ICMIC 2013), Egypt, 31 August, 1–2 September 2013, pp 60–65

  12. Azar AT, El-Said SA (2013) Superior neuro-fuzzy classification systems. Neural Comput Appl 23(1):55–72. doi:10.1007/s00521-012-1231-8

    Article  Google Scholar 

  13. Azar AT, El-Said SA, Balas VE, Olariu T (2013b) Linguistic hedges fuzzy feature selection for erythemato-squamous diseases. In: Soft computing applications, advances in intelligent systems and computing (AISC), vol 195. Springer, Berlin, pp 487–500. doi:10.1007/978-3-642-33941-7_43

  14. Aziz ASA, Hassanien AE, Azar AT, Hanafy SE (2013) Genetic algorithm with different feature selection techniques for anomaly detectors generation. Federated conference on computer science and information systems Kraków, Poland, pp 769–774

  15. Bagyamathi M, Inbarani HH (2015) A novel hybridized rough set and improved harmony search based feature selection for protein sequence classification. In: Hassanien AE, Azar AT, Snasel V, Kacprzyk J, Abawajy JH (eds) Big data in complex systems: challenges and opportunities, studies in big data, vol 9. Springer, Berlin, pp 173–204

  16. Banu PKN, Inbarani HH, Azar AT, Hala S, Own HS, Hassanien AE (2014) Rough set based feature selection for egyptian neonatal jaundice. In: Hassanien AE, Tolba M, Azar AT (eds) Advanced machine learning technologies and applications: second international conference, AMLTA 2014, Cairo, Egypt, November 28–30, 2014. Proceedings, communications in computer and information science, vol 488. Springer, Berlin. ISBN: 978-3-319-13460-4

  17. Basiri ME, Ghasem-Aghaee N, Aghdam MH (2008) Using ant colony optimization-based selected features for predicting post-synaptic activity in proteins. In: Proceedings of 6th European conference on EvoBio 2008, 6th European conference, EvoBIO 2008, Naples, Italy, 4973: 12–23

  18. Beniwal S, Arora J (2012) Classification and feature selection techniques in data mining. Int J Eng Res Technol 1(6):2278–2284

    Google Scholar 

  19. Blake CL, Merz CJ (2013) UCI repository of machine learning databases. http://www.ics.uci.edu/∼mlearn. Accessed Sept 2013

  20. Chakraborty P, Roy GG, Das S, Jain D, Abraham A (2009) An improved harmony search algorithm with differential mutation operator. Fundam Inform 95(4):1–26

    MathSciNet  Google Scholar 

  21. Chandrasekhar T, Thangavel K, Sathishkumar EN (2012) Verdict accuracy of quick reduct algorithm using clustering and classification techniques for gene expression data. IJCSI Int J Comput Sci Issues 9(1):357–363

    Google Scholar 

  22. Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233

    Article  Google Scholar 

  23. Chen HL, Yang B, Liu J, Liu DY (2011) A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst Appl 38(7):9014–9022

    Article  Google Scholar 

  24. Chen LF, Su CT, Chen KH, Wang PC (2012) Particle swarm optimization for feature selection with application in obstructive sleep apnea diagnosis. Int J Neural Comput Appl 21(8):2087–2096

    Article  MathSciNet  Google Scholar 

  25. Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Int J Appl Artif Intell 15(9):843–873

    Article  Google Scholar 

  26. Degertekin SO (2008) Optimum design of steel frames using harmony search algorithm. Struct Multidiscipl Optim 36(4):393–401

    Article  Google Scholar 

  27. Elshazly HI, Azar AT, Elkorany AM, Hassanien AE (2013) Hybrid system based on rough sets and genetic algorithms for medical data classifications. Int J Fuzzy Syst Appl (IJFSA) 3(4):31–46

    Article  Google Scholar 

  28. Forsati R, Moayedikia A, Jensen R, Shamsfard M, Meybodi MR (2014) Enriched ant colony optimization and its application in feature selection. Neurocomputing 142:354–371

    Article  Google Scholar 

  29. Fu X, Tan F, Wang H, Zhang YQ, Harrison RR (2006) Feature similarity based redundancy reduction for gene selection. In: Proceedings of the international conference on data mining, June 26–29, Las Vegas, NV, pp 357–360

  30. Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: harmony search. Simulation 76(2):60–68

    Article  Google Scholar 

  31. Geem ZW (2006) Improved harmony search from ensemble of music players. In: Proceedings of 10th international conference on knowledge-based intelligent information and engineering systems–KES 2006. LNCS 4251. Springer, Heidelberg, pp 86–93

  32. Geem ZW, Choi JY (2007) Music composition using harmony search algorithm. Appl Evol Comput LNCS 4448:593–600

    Google Scholar 

  33. Geem ZW (2009) Particle-swarm harmony search for water network design. Eng Optim 41(4):297–311

    Article  Google Scholar 

  34. Gu Q, Ding Y, Jiang X, Zhang T (2010) Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection. Amino Acids 38(4):975–983

    Article  Google Scholar 

  35. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  36. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann Publishers, Waltham. ISBN 978-0-12-381479-1

    Google Scholar 

  37. Hassanien AE, Azar AT, Snasel V, Kacprzyk J, Abawajy JH (2015) Big data in complex systems: challenges and opportunities, studies in big data, vol 9. Springer, Berlin. ISBN 978-3-319-11055-4

    Book  Google Scholar 

  38. Hu QH, Yu DR, Xie ZX (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn Lett 27(5):414–423

    Article  Google Scholar 

  39. Hassanien AE, Tolba M, Azar AT (2014) Advanced machine learning technologies and applications: second international conference, AMLTA 2014, Cairo, Egypt, November 28–30, 2014. In: Proceedings, communications in computer and information science, vol 488. Springer, Berlin. ISBN: 978-3-319-13460-4

  40. Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28(13):1825–1844

    Article  Google Scholar 

  41. Huang SH, Wulsin LR, Li H, Guo J (2009) Dimensionality reduction for knowledge discovery in medical claims database: application to antidepressant medication utilization study. Comput Methods Programs Biomed 93(2):115–123

    Article  Google Scholar 

  42. Huang ML, Hung YH, Chen WY (2010) Neural network classifier with entropy based feature selection on breast cancer diagnosis. J Med Syst 34(5):865–873

    Article  Google Scholar 

  43. Inbarani HH, Banu PKN, Andrews S (2012) Unsupervised hybrid PSO–quick reduct approach for feature reduction. In: Proceedings of international conference on recent trends in information technology–ICRTIT 2012. pp 11–16

  44. Inbarani HH, Banu PKN (2012) Unsupervised hybrid PSO: relative reduct approach for feature reduction. In: Proceedings of international conference on pattern recognition, informatics and medical engineering, March 21–23, Salem, Tamil Nadu, India, pp 103–108

  45. Inbarani HH, Azar AT, Jothi G (2014a) Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. Comput Methods Programs Biomed 113(1):175–185

    Article  Google Scholar 

  46. Inbarani HH, Banu PKN, Azar AT (2014b) Feature selection using swarm-based relative reduct technique for fetal heart rate. Neural Comput Appl 25(3–4):793–806

    Article  Google Scholar 

  47. Inbarani HH, Kumar SS, Azar AT, Hassanien AE (2014c) Soft rough sets for heart valve disease diagnosis. In: AE Hassanien, M Tolba, AT Azar (eds) Advanced machine learning technologies and applications: second international conference, AMLTA 2014, Cairo, Egypt, November 28–30, 2014. Proceedings, communications in computer and information science, vol 488. Springer, Berlin. ISBN: 978-3-319-13460-4

  48. Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471

    Article  Google Scholar 

  49. Jensen R (2005) Combining rough and fuzzy sets for feature selection, doctor of philosophy, Ph. D Dissertation, School of Informatics University of Edinburgh

  50. Jiang J, Bo Y, Song C, Bao L (2012) Hybrid algorithm based on particle swarm optimization and artificial fish swarm algorithm. Adv Neural Netw 7367:607–614

    Google Scholar 

  51. Jothi G, Inbarani HH, Azar AT (2013) Hybrid tolerance-PSO based supervised feature selection for digital mammogram images. Int J Fuzzy Syst Appl (IJFSA) 3(4):15–30

    Article  Google Scholar 

  52. Jothi G, Inbarani HH (2012) Soft set based quick reduct approach for unsupervised feature selection. In: Proceedings of international conference on advanced communication control and computing technologies (ICACCCT), Tamil Nadu, India, IEEE. pp 277–281

  53. Kalyani P, Karnan M (2011) A new implementation of Attribute reduction using Quick Relative Reduct algorithm. Int J Internet Comput 1(1):99–102

    Google Scholar 

  54. Kattan A, Abdullah R, Salam RA (2010) Harmony search based supervised training of artificial neural networks. In: International conference on intelligent systems, modelling and simulation, IEEE. pp 105–110

  55. Kennedy J, Eberhart RC (1995) A new optimizer using particle swarm theory. In: Proceedings of sixth international symposium on micro machine and human science, Nagoya vol 1, pp 39–43

  56. Lee CK, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manage 42(1):155–165

    Article  Google Scholar 

  57. Liu H, Motoda H (2007) Computational methods of feature selection, Chapman and Hall/CRC Press, USA. ISBN-13: 978-1584888789

  58. Long NC, Cong N, Meesad P, Unger H (2014) Attribute reduction based on rough sets and the discrete firefly algorithm. Recent Adv Inform Commun Technol 265:13–22

    Article  Google Scholar 

  59. Macas M, Lhotsk L, Bakstein E, Novák D, Wild J, Sieger T, Vostatek P, Jech R (2012) Wrapper feature selection for small sample size data driven by complete error estimates. Comput Methods Programs Biomed 108(1):138–150

    Article  Google Scholar 

  60. Mahdavi M, Fesanghary M, Damangir E (2007) An improved harmony search algorithm for solving optimization problems. Appl Math Comput 188(2):1567–1579

    Article  MATH  MathSciNet  Google Scholar 

  61. Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312

    Article  Google Scholar 

  62. Navi SP (2013) Using harmony clustering for haplotype reconstruction from SNP fragments. Int J Bio-Sci Bio-Technol 5(5):223–232

    Article  Google Scholar 

  63. Nemati S, Boostani R, Jazi MD (2008) A novel text-independent speaker verification system using ant colony optimization algorithm. ICISP2008, LNCS 5099. Springer, Berlin, pp 421–429

    Google Scholar 

  64. Olson DL, Delen D (2008) Advanced data mining techniques, first edition, Springer, ISBN 3-540-76916-1

  65. Pawlak Z (2002) Rough sets and intelligent data analysis. Inf Sci 147(1–4):1–12

    Article  MATH  MathSciNet  Google Scholar 

  66. Pawlak Z (1993) Rough sets: present state and the future. Found Comput Decis Sci 18(3–4):157–166

    MATH  MathSciNet  Google Scholar 

  67. Peng YH, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43(1):15–23

    Article  Google Scholar 

  68. Rami NK, Al-Ani A, Al-Jumaily A (2011) Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst Appl 38(9):11515–11526

    Article  Google Scholar 

  69. Saeys Y, Inza IN, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  Google Scholar 

  70. Seok LK, Geem ZW (2005) A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput Methods Appl Mech Eng 194(36–38):3902–3933

    MATH  Google Scholar 

  71. Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst Appl 33(1):49–60

    Article  Google Scholar 

  72. Shi Y, Eberhart RC (1998) Parameter selection in particle swarm optimization. In: Proceedings of the seventh annual conference on evolutionary programming. Springer, New York, vol 1447, pp 591–600

  73. Suguna N, Thanushkodi K (2010) A novel rough set reduct algorithm for medical domain based on bee colony optimization. J Comput 2(6):49–54

    Google Scholar 

  74. Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849

    Article  MATH  Google Scholar 

  75. Velayutham C, Thangavel K (2011) Unsupervised quick reduct algorithm using rough set theory. J Electron Sci Technol 9(3):193–201

    Google Scholar 

  76. Wang B, Gao K, Zhang B (2005) Algorithm of feature selection for inconsistent data preprocessing based rough set. Int J Inform Syst Sci 1(3–4):311–319

    MATH  Google Scholar 

  77. Wang F, Dang C, Qian Y (2012) An efficient rough feature selection algorithm with a multi-granulation view. Int J Approx Reason 53(6):912–926

    Article  MathSciNet  Google Scholar 

  78. Wang F, Xu J, Li L (2014) A novel rough set reduct algorithm to feature selection based on artificial fish swarm algorithm. Adv Swarm Intell 8795:24–33

    Google Scholar 

  79. Wang J, Peng XY, Peng Y (2007) Efficient rough-set based attribute reduction algorithm with nearest neighbour searching. Electron Lett 43(10):563–565

    Article  MathSciNet  Google Scholar 

  80. Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471

    Article  Google Scholar 

  81. Zhang G, Hu L, Jin W (2005) Discretization of continuous attributes in rough set theory and its application. Comput Inform Sci Lecture Notes Comput Sci 3314:1020–1026

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmad Taher Azar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Inbarani, H.H., Bagyamathi, M. & Azar, A.T. A novel hybrid feature selection method based on rough set and improved harmony search. Neural Comput & Applic 26, 1859–1880 (2015). https://doi.org/10.1007/s00521-015-1840-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-015-1840-0

Keywords

Navigation