Skip to main content
Log in

A Machine Learning Framework for Feature Selection in Heart Disease Classification Using Improved Particle Swarm Optimization with Support Vector Machine Classifier

  • Published:
Programming and Computer Software Aims and scope Submit manuscript

Abstract

Machine learning is used as an effective support system in health diagnosis which contains large volume of data. More commonly, analyzing such a large volume of data consumes more resources and execution time. In addition, all the features present in the dataset do not support in achieving the solution of the given problem. Hence, there is a need to use an effective feature selection algorithm for finding the more important features that contribute more in diagnosing the diseases. The Particle Swarm Optimization (PSO) is one of the metaheuristic algorithms to find the best solution with less time. Nowadays, PSO algorithm is not only used to select the more significant features but also removes the irrelevant and redundant features present in the dataset. However, the traditional PSO algorithm has an issue in selecting the optimal weight to update the velocity and position of the particles. To overcome this issue, this paper presents a novel function for identifying optimal weights on the basis of population diversity function and tuning function. We have also proposed a novel fitness function for PSO with the help of Support Vector Machine (SVM). The objective of the fitness function is to minimize the number of attributes and increase the accuracy. The performance of the proposed PSO-SVM is compared with the various existing feature selection algorithms such as Info gain, Chi-squared, One attribute based, Consistency subset, Relief, CFS, Filtered subset, Filtered attribute, Gain ratio and PSO algorithm. The SVM classifier is also compared with several classifiers such as Naive Bayes, Random forest and MLP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Similar content being viewed by others

REFERENCES

  1. Imran Kurt, Mevlut Ture, et al., Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease, J. Expert Syst. Appl., 2008, vol. 34, pp. 366–374.

    Article  Google Scholar 

  2. Hongmei Yan, Jun Zheng, et al., Selecting critical clinical features for heart diseases diagnosis with a real-coded genetic algorithm, J. Appl. Soft Comput., 2008, vol. 8, pp. 1105–1111.

    Article  Google Scholar 

  3. Carlos Ordonez, Association rule discover with the train and test approach for the heart disease prediction, IEEE Trans. Inf. Technol. Biomed., 2006, vol. 10, no. 2, pp. 334–343.

    Article  Google Scholar 

  4. Kusiak, A., Caldarone, Ch.A., et al., Hypo plastic left heart syndrome knowledge discovery with a data mining approach, J. Comput. Biol. Med., 2006, vol. 36, no. 1, pp. 21–40.

    Google Scholar 

  5. Babaoglu, I., Kaan Baykan, O., et al., Assessment of exercise stress testing with artificial neural network in determining coronary artery disease and predicting lesion localization, J. Expert Syst. Appl., 2009, vol. 36, pp. 2562–2566.

    Article  Google Scholar 

  6. Rajeswari, K., Vaithiyanathan, V., et al., Feature selection in ischemic heart disease identification using feed forward neural networks, Int. Symposium on Robotics and Intelligent Sensors, 2012, vol. 41, pp. 1818–1823.

  7. Mu-Jung Huang, Mu-Yen Chen, et al., Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis, J. Expert Syst. Appl., 2007, vol. 32, pp. 856–867.

    Article  Google Scholar 

  8. Tan, K.C., Teoh, E.J., et al., A hybrid evolutionary algorithm for attribute selection in data mining, J. Expert Syst. Appl., 2009, vol. 36, pp. 8616–8630.

    Article  Google Scholar 

  9. Jesmin Nahar, Tasadduq Imam, et al., Association rule mining to detect factors which contribute to heart disease in males and females, J. Expert Syst. Appl., 2013, vol. 40, pp. 1086–1093.

    Article  Google Scholar 

  10. Austin, P.C., Tu, J.V., et al., Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes, J. Clin. Epidemiol., 2013, vol. 66, pp. 398–407.

    Article  Google Scholar 

  11. Kemal Polat and Salih Gunes, A new feature selection method on classification of medical datasets: Kernel F-score feature selection, J. Expert Syst. Appl., 2009, vol. 36, pp. 10367–10373.

  12. Babaoglu, I., Findik, O., et al., A comparison of feature selection models utilizing binary Particle Swarm Optimization and genetic algorithm in determining coronary artery disease using Support Vector Machine, J. Expert Syst. Appl., 2010, vol. 37, pp. 3177–3183.

    Article  Google Scholar 

  13. Jesmin Nahar, Tasadduq Imam, et al., Computational intelligence for heart disease diagnosis: A medical knowledge driven approach, J. Expert Syst. Appl., 2013, vol. 40, pp. 96–104.

    Article  Google Scholar 

  14. Setiawan, N.A. et al., A comparative study of imputation methods to predict missing attribute values in coronary heart disease data set, J. Dep. Electr. Electron. Eng., 2009, vol. 21, pp. 266–269.

    Google Scholar 

  15. Luukka, P. and Lampinen, J., A classification method based on Principal Component Analysis and differential evolution algorithm applied for prediction diagnosis from clinical EMR heart data sets, J. Comput. Intell. Optimization Adaption, Learn. Optim., 2010, vol. 7, pp. 263–283.

    MATH  Google Scholar 

  16. Das, R., Turkoglu, I., et al., Effective diagnosis of heart disease through neural networks ensembles, J. Expert Syst. Appl., 2009, vol. 36, pp. 7675–7680.

    Article  Google Scholar 

  17. Das, R., Turkoglu, I., et al., Diagnosis of valvular heart disease through neural networks ensembles, J. Comput. Methods Programs Biomed., 2009, vol. 93, pp. 185–191.

    Article  Google Scholar 

  18. Chang-Sik Son, Yoon-Nyun Kim, et al., Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches, J. Biomed. Inf., 2012, vol. 45, pp. 999–1008.

    Article  Google Scholar 

  19. Laercio Brito Gonçalves, Marley Maria Bernardes Rebuzzi Vellasco, et al., Inverted hierarchical neuro-fuzzy BSP system: A novel neuro-fuzzy model for pattern classification and rule extraction in databases, J. IEEE Trans. Syst., Man, Cybernetics, 2006, vol. 36, no. 2.

  20. Kemal Polat and Salih Gunes, A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and AIRS, J. Comput. Methods Progr. Biomed., 2007, vol. 88, pp. 164–174.

  21. Kemal Polat, Seral Sahan, et al., Automatic detection of heart disease using an Artificial Immune Recognition System (AIRS) with fuzzy resource allocation mechanism and k-nn (nearest neighbor) based weighting preprocessing, J. Expert Syst. Appl., 2007, vol. 32, pp. 625–631.

    Article  Google Scholar 

  22. Akin Ozcift and Arif Gulten, Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms, J. Comput. Methods Progr. Biomed., 2011, vol. 104, pp. 443–451.

  23. Chih-Lin Chi, Nick Street, W., et al., A decision support system for cost-effective diagnosis, J. Artif. Intell. Med., 2010, vol. 50, pp. 149–161.

    Article  Google Scholar 

  24. Yoon-Joo Park, Se-Hak Chun, et al., Cost-sensitive case-based reasoning using a genetic algorithm: Application to medical diagnosis, J. Artif. Intell. Med., 2011, vol. 51, pp. 133–145.

    Article  Google Scholar 

  25. Debabrata Pal, Mandana, K.M., et al., Fuzzy expert system approach for coronary artery disease screening using clinical parameters, J. Knowl. Based Syst., 2012, vol. 36, pp. 162–174.

    Google Scholar 

  26. Kahramanli, H. and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases, J. Expert Syst. Appl., 2008, vol. 35, pp. 82–89.

    Article  Google Scholar 

  27. Vahid Khatibi and Gholam Ali Montazer, A fuzzy-evidential hybrid inference engine for coronary heart disease risk assessment, J. Expert Syst. Appl., 2010, vol. 37, pp. 8536–8542.

  28. Goekmen Turan, R., Bozdag, I., et al., Improved functional activity of bone marrow derived circulating progenitor cells after intra coronary freshly isolated bone marrow cells transplantation in patients with ischemic heart disease, J. Stem Cell Rev. Rep., 2011, vol. 7, pp.646–656.

    Article  Google Scholar 

  29. Karsdorp, P.A., Kindt, M., et al., False heart rate feedback and the perception of heart symptoms in patients with congenital heart disease and anxiety, Int. J. Behav. Med., 2009, vol. 16, pp. 81–88.

    Article  Google Scholar 

  30. Carlosnasillo/Hybrid-Genetic-Algorithm, 2017. GitHub. https://github.com/carlosnasillo/Hybrid-Genetic-Algorithm. Retrieved October 22, 2017.

  31. Muthukaruppan, S. and Er, M.J., A hybrid Particle Swarm Optimization based fuzzy expert system for the diagnosis of coronary artery disease, J. Expert Syst. Appl., 2012, vol. 39, pp. 11657–11665.

    Article  Google Scholar 

  32. Anooj, P.K., Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules, J. Comput. Inf. Sci., 2012, vol. 24, pp. 27–40.

    Google Scholar 

  33. Tsipouras, M.G., Exarchos, T.P., et al., Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling, J. IEEE Trans. Inf. Technol. Biomed., 2008, vol. 12, no. 4.

  34. Paredesa, S. et al., Long term cardiovascular risk models’ combination, J. Comput. Methods Progr. Biomed., 2011, vol. 101, pp. 231–242.

    Article  Google Scholar 

  35. Swati Shilaskar et al., Feature selection for medical diagnosis: Evaluation for cardiovascular diseases, J. Expert Syst. Appl., 2013, vol. 40, pp. 4146–4153.

  36. UCI Machine Learning Repository: Heart Disease Data Set. Archive.ics.uci.edu. http://archive.ics.uci. edu/ml/datasets/Heart+Disease. Retrieved October 22, 2017.

  37. Zhao, M., Fu, C., Ji, L., Tang, K., and Zhou, M., Feature selection and parameter optimization for Support Vector Machines: A new approach based on genetic algorithm with feature chromosomes, Expert Syst. Appl., 2011, vol. 38, no. 5, pp. 5197–5204.

    Article  Google Scholar 

  38. Li-Na Pu, Ze Zhao, et al., Investigation on cardiovascular risk prediction using genetic information, J. IEEE Trans. Inf. Technol., Biomed., 2012, vol. 16, no. 5.

  39. Pfister, R., Barnes, D., et al., Individual and cumulative effect of type 2 diabetes genetic susceptibility variants on risk of coronary heart disease, J. Diabetologia, 2011, vol. 54, pp. 2283–2287.

    Article  Google Scholar 

  40. Nazri Mohd Nawi, Rozaida Ghazali, et al., The development of improved back-propagation neural networks algorithm for predicting patients with heart disease, in Proceedings of the First International Conference ICICA, 2010, vol. 6377, pp. 317–324.

  41. Jae-Hong Eom, Sung-Chun Kim, et al., AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction, J. Expert Syst. Appl., 2008, vol. 34 2465, p. 2479.

    Article  Google Scholar 

  42. Iftikhar, S., Fatima, K., Rehman, A., Almazyad, A.S., and Saba, T., An evolution based hybrid approach for heart diseases classification and associated risk factors identification, Biomed. Res., 2017, vol. 28, no. 8.

  43. Shah, S.M.S., Batool, S., Khan, I., Ashraf, M.U., Abbas, S.H., and Hussain, S.A., Feature extraction through parallel probabilistic Principal Component Analysis for heart disease diagnosis, Phys. A: Statistical Mechanics and Its Applications, 2017, vol. 482, pp. 796–807.

    Article  Google Scholar 

  44. Arabasadi, Z., Alizadehsani, R., Roshanzamir, M., Moosaei, H., and Yarifard, A.A., Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm, Comput. Methods Progr. Biomed., 2017, vol. 141, pp. 19–26.

    Article  Google Scholar 

  45. Li, Q., Chen, H., Huang, H., Zhao, X., Cai, Z., Tong, C., and Tian, X., An enhanced grey wolf optimization based feature selection wrapped kernel extreme learning machine for medical diagnosis, Comput. Math. Methods Med., 2017.

  46. Vivekanandan, T. and Iyengar, N.C.S.N., Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease, Comput. Biol. Med., 2017, vol. 90, pp. 125–136.

    Article  Google Scholar 

  47. Jabbar, M.A., Deekshatulu, B.L., and Chandra, P., Prediction of heart disease using random forest and feature subset selection, in Innovations in Bio-Inspired Computing and Applications, Cham.; Springer, 2016, pp. 187–196.

    Google Scholar 

  48. Paul, A.K., Shill, P.C., Rabin, M.R.I., and Akhand, M.A.H., Genetic algorithm based fuzzy decision support system for the diagnosis of heart disease, in Informatics, Electronics and Vision (ICIEV), 2016 5th International Conference, IEEE, 2016, pp. 145–150.

  49. Inbarani, H.H., Azar, A.T., and Jothi, G., Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis, Comput. Methods Progr. Biomed., 2014, vol. 113, no. 1, pp. 175–185.

    Article  Google Scholar 

  50. Tomar, D. and Agarwal, S., Feature selection based Least Square Twin Support Vector Machine for diagnosis of heart disease, Int. J. Bio-Sci. Bio-Technol., 2014, vol. 6, no. 2, pp. 69–82.

    Google Scholar 

  51. Reddy, G.T. and Khare, N., An efficient system for heart disease prediction using hybrid OFBAT with rule-based Fuzzy Logic Model, J. Circuits, Syst. Comput., 2017, vol. 26, no. 04, p. 1750061.

    Article  Google Scholar 

  52. Pimentel, A., Coronary heart disease prognosis using machine-learning techniques on patients with type 2 Diabetes Mellitus, in Ubiquitous Machine Learning and Its Applications, IGI Global, 2017, pp. 89–112.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Vijayashree.

Additional information

The article is published in the original.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vijayashree, J., Sultana, H.P. A Machine Learning Framework for Feature Selection in Heart Disease Classification Using Improved Particle Swarm Optimization with Support Vector Machine Classifier. Program Comput Soft 44, 388–397 (2018). https://doi.org/10.1134/S0361768818060129

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0361768818060129

Keywords:

Navigation