Skip to main content

Advertisement

Log in

Classification of patients with chronic disease by activation level using machine learning methods

  • Published:
Health Care Management Science Aims and scope Submit manuscript

Abstract

Patient Activation Measure (PAM) measures the activation level of patients with chronic conditions and correlates well with patient adherence behavior, health outcomes, and healthcare costs. PAM is increasingly used in practice to identify patients needing more support from the care team. We define PAM levels 1 and 2 as low PAM and investigate the performance of eight machine learning methods (Logistic Regression, Lasso Regression, Ridge Regression, Random Forest, Gradient Boosted Trees, Support Vector Machines, Decision Trees, Neural Networks) to classify patients. Primary data collected from adult patients (n=431) with Diabetes Mellitus (DM) or Hypertension (HT) attending Family Health Centers in Istanbul, Turkey, is used to test the methods. \(44.5\%\) of patients in the dataset have a low PAM level. Classification performance with several feature sets was analyzed to understand the relative importance of different types of information and provide insights. The most important features are found as whether the patient performs self-monitoring, smoking and exercise habits, education, and socio-economic status. The best performance was achieved with the Logistic Regression algorithm, with Area Under the Curve (AUC)=0.72 with the best performing feature set. Alternative feature sets with similar prediction performance are also presented. The prediction performance was inferior with an automated feature selection method, supporting the importance of using domain knowledge in machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. see the webpage of NHS https://www.england.nhs.uk/2014/05/patient-activation/ and endorsement by the CMS: https://cmit.cms.gov/CMIT_public/ViewMeasure?MeasureId=3277

  2. See https://www.england.nhs.uk/wp-content/uploads/2018/04/patient-activation-measure-quick-guide.pdf

  3. After encoding, 33 unique features correspond to 87 encoded features.

  4. We have excluded neural networks as a recursive feature selection method in our study, specifically using the RFECV algorithm from the sklearn library. This is because RFECV requires the availability of ‘coef’ or ‘feature importances’ attributes, which are not applicable in the case of neural network implementations. Moreover, one of the fundamental concepts underlying neural networks is to enable the algorithm to learn feature importances in a black-box manner without the explicit need for feature selection.

  5. We count each categorical feature in the original survey as one unique feature while the classification algorithms use each level as one feature after one-hot-encoding. For example, education is a unique categorical feature, and a university degree is one of the features generated from this unique feature.

  6. As all features (except for BMI and number drugs) are binary variables, we do not standardize the coefficients

  7. Note that the model presented in Table 8 uses dummy coding for the variables, dropping one category level for each categorical variable. The one-hot encoding procedure used in the cross-validation classification algorithm in the previous sections has not dropped levels for categories with more than two levels. Thus the number of features in Table 8 is less than the number of features in Fig. 6. Nevertheless, the main results are consistent with each other.

  8. Note that the formula used in this prediction is not exactly the same as that given above; the model in Table 15 is fitted to the whole data, while the below figures use a model fitted to the training set. However, to use this model for prediction in a new data set, the whole study data should be used, which results in the model given in Table 15.

  9. Based on data for the January-October period of 2017, the total number of emergency service examinations is 84,545,429, which is more than the total population of Turkey.

References

  1. Turkish Ministry of Health Summary Statistics for Emergency Room Use in (2017). https://khgmistatistikdb.saglik.gov.tr/Eklenti/23496/0/2017-ocak-ekim-donemi-acil-servis-verileri2pdf.pdf. Accessed: 2021-07-04

  2. Allen LN, Feigl AB (2017) What’s in a name? a call to reframe non-communicable diseases. Lancet Global Health 5(2):e129–e130

    Article  Google Scholar 

  3. Alpaydin E (2020) Introduction to Machine Learning (Adaptive Computation and Machine Learning Series). The MIT Press, 4 edition

  4. Awan SE, Sohel F, Sanfilippo FM, Bennamoun M, Dwivedi G (2018) Machine learning in heart failure: ready for prime time. Curr Opinion Cardiol 33(2):190–195

    Article  Google Scholar 

  5. Bonnell LN, Littenberg B, Wshah SR, Rose GL (2020) A machine learning approach to identification of unhealthy drinking. J Am Board Family Med 33(3):397–406

    Article  Google Scholar 

  6. Carman KL, Dardess P, Maurer M, Sofaer S, Adams K, Bechtel C, Sweeney J (2013) Patient and family engagement: a framework for understanding the elements and developing interventions and policies. Health Affairs 32(2):223–231

    Article  Google Scholar 

  7. Chew S, Brewster L, Tarrant C, Martin G, Armstrong N (2018) Fidelity or flexibility: an ethnographic study of the implementation and use of the patient activation measure. Patient Educ Counsel 101(5):932–937

    Article  Google Scholar 

  8. Dentzer S (2013) Rx for the ‘blockbuster drug’ of patient engagement

  9. Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from imbalanced data sets, vol 10. Springer

    Book  Google Scholar 

  10. Gao J, Arden M, Hoo ZH, Wildman M (2019) Understanding patient activation and adherence to nebuliser treatment in adults with cystic fibrosis: responses to the uk version of pam-13 and a think aloud study. BMC Health Serv Res 19(1):1–12

    Article  Google Scholar 

  11. Graffigna G, Barello S, Bonanomi A, Lozza E (2015) Measuring patient engagement: development and psychometric properties of the patient health engagement (phe) scale. Front Psychol 6:274

    Article  Google Scholar 

  12. Greene J, Hibbard JH (2012) Why does patient activation matter? an examination of the relationships between patient activation and health-related outcomes. J General Int Med 27(5):520–526

    Article  Google Scholar 

  13. Greene J, Hibbard JH, Sacks R, Overton V, Parrotta CD (2015) When patient activation levels change, health outcomes and costs change, too. Health Affairs 34(3):431–437 (PMID: 25732493)

    Article  Google Scholar 

  14. Grinsztajn L, Oyallon E, Varoquaux G (2022) Why do tree-based models still outperform deep learning on tabular data? arXiv:2207.08815

  15. Hibbard JH, Greene J (2013) What the evidence shows about patient activation: better health outcomes and care experiences; fewer data on costs. Health affairs 32(2):207–214

    Article  Google Scholar 

  16. Hibbard JH, Greene J, Overton V (2013) Patients with lower activation associated with higher costs; delivery systems should know their patients’‘scores’. Health Affairs 32(2):216–222

    Article  Google Scholar 

  17. Hibbard JH, Greene J, Sacks R, Overton V, Parrotta CD (2016) Adding a measure of patient self-management capability to risk assessment can improve prediction of high costs. Health Affairs 35(3):489–494

    Article  Google Scholar 

  18. Hibbard JH, Mahoney ER, Stockard J, Tusler M (2005) Development and testing of a short form of the patient activation measure. Health Serv Res 40(6p1):1918–1930

  19. Hibbard JH, Stockard J, Mahoney ER, Tusler M (2004) Development of the patient activation measure (pam): conceptualizing and measuring activation in patients and consumers. Health Serv Res 39(4p1):1005–1026

  20. Holman H, Lorig K (2004) Patient self-management: a key to effectiveness and efficiency in care of chronic disease. Public Health Report 119(3):239–243

    Article  Google Scholar 

  21. Karanasiou GS, Tripoliti EE, Papadopoulos TG, Kalatzis FG, Goletsis Y, Naka KK, Bechlioulis A, Errachid A, Fotiadis DI (2016) Predicting adherence of patients with hf through machine learning techniques. Healthcare Technol Lett 3(3):165–170

    Article  Google Scholar 

  22. Kilic B, Kalaca S, Unal B, Phillimore P, Zaman S (2015) Health policy analysis for prevention and control of cardiovascular diseases and diabetes mellitus in turkey. Int J Public Health 60(1):47–53

    Article  Google Scholar 

  23. Koesmahargyo V, Abbas A, Zhang L, Guan L, Feng S, Yadav V, Galatzer-Levy IR (2020) Accuracy of machine learning-based prediction of medication adherence in clinical research. Psychiatry Res 294:113558

    Article  Google Scholar 

  24. Kosar C, Besen DB (2019) Adaptation of a patient activatıon measure (pam) into turkish: reliability and validity test. African Health Sci 19(1):1811–1820

    Article  Google Scholar 

  25. Lindsay A, Hibbard JH, Boothroyd DB, Glaseroff A, Asch SM (2018) Patient activation changes as a potential signal for changes in health care costs: cohort study of us high-cost patients. J General Int Med 33(12):2106–2112

    Article  Google Scholar 

  26. Ma Y, He H (2013) Imbalanced learning: foundations, algorithms, and applications

  27. McAllister M, Dunn G, Payne K, Davies L, Todd C (2012) Patient empowerment: the need to consider it as a measurable patient-reported outcome for chronic conditions. BMC Health Serv Res 12(1):1–8

    Article  Google Scholar 

  28. McBain H, Shipley M, Newman S (2015) The impact of self-monitoring in chronic illness on healthcare utilisation: a systematic review of reviews. BMC Health Serv Res 15(1):1–10

    Article  Google Scholar 

  29. Murdock RJ, Kauwe SK, Wang AY-T, Sparks TD (2020) Is domain knowledge necessary for machine learning materials properties? Integrat Mater Manufact Innovation 9(3):221–227

    Article  Google Scholar 

  30. Norman P, Bennett P (1996) Health locus of control. In Predicting health behaviour: research and practice with social cognition models, pages 62–94. Open University Press

  31. Obermeyer Z, Emanuel EJ (2016) Predicting the future-big data, machine learning, and clinical medicine. New England J Med 375(13):1216

    Article  Google Scholar 

  32. O’Malley D, Dewan AA, Ohman-Strickland PA, Gundersen DA, Miller SM, Hudson SV (2018) Determinants of patient activation in a community sample of breast and prostate cancer survivors. Psycho-oncology 27(1):132–140

    Article  Google Scholar 

  33. Osborne RH, Batterham RW, Elsworth GR, Hawkins M, Buchbinder R (2013) The grounded psychometric development and initial validation of the health literacy questionnaire (hlq). BMC Public Health 13(1):1–17

    Article  Google Scholar 

  34. Poitras M-E, Maltais M-E, Bestard-Denommé L, Stewart M, Fortin M (2018) What are the effective elements in patient-centered and multimorbidity care? a scoping review. BMC Health Serv Res 18(1):1–9

    Article  Google Scholar 

  35. Protheroe J, Rowlands G, Bartlam B, Levin-Zamir D (2017) Health literacy, diabetes prevention, and self-management. J Diabetes Res 2017:1298315

    Article  Google Scholar 

  36. Queenan C, Cameron K, Snell A, Smalley J, Joglekar N (2019) Patient heal thyself: reducing hospital readmissions with technology-enabled continuity of care and patient activation. Product Oper Manag 28(11):2841–2853

    Article  Google Scholar 

  37. Rao I, Shaham A, Yavneh A, Kahana D, Ashlagi I, Brandeau ML, Yamin D (2020) Predicting and improving patient-level antibiotic adherence. Health Care Manag Sci 1–13

  38. Remmers C, Hibbard J, Mosen DM, Wagenfield M, Hoye RE, Jones C (2009) Is patient activation associated with future health outcomes and healthcare utilization among patients with diabetes? J Ambulat Care Manag 32(4):320–327

    Article  Google Scholar 

  39. Sakarya S, Kulak E, Gorcin Karaketir S, Dogan E, Akman M, Cifcili S, Gunes E, Ormeci L (2019) Factors associated with patient activation in a turkish population with diabetes and/or hypertension. Eur J Public Health 29(Supplement_4):ckz186–225

  40. Schwarzer R, Fuchs R et al (1996) Self-efficacy and health behaviours. Predicting health behavior: Research and practice with social cognition models 163:196

    Google Scholar 

  41. Shively MJ, Gardetto NJ, Kodiath MF, Kelly A, Smith TL, Stepnowsky C, Maynard C, Larson CB (2013) Effect of patient activation on self-management in patients with heart failure. J Cardiovasc Nurs 28(1):20–34

    Article  Google Scholar 

  42. Simmons LA, Wolever RQ, Bechard EM, Snyderman R (2014) Patient engagement as a risk factor in personalized health care: a systematic review of the literature on chronic disease. Genome Med 6(2):1–13

    Article  Google Scholar 

  43. Sinha AP, Zhao H (2008) Incorporating domain knowledge into data mining classifiers: An application in indirect lending. Decis Support Syst 46(1):287–299

    Article  Google Scholar 

  44. Smith SG, Curtis LM, Wardle J, von Wagner C, Wolf MS (2013) Skill set or mind set? associations between health literacy, patient activation and health. PloS one 8(9):e74373

    Article  Google Scholar 

  45. Son Y-J, Kim H-G, Kim E-H, Choi S, Lee S-K (2010) Application of support vector machine for prediction of medication adherence in heart failure patients. Healthcare Inf Res 16(4):253–259

    Article  Google Scholar 

  46. Tripoliti EE, Papadopoulos TG, Karanasiou GS, Naka KK, Fotiadis DI (2017) Heart failure: Diagnosis, severity estimation and prediction of adverse events through machine learning techniques. Comput Struct Biotechnol J 15:26–47

    Article  Google Scholar 

  47. Wallert J, Gustafson E, Held C, Madison G, Norlund F, von Essen L, Olsson EMG (2018) Predicting adherence to internet-delivered psychotherapy for symptoms of depression and anxiety after myocardial infarction: Machine learning insights from the u-care heart randomized controlled trial. J Med Internet Res 20(10):e10754

    Article  Google Scholar 

  48. Wang H, Zheng B, Yoon SW, Ko HS (2018) A support vector machine-based ensemble algorithm for breast cancer diagnosis. Eur J Oper Res 267(2):687–699

    Article  Google Scholar 

  49. Weil AR (2016) The patient engagement imperative. Health Affairs 34(4)

  50. Wilcox AB, Hripcsak G (2003) The role of domain knowledge in automating medical text report classification. J Am Med Inf Assoc 10(4):330–338

    Article  Google Scholar 

  51. Wong ST, Peterson S, Black C (2011) Patient activation in primary healthcare: a comparison between healthier individuals and those with a chronic illness. Med Care 469–479

  52. Zhou M, Fukuoka Y, Goldberg K, Vittinghoff E, Aswani A (2019) Applying machine learning to predict future adherence to physical activity programs. BMC Med Inf Decision Making 19(1):1–11

Download references

Acknowledgements

We sincerely thank the reviewers for their constructive comments that significantly improved this paper. We are grateful to the AXA Research Fund for the financial support provided through the AXA Award granted to the second author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evrim D. Gunes.

Ethics declarations

Conflict of Interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethical standard

The study was approved by the Ethis Committee of Marmara University. The data is available on request.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 PAM Scale Questions

The following questions are included in the PAM, developed by [18]. Due to the proprietary nature of the instrument, scoring scales are not included:

Table 13 Summary statistics for the features in the data set
  1. 1.

    When all is said and done, I am the person who is responsible for managing my health condition.

  2. 2.

    Taking an active role in my own health care is the most important factor in determining my health and ability to function.

  3. 3.

    I am confident that I can take actions that will help prevent or minimize some symptoms or problems associated with my health condition.

  4. 4.

    I know what each of my prescribed medications do.

  5. 5.

    I am confident that I can tell when I need to go get medical care and when I can handle a health problem myself.

  6. 6.

    I am confident I can tell my health care provider concerns I have even when he or she does not ask.

  7. 7.

    I am confident that I can follow through on medical treatments I need to do at home.

  8. 8.

    I understand the nature and causes of my health condition(s).

  9. 9.

    I know the different medical treatment options available for my health condition.

  10. 10.

    I have been able to maintain the lifestyle changes for my health that I have made.

  11. 11.

    I know how to prevent further problems with my health condition.

  12. 12.

    I am confident I can figure out solutions when new situations or problems arise with my health condition.

  13. 13.

    I am confident that I can maintain lifestyle changes like diet and exercise even during times of stress.

1.2 Additional Tables

Table 14 Feature importances for the LR Classifier for the feature sets Habits+Health+Demo and Habits+Health, Averaged over fifty cross validation folds
Table 15 Coefficients of the LR Model for the feature set Habits+Health
Table 16 Logistic Regression Model for Set Demo+Health+self-control
Table 17 Logistic Regression Model for Set Demo+Health+Habit+Health Service

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Demiray, O., Gunes, E.D., Kulak, E. et al. Classification of patients with chronic disease by activation level using machine learning methods. Health Care Manag Sci 26, 626–650 (2023). https://doi.org/10.1007/s10729-023-09653-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10729-023-09653-4

Keywords

Navigation