Skip to main content

Advertisement

Log in

Optimized support vector regression predicting treatment duration among tuberculosis patients in Malaysia

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Machine learning models have emerged as an advanced tool for predicting diseases and their outcomes. This study developed a machine learning model to predict the treatment duration for Tuberculosis patients in Malaysia based on a real-life patient dataset. Six regression models, namely Support Vector Regression, Linear Regression, Lasso Regression, Ridge Regression, Random Forest Regression, and Gradient Boosting Regression were initially developed and then optimized through hyperparameter tuning to determine the best predictive model. Using a dataset of 435 Malaysian Tuberculosis patients, we compared our results with data from countries with high Tuberculosis prevalence rates, namely Belarus, Nigeria, and Georgia. Experimentations revealed Support Vector Regression emerged as the best performing model as it can predict treatment duration with the lowest error rates (Mean Absolute Error = 69.70; Root Mean Squared Error = 109.49). Eight significant risk factors were identified for the Malaysian dataset through Pearson correlation, namely, treatment outcome, treatment status, fixed dose combination dosage, maintenance phase regimen, chest X-ray findings, tuberculin skin test, location of treatment initiation, and levofloxacin-based regimen. Comparison with data from other countries confirmed the consistent performance of the optimized Support Vector Regression model in predicting Tuberculosis treatment duration, hence rendering the model generalizable. To the best of our knowledge, this is the first study to demonstrates the effectiveness of machine learning in predicting Tuberculosis treatment duration based on potential risk factors. These findings will help clinicians make informed decisions about the optimal treatment duration, prepare patients' expectations, and estimate the cost of Tuberculosis treatment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The dataset (UM TB) generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

The TB Portal data available at https://depot.tbportals.niaid.nih.gov/.

Notes

  1. https://scikit-learn.org/stable/modules/ensemble.html

  2. https://scikit-learn.org/stable/modules/linear_model.html

  3. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

  4. https://scikit-learn.org/stable/model_selection.html

  5. https://scikit-learn.org/stable/modules/model_evaluation.html

  6. https://depot.tbportals.niaid.nih.gov/

References

  1. Atif M, Sulaiman SAS, Shafie AA, Babar ZU (2015) Duration of treatment in pulmonary tuberculosis: Are international guidelines on the management of tuberculosis missing something? Public Health 129(6):777–782. https://doi.org/10.1016/j.puhe.2015.04.010

    Article  Google Scholar 

  2. Bangalore S, Kamalakkannan G, Parkar S, Messerli FH (2007) Fixed-Dose Combinations Improve Medication Compliance: A Meta-Analysis. Am J Med 120(8):713–719. https://doi.org/10.1016/j.amjmed.2006.08.033

    Article  Google Scholar 

  3. Bartholomay P, Pelissari DM, de Araujo WN, Yadon ZE, Heldal E (2016) Quality of tuberculosis care at different levels of health care in Brazil in 2013. Rev. Panam. Salud Publica 39(1):3–11

    Google Scholar 

  4. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    MathSciNet  Google Scholar 

  5. Chen ML, Doddi A, Royer J, Freschi L, Schito M, Ezewudo M, Kohane IS, Beam A, Farhat M (2019) Beyond multidrug resistance: Leveraging rare variants with machine and statistical learning models in Mycobacterium tuberculosis resistance prediction. EBioMedicine 43:356–369. https://doi.org/10.1016/j.ebiom.2019.04.016

    Article  Google Scholar 

  6. Govindarajan S, Swaminathan R (2021) Extreme learning machine based differentiation of pulmonary tuberculosis in chest radiographs using integrated local feature descriptors. Comput Methods Programs Biomed 204:106058. https://doi.org/10.1016/j.cmpb.2021.106058

    Article  Google Scholar 

  7. Haddad MB, Lash TL, Castro KG, Hill AN, Navin TR, Gandhi NR, Magee MJ (2020) Tuberculosis infection among people with diabetes: U.S. population differences by race/ethnicity. Am J Prev Med 58(6):858–863. https://doi.org/10.1016/j.amepre.2019.12.010

    Article  Google Scholar 

  8. Huang JC, Tsai YC, Wu PY, Lien YH, Chien CY, Kuo CF, Hung JF, Chen SC, Kuo CH (2020) Predictive modeling of blood pressure during hemodialysis: a comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method. Comput Methods Programs Biomede 195:105536. https://doi.org/10.1016/j.cmpb.2020.105536

    Article  Google Scholar 

  9. Kulwant KK, Said SM, Ismail SNS, Ying LP (2020) Risk factors of unfavourable TB treatment outcomes in Hulu Langat, Selangor. Malaysian J Med Health Sci 18(1):52–60

    Google Scholar 

  10. Kumari K, Yadav S (2018) Linear regression analysis study. Journal of the Practice of Cardiovascular Sciences 4(1):33. https://doi.org/10.4103/jpcs.jpcs_8_18

    Article  Google Scholar 

  11. Lai HH, Lai YJ, Yen YF (2017) Association of body mass index with timing of death during tuberculosis treatment. PLoS ONE 12(1):1–12. https://doi.org/10.1371/journal.pone.0170104

    Article  MathSciNet  Google Scholar 

  12. Lim RBT, Wee WK, For WC, Ananthanarayanan JA, Soh YH, Goh LML, Tham DKT, Wong ML (2020) Correlates, facilitators and barriers of physical activity among primary care patients with prediabetes in Singapore - A mixed methods approach. BMC Public Health 20(1):1–13. https://doi.org/10.1186/s12889-019-7969-5

    Article  Google Scholar 

  13. Luo Y, Xue Y, Song H, Tang G, Liu W, Bai H, Yuan X, Tong S, Wang F, Cai Y, Sun Z (2022) Machine learning based on routine laboratory indicators promoting the discrimination between active tuberculosis and latent tuberculosis infection. J Infect 84(5):648–657. https://doi.org/10.1016/j.jinf.2021.12.046

    Article  Google Scholar 

  14. Nahid P, Dorman SE, Alipanah N, Barry PM, Brozek JL, Cattamanchi A, Chaisson LH, Chaisson RE, Daley CL, Grzemska M, Higashi JM, Ho CS, Hopewell PC, Keshavjee SA, Lienhardt C, Menzies R, Merrifield C, Narita M, O’Brien R, Vernon A (2016) Official American thoracic society/centers for disease control and prevention/infectious diseases society of America clinical practice guidelines: treatment of drug-susceptible tuberculosis. Clin Infectious Dis 63(7):e147–e195. https://doi.org/10.1093/cid/ciw376

    Article  Google Scholar 

  15. Norval PY, Blomberg B, Kitler ME, Dye C, Spinaci S (1999) Estimate of the global market for rifampicin-containing fixed-dose combination tablets. Int. J. Tuberc. Lung Dis 3(11 SUPPL. 3):292–300

    Google Scholar 

  16. Potdar K, Pardawala TS, Pai DC (2017) A comparative study of categorical variable encoding techniques for neural network classifiers. Int J Comput Appl 175(4):7–9. https://doi.org/10.5120/ijca2017915495

    Article  Google Scholar 

  17. Rosenthal A, Gabrielian A, Engle E, Hurt DE, Alexandru S, Crudu V, Sergueev E, Kirichenko V, Lapitskii V, Snezhko E, Kovalev V, Astrovko A, Alena S, Taaffe J, Harris M, Long A, Wollenberg K, Akhundova I, Ismayilova S, Mindru R (2017) The TB portals: an open-access, web_based platform for global drug-resistant_tuberculosis data sharing and analysis. J Clin Microbiol 55(11):3267–3282

    Article  Google Scholar 

  18. Sałat R, Sałat K (2013) The application of support vector regression for prediction of the antiallodynic effect of drug combinations in the mouse model of streptozocin-induced diabetic neuropathy. Comput Methods Programs Biomed 111(2):330–337. https://doi.org/10.1016/j.cmpb.2013.04.018

    Article  Google Scholar 

  19. Sauer CM, Sasson D, Paik KE, McCague N, Celi LA, Fernández IS, Illigens BMW (2018) Feature selection and prediction of treatment failure in tuberculosis. PLoS ONE 13(11):1–14. https://doi.org/10.1371/journal.pone.0207491

    Article  Google Scholar 

  20. Schober P, Schwarte LA (2018) Correlation coefficients: Appropriate use and interpretation. Anesth Analg 126(5):1763–1768. https://doi.org/10.1213/ANE.0000000000002864

    Article  Google Scholar 

  21. Seo D, Kang E, Kim YM, Kim SY, Oh IS, Kim MG (2020) SVM-based waist circumference estimation using Kinect. Comput Methods Programs Biomed 191:105418. https://doi.org/10.1016/j.cmpb.2020.105418

    Article  Google Scholar 

  22. Sharma A, Machado E, Lima KVB, Suffys PN, Conceição EC (2022) Tuberculosis drug resistance profiling based on machine learning: A literature review. Braz J Infect Dis 26(1):1–9. https://doi.org/10.1016/j.bjid.2022.102332

    Article  Google Scholar 

  23. Siddiqa A, Naqvi SAZ, Ahsan M, Ditta A, Alquhayz H, Khan MA, Khan MA (2022) Robust length of stay prediction model for indoor patients. Computers, Materials and Continua 70(3):5519–5536. https://doi.org/10.32604/cmc.2022.021666

    Article  Google Scholar 

  24. Singh H, Ramamohan V (2020) A model-based investigation into urban-rural disparities in tuberculosis treatment outcomes under the Revised National Tuberculosis Control Programme in India. PLoS ONE 15(2):1–15. https://doi.org/10.1371/journal.pone.0228712

    Article  Google Scholar 

  25. Timimi H, Falzon D, Glaziou P, Sismanidis C, Floyd K (2012) WHO guidance on electronic systems to manage data for tuberculosis care and control. J Am Med Inform Assoc 19(6):939–941. https://doi.org/10.1136/amiajnl-2011-000755

    Article  Google Scholar 

  26. Tok PSK, Liew SM, Wong LP, Razali A, Loganathan T, Chinna K, Ismail N, Kadir NA (2020) Determinants of unsuccessful treatment outcomes and mortality among tuberculosis patients in Malaysia: A registry-based cohort study. PLoS ONE 15(4):1–14. https://doi.org/10.1371/journal.pone.0231986

    Article  Google Scholar 

  27. Wang S, Tang J, Liu H (2016) Feature selection. Encyclopedia of Machine Learning and Data Mining, January. https://doi.org/10.1007/978-1-4899-7502-7_101-1

    Article  Google Scholar 

  28. World Health Organization (2021) The end strategy TB. World Health Organization 53(9):1689–1699

    Google Scholar 

  29. Alsaffar M, Alshammari G, Alshammari A, Aljaloud S, Almurayziq TS, Hamad AA, Kumar V, Belay A (2021). Detection of tuberculosis disease using image processing technique. Mob Inf Syst, 2021. https://doi.org/10.1155/2021/7424836

  30. Althomsons SP, Winglee K, Heilig CM, Talarico S, Silk B, Wortham J, Hill AN, Navin TR (2022). Using machine learning techniques and national tuberculosis surveillance data to predict excess growth in genotyped tuberculosis clusters. AJR Am J Roentgenol 186(2), 227–236. https://pubmed.ncbi.nlm.nih.gov/28459981/

  31. An L, Peng K, Yang X, Huang P, Luo Y, Feng P, Wei B (2022). Article E‐TBNet: Light Deep Neural Network for Automatic Detection of Tuberculosis with X‐ray DR Imaging. Sensors, 22(3). https://doi.org/10.3390/s22030821

  32. Asad M, Mahmood A, Usman M (2020). A machine learning-based framework for Predicting Treatment Failure in tuberculosis: A case study of six countries. Tuberculosis (Edinburgh, Scotland), 123(June), 101944. https://doi.org/10.1016/j.tube.2020.101944

  33. Avoi R, Liaw YC (2021). Tuberculosis death epidemiology and its associated risk factors in sabah, malaysia. Int J Environ Health Res 18(18). https://doi.org/10.3390/ijerph18189740

  34. Banga A, Ahuja R, Sharma SC (2021). Performance analysis of regression algorithms and feature selection techniques to predict PM2.5 in smart cities. International Journal of Systems Assurance Engineering and Management. https://doi.org/10.1007/s13198-020-01049-9

  35. Basak D, Pal S, Patranabis DC (2007). Support vector regression. Neural Information Processing-Letters and Reviews, 11.

  36. Brindha GR, Rishiikeshwer BS, Santhi B, Nakendraprasath K, Manikandan R, Gandomi AH (2022). Precise prediction of multiple anticancer drug efficacy using multi target regression and support vector regression analysis. Comput Methods Programs Biomed, 224, 107027. https://doi.org/10.1016/j.cmpb.2022.107027

  37. Brownlee J (2019). How to choose a feature selection method for machine learning. https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/

  38. Centers for Disease Control and Prevention (2020). Targeted TB Testing & Interpreting Skin Test Results. https://www.cdc.gov/tb/publications/factsheets/testing/skintestresults.htm

  39. Iqbal A, Usman M, Ahmed Z (2022). An efficient deep learning-based framework for tuberculosis detection using chest X-ray images. Tuberculosis, 136(July), 102234. https://doi.org/10.1016/j.tube.2022.102234

  40. Karumbi J, Garner P (2015). Directly observed therapy for treating tuberculosis. Cochrane Database Syst. Rev, 2015(5). https://doi.org/10.1002/14651858.CD003343.pub4

  41. Meraj SS, Yaakob R, Azman A, Rum SNM, Nazri ASA (2019). Artificial intelligence in diagnosing tuberculosis: A review. International Journal on Advanced Science, Engineering and Information Technology, 9(1), 81–91. https://doi.org/10.18517/ijaseit.9.1.7567

  42. Mohidem NA, Osman M, Muharam FM, Elias SM, Shaharudin R, Hashim Z (2021). Prediction of tuberculosis cases based on sociodemographic and environmental factors in Gombak, Selangor, Malaysia: A Comparative Assessment of Multiple Linear Regression and Artificial Neural Network Models. Int. J. Microbiol. https://doi.org/10.4103/ijmy.ijmy

  43. Natekin A, Knoll A (2013). Gradient boosting machines, a tutorial. Frontiers in Neurorobotics, 7(DEC). https://doi.org/10.3389/fnbot.2013.00021

  44. Pusch T, Pasipanodya JG, Hall RG, Gumbo T (2014). Therapy duration and long-term outcomes in extra-pulmonary tuberculosis. BMC Infectious Diseases, 14(1). https://doi.org/10.1186/1471-2334-14-115

  45. Rajendran M, Zaki RA, Aghamohammadi N (2020). Contributing risk factors towards the prevalence of multidrug-resistant tuberculosis in Malaysia: A systematic review. Tuberculosis, 122(March), 101925. https://doi.org/10.1016/j.tube.2020.101925

  46. Richesson RL, Hammond WE, Nahm M, Wixted D, Simon GE, Robinson JG, Bauck AE, Cifelli D, Smerek MM, Dickerson J, Laws RL, Madigan RA, Rusincovitch SA, Kluchar C, Califf RM (2013). Electronic health records based phenotyping in next-generation clinical trials: A perspective from the NIH health care systems collaboratory. J Am Med Inform Assoc, 20(E2). https://doi.org/10.1136/amiajnl-2013-001926

  47. Rocha MS, Oliveira GP, Saraceni V, Aguiar FP, Coeli CM, Pinheiro RS (2018). Effect of inpatient and outpatient care on treatment outcome in tuberculosis: A cohort study. Rev Panam Salud Publica 42, 1–8. https://doi.org/10.26633/RPSP.2018.112

  48. World Health Organization (2022). Drug-resistant TB. https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-tuberculosis-report-2022/tb-disease-burden/2-3-drug-resistant-tb

Download references

Acknowledgements

The authors wish to extend their gratitude to Ng Kee Seong for assisting with technical editing of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vimala Balakrishnan.

Ethics declarations

Competing interests

No competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Balakrishnan, V., Ramanathan, G., Zhou, S. et al. Optimized support vector regression predicting treatment duration among tuberculosis patients in Malaysia. Multimed Tools Appl 83, 11831–11844 (2024). https://doi.org/10.1007/s11042-023-16028-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16028-y

Keywords

Navigation