Skip to main content

Using Tree-Based Gradient Boosting to Distinguish Between Lymphoma and COVID-19

  • Conference paper
  • First Online:
Intelligent Sustainable Systems

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 579))

  • 584 Accesses

Abstract

Over 600,000 new lymphoma cases and around 280,000 lymphoma-related deaths were reported in 2020. The delayed diagnosis of lymphoma has long been a problem. However, the advent of the COVID-19 pandemic, which disrupted healthcare services worldwide, may have caused more significant delays in lymphoma diagnoses. Since lymphomas can sometimes present with symptoms like COVID-19 and can affect the lungs, there is also a risk of misdiagnosis. We collected 505 lymphoma and 180 COVID-19 case reports from ScienceDirect and applied boosting methods to classify each patient as having COVID-19 or lymphoma based on the patient’s age, gender and reported symptoms. LightGBM had the highest ROC AUC (0.89), meaning it best differentiates between the two diseases. Therefore, this model can be used as a screening tool to reduce the delay in lymphoma diagnosis and improve the patients’ chances of survival.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al Daoud, E.: Comparison between xgboost, lightgbm and catboost using a home credit dataset. Int. J. Comput. Inf. Eng. 13(1), 6–10 (2019)

    Google Scholar 

  2. Antel, K., Louw, V.J., Maartens, G., Oosthuizen, J., Chetty, D., Verburgh, E.: Diagnosing lymphoma in the shadow of an epidemic: lessons learned from the diagnostic challenges posed by the dual tuberculosis and hiv epidemics. Leukemia Lymphoma 61(14), 3417–3421 (2020)

    Article  Google Scholar 

  3. Asselman, A., Khaldi, M., Aammou, S.: Enhancing the prediction of student performance based on the machine learning xgboost algorithm. Interact. Learn. Environ. 1–20 (2021)

    Google Scholar 

  4. Bentéjac, C., Csörgő, A., Martínez-Muñoz, G.: A comparative analysis of gradient boosting algorithms. Artif. Intel. Rev. 54(3), 1937–1967 (2021)

    Article  Google Scholar 

  5. Cantini, L., Mentrasti, G., Russo, G., Signorelli, D., Pasello, G., Rijavec, E., Russano, M., Antonuzzo, L., Rocco, D., Giusti, R., et al.: Evaluation of covid-19 impact on delaying diagnostic-therapeutic pathways of lung cancer patients in Italy (covid-delay study): fewer cases and higher stages from a real-world scenario. ESMO Open 7(2), 100,406 (2022)

    Google Scholar 

  6. Chaturvedi, A., Dhariwal, A., Patel, M.: Study on prediction of airfares based on xgboost and light gbm machine learning algorithms. Int. Res. J. Modern. Eng. Technol. Sci. 2(4), 155–163 (2020)

    Google Scholar 

  7. Chen, C., Zheng, A., Ou, X., Wang, J., Ma, X.: Comparison of radiomics-based machine-learning classifiers in diagnosis of glioblastoma from primary central nervous system lymphoma. Front. Oncol. 1151 (2020)

    Google Scholar 

  8. Dorogush, A.V., Ershov, V., Gulin, A.: Catboost: gradient boosting with categorical features support (2018). arXiv:1810.11363

  9. Dvori, M., Elitzur, S., Barg, A., Barzilai-Birenboim, S., Gilad, G., Amar, S., Toledano, H., Toren, A., Weinreb, S., Goldstein, G., et al.: Delayed diagnosis and treatment of children with cancer during the covid-19 pandemic. Int. J. Clin. Oncol. 26(8), 1569–1574 (2021)

    Article  Google Scholar 

  10. Harrington, P.: Machine learning in action. Simon and Schuster (2012)

    Google Scholar 

  11. Ju, Y., Sun, G., Chen, Q., Zhang, M., Zhu, H., Rehman, M.U.: A model combining convolutional neural network and lightgbm algorithm for ultra-short-term wind power forecasting. IEEE Access 7, 28309–28318 (2019)

    Article  Google Scholar 

  12. Kiangala, S.K., Wang, Z.: An effective adaptive customization framework for small manufacturing plants using extreme gradient boosting-xgboost and random forest ensemble learning algorithms in an industry 4.0 environment. Mach. Learn. Appl. 4, 100,024 (2021)

    Google Scholar 

  13. Kim, J.S.: Covid-19 prediction and detection using machine learning algorithms: Catboost and linear regression. Am. J. Theoret. Appl. Stat. 10(5), 208–215 (2021)

    Article  Google Scholar 

  14. Kuo, D.E., Wei, M.M., Armbrust, K.R., Knickelbein, J.E., Yeung, I.Y., Nussenblatt, R.B., Chan, C.C., Sen, H.N.: Gradient boosted decision tree classification of endophthalmitis versus uveitis and lymphoma from aqueous and vitreous il-6 and il-10 levels. J. Ocular Pharmacol. Therapeutics 33(4), 319–324 (2017)

    Article  Google Scholar 

  15. Liu, B., Liu, Y., Zhang, J., Zeng, Y., Wang, W.: Application of the synergetic algorithm on the classification of lymph tissue cells. Comput. Biol. Med. 38, 650–8 (2008)

    Article  Google Scholar 

  16. Ma, X., Sha, J., Wang, D., Yu, Y., Yang, Q., Niu, X.: Study on a prediction of p2p network loan default based on the machine learning lightgbm and xgboost algorithms according to different high dimensional data cleaning. Electron. Commerce Res. Appl. 31, 24–39 (2018)

    Article  Google Scholar 

  17. Narkhede, S.: Understanding auc-roc curve. Towards data. Science 26(1), 220–227 (2018)

    Google Scholar 

  18. Pholo, D., Hamam, Y., Khalaf, A., Du, C.: Differentiating between covid-19 and tuberculosis using machine learning and natural language processing. Revue d’Intelligence Artificielle 36, 313–318 (2022). https://doi.org/10.18280/ria.360216

  19. Pholo, M.D., Hamam, Y., Khalaf, A., Du, C.: Combining tf-idf with symptom features to differentiate between lymphoma and tuberculosis case reports. In: 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1–4 (2019). https://doi.org/10.1109/GlobalSIP45357.2019.8969317

  20. Struyf, T., Deeks, J.J., Dinnes, J., Takwoingi, Y., Davenport, C., Leeflang, M.M., Spijker, R., Hooft, L., Emperador, D., Domen, J., et al.: Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has covid-19. Cochrane Database Syst. Rev. (2) (2021)

    Google Scholar 

  21. Thakkar, K., Ghaisas, S.M., Singh, M.: Lymphadenopathy: differentiation between tuberculosis and other non-tuberculosis causes like follicular lymphoma. Front. Publ. Health 4, 31 (2016)

    Article  Google Scholar 

  22. Wang, S., Zheng, Y., Wang, Z., Yao, X., Dong, B., Liu, H., Qu, J.: Comparison of chest ct manifestations of coronavirus disease 2019 (covid-19) and pneumonia associated with lymphoma. Int. J. Med. Sci. 17(13), 1909 (2020)

    Article  Google Scholar 

  23. Xu, X.W., Wu, X.X., Jiang, X.G., Xu, K.J., Ying, L.J., Ma, C.L., Li, S.B., Wang, H.Y., Zhang, S., Gao, H.N., et al.: Clinical findings in a group of patients infected with the 2019 novel coronavirus (sars-cov-2) outside of Wuhan, China: retrospective case series. BMJ 368 (2020)

    Google Scholar 

  24. Yao, D., Zhang, L., Wu, P., Gu, X., Chen, Y., Wang, L., Huang, X.: Clinical and misdiagnosed analysis of primary pulmonary lymphoma: a retrospective study. BMC Cancer 18(1), 1–7 (2018)

    Article  Google Scholar 

  25. Yu, L., Halalau, A., Dalal, B., Abbas, A.E., Ivascu, F., Amin, M., Nair, G.B.: Machine learning methods to predict mechanical ventilation and mortality in patients with covid-19. PLoS One 16(4), e0249,285 (2021)

    Google Scholar 

  26. Zhou, F., Yu, T., Du, R., Fan, G., Liu, Y., Liu, Z., Xiang, J., Wang, Y., Song, B., Gu, X., et al.: Clinical course and risk factors for mortality of adult inpatients with covid-19 in Wuhan, China: a retrospective cohort study. The lancet 395(10229), 1054–1062 (2020)

    Article  Google Scholar 

  27. Zintzaras, E., Bai, M., Douligeris, C., Kowald, A., Kanavaros, P.: A tree-based decision rule for identifying profile groups of cases without predefined classes: application in diffuse large b-cell lymphomas. Comput. Biol. Med. 37(5), 637–641 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moanda Diana Pholo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pholo, M.D., Hamam, Y., Khalaf, A., Tu, C. (2023). Using Tree-Based Gradient Boosting to Distinguish Between Lymphoma and COVID-19. In: Nagar, A.K., Singh Jat, D., Mishra, D.K., Joshi, A. (eds) Intelligent Sustainable Systems. Lecture Notes in Networks and Systems, vol 579. Springer, Singapore. https://doi.org/10.1007/978-981-19-7663-6_43

Download citation

Publish with us

Policies and ethics