Skip to main content

A Multi-label Classification Study for the Prediction of Long-COVID Syndrome

  • Conference paper
  • First Online:
AIxIA 2023 – Advances in Artificial Intelligence (AIxIA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14318))

Abstract

We present a study about the prediction of long-COVID sequelae through multi-label classification (MLC). Data about more than 300 patients have been collected during a long-COVID study at Ospedale Maggiore of Novara (Italy), considering their baseline situation, as well as their condition on acute COVID-19 onset. The goal is to predict the presence of specific long-COVID sequelae after a one-year follow-up. To amplify the representativeness of the analysis, we carefully investigated the possibility of augmenting the dataset, by considering situations where different levels in the number of complications could arise. MLSmote under six different policies of data augmentation has been considered, and a representative set of MLC approaches have been tested on all the available datasets. Results have been evaluated in terms of Accuracy, Exact match, Hamming Score and macro-averaged AUC; they show that MLC methods can actually be useful for the prediction of specific long-COVID sequelae, under the different conditions represented by the different considered datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Actually the collected data concerned much more hospitalized patients, but we have been able to work only with those patients who decided to partecipate in the study and for which reliable data were available [5].

References

  1. TECNOMED-HUB webpage. https://www.tecnomedhub.it. Accessed 30 June 2023

  2. Atkinson, A.: On the measurement of inequality. J. Econ. Theory 2(3), 244–263 (1970)

    Article  MathSciNet  Google Scholar 

  3. Baarts, J., et al.: Multilabel classification of disease prediction in patients presenting with dyspnea. Eur. Respir. J. 58(suppl 65) (2021)

    Google Scholar 

  4. Bellan, M., et al.: Long-term sequelae are highly prevalent one year after hospitalization for severe covid-19. Sci. Rep. 11(1), 22666 (2021)

    Article  Google Scholar 

  5. Bellan, M., Soddu, D., Balbo, P.E., Baricich, A., Zeppegno, P., et al.: Respiratory and psychophysical sequelae among patients with covid-19 four months after hospital discharge. JAMA Netw. 41(1), e2036142 (2021)

    Article  Google Scholar 

  6. Bogatinovski, J., Todorovski, L., Džeroski, S., Kocev, D.: Comprehensive comparative study of multi-label classification methods. Expert Syst. Appl. 203, 117215 (2022)

    Article  Google Scholar 

  7. Charte, F., Rivera, A., delJesus, M., Herrera, F.: MLSMOTE: approaching imbalanced multilabeled learning through synthetic instance generation. Knowl. Based Syst. 89, 385–397 (2015)

    Article  Google Scholar 

  8. Charte, F., Rivera, A., delJesus, M., Herrera, F.: Dealing with difficult minority labels in imbalanced mutilabel data sets. Neurocomputing 326–327, 39–53 (2019)

    Article  Google Scholar 

  9. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)

    Article  MATH  Google Scholar 

  10. Frank, E., Hall, M., Witten, I.: The WEKA workbench. In: Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. (2016). (Online Appendix)

    Google Scholar 

  11. Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. (CSUR) 47(3), 1–38 (2015)

    Article  Google Scholar 

  12. Guo, Y., Gu, S.: Multi-label classification using conditional dependency networks. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 1300–1305 (2011)

    Google Scholar 

  13. Huang, Y., et al.: A multi-label learning prediction model for heart failure in patients with atrial fibrillation based on expert knowledge of disease duration. Appl. Intell., 1–12 (2023)

    Google Scholar 

  14. Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)

    Article  Google Scholar 

  15. Nalbandian, A., et al.: Post-acute covid-19 syndrome. Nat. Med. 27(4), 601–615 (2021)

    Article  Google Scholar 

  16. Panigutti, C., Guidotti, R., Monreale, A., Pedreschi, D.: Explaining multi-label black-box classifiers for health applications. In: Shaban-Nejad, A., Michalowski, M. (eds.) W3PHAI 2019. SCI, vol. 843, pp. 97–110. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-24409-5_9

    Chapter  Google Scholar 

  17. Rana, P., Sowmya, A., Meijering, E., Song, Y.: Imbalanced classification for protein subcellular localization with multilabel oversampling. Bioinformatics 39(1), btac841 (2023)

    Google Scholar 

  18. Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85, 333–359 (2011)

    Article  MathSciNet  Google Scholar 

  19. Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), pp. 995–1000 (2008)

    Google Scholar 

  20. Read, J., Reutemann, P., Pfahringer, B., Holmes, G.: MEKA: a multi-label/multi-target extension to Weka. J. Mach. Learn. Res. 17(21), 1–5 (2016). http://meka.sourceforge.net/

  21. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, Cham (2004). https://doi.org/10.1007/978-1-4757-4145-2

    Book  MATH  Google Scholar 

  22. Tabia, K.: Towards explainable multi-label classification. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1088–1095 (2019). https://doi.org/10.1109/ICTAI.2019.00152

  23. Tarekegn, A., Giacobini, M., Michalak, K.: A review of methods for imbalanced multi-label classification. Pattern Recogn. 118, 107965 (2021)

    Article  Google Scholar 

  24. Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Data Eng. 23, 1079–1089 (2011)

    Article  Google Scholar 

  25. Zaragoza, J., Sucar, L., Morales, E., Bielza, C., Larranaga, P.: Bayesian chain classifiers for multidimensional classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 2192–2197 (2011)

    Google Scholar 

  26. Zhou, L., Zheng, X., Yang, D., Wang, Y., Bai, X., Ye, X.: Application of multi-label classification models for the diagnosis of diabetic complications. BMC Med. Inform. Decis. Making 21(1), 182 (2021)

    Article  Google Scholar 

Download references

Acknowledgments

M. Dossena and C. Irwin are supported by the National PhD program in Artificial Intelligence for Healthcare and Life Sciences (Campus Bio-medico University of Rome). We want to thank A. Chiocchetti and M. Bellan for having provided us with the long-COVID data and for several fruitful discussions about the case study. This work was funded by “Piano Riparti Piemonte”, Azione n. 173 “INFRA-P. Realizzazione, rafforzamento e ampliamento infrastrutture di ricerca pubbliche—bando INFRA-P2-TECNOMED-HUB n.378-48”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luigi Portinale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dossena, M., Irwin, C., Piovesan, L., Portinale, L. (2023). A Multi-label Classification Study for the Prediction of Long-COVID Syndrome. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47546-7_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47545-0

  • Online ISBN: 978-3-031-47546-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics