Skip to main content

Towards Providing Clinical Insights on Long Covid from Twitter Data

  • Chapter
  • First Online:
Multimodal AI in Healthcare

Abstract

From the outset of the COVID-19 pandemic, social media has provided a platform for sharing and discussing experiences in real time. This rich source of information may also prove useful to researchers for uncovering evolving insights into post-acute sequelae of SARS-CoV-2 (PACS), commonly referred to as Long COVID. In order to leverage social media data, we propose using entity-extraction methods for providing clinical insights prior to defining subsequent downstream tasks. In this work, we address the gap between state-of-the-art entity recognition models and the extraction of clinically relevant entities which may be useful to provide explanations for gaining relevant insights from Twitter data. We then propose an approach to bridge the gap by utilizing existing configurable tools, and datasets to enhance the capabilities of these models. Code for this work is available at: https://github.com/VectorInstitute/ProjectLongCovid-NER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aucott, J. N., & Rebman, A. W. (2021). Long-haul covid: Heed the lessons from other infection-triggered illnesses. The Lancet, 397(10278), 967–968.

    Article  Google Scholar 

  2. Bhambhoria, R., et al. (2020). A smart system to generate and validate question answer pairs for covid-19 literature. In: Proceedings of the First Workshop on Scholarly Document Processing (pp. 20–30)

    Google Scholar 

  3. Bodenreider, O.: The unified medical language system (umls): Integrating biomedical terminology. Nucleic Acids Research, 32(Database issue), 267–270. https://doi.org/10.1093/nar/gkh061.

  4. C., E., et al.: RETAIN: interpretable predictive model in healthcare using reverse time attention mechanism. CoRR abs/1608.05745 (2016), http://arxiv.org/abs/1608.05745

  5. Chiauzzi, E., & Wicks, P. (2019). Digital trespass: Ethical and terms-of-use violations by researchers accessing data from an online patient community. Journal of Medical Internet Research, 21(2), e11985.

    Article  Google Scholar 

  6. Demner-Fushman, D., et al.: MetaMap Lite: An evaluation of a new Java implementation of MetaMap. Journal of the American Medical Informatics Association, 24(4), 841–844 (01 2017). https://doi.org/10.1093/jamia/ocw177

  7. Domingo, F., et al. (2021) Prevalence of long-term effects in individuals diagnosed with covid-19: A living systematic review.

    Google Scholar 

  8. Guo, J., et al. (2020). Mining twitter to explore the emergence of covid-19 symptoms. Public Health Nursing, 37(6), 934–940.

    Article  Google Scholar 

  9. Jelodar, H., et al. (2020). Deep sentiment classification and topic discovery on novel coronavirus or covid-19 online discussions: Nlp using lstm recurrent neural network approach. IEEE Journal of Biomedical and Health Informatics, 24(10), 2733–2742.

    Article  Google Scholar 

  10. M., G., et al.: Umlsbert: Clinical domain knowledge augmentation of contextual embeddings using the unified medical language system metathesaurus (2020)

    Google Scholar 

  11. Mackey, T., et al. (2020). Machine learning to detect self-reporting of symptoms, testing access, and recovery associated with covid-19 on twitter: retrospective big data infoveillance study. JMIR Public Health and Surveillance, 6(2), e19509.

    Article  Google Scholar 

  12. Magge, A., Klein, A., Miranda-Escalada, A., Al-garadi, M.A., Alimova, I., Miftahutdinov, Z., Farre-Maduell, E., Lopez, S.L., Flores, I., O’Connor, K., Weissenbacher, D., Tutubalina, E., Sarker, A., Banda, J.M., Krallinger, M., & Gonzalez-Hernandez, G. (Eds.). (2021). In Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task. Association for Computational Linguistics, Mexico City, Mexico, June 2021. https://aclanthology.org/2021.smm4h-1.0

  13. Morgan, M., et al. (2014). Information extraction for social media. In: Proceedings of the Third Workshop on Semantic Web and Information Extraction (pp. 9–16).

    Google Scholar 

  14. Müller, M., et al. (2020). Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. arXiv:2005.07503 (2020)

  15. Phillips, N. (2021). The coronavirus is here to stay-here’s what that means. Nature, 590(7846), 382–384.

    Article  Google Scholar 

  16. Pradhan, S., et al. (2014). Semeval-2014 task 7: Analysis of clinical text. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Citeseer

    Google Scholar 

  17. Qin, L., et al. (2020). Prediction of number of cases of 2019 novel coronavirus (covid-19) using social media search index. International Journal of Environmental Research and Public Health, 17(7), 2365.

    Article  Google Scholar 

  18. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead (2019)

    Google Scholar 

  19. Sarker, A., Gonzalez-Hernandez, G.: Overview of the second social media mining for health (smm4h) shared tasks at AMIA 2017. In: Proceedings of the 2nd Social Media Mining for Health Research and Applications Workshop co-located with the American Medical Informatics Association Annual Symposium (AMIA 2017). http://ceur-ws.org/Vol-1996/.

  20. Sha, Y., Wang, M.D. (2017). Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 233–240).

    Google Scholar 

  21. Smailhodzic, E., et al. (2016). Social media use in healthcare: A systematic review of effects on patients and on their relationship with healthcare professionals. BMC Health Services Research, 16(1), 1–14.

    Article  Google Scholar 

  22. Staccini, P., Lau, A. Y., et al. (2020). Social media, research, and ethics: Does participant willingness matter? Yearbook of Medical Informatics, 29(01), 176–183.

    Article  Google Scholar 

  23. Sudre, C., et al. (2021). Attributes and predictors of long covid. Nature Medicine, 27(4), 626–631.

    Article  Google Scholar 

  24. Suominen, H., et al. (2013). Overview of the share/clef ehealth evaluation lab 2013. In: International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 212–231). Springer.

    Google Scholar 

  25. Tsao, S., et al. (2021). What social media told us in the time of covid-19: A scoping review. The Lancet Digital Health.

    Google Scholar 

  26. Uzuner, Ö., et al. (2011). 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18(5), 552–556.

    Article  Google Scholar 

  27. Vijayan, T., et al. (2020). Trusting evidence over anecdote: Clinical decision making in the era of covid-19. BMJ. https://blogs.bmj.com/bmj/2020/07/23/trusting-evidence-over-anecdote-clinical-decision-making-in-the-era-of-covid-19/.

  28. Wang, Y., et al. (2021). Examining risk and crisis communications of government agencies and stakeholders during early-stages of covid-19 on twitter. Computers in Human Behavior, 114, 106568.

    Article  Google Scholar 

  29. Wiegreffe, S., Pinter, Y. (2019). Attention is not not explanation. arxiv:1908.04626.

  30. Williams, M. L., Burnap, P., & Sloan, L. (2017). Towards an ethical framework for publishing twitter data in social research: Taking into account users’ views, online context and algorithmic estimation. Sociology, 51(6), 1149–1168.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Vector Institute for making this collaboration possible and providing academic infrastructure and computing support during all phases of this work. We would also like to thank Antoaneta Vladimirova and Celine Leng from Roche, Esmat Sahak from University of Toronto for their support throughout this project, as well as Dr. Angela Cheung from the University Health Network for her expertise and guidance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rohan Bhambhoria .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bhambhoria, R. et al. (2023). Towards Providing Clinical Insights on Long Covid from Twitter Data. In: Shaban-Nejad, A., Michalowski, M., Bianco, S. (eds) Multimodal AI in Healthcare. Studies in Computational Intelligence, vol 1060. Springer, Cham. https://doi.org/10.1007/978-3-031-14771-5_19

Download citation

Publish with us

Policies and ethics