Skip to main content
Log in

Translator Data Pre-processing Gram Feature Algorithmic Model (TDGA) for Enhancing Classifier Accuracy in the Healthcare Domain

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Coronavirus COVID-19 has been spreading like wildfire all over the world since the year 2019. Nowadays, everyone is interacting with social media on a regular basis. In this study, the primary objective is to examine how Twitter users feel about COVID-19’s social life and to conclude their opinions. The polarity of feelings is determined by machine-learning classifiers that use popular words such as Coronavirus and COVID-19 to identify them. In addition, viruses such as BA-4 and BA-5 types of omicrons are spreading widely all over the globe. In order to prevent themselves from these types of viruses, the public needs to know the exact sentiment of the current social life problem. Using the newly reported topics/themes/issues and the associated sentiments from various factors, the COVID-19 pandemic can be better understood. A large dataset of tweets conveying information regarding COVID-19 is analyzed in this article, in particular, the credibility of the information shared on Twitter. It was evaluated using unigrams, bigrams, and trigrams with different parameters such as f1 score, precision, recall and compared against accuracy using machine-learning techniques such as Support Vector Machine (SVM), Naive Bayes (NB), and Logistic Regression (LR). The model TDGA (Translator Data pre-processing Gram feature Algorithmic model) performs well in individuals’ assessments of COVID-19 and benchmark COVID datasets, with a maximum efficiency of 86%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Dataset used in this journal is Benchmark twitter dataset taken form IEEE Repository.

References

  1. Singh M, Jakhar AK. Sentiment analysis on the impact of coronavirus in social life using the BERT model. Soc Netw Anal Min. 2021. https://doi.org/10.1007/s13278-021-00737-z.

    Article  Google Scholar 

  2. Sethi M, Pande S, Trar P, Soni P. Sentiment identification in COVID-19 specific tweets. In: International conference on electronics and sustainable communication systems, ICESC. 2020. p. 509.

  3. Raheja S, Asthana A. Sentiment analysis of twitter comments on COVID-19. In: International conference on cloud computing data science and engineering. IEEE; 2021. p. 704–8.

  4. Sattar NS, Arifuzzaman S. COVID-19 vaccination awareness and aftermath: public sentiment analysis on twitter data and vaccinated population prediction in the USA. Applied Science. 2021;11:1–32.

    Article  Google Scholar 

  5. D’Aniello G, Gaela M, Rocca IL. Know MIS-ABSA: an overview and a reference model for applications of sentiment analysis and aspect-based sentiment analysis. Artif Intell Rev. 2022. https://doi.org/10.1007/s10462-021-10134-9.

    Article  Google Scholar 

  6. Rahat AM, Kahir A, Masum AKM. Comparison of Naïve Bayes and SVM algorithm base on sentiment analysis using review dataset. IEEE 23rd Nov 2019. p. 266–70.

  7. Relucio FS, Palaoag TD. Sentiment analysis on educational posts from social media. IC4E 2018, January 11–13, 2018, San Diego, © 2018 Association for Computing Machinery, p. 99–102.

  8. Kaur C, Sharma A, et al. COVID-19 sentimental analysis using machine learning techniques. In: Panigrahi CR, et al., editors. Progress in advanced computing and intelligent engineering: proceedings of ICACIE 2020. Singapore: Springer; 2021. p. 153–62.

    Chapter  Google Scholar 

  9. Qiang ZJ, Qiang ZJ, Xiaolin G. Comparison research on text preprocessing methods on twitter sentiment analysis. IEEE Access. 2017;5:2870–9.

    Article  Google Scholar 

  10. Ansari MZ, Aziza MB, Siddiqui MO, Mehraa H, Singha KP. Analysis of political sentiment orientations on twitter. Procedia Computer Science. 2020;167:1821–8.

    Article  Google Scholar 

  11. Gupta P, Kumar S, Suman RR. Sentiment analysis of lockdown in india during COVID-19: a case study on twitter. IEEE Trans Comput Soc Syst. 2021. https://doi.org/10.1109/TCSS.2020.3042446.

    Article  Google Scholar 

  12. Gulati K, Saravana Kumar S, Kumar RS. Comparative analysis of machine learning-based classification of tweets related to COVID-19 pandemic. Materials Today Proceedings. 2021. https://doi.org/10.1016/j.matpr.2021.04.364.

    Article  Google Scholar 

  13. Birjalia M, Beni-Hssane A, Erritali M. Machine learning and semantic sentiment analysis based algorithms for suicide sentiment prediction in social networks. Procedia Computer Science. 2017;113:65–72.

    Article  Google Scholar 

  14. Raza GM, Butt ZS, Latif S, Wahid A. Sentiment analysis on COVID tweets: an experimental analysis on the impact of count vectorizer and TF-IDF on sentiment predicitions using deep learning models. In: International conference on digital futures and transformative technologies. IEEE Explore. 2021. p. 1–6.

  15. Ahuja R, Sharma SC, et al. Sentiment analysis on different domain using machine learning algorithms. In: Tiwari S, et al., editors. Advances in data and information sciences. Singapore: Springer; 2022. p. 143–53.

    Chapter  Google Scholar 

  16. Cabinet Secretariat of Republic of Indonesia. Inilah PP Pembatasan Sosial Berskala Besar untuk Percepatan Penanganan COVID-19 [Title in English: This is the PP for Large-Scale Social Restrictions to Accelerate HAndling COid-19. 2020.

  17. Ministry of National Development Planning, Republic of Indonesia. SDGs: Solusi Bersama Pulihkan Indonesia Pascapandmic COVID-19 [Title in English: SDGs: A Joint Solution to Restore Indonesia Post – COVID-19 Pandemic]. [Online]. 2020. http://www.bappenas.go.id/id/berita-dan-siaran-pers/sdgs-solusi-bersama-pulihkan-indonesia-pascapandemi-COVID-19.

  18. Morgan H. Best practices for implementing remote learning during a pandemic. The Clearing House: A Journal of Educational Strategies, Issues and Ideas. 2020;93(3):135–41.

    Article  Google Scholar 

  19. Oktawirawan DH. Faktor pemicu Kecemasan siswa dalam melakukan pembelajaran daring di masa pandemic COVID-19 [Title in English : Factors that trigger student anxiety in conducting online learn ing during the COVID-19 pandemic]. Journal Ilmiah Universitas Batanghari Jambi 2020;20(2):541–544.

  20. Ministry of Education and Culture, Republic of Indonesia. Kemendikbud Siapkan Kebijakan Pembelajaran Tatap Muka Terbatas [Title in English: Ministry of Education and Culture PrepaRES Limited face- to-Face Learning Policy].: [Online]. 2021. http://www.Kemdikbud.go.id/main/blog/2021/03/kemendikbud-siapkan-kebijakan-pembelajaran-tatap-muka-terbatas.

  21. BBC Indonesia. Pendidikan anak:Mendikbud tegaskan sekolah tatap muka harus dibuka lagi setelah semua guru divaksinasi COVID-19[Title in English: Children’s education:Minister of Education and Culture confirms face-to-face schools must be reopened after all teachers are vaccinated against COVID-19]. [Online]. 2021. https://www.bbc.com/inonesia/indoneisa-56573908.

  22. Nielsen Global Media. COVID-19 and The State of Media in North Asia. [Online]. 2020. http://www.nielsen.com/wp-content/uploads/sites/3/2020/03/The-Impact-of-COVID-19-on-Media-Consumption-Across-North.pdf.

  23. Fauziyyah AK. Analisis Sentiment Pandemi COVID-19 Pada Streaming Twitter DEngan Text Mining Python [Title in English:Analysis of COVID-19 Pandemic Sentiment On Twitter Streaming With Text Mining Python]. Journal Ilmaih SINUS 2020;18(2):31–42.

  24. Khan R, Shrivastava P, Kapoor A, Tiwari A, Mittal A. Social media analysis with AI: sentiment analysis techniques for the analysis of twitter COVID-19 data. J Crit Rev. 2020;7(09):2761–2774. ISSN:2394–5125.

  25. Shofiya C, Abidi S. Sentiment analysis on COVID-19 related social distancing in Canada using Twitter data. Int J Environ Res Public Health. 2021;19(11):5993.

    Article  Google Scholar 

  26. Kahan L, Amjad A, Asraf N. Urdu sentiment analysis with deep learning methods. IEEE Access. 2021;9:97803–12.

    Article  Google Scholar 

  27. Paliwal S, Parveen S, et al. Sentiment analysis of COVID-19 vaccine rollout in India. In: Tuba M, et al., editors. ICT ICT systems and sustainability: proceedings of ICT4SD 2021, volume 1. Singapore: Springer; 2022. p. 21–33.

    Chapter  Google Scholar 

  28. Ghasiya P, Okamura K. Investigating COVID-19 news across four nations: a topic modelling and sentiment analysis approach. IEEE Access. 2021;9:36645–56.

    Article  Google Scholar 

  29. Naseem U, Razzak I, Khushi M, Eklund PW. COVIDSenti: a large –scale benchmark twitter data set for COVID-19 sentiment analysis. IEEE Transactions on Computational Social System. 2021;8(4):1003–15.

    Article  Google Scholar 

  30. Babu NV. Sentiment analysis in social media data for depression detection using artificial intelligence: a review. SN Computer Science. 2021;3:1–20.

    Google Scholar 

  31. Khan R, Shrivastava P, Kapoor A, Tiwari A, Mittal A. Social media analysis with AI: sentiment analysis techniques for the analysis of twitter COVID-19 data. J Crit Rev. 2020;7:2761–74.

    Google Scholar 

  32. Arora P, Arora P. Mining twitter data for depression. In: IEEE international conference on signal processing and communication (ICSC). 2019. p. 186–189

  33. Fitri VA, Andreswari R, Hasibuan MA. Sentiment analysis of social media twitter with case of anti-LGBT campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest algorithm. Procedia Computer Science. 2019;161:765–72.

    Article  Google Scholar 

  34. Yadav N, Kudale O, Rao A, Gupta S, Shitole AK, et al. Twitter sentiment analysis supervised machine learning. In: Hemanth J, et al., editors. intelligent data communication technologies and internet of things: proceedings of ICICI 2020. Singapore: Springer; 2021. p. 1–18.

    Google Scholar 

  35. Sathya A, Mythili MS. An investigation of machine learning algorithm in sentiment analysis. Adv Appl Math Sci 2022;4575–4584.

  36. Alamrani Y, Lazaar M, Kadirip KEEI. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Computer Science. 2018. https://doi.org/10.1016/j.procs.2018.01.150.

    Article  Google Scholar 

  37. Qutab I, Malik KI, Arooj H. Sentiment classification using multinomial logistic regression on Roman Urdu text. International Journal of Innovations in Science And Technology. 2020;4(4):323–35.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Sathya.

Ethics declarations

Conflict of Interest

The authors have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Industrial IoT and Cyber-Physical Systems” guest edited by Arun K Somani, Seeram Ramakrishnan, Anil Chaudhary and Mehul Mahrishi.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sathya, A., Mythili, M.S. Translator Data Pre-processing Gram Feature Algorithmic Model (TDGA) for Enhancing Classifier Accuracy in the Healthcare Domain. SN COMPUT. SCI. 4, 542 (2023). https://doi.org/10.1007/s42979-023-01895-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-01895-x

Keywords

Navigation