Detection of Misinformation About COVID-19 in Brazilian Portuguese WhatsApp Messages

Forte Martins, Antônio Diogo; Cabral, Lucas; Chaves Mourão, Pedro Jorge; Monteiro, José Maria; Machado, Javam

doi:10.1007/978-3-030-80599-9_18

Antônio Diogo Forte Martins¹²,
Lucas Cabral¹²,
Pedro Jorge Chaves Mourão¹³,
José Maria Monteiro¹² &
…
Javam Machado¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12801))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1679 Accesses
3 Citations

Abstract

During the coronavirus pandemic, the problem of misinformation arose once again, quite intensely, through social networks. In many developing countries such as Brazil, one of the primary sources of misinformation is the messaging application WhatsApp. However, due to WhatsApp’s private messaging nature, there still few methods of misinformation detection developed specifically for this platform. Additionally, a MID model built to Twitter or Facebook may have a poor performance when used to classify WhatsApp messages. In this context, the automatic misinformation detection (MID) about COVID-19 in Brazilian Portuguese WhatsApp messages becomes a crucial challenge. In this work, we present the COVID-19.BR, a data set of WhatsApp messages about coronavirus in Brazilian Portuguese, collected from Brazilian public groups and manually labeled. Besides, we evaluated a series of misinformation classifiers combining different techniques. Our best result achieved an F1 score of 0.778, and the analysis of errors indicates that they occur mainly due to the predominance of short texts. When texts with less than 50 words are filtered, the F1 score rises to 0.857.

Supported by CAPES and LSBD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Choudrie, J., Banerjee, S., Kotecha, K., Walambe, R., Karende, H., Ameta, J.: Machine learning techniques and older adults processing of online information and misinformation: a covid 19 study. Comput. Hum. Behav. 119, 106716 (2021). https://doi.org/10.1016/j.chb.2021.106716
Article Google Scholar
Elhadad, M.K., Li, K.F., Gebali, F.: Detecting misleading information on COVID-19. IEEE Access 8, 165201–165215 (2020). https://doi.org/10.1109/ACCESS.2020.3022867
Article Google Scholar
Garimella, K., Tyson, G.: WhatsApp, doc? A first look at WhatsApp public group data. arXiv preprint arXiv:1804.01473 (2018)
Giachanou, A., Zhang, G., Rosso, P.: Multimodal multi-image fake news detection. In: 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pp. 647–654 (2020). https://doi.org/10.1109/DSAA49011.2020.00091
Kim, S.B., Han, K.S., Rim, H.C., Myaeng, S.H.: Some effective techniques for naive bayes text classification. IEEE Trans. Knowl. Data Eng. 18(11), 1457–1466 (2006)
Article Google Scholar
Kolluri, N.L., Murthy, D.: CoVerifi: a COVID-19 news verification system. Online Soc. Netw. Media 22, 100123 (2021). https://doi.org/10.1016/j.osnem.2021.100123
Article Google Scholar
Pranckevičius, T., Marcinkevičius, V.: Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Baltic J. Mod. Comput. 5(2), 221 (2017)
Article Google Scholar
Prasetijo, A.B., Isnanto, R.R., Eridani, D., Soetrisno, Y.A.A., Arfan, M., Sofwan, A.: Hoax detection system on Indonesian news sites based on text classification using SVM and SGD. In: 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), pp. 45–49. IEEE (2017)
Google Scholar
Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of naive bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 616–623 (2003)
Google Scholar
Resende, G., Messias, J., Silva, M., Almeida, J., Vasconcelos, M., Benevenuto, F.: A system for monitoring public political groups in WhatsApp. In: Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, WebMedia 2018, pp. 387–390. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3243082.3264662
Su, Q., Wan, M., Liu, X., Huang, C.R.: Motivations, methods and metrics of misinformation detection: an NLP perspective. Nat. Lang. Process. Res. 1, 1–13 (2020). https://doi.org/10.2991/nlpr.d.200522.001
Article Google Scholar
Waterloo, S.F., Baumgartner, S.E., Peter, J., Valkenburg, P.M.: Norms of online expressions of emotion: comparing Facebook, Twitter, Instagram, and WhatsApp. New Media Soc. 20(5), 1813–1831 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Federal University of Ceará, Fortaleza, Ceará, Brazil
Antônio Diogo Forte Martins, Lucas Cabral, José Maria Monteiro & Javam Machado
Universidade Estadual do Ceará, Fortaleza, Ceará, Brazil
Pedro Jorge Chaves Mourão

Authors

Antônio Diogo Forte Martins
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Cabral
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Jorge Chaves Mourão
View author publications
You can also search for this author in PubMed Google Scholar
José Maria Monteiro
View author publications
You can also search for this author in PubMed Google Scholar
Javam Machado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antônio Diogo Forte Martins .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
German Research Center for Artificial Intelligence, Saarbrücken, Germany
Helmut Horacek
University of Hertfordshire, Hatfield, UK
Epaminondas Kapetanios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Forte Martins, A.D., Cabral, L., Chaves Mourão, P.J., Monteiro, J.M., Machado, J. (2021). Detection of Misinformation About COVID-19 in Brazilian Portuguese WhatsApp Messages. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-80599-9_18
Published: 20 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80598-2
Online ISBN: 978-3-030-80599-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics