research-article

A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet

Authors:
Vinícius Nunes Barbosa

Universidade Federal Rural do Semi-Árido - UFERSA, Brazil

Universidade Federal Rural do Semi-Árido - UFERSA, Brazil
View Profile

,
Francisco Mendes Mendes Neto

Universidade Federal Rural do Semi-Árido, Brazil

Universidade Federal Rural do Semi-Árido, Brazil
View Profile

,
Sebastiao Alves Filho

Universidade do Estado do Rio Grande do Norte (UERN), Brazil

Universidade do Estado do Rio Grande do Norte (UERN), Brazil
View Profile

,
Lenardo Silva

Universidade Federal Rural do Semi-Árido, Brazil

Universidade Federal Rural do Semi-Árido, Brazil
View Profile

SBSI '22: Proceedings of the XVIII Brazilian Symposium on Information SystemsMay 2022Article No.: 39Pages 1–8https://doi.org/10.1145/3535511.3535550

Published:30 June 2022Publication History

SBSI '22: Proceedings of the XVIII Brazilian Symposium on Information Systems

Pages 1–8

ABSTRACT

Context: The increase in the proliferation of fake news on the Internet has significantly impacted the quality and veracity of information received by society. Problem: The malicious use of information can compromise democracy by manipulating people’s opinions. In addition, there are few facilitating mechanisms that classify and help the citizen to know whether a certain news propagated is true or not. This problem has driven new research directions in an attempt to classify and identify these news. Methodology: This work in its methodology performs a comparison of algorithms to serve as an intelligent solution in the detection of fake news in Portuguese. About 12,000 news featured the dataset used for this analysis. Pre-processing techniques were used to analyze the patterns of these news, as well as to reduce noise and eliminate null information. The algorithms used for comparison were Logistic Regression, Stochastic Gradient Descent, Support Vector Machine and Multilayer Perceptron. Result: The results obtained showed that the models generated by the four algorithms obtained an accuracy greater than 90%. To choose the best algorithm, metrics such as precision, recall and f-measure were used for each of the models. The SVM algorithm had the best performance, with 96.39% accuracy. Contribution: In addition to the analytical results presented, this work brought as contributions the availability of a database containing news in Portuguese and an analysis, from the text of the news, both grammatical and structural, in order to detect the existing patterns between true and false.

References

Davide Anguita, Luca Ghelardoni, Alessandro Ghio, Luca Oneto, and Sandro Ridella. 2012. The’K’in K-fold Cross Validation.. In ESANN. i6doc. com publ, Bruges, Belgium, 441–446.Google Scholar
Rafael Batista. 2018. A divulgação de notícias falsas, conhecidas como fake news, pode interferir negativamente em vários setores da sociedade, como política, saúde e segurança. https://mundoeducacao.bol.uol.com.br/curiosidades/fake-news.htm. Acessado: 20/04/2021.Google Scholar
Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, and Gaël Varoquaux. 2013. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108–122.Google Scholar
Sonia Castelo, Thais Almeida, Anas Elghafari, Aécio Santos, Kien Pham, Eduardo Nakamura, and Juliana Freire. 2019. A Topic-Agnostic Approach for Identifying Fake News Pages. In Companion Proceedings of The 2019 World Wide Web Conference. ACM, 975–980.Google ScholarDigital Library
Douglas Ciriaco. 2018. Mais de 4 bilhões de pessoas usam a internet ao redor do mundo.https://www.tecmundo.com.br/internet/4-bilhoes-pessoas-usam-internet-no-mundo.html. Acessado: 19/04/2021.Google Scholar
Rosanne D’Agostino. 2017. Três anos depois, linchamento de Fabiane após boato na web pode ajudar a endurecer lei. https://g1.globo.com/e-ou-nao-e/noticia/tres_anos_depois_linchamento_de_fabiane_apos_boato_na_web_pode_ajudar_a_endurecer_lei.ghtml. Acessado: 20/04/2021.Google Scholar
Ithalo Henrique de Sousa Leal. 2018. O uso de aprendizagem de máquina para identificação e classificação de fake news no twitter referentes a eleição presidencial de 2018. Monografia (Bacharelado em Ciência da Computação), Faculdade Doctum de Caratinga.Google Scholar
Caroline Delmazo and Jonas C.L. Valente. 2018. Fake news nas redes sociais online: propagação e reações à desinformação em busca de cliques. Media & Jornalismo 18 (04 2018), 155 – 169. http://www.scielo.mec.pt/scielo.php?script=sci_arttext&pid=S2183-54622018000100012&nrm=isoGoogle Scholar
Davi P. Guimarães, Guilherme M. Moreira, Matheus E. Fagundes, and Nilson M. Lazarin. 2019. Análise de sites disseminadores de fake news. In Anais Estendidos do XV Simpósio Brasileiro de Sistemas de Informação (Aracaju). SBC, Porto Alegre, RS, Brasil, 17–20. https://doi.org/10.5753/sbsi.2019.7431Google Scholar
Md Abu Kausar, VS Dhaka, and Sanjeev Kumar Singh. 2013. Web crawler: a review. International Journal of Computer Applications 63, 2(2013).Google Scholar
Simon Kemp. 2018. Digital in 2018: World’s Internet users pass the 4 billion mark. https://wearesocial.com/blog/2018/01/global-digital-report-2018 https://wearesocial.com/blog/2018/01/global-digital-report-2018. Acessado em 19/04/2021.Google Scholar
Jake Lever, Martin Krzywinski, and Naomi Altman. 2016. Logistic regression.Google Scholar
Marumo and Fabiano Shiiti. 2018. Deep Learning para Classificação de fake news por sumarização de texto.Monografia (Bacharelado em Ciência da Computação), Universidade Estadual de Londrina.Google Scholar
Ryan Mitchell. 2018. Web scraping with Python: Collecting more data from the modern web. ” O’Reilly Media, Inc.”.Google Scholar
Maria Carolina Monard and José Augusto Baranauskas. 2003. Conceitos Sobre Aprendizado de Máquina. In Sistemas Inteligentes Fundamentos e Aplicações (1 ed.). Manole Ltda, Barueri-SP, 89–114.Google Scholar
Rafael Monteiro, Roney L. de Sales, and Thiago A. S. Pardo. 2018. Detecção Automática de Notícias Falsas para o Português.https://nilc-fakenews.herokuapp.com/about. Acessado: 11/04/2021.Google Scholar
Roger Monteiro, Rodrigo Nogueira, and Greisse Moser. 2019. Desenvolvimento de um sistema para a classificação de Fakenews acoplado à etapa de ETL de um Data Warehouse de Textos de Notícias em língua Portuguesa. In Anais da XV Escola Regional de Banco de Dados(Chapecó). SBC, Porto Alegre, RS, Brasil, 131–140. https://doi.org/10.5753/erbd.2019.8486Google Scholar
Rafael A. Monteiro, Roney L. S. Santos, Thiago A. S. Pardo, Tiago A. de Almeida, Evandro E. S. Ruiz, and Oto A. Vale. 2018. Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In Computational Processing of the Portuguese Language. Springer International Publishing, NY, USA, 324–334.Google Scholar
Kenneth Rapoza. 2017. Can ’Fake News’ Impact The Stock Market?https://www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-the-stock-market/#6f93aa252fac. Acessado: 19/04/2021.Google Scholar
Sebastian Raschka. 2018. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. arxiv:1811.12808 [cs.LG]Google Scholar
Christopher Salton, Gerard e Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.Google Scholar
Wellison Santos, Marcus Xavier, David Carlos da Cunha, Jose Carlos Ferreira, Daniel Adauto, and Carlos Ferraz. 2019. TrendsBot: Verificando a veracidade das mensagens do Telegram utilizando Data Stream. In Anais Estendidos do XXXVII Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (Gramado). SBC, Porto Alegre, RS, Brasil, 65–72. https://doi.org/10.5753/sbrc_estendido.2019.7771Google Scholar
Daniel Silveira. 2018. Brasil ganha 10 milhões de internautas em 1 ano, aponta IBGE. https://g1.globo.com/economia/tecnologia/noticia/2018/12/20/numero_de_internautas_cresce_em_cerca_de_10_milhoes_em_um_ano_no_brasil_aponta_ibge.ghtml. Acessado: 20/04/2021.Google Scholar
Statista. 2017. Internet usage in Brazil - Statistics & Facts. http://www.digitalnewsreport.org/survey/2018/brazil-2018/. Acessado: 21/04/2021.Google Scholar

Index Terms

A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet
1. Applied computing
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Fake news detection: A survey of graph neural network methods
Abstract
The emergence of various social networks has generated vast volumes of data. Efficient methods for capturing, distinguishing, and filtering real and fake news are becoming increasingly important, especially after the outbreak of the ...
Highlights
- All searchable articles of graph neural network (GNN) for fake news detection are reviewed.
Read More
A review on fake news detection 3T’s: typology, time of detection, taxonomies
Abstract
Fake news has become an industry on its own, where users paid to write fake news and create clickbait content to allure the audience. Apparently, the detection of fake news is a crucial problem and several studies have proposed machine-learning-...
Read More
Fake news outbreak 2021: Can we stop the viral spread?
Abstract
Social Networks' omnipresence and ease of use has revolutionized the generation and distribution of information in today's world. However, easy access to information does not equal an increased level of public knowledge. Unlike ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SBSI '22: Proceedings of the XVIII Brazilian Symposium on Information Systems
May 2022
394 pages
ISBN:9781450396981
DOI:10.1145/3535511
Editors:
Rita Cristina G. Berardi,
Alexandre R. Graeml,
Valdemar V. Graciano Neto,
Awdren de Lima Fontão,
Williamson Silva
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
datasets
gaze detection
neural networks
text tagging
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate181of557submissions,32%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 103
  Total Downloads
- Downloads (Last 12 months)50
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet

SBSI '22: Proceedings of the XVIII Brazilian Symposium on Information Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fake news detection: A survey of graph neural network methods

A review on fake news detection 3T’s: typology, time of detection, taxonomies

Fake news outbreak 2021: Can we stop the viral spread?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Comparative Study of Machine Learning Algorithms for the Detection of Fake News on the Internet

SBSI '22: Proceedings of the XVIII Brazilian Symposium on Information Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fake news detection: A survey of graph neural network methods

A review on fake news detection 3T’s: typology, time of detection, taxonomies

Fake news outbreak 2021: Can we stop the viral spread?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media