Towards automated check-worthy sentence detection using Gated Recurrent Unit

Jha, Ria; Motwani, Ena; Singhal, Nivedita; Kaushal, Rishabh

doi:10.1007/s00521-023-08300-x

Towards automated check-worthy sentence detection using Gated Recurrent Unit

Original Article
Published: 10 February 2023

Volume 35, pages 11337–11357, (2023)
Cite this article

Download PDF

Neural Computing and Applications Aims and scope Submit manuscript

Towards automated check-worthy sentence detection using Gated Recurrent Unit

Download PDF

Ria Jha¹,
Ena Motwani¹,
Nivedita Singhal¹ &
…
Rishabh Kaushal ORCID: orcid.org/0000-0002-9200-7802¹

959 Accesses
2 Citations
Explore all metrics

Abstract

People are exposed to a lot of information daily, which is a mix of facts, opinions, and false claims. The rate at which information is created and spread has necessitated an automated fact-checking mechanism. In this work, we focus on the first step of the fact-checking system, which is to identify whether a given sentence is factual. We propose a glove embedding-based gated recurrent unit pipeline for check-worthy sentence detection, referred to as G2CW framework. It detects whether a given sentence has check-worthy content in it or not; furthermore, if it has check-worthy content, whether it is important or not, from a fact-checking perspective. We evaluate our proposed framework on two datasets: a standard ClaimBuster dataset commonly used by the research community for this problem and a self-curated IndianClaim dataset. Our G2CW framework outperforms prior work with 0.92 as F1-score. Furthermore, our G2CW framework, when trained on the ClaimBuster dataset, performs the best on the IndianClaims dataset.

Evidence Extraction to Validate Medical Claims in Fake News Detection

SimpleLSTM: A Deep-Learning Approach to Simple-Claims Classification

A Novel Approach Towards Fake News Detection: Deep Learning Augmented with Textual Entailment Features

1 Introduction

In today’s day and age, an unprecedented amount of information is constantly generated ([5, 15]). The information available to the public formulates public opinion and understanding of current events ([18, 27]). It is of utmost importance, especially in a democratic nation, that people have access to accurate information which formulates public opinion. We also observe that this information comprises facts, misleading statements, and false claims ([12]). Hence, distinction must be made between truthful, factual pieces of information and fabricated ones. Fact-checking ([6, 7, 10]) is the key to ensure transparency and accountability of those in power. Fact-checkers and journalists constantly work to identify check-worthy statements, verify the facts, and correct misinformation before making it available to the public. It is essential to automatically ([17, 25, 26]) distinguish between facts that are check-worthy, facts that do not require verification, and statements that are not factual. If the entire corpus of information is considered for verification, one will waste resources on sentences that do not warrant verification. Therefore, detecting check-worthy sentences in the first step helps to reduce the volume of information to be verified.

Table 1 Examples of three types of sentences

Towards automated check-worthy sentence detection using Gated Recurrent Unit

Abstract

Similar content being viewed by others

Evidence Extraction to Validate Medical Claims in Fake News Detection

SimpleLSTM: A Deep-Learning Approach to Simple-Claims Classification

A Novel Approach Towards Fake News Detection: Deep Learning Augmented with Textual Entailment Features

1 Introduction

2 Related work

2.1 Fact-checking

2.2 Check-worthiness of sentence

3 Dataset description and analysis

3.1 Description of datasets

3.2 Data analysis

3.2.1 ClaimBuster dataset

3.2.2 IndianClaims dataset

4 Proposed methodology

4.1 Proposed G2CW framework

5 Experiment setup & results

5.1 Experiment design

5.2 Results for ClaimBuster dataset

5.2.1 Baseline models

5.2.2 Deep learning models

5.2.3 Best-performing models

5.3 Results for Indian dataset

5.3.1 Baseline models

5.3.2 Deep learning models

5.3.3 Best-performing models

6 Conclusion and future scope

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation