Skip to main content

Quora Insincere Questions Classification Using Attention Based Model

  • Conference paper
  • First Online:
Data Science and Emerging Technologies (DaSET 2022)

Abstract

The online platform has evolved into an unparalleled storehouse of information. People use various social question-and-answer websites such as Quora, Form-spring, Stack-Overflow, Twitter, and Beepl to ask questions, clarify doubts, and share ideas and expertise with others. An increase in inappropriate and insincere comments by users without a genuine motive is a major issue with such Q & A websites. Individuals tend to share harmful and toxic content intended to make a statement rather than look for helpful answers. In the world of natural language processing (NLP), Bidirectional Encoder Representations from Transformers (BERT) has been a game-changer. It has dominated performance benchmarks and thereby pushed the limits of researchers’ ability to experiment and produce similar models. This resulted in improvements in language models by introducing lighter models while maintaining efficiency and performance. This study utilized pre-trained state-of-the-art language models for understanding whether posted questions are sincere or insincere with limited computation. To overcome the high computation problem of NLP, the BERT, XLNet, StructBERT, and DeBERTa models were trained on three samples of data. The metrics proved that even with limited resources, recent transformer-based models outscore previous studies with remarkable results. Amongst the four, DeBERTa stands out with the highest balanced accuracy, macro, and weighted f1-score of 80%, 0.83 and 0.96, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hosseinmardi, H., Mattson, S. A., Ibn Rafiq, R., Han, R., Lv, Q., Mishra, S.. Analyzing labeled cyberbullying incidents on the instagram social network. In: Liu, TY., Scollon, C., Zhu, W. (eds.) Social Informatics. SocInfo 2015. Lecture Notes in Computer Science, vol. 9471, pp. 49-66. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27433-1_4

  2. Maslej-Krešňáková, V., Sarnovský, M., Butka, P., Machová, K.: Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification. Appl. Sci. 10(23), 8631 (2020)

    Article  Google Scholar 

  3. Del Vicario, M., et al.: The spreading of misinformation online. Proc. Natl. Acad. Sci. 113(3), 554–559 (2016)

    Article  Google Scholar 

  4. Morzhov, S.: Avoiding unintended bias in toxicity classification with neural networks. In: 2020 26th Conference of Open Innovations Association (FRUCT), pp. 314–320. IEEE (2020)

    Google Scholar 

  5. Quora Insincere Questions Classification | Kaggle, https://www.kaggle.com/c/quora-insincere-questions-classification/data. Accessed 02 Nov 2021

  6. Kumar, A., Makhija, P., Gupta, A. Noisy Text Data: Achilles’ Heel of BERT. arXiv preprint arXiv:2003.12932 (2020)

  7. Wirth, R., Hipp, J.: CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, vol. 1, pp. 29–39 (2000)

    Google Scholar 

  8. Aslam, I., et al.: Classification of Insincere Questions Using Deep Learning: Quora Dataset Case Study. Springer International Publishing, Cham (2021)

    Google Scholar 

  9. Al-Ramahi, M.A. Alsmadi, I.: Using data analytics to filter insincere posts from online social networks. a case study: Quora Insincere Questions (2020)

    Google Scholar 

  10. Rachha, A. Vanmane, G.: Detecting insincere questions from text: a transfer learning approach (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sulaf Assi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chakraborty, S. et al. (2023). Quora Insincere Questions Classification Using Attention Based Model. In: Wah, Y.B., Berry, M.W., Mohamed, A., Al-Jumeily, D. (eds) Data Science and Emerging Technologies. DaSET 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 165. Springer, Singapore. https://doi.org/10.1007/978-981-99-0741-0_26

Download citation

Publish with us

Policies and ethics