TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism

Elakkiya, E.; Selvakumar, S.; Leela Velusamy, R.

doi:10.1007/s12652-020-02640-5

TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism

Original Research
Published: 09 November 2020

Volume 12, pages 9287–9302, (2021)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

670 Accesses
12 Citations
Explore all metrics

Abstract

Online Social Networks (OSNs) allow easy membership leading to registration of a huge population and generation of voluminous information. These characteristics attract spammers to spread spam which may cause annoyance, financial loss, or personal information loss to the user and also weaken the reputation of social network sites. Most of the spam detection methods are based on user and content-based features using machine learning techniques. But, these annotated features are difficult to extract in real-time due to the privacy policy of most social network sites. Even for the features that can be extracted, because of their large size, the manual extraction process is complex and time-consuming. So there is a need for text level spam detection that does not require extraction of hard-core features. Existing deep learning based or existing single attention mechanism based text classification methods could not perform well as social network data are sparse with short texts and noises. Moreover, Spammers avoid direct spam words and use indirect words to evade spam filtering techniques and thus resulting in the dynamic and non-stationary nature of the social network spam texts. These indirect words contain hidden context that creates attention drift problem. So conjoint attention mechanism along with two attention mechanisms namely normal attention and context preserving attention are proposed to avoid attention drift problem in this deep learning-based text level spam detection technique (TextSpamDetector). Attention drift problem is solved by one attention mechanism which helps to find the important words while another attention mechanism allows focusing on attention in target context by referring to higher level abstraction of context vector. These attention mechanisms are referring to different context representations of the input text for finding informative words from the structural context representation. This structural context representation containing both local semantic features as well as global semantic dependency features is generated by CNN and BiLSTM. The proposed model is evaluated with the existing spam detection techniques using three datasets and the experimental results have proved that the proposed model performs well in terms of accuracy, F measure, and false-positive rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spam detection on social networks using deep contextualized word representation

Article 14 July 2022

Spam review detection using self attention based CNN and bi-directional LSTM

Article 15 February 2021

Fusion Convolutional Attention Network for Opinion Spam Detection

References

Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in online social networks. Comput Commun 3610–11:1120–1129
Article Google Scholar
Ala’M AZ et al (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl Based Syst 153:91–104
Article Google Scholar
Alghamdi B, Watson J, Xu Y (2016) Toward detecting malicious links in online social networks through user behavior. In: 2016 IEEE/WIC/ACM international conference on web intelligence workshops (WIW)
Almeida TA et al (2016) Text normalization and semantic indexing to enhance instant messaging and SMS spam filtering. Knowl Based Syst 108:25–32
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Banerjee I et al (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79–88
Article Google Scholar
Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 48(10):3538–3556
Article Google Scholar
Benevenuto F et al. (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6
Cao C, Caverlee J (2014) Behavioral detection of spam URL sharing: posting patterns versus click patterns. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014)
Chen C et al (2015) A performance evaluation of machine learning-based streaming spam tweets detection. IEEE Trans Comput Soc Syst 2(3):65–76
Article Google Scholar
Chen C et al (2016) Statistical features-based real-time detection of drifted twitter spam. IEEE Trans Inf Forensics Secur 12(4):914–925
Article Google Scholar
Cheng Z, Bai F, Xu Y, Zheng G, Pu S, Zhou S (2017) Focusing attention: towards accurate text recognition in natural images. In Proceedings of the IEEE international conference on computer vision, pp 5076–5084
Conneau A et al. (2017) Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364
Egele M et al. (2015) Towards detecting compromised accounts on social networks. IEEE
Feng B et al (2018) Multistage and elastic spam detection in mobile social networks through deep learning. IEEE Network 32(4):15–21
Article Google Scholar
Feng S, Wang Y, Liu L, Wang D, Yu G (2019) Attention based hierarchical LSTM network for context-aware microblog sentiment classification. World Wide Web 22(1):59–81
Article Google Scholar
Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44
Article Google Scholar
Jose T, Babu SS (2019) Detecting spammers on social network through clustering technique. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01541-6
Article Google Scholar
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751, Doha, Qatar, Association for Computational Linguistics
Liu S et al (2017) Addressing the class imbalance problem in twitter spam detection using ensemble learning. Comput Secur 69:35–49
Article Google Scholar
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025
Madisetty S, Desarkar MS (2018) A neural network-based ensemble approach for spam detection in Twitter. IEEE Trans Comput Soc Syst 5(4):973–984
Article Google Scholar
Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000
Article Google Scholar
Mikolov T et al. (2013a) Efficient estimation of word representations in vector space. In: Proceeding of workshop at first international conference on learning representation (ICLR)
Mikolov T, Yih W, Zweig G (2013b) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies
Rao G et al (2018) LSTM with sentence representations for document-level sentiment classification. Neurocomputing 308:49–57
Article Google Scholar
Rathore S, Loia V, Park JH (2018) SpamSpotter: an efficient spammer detection framework based on intelligent decision support system on facebook. Appl Soft Comput 67:920–932
Article Google Scholar
Sarıgül M, Ozyildirim BM, Avci M (2019) Differential convolutional neural network. Neural Networks 116:279–287
Article Google Scholar
Sedhai S, Sun A (2015) Hspam14: a collection of 14 million tweets for hashtag-oriented spam research. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval.
Sedhai S, Sun A (2017) Semi-supervised spam detection in Twitter stream. IEEE Trans Comput Soc Syst 5(1):169–175
Article Google Scholar
Shehnepoor S et al (2017) NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595
Article Google Scholar
Simon K (2020) Digital 2020: 3.8 billion people use social Media. We Are Social Inc. https://wearesocial.com/blog/2020/01/digital-2020-3-8-billion-people-use-social-media. Accessed 20 Feb 2020
Song L et al (2017) Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection. Electron Commer Res 17(1):51–81
Article Google Scholar
Tagg C (2009) A thesis on A corpus linguistics study of SMS text messaging. University of Birmingham, Diss
Google Scholar
Thomas K et al (2011) Design and evaluation of a real-time URL spam filtering service. 2011 IEEE symposium on security and privacy. Trans Dependable Secure Comput 14(4):447–460
Google Scholar
UtkMl's Twitter Spam Detection Competition (2019).https://www.kaggle.com/c/twitter-spam/data. Accessed Nov 2019
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wu T et al (2017a) Detecting spamming activities in twitter based on deep-learning technique. Concurr Comput Pract Exp 29(19):e4209
Article Google Scholar
Wu T et al (2017b) Twitter spam detection based on deep learning. In: Proceedings of the australasian computer science week multiconference
Xu G et al (2019a) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–51532
Article Google Scholar
Xu J et al (2019b) Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing. 33:10067–10068
Google Scholar
Yang C, Harkreader R, Guofei Gu (2013) Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans Inf Forensics Secur 8(8):1280–1293
Article Google Scholar
Zhang X et al (2016) Detecting spam and promoting campaigns in Twitter. ACM Trans Web (TWEB) 10(1):1–28
Article Google Scholar
Zheng X et al (2015) Detecting spammers on social networks. Neurocomputing 159:27–34
Article Google Scholar
Zhou C et al (2015) A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630
Zhou Y, Xu B, Xu J, Yang L, Li C (2016) Compositional recurrent neural networks for chinese short text classification. In: 2016 IEEE/WIC/ACM international conference on web intelligence (WI), pp. 137–144.

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, India
E. Elakkiya, S. Selvakumar & R. Leela Velusamy
Indian Institute of Information Technology, Una, Himachal Pradesh, India
S. Selvakumar

Authors

E. Elakkiya
View author publications
You can also search for this author in PubMed Google Scholar
S. Selvakumar
View author publications
You can also search for this author in PubMed Google Scholar
R. Leela Velusamy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to E. Elakkiya.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Elakkiya, E., Selvakumar, S. & Leela Velusamy, R. TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism. J Ambient Intell Human Comput 12, 9287–9302 (2021). https://doi.org/10.1007/s12652-020-02640-5

Download citation

Received: 08 July 2020
Accepted: 24 October 2020
Published: 09 November 2020
Issue Date: October 2021
DOI: https://doi.org/10.1007/s12652-020-02640-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism

Abstract

Access this article

Similar content being viewed by others

Spam detection on social networks using deep contextualized word representation

Spam review detection using self attention based CNN and bi-directional LSTM

Fusion Convolutional Attention Network for Opinion Spam Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism

Abstract

Access this article

Similar content being viewed by others

Spam detection on social networks using deep contextualized word representation

Spam review detection using self attention based CNN and bi-directional LSTM

Fusion Convolutional Attention Network for Opinion Spam Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation