Skip to main content
Log in

TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Online Social Networks (OSNs) allow easy membership leading to registration of a huge population and generation of voluminous information. These characteristics attract spammers to spread spam which may cause annoyance, financial loss, or personal information loss to the user and also weaken the reputation of social network sites. Most of the spam detection methods are based on user and content-based features using machine learning techniques. But, these annotated features are difficult to extract in real-time due to the privacy policy of most social network sites. Even for the features that can be extracted, because of their large size, the manual extraction process is complex and time-consuming. So there is a need for text level spam detection that does not require extraction of hard-core features. Existing deep learning based or existing single attention mechanism based text classification methods could not perform well as social network data are sparse with short texts and noises. Moreover, Spammers avoid direct spam words and use indirect words to evade spam filtering techniques and thus resulting in the dynamic and non-stationary nature of the social network spam texts. These indirect words contain hidden context that creates attention drift problem. So conjoint attention mechanism along with two attention mechanisms namely normal attention and context preserving attention are proposed to avoid attention drift problem in this deep learning-based text level spam detection technique (TextSpamDetector). Attention drift problem is solved by one attention mechanism which helps to find the important words while another attention mechanism allows focusing on attention in target context by referring to higher level abstraction of context vector. These attention mechanisms are referring to different context representations of the input text for finding informative words from the structural context representation. This structural context representation containing both local semantic features as well as global semantic dependency features is generated by CNN and BiLSTM. The proposed model is evaluated with the existing spam detection techniques using three datasets and the experimental results have proved that the proposed model performs well in terms of accuracy, F measure, and false-positive rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Ahmed F, Abulaish M (2013) A generic statistical approach for spam detection in online social networks. Comput Commun 3610–11:1120–1129

    Article  Google Scholar 

  • Ala’M AZ et al (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl Based Syst 153:91–104

    Article  Google Scholar 

  • Alghamdi B, Watson J, Xu Y (2016) Toward detecting malicious links in online social networks through user behavior. In: 2016 IEEE/WIC/ACM international conference on web intelligence workshops (WIW)

  • Almeida TA et al (2016) Text normalization and semantic indexing to enhance instant messaging and SMS spam filtering. Knowl Based Syst 108:25–32

    Article  Google Scholar 

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  • Banerjee I et al (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79–88

    Article  Google Scholar 

  • Barushka A, Hajek P (2018) Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl Intell 48(10):3538–3556

    Article  Google Scholar 

  • Benevenuto F et al. (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6

  • Cao C, Caverlee J (2014) Behavioral detection of spam URL sharing: posting patterns versus click patterns. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014)

  • Chen C et al (2015) A performance evaluation of machine learning-based streaming spam tweets detection. IEEE Trans Comput Soc Syst 2(3):65–76

    Article  Google Scholar 

  • Chen C et al (2016) Statistical features-based real-time detection of drifted twitter spam. IEEE Trans Inf Forensics Secur 12(4):914–925

    Article  Google Scholar 

  • Cheng Z, Bai F, Xu Y, Zheng G, Pu S, Zhou S (2017) Focusing attention: towards accurate text recognition in natural images. In Proceedings of the IEEE international conference on computer vision, pp 5076–5084

  • Conneau A et al. (2017) Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364

  • Egele M et al. (2015) Towards detecting compromised accounts on social networks. IEEE

  • Feng B et al (2018) Multistage and elastic spam detection in mobile social networks through deep learning. IEEE Network 32(4):15–21

    Article  Google Scholar 

  • Feng S, Wang Y, Liu L, Wang D, Yu G (2019) Attention based hierarchical LSTM network for context-aware microblog sentiment classification. World Wide Web 22(1):59–81

    Article  Google Scholar 

  • Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44

    Article  Google Scholar 

  • Jose T, Babu SS (2019) Detecting spammers on social network through clustering technique. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01541-6

    Article  Google Scholar 

  • Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751, Doha, Qatar, Association for Computational Linguistics

  • Liu S et al (2017) Addressing the class imbalance problem in twitter spam detection using ensemble learning. Comput Secur 69:35–49

    Article  Google Scholar 

  • Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025

  • Madisetty S, Desarkar MS (2018) A neural network-based ensemble approach for spam detection in Twitter. IEEE Trans Comput Soc Syst 5(4):973–984

    Article  Google Scholar 

  • Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000

    Article  Google Scholar 

  • Mikolov T et al. (2013a) Efficient estimation of word representations in vector space. In: Proceeding of workshop at first international conference on learning representation (ICLR)

  • Mikolov T, Yih W, Zweig G (2013b) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies

  • Rao G et al (2018) LSTM with sentence representations for document-level sentiment classification. Neurocomputing 308:49–57

    Article  Google Scholar 

  • Rathore S, Loia V, Park JH (2018) SpamSpotter: an efficient spammer detection framework based on intelligent decision support system on facebook. Appl Soft Comput 67:920–932

    Article  Google Scholar 

  • Sarıgül M, Ozyildirim BM, Avci M (2019) Differential convolutional neural network. Neural Networks 116:279–287

    Article  Google Scholar 

  • Sedhai S, Sun A (2015) Hspam14: a collection of 14 million tweets for hashtag-oriented spam research. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval.

  • Sedhai S, Sun A (2017) Semi-supervised spam detection in Twitter stream. IEEE Trans Comput Soc Syst 5(1):169–175

    Article  Google Scholar 

  • Shehnepoor S et al (2017) NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595

    Article  Google Scholar 

  • Simon K (2020) Digital 2020: 3.8 billion people use social Media. We Are Social Inc. https://wearesocial.com/blog/2020/01/digital-2020-3-8-billion-people-use-social-media. Accessed 20 Feb 2020

  • Song L et al (2017) Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection. Electron Commer Res 17(1):51–81

    Article  Google Scholar 

  • Tagg C (2009) A thesis on A corpus linguistics study of SMS text messaging. University of Birmingham, Diss

    Google Scholar 

  • Thomas K et al (2011) Design and evaluation of a real-time URL spam filtering service. 2011 IEEE symposium on security and privacy. Trans Dependable Secure Comput 14(4):447–460

    Google Scholar 

  • UtkMl's Twitter Spam Detection Competition (2019).https://www.kaggle.com/c/twitter-spam/data. Accessed Nov 2019

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  • Wu T et al (2017a) Detecting spamming activities in twitter based on deep-learning technique. Concurr Comput Pract Exp 29(19):e4209

    Article  Google Scholar 

  • Wu T et al (2017b) Twitter spam detection based on deep learning. In: Proceedings of the australasian computer science week multiconference

  • Xu G et al (2019a) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–51532

    Article  Google Scholar 

  • Xu J et al (2019b) Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing. 33:10067–10068

    Google Scholar 

  • Yang C, Harkreader R, Guofei Gu (2013) Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans Inf Forensics Secur 8(8):1280–1293

    Article  Google Scholar 

  • Zhang X et al (2016) Detecting spam and promoting campaigns in Twitter. ACM Trans Web (TWEB) 10(1):1–28

    Article  Google Scholar 

  • Zheng X et al (2015) Detecting spammers on social networks. Neurocomputing 159:27–34

    Article  Google Scholar 

  • Zhou C et al (2015) A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630

  • Zhou Y, Xu B, Xu J, Yang L, Li C (2016) Compositional recurrent neural networks for chinese short text classification. In: 2016 IEEE/WIC/ACM international conference on web intelligence (WI), pp. 137–144.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. Elakkiya.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elakkiya, E., Selvakumar, S. & Leela Velusamy, R. TextSpamDetector: textual content based deep learning framework for social spam detection using conjoint attention mechanism. J Ambient Intell Human Comput 12, 9287–9302 (2021). https://doi.org/10.1007/s12652-020-02640-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02640-5

Keywords

Navigation