Skip to main content
Log in

BiGRU attention capsule neural network for persian text classification

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Text classification is a significant part of the business world. In the news classification world, detection of the subject is an important issue that can lead to the recognition of news trends and junk news. There are different algorithms of deep learning to process text classification. In this paper, specific algorithms have been implemented and compared to obtain the subject of the text in the Persian news corpus. The best results belong to BiGRU with the attention mechanism and CapsNet (BiGRUACaps) method. The GRU network outperforms LSTM because of fewer gates and, therefore, fewer parameters. In the GRU, the flow control is done without a memory unit, and this network has shown that it has better performance in case of existing less data. Moreover, given that long sentences are used in the news texts, the existence of the attention mechanism has made important words more relevant and has solved the problem in the long sequences data. The most significant problem in classifying Persian texts was the lack of a suitable dataset. One of the contributions of this work is scraped data. Collecting 20,726 records from Persian news websites is the best Persian news dataset with the category. Due to the lack of appropriate pre-trained Persian models and also the combination of various neural networks with these models, and determining the optimal model to identify the subject of Persian text, has been another problem of this research. The use of Model CapsNet in Persian data has also been looked into, which has had exciting results. The results of the comparison show the improvement of the classification performance of the Persian texts. The best result obtained the combination of BiGRUACaps with 0.8608 in F Measure

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Asghar MZ, Habib A, Habib A, Khan AM, Ali R, Khattak AM (2019) Exploring deep neural networks for rumor detection. J Ambient Intell Hum Comput 12:1–19

    Google Scholar 

  • Banerjee I, Ling Y, Chen MC, Hasan SA, Langlotz CP, Moradzadeh N, Lungren MP (2019) Comparative effectiveness of convolutional neural network (cnn) and recurrent neural network (rnn) architectures for radiology text report classification. Artif Intell Med 97:79–88

    Article  Google Scholar 

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  • Chung J, Çaglar Gülçehre Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555

  • d’Sa AG, Illina I, Fohr D (2020) Bert and fasttext embeddings for automatic detection of toxic speech. In: Siie 2020-information systems and economic intelligence

  • Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. Aaai

  • Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv:abs/1603.01360

  • Lin R, Fu C, Mao C, Wei J, Li J (2018) Academic news text classification model based on attention mechanism and rcnn

  • Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338

    Article  Google Scholar 

  • Liu J, Yang Y, Lv S, Wang J, Chen H (2019) Attention-based bigru-cnn for chinese question classification. J Ambient Intell Hum Comput:1–12

  • Lopez-Gazpio I, Maritxalar M, Lapata M, Agirre E (2019) Word n-gram attention models for sentence similarity and inference. Expert Syst Appl 132:1–11

    Article  Google Scholar 

  • Makarenkov V, Rokach L, Shapira B (2019) Choosing the right word: Using bidirectional lstm tagger for writing support systems. arXiv:abs/1901.02490

  • Mikolov T, Chen K, Corrado GS, Dean J (2013) Efficient estimation of word representations in vector space. CoRR arXiv:abs/1301.3781

  • Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pretraining distributed word representations. arXiv:abs/1712.09405

  • Miller AH, Fisch A, Dodge J, Karimi A-H, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. arXiv:abs/1606.03126

  • Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. Emnlp

  • Perikos I, Hatzilygeroudis I (2016) Recognizing emotions in text using ensemble of classifiers. Eng Appl Artif Intell 51:191–201

    Article  Google Scholar 

  • Powers DMW (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation

  • Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. arXiv:abs/1710.09829

  • Salehinejad H, Baarbe J, Sankar S, Barfett J, Colak E, Valaee S (2018) Recent advances in recurrent neural networks. arXiv:abs/1801.01078

  • Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681

    Article  Google Scholar 

  • Seraji M (2013) Preper: a pre-processor for Persian

  • Seraji M (2015) Morphosyntactic corpora and tools for Persian

  • Sreelakshmi K, Rafeeque PC, Sreetha S, Gayathri E (2018) Deep bi-directional lstm network for query intent detection. Procedia Comput Sci 143:939–946

    Article  Google Scholar 

  • Stehman SV (1997) Selecting and interpreting measures of thematic classification accuracy

  • Thompson K (1968) Programming techniques: regular expression search algorithm. Commun ACM 11:419–422

    Article  Google Scholar 

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al (2017) Attention is all you need. arXiv:abs/1706.03762

  • Wang D, Su J-L, Yu H (2020) Feature extraction and analysis of natural language processing for deep learning english language. IEEE Access 8:46335–46345

    Article  Google Scholar 

  • Wang Q, Ruan T, Zhou Y, Xu C, Gao D, He P (2018) An attention-based bi-gru-capsnet model for hypernymy detection between compound entities. In: IEEE international conference on bioinformatics and biomedicine (BIBM), 2018, pp 1031–1035

  • Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. CoRR arXiv:abs/1301.3557

  • Zhang X, Zhao JJ, LeCun Y (2015) Character-level convolutional networks for text classification. Nips

  • Zhong B, Xing X, Love PED, Wang X, Luo H (2019) Convolutional neural network: deep learning-based classification of building quality problems. Adv Eng Informat 40:46–57

    Article  Google Scholar 

  • Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. Acl

  • Zhou X, Hu B, Chen Q, Wang X (2018) Recurrent convolutional neural network for answer selection in community question answering. Neurocomputing 274:8–18

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Manthouri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kenarang, A., Farahani, M. & Manthouri, M. BiGRU attention capsule neural network for persian text classification. J Ambient Intell Human Comput 13, 3923–3933 (2022). https://doi.org/10.1007/s12652-022-03742-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-03742-y

Keywords

Navigation