Abstract
Text classification is a significant part of the business world. In the news classification world, detection of the subject is an important issue that can lead to the recognition of news trends and junk news. There are different algorithms of deep learning to process text classification. In this paper, specific algorithms have been implemented and compared to obtain the subject of the text in the Persian news corpus. The best results belong to BiGRU with the attention mechanism and CapsNet (BiGRUACaps) method. The GRU network outperforms LSTM because of fewer gates and, therefore, fewer parameters. In the GRU, the flow control is done without a memory unit, and this network has shown that it has better performance in case of existing less data. Moreover, given that long sentences are used in the news texts, the existence of the attention mechanism has made important words more relevant and has solved the problem in the long sequences data. The most significant problem in classifying Persian texts was the lack of a suitable dataset. One of the contributions of this work is scraped data. Collecting 20,726 records from Persian news websites is the best Persian news dataset with the category. Due to the lack of appropriate pre-trained Persian models and also the combination of various neural networks with these models, and determining the optimal model to identify the subject of Persian text, has been another problem of this research. The use of Model CapsNet in Persian data has also been looked into, which has had exciting results. The results of the comparison show the improvement of the classification performance of the Persian texts. The best result obtained the combination of BiGRUACaps with 0.8608 in F Measure
Similar content being viewed by others
References
Asghar MZ, Habib A, Habib A, Khan AM, Ali R, Khattak AM (2019) Exploring deep neural networks for rumor detection. J Ambient Intell Hum Comput 12:1–19
Banerjee I, Ling Y, Chen MC, Hasan SA, Langlotz CP, Moradzadeh N, Lungren MP (2019) Comparative effectiveness of convolutional neural network (cnn) and recurrent neural network (rnn) architectures for radiology text report classification. Artif Intell Med 97:79–88
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Chung J, Çaglar Gülçehre Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
d’Sa AG, Illina I, Fohr D (2020) Bert and fasttext embeddings for automatic detection of toxic speech. In: Siie 2020-information systems and economic intelligence
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. Aaai
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv:abs/1603.01360
Lin R, Fu C, Mao C, Wei J, Li J (2018) Academic news text classification model based on attention mechanism and rcnn
Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338
Liu J, Yang Y, Lv S, Wang J, Chen H (2019) Attention-based bigru-cnn for chinese question classification. J Ambient Intell Hum Comput:1–12
Lopez-Gazpio I, Maritxalar M, Lapata M, Agirre E (2019) Word n-gram attention models for sentence similarity and inference. Expert Syst Appl 132:1–11
Makarenkov V, Rokach L, Shapira B (2019) Choosing the right word: Using bidirectional lstm tagger for writing support systems. arXiv:abs/1901.02490
Mikolov T, Chen K, Corrado GS, Dean J (2013) Efficient estimation of word representations in vector space. CoRR arXiv:abs/1301.3781
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pretraining distributed word representations. arXiv:abs/1712.09405
Miller AH, Fisch A, Dodge J, Karimi A-H, Bordes A, Weston J (2016) Key-value memory networks for directly reading documents. arXiv:abs/1606.03126
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. Emnlp
Perikos I, Hatzilygeroudis I (2016) Recognizing emotions in text using ensemble of classifiers. Eng Appl Artif Intell 51:191–201
Powers DMW (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. arXiv:abs/1710.09829
Salehinejad H, Baarbe J, Sankar S, Barfett J, Colak E, Valaee S (2018) Recent advances in recurrent neural networks. arXiv:abs/1801.01078
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681
Seraji M (2013) Preper: a pre-processor for Persian
Seraji M (2015) Morphosyntactic corpora and tools for Persian
Sreelakshmi K, Rafeeque PC, Sreetha S, Gayathri E (2018) Deep bi-directional lstm network for query intent detection. Procedia Comput Sci 143:939–946
Stehman SV (1997) Selecting and interpreting measures of thematic classification accuracy
Thompson K (1968) Programming techniques: regular expression search algorithm. Commun ACM 11:419–422
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al (2017) Attention is all you need. arXiv:abs/1706.03762
Wang D, Su J-L, Yu H (2020) Feature extraction and analysis of natural language processing for deep learning english language. IEEE Access 8:46335–46345
Wang Q, Ruan T, Zhou Y, Xu C, Gao D, He P (2018) An attention-based bi-gru-capsnet model for hypernymy detection between compound entities. In: IEEE international conference on bioinformatics and biomedicine (BIBM), 2018, pp 1031–1035
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. CoRR arXiv:abs/1301.3557
Zhang X, Zhao JJ, LeCun Y (2015) Character-level convolutional networks for text classification. Nips
Zhong B, Xing X, Love PED, Wang X, Luo H (2019) Convolutional neural network: deep learning-based classification of building quality problems. Adv Eng Informat 40:46–57
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. Acl
Zhou X, Hu B, Chen Q, Wang X (2018) Recurrent convolutional neural network for answer selection in community question answering. Neurocomputing 274:8–18
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kenarang, A., Farahani, M. & Manthouri, M. BiGRU attention capsule neural network for persian text classification. J Ambient Intell Human Comput 13, 3923–3933 (2022). https://doi.org/10.1007/s12652-022-03742-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-022-03742-y