Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism

Alyoubi, Khaled Hamed; Sharma, Akashdeep

doi:10.1007/s11277-023-10274-x

Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism

Published: 22 March 2023

Volume 130, pages 43–60, (2023)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

138 Accesses
Explore all metrics

Abstract

The problem of multi-domain sentiment analysis is complex since meaning of words in different domains can be interpreted differently. This paper proposes a deep bi-directional Recurrent Neural Network based sentiment classification system employing attention mechanism for multi-domain classifications. The approach derives domain representation by extracting features related to description of domain from the text using bidirectional recurrent network with attention and feed it to the sentiment classifier along with the processed text using common hidden layers. We experiment with varied types of recurrent networks and propose that implementing the recurrent network with gated recurrent unit ensures that both domain-specific feature extraction and feature sharing for classification can be performed simultaneously and effectively. The evaluation of domain and sentiment modules has been conducted separately and results are encouraging. We found that using gated recurrent unit as bidirectional recurrent network in both modules gives efficient performance as it trains quickly and gives higher validation accuracy for all present domains. The proposed model also demonstrated good results for other metrics when compared with other similar state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Data Availability

The data may be made available on request with permissions.

Code Availability

The code may be made available on request with permissions.

References

Duric, A., & Song, F. (2012). Feature selection for sentiment analysis based on content and syntax models. Decision Support Systems, 53(4), 704–711.
Article Google Scholar
Gokalp, O., Tasci, E., & Ugur, A. (2020). A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification. Expert Systems with Applications, 146, 113176.
Article Google Scholar
Iqbal, F., Hashmi, J. M., Fung, B. C., Batool, R., Khattak, A. M., Aleem, S., & Hung, P. C. (2019). A hybrid framework for sentiment analysis using genetic algorithm based feature reduction. IEEE Access, 7, 14637–14652.
Article Google Scholar
Momani, S., Abo-Hammour, Z. S., & Alsmadi, O. M. (2016). Solution of inverse kinematics problem using genetic algorithms. Applied Mathematics & Information Sciences, 10(1), 225.
Article Google Scholar
Abo-Hammour, Z., Arqub, O. A., Alsmadi, O., Momani, S., & Alsaedi, A. (2014). An optimization algorithm for solving systems of singular boundary value problems. Applied Mathematics & Information Sciences, 8(6), 2809.
Article MathSciNet Google Scholar
Abo-Hammour, Z., Abu Arqub, O., Momani, S., & Shawagfeh, N. (2014). Optimization solution of Troesch’s and Bratu’s problems of ordinary type using novel continuous genetic algorithm. Discrete Dynamics in Nature and Society, 2014.
Abu Arqub, O., Abo-Hammour, Z., Momani, S., & Shawagfeh, N. (2012). Solving singular two-point boundary value problems using continuous genetic algorithm. In Abstract and applied analysis (Vol. 2012). Hindawi.
Xue, S., Lu, J., & Zhang, G. (2019). Cross-domain network representations. Pattern Recognition, 94, 135–148.
Article Google Scholar
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Du, Y., He, M., Wang, L., & Zhang, H. (2020). Wasserstein based transfer network for cross-domain sentiment classification. Knowledge-Based Systems, 204, 106162.
Article Google Scholar
Pan, S. J., Ni, X., Sun, J. T., Yang, Q., & Chen, Z. (2010). Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th international conference on World wide web (pp. 751–760).
Poria, S., Cambria, E., & Gelbukh, A. (2016). Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems, 108, 42–49.
Article Google Scholar
Rana, T. A., & Cheah, Y. N. (2017). A two-fold rule-based model for aspect extraction. Expert Systems with Applications, 89, 273–285.
Article Google Scholar
Wu, C., Wu, F., Wu, S., Yuan, Z., & Huang, Y. (2018). A hybrid unsupervised method for aspect term and opinion target extraction. Knowledge-Based Systems, 148, 66–73.
Article Google Scholar
Do, H. H., Prasad, P. W. C., Maag, A., & Alsadoon, A. (2019). Deep learning for aspect-based sentiment analysis: A comparative review. Expert Systems with Applications, 118, 272–299.
Article Google Scholar
Yuan, Z., Wu, S., Wu, F., Liu, J., & Huang, Y. (2018). Domain attention model for multi-domain sentiment classification. Knowledge-Based Systems, 155, 1–10.
Article Google Scholar
Liu, P., Qiu, X., & Huang, X. (2016). Deep multi-task learning with shared memory. arXiv preprint arXiv:1609.07222.
Chauhan, G. S., Meena, Y. K., Gopalani, D., & Nahta, R. (2020). A two-step hybrid unsupervised model with attention mechanism for aspect extraction. Expert Systems with Applications, 161, 113673.
Article Google Scholar
Kim, Y. B., Stratos, K., & Kim, D. (2017). Domain attention with an ensemble of experts. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers) (pp. 643–653).
Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I., & Socher, R. (2016). Ask me anything: Dynamic memory networks for natural language processing. In International conference on machine learning (pp. 1378–1387). PMLR.
Lee, G., Jeong, J., Seo, S., Kim, C., & Kang, P. (2018). Sentiment classification with word localization based on weakly supervised learning with a convolutional neural network. Knowledge-Based Systems, 152, 70–82.
Article Google Scholar
Liu, G., & Guo, J. (2019). Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing, 337, 325–338.
Article Google Scholar
Fu, X., Yang, J., Li, J., Fang, M., & Wang, H. (2018). Lexicon-enhanced LSTM with attention for general sentiment analysis. IEEE Access, 6, 71884–71891.
Article Google Scholar
Kardakis, S., Perikos, I., Grivokostopoulou, F., & Hatzilygeroudis, I. (2021). Examining attention mechanisms in deep learning models for sentiment analysis. Applied Sciences, 11(9), 3883.
Article Google Scholar
Riemer, M., Khabiri, E., & Goodwin, R. (2017). Representation stability as a regularizer for improved text analytics transfer learning. arXiv preprint arXiv:1704.03617.
Ji, Y., Wu, W., Chen, S., Chen, Q., Hu, W., & He, L. (2020). Two-stage sentiment classification based on user-product interactive information. Knowledge-Based Systems, 203, 106091.
Article Google Scholar
Yu, B., Wei, J., Yu, B., Cai, X., Wang, K., Sun, H., & Chen, X. (2022). Feature-guided multimodal sentiment analysis towards Industry 4.0. Computers and Electrical Engineering, 100, 107961.
Jetson Nano Developer Kit, Available at: https://developer.nvidia.com/embedded/jetson-nano-developer-kit. Last access 12th April, 2022.

Download references

Acknowledgements

This research work was funded by Institutional Fund Projects under grant no. (G:1299-611-1440). Therefore, the authors gratefully acknowledge technical and financial support from Ministry of Education and Deanship of Scientific Research (DSR), King Abdulaziz University (KAU), Jeddah, Saudi Arabia.

Funding

This study was funded by Deanship of Scientific Research (DSR), King Abdulaziz University Jeddah, under Grant No. (G:1299-611-1440).

Author information

Authors and Affiliations

Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Khaled Hamed Alyoubi
UIET, Panjab University Chandigarh, Chandigarh, India
Akashdeep Sharma

Authors

Khaled Hamed Alyoubi
View author publications
You can also search for this author in PubMed Google Scholar
Akashdeep Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed towards planning, literature survey, experimentations, manuscript writing, editing and proof-reading.

Corresponding author

Correspondence to Akashdeep Sharma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

1.1 Appendix 1: Recurrent Networks and Attention Mechanism

Recurrent Neural networks (RNN) are supervised deep learning neural networks in which the neurons are connected through time with each other and therefore allows exhibition of temporal behaviour. It is a class of network that contains hidden layer loops to hold the information at the old step to predict the value of the current time step. It forms connections between units in directed loops and remembers previous inputs through its internal state as seen in the diagram (Fig. 3).

Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output layer. A, B, and C are the network parameters that are used to improve the output of the model. At any given time t, the current input is a combination of input at x(t) and x(t-1). The output of RNN in time step t-1 affects the output at time step t which allows RNN to establish a current sequence of time. RNNs consider only the short-term dependencies because of vanishing and exploding gradient problems.

1.2 Long Short-Term Memory Networks (LSTM)

LSTM is an updated version of RNN which is specifically designed to eliminate gradient disappearance and gradient explosion issues. It has a memory cell at the top which helps to carry the information from a particular time instance to the next time instance in an efficient manner and is able to remember the information from previous states as compared to RNN.

As shown in the above Fig. 4, LSTM network is fed by input data from the instance of present time and output of hidden layer from the previous time instances. LSTMs have “cells” in the hidden layers of the neural network, which have three gates–an input gate, an output gate, and a forget gate. These gates control the flow of information which is needed to predict the output in the network.

1.3 Gated Recurrent Unit (GRU)

GRU is a well-accepted variant and used hidden states to regulate information instead of using a “cell state”. GRU has two gates instead of three gatesa reset gate and an update gate. GRU combines a forgotten gate and input gate into a single update gate. The reset and update gates control how much and which information to retain which is similar to the gates within LSTMs. The final model is simpler than the standard LSTM model and given a set of observations, the learned model can provide the corresponding control output vector. It has been found that GRUs are equally efficient as LSTMs.

1.4 Attention

Attention is proposed as a solution to the limitation of the encoder-decoder model which encodes the input sequence to one fixed length vector from which output is decoded by the decoder at each time step. Attention mechanism makes sure that the model searches for relevant information in the source sentence in order to predict next word and assigns more weight to that information. This is achieved by creating a context vector for each step rather than a fixed context vector. As can be seen in the Fig. 5, there is no change in the working of the encoder but the decoder’s hidden state is computed with a context vector, the previous output and the previous hidden state and also it has separate context vector for each target word. These context vectors are computed as a weighted sum of activation states in forward and backward directions and alphas and these alphas denote how much attention is given by the input for the generation of output word.

1.5 Appendix 2: Details of Used Various Parameters

d_pred_acc: Accuracy is the ratio of correctly predicted observation to total observations and here it means accuracy of the model to predict the domain.

d_pred_loss: It is the loss in the model to predict the domain and it indicates the level of inaccurate prediction by the model on a single example.

loss: It depicts how well specific algorithm models the given data and we had used cross entropy loss for training of our models.

s_pred_acc: The accuracy of the model to predict the sentiment.

s_pred_loss: The loss in the model to predict the sentiment.

val_d_pred_acc: The accuracy of the model to predict the domain during validation.

val_d_pred_loss: The loss in the model to predict the domain during validation.

val_loss: The complete loss from both the domains during validation.

val_s_pred_acc: The accuracy of the model to predict the sentiment during validation.

val_s_pred_loss: The loss in the model to predict the sentiment during validation.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alyoubi, K.H., Sharma, A. Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism. Wireless Pers Commun 130, 43–60 (2023). https://doi.org/10.1007/s11277-023-10274-x

Download citation

Accepted: 23 February 2023
Published: 22 March 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11277-023-10274-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism

Abstract

Access this article

Similar content being viewed by others

Sentiment Analysis in the Age of Generative AI

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Data Availability

Code Availability

References

Acknowledgements

Funding