Skip to main content
Log in

Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

The problem of multi-domain sentiment analysis is complex since meaning of words in different domains can be interpreted differently. This paper proposes a deep bi-directional Recurrent Neural Network based sentiment classification system employing attention mechanism for multi-domain classifications. The approach derives domain representation by extracting features related to description of domain from the text using bidirectional recurrent network with attention and feed it to the sentiment classifier along with the processed text using common hidden layers. We experiment with varied types of recurrent networks and propose that implementing the recurrent network with gated recurrent unit ensures that both domain-specific feature extraction and feature sharing for classification can be performed simultaneously and effectively. The evaluation of domain and sentiment modules has been conducted separately and results are encouraging. We found that using gated recurrent unit as bidirectional recurrent network in both modules gives efficient performance as it trains quickly and gives higher validation accuracy for all present domains. The proposed model also demonstrated good results for other metrics when compared with other similar state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data Availability

The data may be made available on request with permissions.

Code Availability

The code may be made available on request with permissions.

References

  1. Duric, A., & Song, F. (2012). Feature selection for sentiment analysis based on content and syntax models. Decision Support Systems, 53(4), 704–711.

    Article  Google Scholar 

  2. Gokalp, O., Tasci, E., & Ugur, A. (2020). A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification. Expert Systems with Applications, 146, 113176.

    Article  Google Scholar 

  3. Iqbal, F., Hashmi, J. M., Fung, B. C., Batool, R., Khattak, A. M., Aleem, S., & Hung, P. C. (2019). A hybrid framework for sentiment analysis using genetic algorithm based feature reduction. IEEE Access, 7, 14637–14652.

    Article  Google Scholar 

  4. Momani, S., Abo-Hammour, Z. S., & Alsmadi, O. M. (2016). Solution of inverse kinematics problem using genetic algorithms. Applied Mathematics & Information Sciences, 10(1), 225.

    Article  Google Scholar 

  5. Abo-Hammour, Z., Arqub, O. A., Alsmadi, O., Momani, S., & Alsaedi, A. (2014). An optimization algorithm for solving systems of singular boundary value problems. Applied Mathematics & Information Sciences, 8(6), 2809.

    Article  MathSciNet  Google Scholar 

  6. Abo-Hammour, Z., Abu Arqub, O., Momani, S., & Shawagfeh, N. (2014). Optimization solution of Troesch’s and Bratu’s problems of ordinary type using novel continuous genetic algorithm. Discrete Dynamics in Nature and Society, 2014.

  7. Abu Arqub, O., Abo-Hammour, Z., Momani, S., & Shawagfeh, N. (2012). Solving singular two-point boundary value problems using continuous genetic algorithm. In Abstract and applied analysis (Vol. 2012). Hindawi.

  8. Xue, S., Lu, J., & Zhang, G. (2019). Cross-domain network representations. Pattern Recognition, 94, 135–148.

    Article  Google Scholar 

  9. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  10. Du, Y., He, M., Wang, L., & Zhang, H. (2020). Wasserstein based transfer network for cross-domain sentiment classification. Knowledge-Based Systems, 204, 106162.

    Article  Google Scholar 

  11. Pan, S. J., Ni, X., Sun, J. T., Yang, Q., & Chen, Z. (2010). Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th international conference on World wide web (pp. 751–760).

  12. Poria, S., Cambria, E., & Gelbukh, A. (2016). Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems, 108, 42–49.

    Article  Google Scholar 

  13. Rana, T. A., & Cheah, Y. N. (2017). A two-fold rule-based model for aspect extraction. Expert Systems with Applications, 89, 273–285.

    Article  Google Scholar 

  14. Wu, C., Wu, F., Wu, S., Yuan, Z., & Huang, Y. (2018). A hybrid unsupervised method for aspect term and opinion target extraction. Knowledge-Based Systems, 148, 66–73.

    Article  Google Scholar 

  15. Do, H. H., Prasad, P. W. C., Maag, A., & Alsadoon, A. (2019). Deep learning for aspect-based sentiment analysis: A comparative review. Expert Systems with Applications, 118, 272–299.

    Article  Google Scholar 

  16. Yuan, Z., Wu, S., Wu, F., Liu, J., & Huang, Y. (2018). Domain attention model for multi-domain sentiment classification. Knowledge-Based Systems, 155, 1–10.

    Article  Google Scholar 

  17. Liu, P., Qiu, X., & Huang, X. (2016). Deep multi-task learning with shared memory. arXiv preprint arXiv:1609.07222.

  18. Chauhan, G. S., Meena, Y. K., Gopalani, D., & Nahta, R. (2020). A two-step hybrid unsupervised model with attention mechanism for aspect extraction. Expert Systems with Applications, 161, 113673.

    Article  Google Scholar 

  19. Kim, Y. B., Stratos, K., & Kim, D. (2017). Domain attention with an ensemble of experts. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers) (pp. 643–653).

  20. Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I., & Socher, R. (2016). Ask me anything: Dynamic memory networks for natural language processing. In International conference on machine learning (pp. 1378–1387). PMLR.

  21. Lee, G., Jeong, J., Seo, S., Kim, C., & Kang, P. (2018). Sentiment classification with word localization based on weakly supervised learning with a convolutional neural network. Knowledge-Based Systems, 152, 70–82.

    Article  Google Scholar 

  22. Liu, G., & Guo, J. (2019). Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing, 337, 325–338.

    Article  Google Scholar 

  23. Fu, X., Yang, J., Li, J., Fang, M., & Wang, H. (2018). Lexicon-enhanced LSTM with attention for general sentiment analysis. IEEE Access, 6, 71884–71891.

    Article  Google Scholar 

  24. Kardakis, S., Perikos, I., Grivokostopoulou, F., & Hatzilygeroudis, I. (2021). Examining attention mechanisms in deep learning models for sentiment analysis. Applied Sciences, 11(9), 3883.

    Article  Google Scholar 

  25. Riemer, M., Khabiri, E., & Goodwin, R. (2017). Representation stability as a regularizer for improved text analytics transfer learning. arXiv preprint arXiv:1704.03617.

  26. Ji, Y., Wu, W., Chen, S., Chen, Q., Hu, W., & He, L. (2020). Two-stage sentiment classification based on user-product interactive information. Knowledge-Based Systems, 203, 106091.

    Article  Google Scholar 

  27. Yu, B., Wei, J., Yu, B., Cai, X., Wang, K., Sun, H., & Chen, X. (2022). Feature-guided multimodal sentiment analysis towards Industry 4.0. Computers and Electrical Engineering, 100, 107961.

  28. Jetson Nano Developer Kit, Available at: https://developer.nvidia.com/embedded/jetson-nano-developer-kit. Last access 12th April, 2022.

Download references

Acknowledgements

This research work was funded by Institutional Fund Projects under grant no. (G:1299-611-1440). Therefore, the authors gratefully acknowledge technical and financial support from Ministry of Education and Deanship of Scientific Research (DSR), King Abdulaziz University (KAU), Jeddah, Saudi Arabia.

Funding

This study was funded by Deanship of Scientific Research (DSR), King Abdulaziz University Jeddah, under Grant No. (G:1299-611-1440).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed towards planning, literature survey, experimentations, manuscript writing, editing and proof-reading.

Corresponding author

Correspondence to Akashdeep Sharma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Appendix 1: Recurrent Networks and Attention Mechanism

Recurrent Neural networks (RNN) are supervised deep learning neural networks in which the neurons are connected through time with each other and therefore allows exhibition of temporal behaviour. It is a class of network that contains hidden layer loops to hold the information at the old step to predict the value of the current time step. It forms connections between units in directed loops and remembers previous inputs through its internal state as seen in the diagram (Fig. 3).

Fig. 3
figure 3

RNN and its illustration

Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output layer. A, B, and C are the network parameters that are used to improve the output of the model. At any given time t, the current input is a combination of input at x(t) and x(t-1). The output of RNN in time step t-1 affects the output at time step t which allows RNN to establish a current sequence of time. RNNs consider only the short-term dependencies because of vanishing and exploding gradient problems.

1.2 Long Short-Term Memory Networks (LSTM)

LSTM is an updated version of RNN which is specifically designed to eliminate gradient disappearance and gradient explosion issues. It has a memory cell at the top which helps to carry the information from a particular time instance to the next time instance in an efficient manner and is able to remember the information from previous states as compared to RNN.

As shown in the above Fig. 4, LSTM network is fed by input data from the instance of present time and output of hidden layer from the previous time instances. LSTMs have “cells” in the hidden layers of the neural network, which have three gates–an input gate, an output gate, and a forget gate. These gates control the flow of information which is needed to predict the output in the network.

Fig. 4
figure 4

LSTM and GRU architecture

1.3 Gated Recurrent Unit (GRU)

GRU is a well-accepted variant and used hidden states to regulate information instead of using a “cell state”. GRU has two gates instead of three gatesa reset gate and an update gate. GRU combines a forgotten gate and input gate into a single update gate. The reset and update gates control how much and which information to retain which is similar to the gates within LSTMs. The final model is simpler than the standard LSTM model and given a set of observations, the learned model can provide the corresponding control output vector. It has been found that GRUs are equally efficient as LSTMs.

1.4 Attention

Attention is proposed as a solution to the limitation of the encoder-decoder model which encodes the input sequence to one fixed length vector from which output is decoded by the decoder at each time step. Attention mechanism makes sure that the model searches for relevant information in the source sentence in order to predict next word and assigns more weight to that information. This is achieved by creating a context vector for each step rather than a fixed context vector. As can be seen in the Fig. 5, there is no change in the working of the encoder but the decoder’s hidden state is computed with a context vector, the previous output and the previous hidden state and also it has separate context vector for each target word. These context vectors are computed as a weighted sum of activation states in forward and backward directions and alphas and these alphas denote how much attention is given by the input for the generation of output word.

Fig. 5
figure 5

Sample attention mechanism

1.5 Appendix 2: Details of Used Various Parameters

d_pred_acc: Accuracy is the ratio of correctly predicted observation to total observations and here it means accuracy of the model to predict the domain.

d_pred_loss: It is the loss in the model to predict the domain and it indicates the level of inaccurate prediction by the model on a single example.

loss: It depicts how well specific algorithm models the given data and we had used cross entropy loss for training of our models.

s_pred_acc: The accuracy of the model to predict the sentiment.

s_pred_loss: The loss in the model to predict the sentiment.

val_d_pred_acc: The accuracy of the model to predict the domain during validation.

val_d_pred_loss: The loss in the model to predict the domain during validation.

val_loss: The complete loss from both the domains during validation.

val_s_pred_acc: The accuracy of the model to predict the sentiment during validation.

val_s_pred_loss: The loss in the model to predict the sentiment during validation.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alyoubi, K.H., Sharma, A. Deep Recurrent Neural Model for Multi Domain Sentiment Analysis with Attention Mechanism. Wireless Pers Commun 130, 43–60 (2023). https://doi.org/10.1007/s11277-023-10274-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-023-10274-x

Keywords

Navigation