Skip to main content
Log in

Dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Emotion Recognition in Conversation (ERC) is a task aimed at predicting the emotions conveyed by an utterance in a dialogue. It is common in ERC research to integrate intra-utterance, local contextual, and global contextual information to obtain the utterance vectors. However, there exist complex semantic dependencies among these factors, and failing to model these dependencies accurately can adversely affect the effectiveness of emotion recognition. Moreover, to enhance the semantic dependencies within the context, researchers commonly introduce external commonsense knowledge after modeling it. However, injecting commonsense knowledge into the model simply without considering its potential impact can introduce unexpected noise. To address these issues, we propose a dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention. The local–global context encoder effectively integrates the information of intra-utterance, local context, and global context to capture the semantic dependencies among them. To provide more accurate external commonsense information, we present a fusion module to filter the commonsense information through multi-head attention. Our proposed method has achieved competitive results on four datasets and exhibits advantages compared with mainstream models using commonsense knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. Kratzwald B, Ilic S, Kraus M, Feuerriegel S, Prendinger H (2018) Decision support with text-based emotion recognition: deep learning for affective computing. arXiv preprint arXiv:1803.06397

  2. Wen J, Jiang D, Tu G, Liu C, Cambria E (2023) Dynamic interactive multiview memory network for emotion recognition in conversation. Inf Fusion 91:123–133

    Article  Google Scholar 

  3. Cambria E, Wang H, White B (2014) Guest editorial: big social data analysis. Knowl-Based Syst 69:1–2

    Article  Google Scholar 

  4. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  MathSciNet  Google Scholar 

  5. Saberi B, Saad S (2017) Sentiment analysis or opinion mining: a review. Int J Adv Sci Eng Inf Technol 7(5):1660–1666

    Article  Google Scholar 

  6. Baecker AN, Geiskkovitch DY, González AL, Young JE (2020) Emotional support domestic robots for healthy older adults: conversational prototypes to help with loneliness. In: Companion of the 2020 ACM/IEEE international conference on human–robot interaction, pp 122–124

  7. Abdollahi H, Mahoor MH, Zandie R, Sewierski J, Qualls SH (2022) Artificial emotional intelligence in socially assistive robots for older adults: a pilot study. IEEE Trans Affect Comput 14(3):2020–2032. https://doi.org/10.1109/TAFFC.2022.3143803

    Article  Google Scholar 

  8. Darling K (2016) Extending legal protection to social robots: the effects of anthropomorphism, empathy, and violent behavior towards robotic objects. In: Law Robot, Froomkin Calo, Kerr (eds) Edward Elgar.

  9. Zhong P, Wang D, Miao C (2019) Knowledge-enriched transformer for emotion detection in textual conversations. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 165–176

  10. Ghosal D, Majumder N, Gelbukh A, Mihalcea R, Poria S (2020) COSMIC: COmmonSense knowledge for eMotion identification in conversations. In: Findings of the association for computational linguistics: EMNLP 2020, pp 2470–2481

  11. Li J, Lin Z, Fu P, Wang W (2021) Past, present, and future: conversational emotion recognition through structural modeling of psychological knowledge. In: Findings of the association for computational linguistics: EMNLP 2021, pp 1204–1214

  12. Hu J, Liu Y, Zhao J, Jin Q (2021) MMGCN: multimodal fusion via deep graph convolution network for emotion recognition in conversation. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 5666–5675

  13. Wen Z, Wang R, Luo X, Wang Q, Liang B, Du J, Yu X, Gui L, Xu R (2023) Multi-perspective contrastive learning framework guided by sememe knowledge and label information for sarcasm detection. Int J Mach Learn Cybern 14:4119–4134

    Article  Google Scholar 

  14. Wang R, Bao J, Mi F, Chen Y, Wang H, Wang Y, Li Y, Shang L, Wong K-F, Xu R (2023) Retrieval-free knowledge injection through multi-document traversal for dialogue models. In: Proceedings of the 61st annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 6608–6619

  15. Bosselut A, Rashkin H, Sap M, Malaviya C, Celikyilmaz A, Choi Y (2019) COMET: commonsense transformers for automatic knowledge graph construction. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4762–4779

  16. Xiao G, Tu G, Zheng L, Zhou T, Li X, Ahmed SH, Jiang D (2020) Multimodality sentiment analysis in social internet of things based on hierarchical attentions and CSAT-TCN with MBM network. IEEE Internet Things J 8(16):12748–12757

    Article  Google Scholar 

  17. Jiang D, Liu H, Wei R, Tu G (2023) CSAT-FTCN: a fuzzy-oriented model with contextual self-attention network for multimodal emotion recognition. Cogn Comput 15:1082–1091

    Article  Google Scholar 

  18. Tu G, Wen J, Liu H, Chen S, Zheng L, Jiang D (2022) Exploration meets exploitation: multitask learning for emotion recognition based on discrete and dimensional models. Knowl-Based Syst 235:107598

    Article  Google Scholar 

  19. Khan W, Daud A, Nasir JA, Amjad T (2016) A survey on the state-of-the-art machine learning models in the context of NLP. Kuwait J Sci 43(4):95–113

    MathSciNet  Google Scholar 

  20. Tu G, Liang B, Jiang D, Xu R (2022) Sentiment- emotion- and context-guided knowledge selection framework for emotion recognition in conversations. IEEE Trans Affect Comput 14:1803–1816

    Article  Google Scholar 

  21. Chen R, Wang J, Yu L-C, Zhang X (2023) Decoupled variational autoencoder with interactive attention for affective text generation. Eng Appl Artif Intell 123:106447

    Article  Google Scholar 

  22. Sheng D, Wang D, Shen Y, Zheng H, Liu H (2020) Summarize before aggregate: a global-to-local heterogeneous graph inference network for conversational emotion recognition. In: Proceedings of the 28th international conference on computational linguistics, pp 4153–4163

  23. Poria S, Cambria E, Hazarika D, Majumder N, Zadeh A, Morency L-P (2017) Context-dependent sentiment analysis in user-generated videos. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long Papers), pp 873–883

  24. Zahiri SM, Choi JD (2018) Emotion detection on tv show transcripts with sequence-based convolutional neural networks. In: Workshops at the thirty-second AAAI conference on artificial intelligence

  25. Ishiwatari T, Yasuda Y, Miyazaki T, Goto J (2020) Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 7360–7370

  26. Zhang D, Wu L, Sun C, Li S, Zhu Q, Zhou G (2019) Modeling both context-and speaker-sensitive dependence for emotion detection in multi-speaker conversations. In: IJCAI, pp 5415–5421

  27. Shen W, Wu S, Yang Y, Quan X (2021) Directed acyclic graph network for conversational emotion recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 1551–1560

  28. Hu D, Wei L, Huai X (2021) DialogueCRN: contextual reasoning networks for emotion recognition in conversations. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 7042–7052

  29. Lee J, Lee W (2022) CoMPM: context modeling with speaker’s pre-trained memorytracking for emotion recognition in conversation. In: Proceedings of the 2022 Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 5669–5679

  30. Wang Y, Zhang J, Ma J, Wang S, Xiao J (2020) Contextualized emotion recognition in conversation as sequence tagging. In: Proceedings of the 21th annual meeting of the special interest group on discourse and dialogue, pp 186–195

  31. Chen R, Wang J, Yu L-C, Zhang X (2023) Learning to memorize entailment and discourse relations for persona-consistent dialogues. arXiv preprint arXiv:2301.04871

  32. Speer R, Chin J, Havasi C (2017) Conceptnet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 31

  33. Sap M, Le Bras R, Allaway E, Bhagavatula C, Lourie N, Rashkin H, Roof B, Smith NA, Choi Y (2019) Atomic: an Atlas of machine commonsense for if-then reasoning. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3027–3035

  34. Cambria E, Liu Q, Decherchi S, Xing F, Kwok K (2022) Senticnet 7: a commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In: Proc LREC 2022, pp 3829–3839

  35. Cai H, Shen X, Xu Q, Shen W, Wang X, Ge W, Zheng X, Xue X (2023) Improving empathetic dialogue generation by dynamically infusing commonsense knowledge. arXiv preprint arXiv:2306.04657

  36. Liu Y, Wan Y, He L, Peng H, Philip SY (2021) Kg-bart: knowledge graph-augmented bart for generative commonsense reasoning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 6418–6425

  37. Zhang X, Bosselut A, Yasunaga M, Ren H, Liang P, Manning CD, Leskovec J (2022) Greaselm: graph reasoning enhanced language models for question answering. arXiv preprint arXiv:2201.08860

  38. Song R, He S, Gao S, Cai L, Liu K, Yu Z, Zhao J (2023) Multilingual knowledge graph completion from pretrained language models with knowledge constraints. In: Findings of the association for computational linguistics: ACL 2023, pp 7709–7721

  39. Zhu L, Pergola G, Gui L, Zhou D, He Y (2021) Topic-driven and knowledge-aware transformer for dialogue emotion detection. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 1571–1582

  40. Tu G, Wen J, Liu C, Jiang D, Cambria E (2022) Context- and sentiment-aware networks for emotion recognition in conversation. IEEE Trans Artif Intell 3(5):699–708

    Article  Google Scholar 

  41. Jiang D, Wei R, Wen J, Tu G, Cambria E (2023) AutoML-Emo: automatic knowledge selection using congruent effect for emotion identification in conversations. IEEE Trans Affect Comput 14:1845–1856

    Article  Google Scholar 

  42. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  43. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 4171–4186

  44. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

  45. Liu Y, Lapata M (2019) Text summarization with pretrained encoders. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 3730–3740

  46. Radford A, Narasimhan K, Salimans T, Sutskever I, et al (2018) Improving language understanding by generative pre-training

  47. Busso C, Bulut M, Lee C-C, Kazemzadeh A, Mower E, Kim S, Chang JN, Lee S, Narayanan SS (2008) Iemocap: interactive emotional dyadic motion capture database. Lang Resour Eval 42(4):335–359

    Article  Google Scholar 

  48. Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the eighth international joint conference on natural language processing (Volume 1: Long Papers), pp 986–995

  49. Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R (2019) MELD: a multimodal multi-party dataset for emotion recognition in conversations. In: ACL, pp 527–536

  50. Chen Y (2015) Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo

  51. Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E (2019) Dialoguernn: an attentive RNN for emotion detection in conversations. In: Proceedings of the AAAI conference on artificial intelligence, pp 6818–6825

  52. Ghosal D, Majumder N, Poria S, Chhaya N, Gelbukh A (2019) DialogueGCN: a graph convolutional neural network for emotion recognition in conversation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 154–164

  53. Li J, Ji D, Li F, Zhang M, Liu Y (2020) Hitrans: a transformer-based context-and speaker-sensitive model for emotion detection in conversations. In: Proceedings of the 28th international conference on computational linguistics, pp 4190–4200

  54. Xie Y, Yang K, Sun C-J, Liu B, Ji Z (2021) Knowledge-interactive network with sentiment polarity intensity-aware multi-task learning for emotion recognition in conversations. In: Findings of the association for computational linguistics: EMNLP 2021, pp 2879–2889

Download references

Acknowledgements

The authors would like to respect and thank all reviewers for their constructive and helpful review. This research is funded by the National Natural Science Foundation of China (62372283, 62206163), Science and Technology Major Project of Guangdong Province (STKJ2021005, STKJ202209002, STKJ2023076), Natural Science Foundation of Guangdong Province (2019A1515010943).

Funding

The authors would like to respect and thank all reviewers for their constructive and helpful review. This research is funded by the National Natural Science Foundation of China (62372283, 62206163), Natural Science Foundation of Guangdong Province (2019A1515010943), The Basic and Applied Basic Research of Colleges and Universities in Guangdong Province (Special Projects in Artificial Intelligence)(2019KZDZX1030), 2020 Li Ka Shing Foundation Cross-Disciplinary Research Grant (2020LKSFG04D), Science and Technology Major Project of Guangdong Province(STKJ2021005, STKJ202209002), and the Opening Project of GuangDong Province Key Laboratory of Information Security Technology (2020B1212060078).

Author information

Authors and Affiliations

Authors

Contributions

W.Y.: Original Draft. C.L.: Review & Editing. X.H.: Review & Editing. W.Z.: Validation. E.C.: Review & Editing. D.J.: Supervision.

Corresponding authors

Correspondence to Erik Cambria or Dazhi Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, W., Li, C., Hu, X. et al. Dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-023-02066-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13042-023-02066-3

Keywords

Navigation