MTGR: Improving Emotion and Sentiment Analysis with Gated Residual Networks

Hajlaoui, Rihab; Bilodeau, Guillaume-Alexandre; Rockemann, Jan

doi:10.1007/978-3-031-37660-3_11

Rihab Hajlaoui⁹,
Guillaume-Alexandre Bilodeau⁹ &
Jan Rockemann¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13643))

Included in the following conference series:

International Conference on Pattern Recognition

Abstract

In this paper, we address the problem of emotion recognition and sentiment analysis. Implementing an end-to-end deep learning model for emotion recognition or sentiment analysis that uses different modalities of data has become an emerging research area. Numerous research studies have shown that multimodal transformers can efficiently combine and integrate different heterogeneous modalities of data, and improve the accuracy of emotion/sentiment prediction. Therefore, in this paper, we propose a new multimodal transformer for sentiment analysis and emotion recognition. Compared to previous work, we propose to integrate a gated residual network (GRN) into the multimodal transformer to better capitalize on the various signal modalities. Our method shows an improvement of the F1 score and the accuracy results on the CMU-MOSI and IEMOCAP datasets compared to the state-of-the-art results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Morency, L.-P., Mihalcea, R., Doshi. P.: Towards multimodal sentiment analysis: harvesting opinions from the web. In: International Conference on Multimodal Interfaces (ICMI 2011). Alicante, Spain, Nov. (2011)
Google Scholar
V. Pérez-Rosas, V., Mihalcea, R., Morency, L.-P.: Utterance-level multimodal sentiment analysis. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia, Bulgaria: Association for Computational Linguistics, Aug. 2013, pp. 973–982. https://aclanthology.org/P13-1096
Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). (2016), pp. 439–448. https://doi.org/10.1109/ICDM.2016.0055
Wang, H., Meghawat, A., Morency, L., Xing, E.P.: Select-Additive Learning: improving cross-individual generalization in multimodal sentiment analysis. In: CoRR abs/1609.05244 (2016). arXiv: 1609.05244
Zadeh, A., Zellers, R., Pincus, E., Morency, L.: MOSI: multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. In: CoRR abs/1606.06259 (2016). arXiv: 1606.06259
Tsai, Y.H., Bai, S., Liang, P.P., Kolter, J.Z., Morency, L., Salakhutdinov, R.: Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL (2019), Florence, Italy, Jul 28- Aug 2, 2019, Volume 1: Long Papers. 2019, pp. 6558–6569. https://doi.org/10.18653/v1/p19-1656
Dobrišek, S., Gajšek, R., Mihelic, F., Pavešic, N., Štruc. V.: Towards efficient multi-modal emotion recognition. Int. J. Adv. Robot. Syst. 10.1, 53 (2013)
Google Scholar
Li, B., Li, C., Duan, F., Zheng, N., Zhao. Q.: TPFN: applying outer product along time to multimodal sentiment analysis fusion on incomplete data. In: Computer Vision - ECCV 2020–16th European Conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part XXIV. (2020), pp. 431–447. https://doi.org/10.1007/978-3-030-58586-0_26
Zadeh, A., Liang, P.P., Poria, S., Cambria, E., Morency. L.: Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL (2018), Melbourne, Australia, Jul 15–20, 2018, Volume 1: Long Papers. 2018, pp. 2236–2246. https://doi.org/10.18653/v1/P18-1208, https://aclanthology.org/P18-1208/
Wang, Y., Shen, Y., Liu, Z., Liang, P.P., Zadeh, A., Morency, L.-P.: Words can shift: dynamically adjusting word representations using nonverbal behaviors. In: AAAI, pp. 7216–7223 (2019). https://doi.org/10.1609/aaai.v33i01.33017216
Delbrouck, J., Tits, N., Brousmiche, M., Dupont, S.: A transformerbased joint-encoding for emotion recognition and sentiment analysis. In: CoRR abs/2006.15955 (2020). arXiv: 2006.15955
Lim, B., Arik, S.O., Loeff, N., Pfister, T.: Temporal fusion transformers for interpretable multi-horizon time series forecasting (2020). arXiv:1912.09363 [stat.ML]
Savarese, P., Figueiredo, D.: Residual gates: a simple mechanism for improved network optimization. In: Proceedings of the International Conference on Learning Representations (2017)
Google Scholar
Busso, C., et al.: IEMOCAP: interactive emotional dyadic motion capture database. In: Lang. Resour. Evaluation 42.4 (2008), pp. 335–359. https://doi.org/10.1007/s10579-008-9076-6
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods In Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
iMotions. https://imotions.com/ (2017)
Degottex, G., Kane, J., Drugman, T., Raitio, T., Scherer, S.: COVAREP-A collaborative voice analysis repository for speech technologies. In: IEEE International Conference on Acoustics, Speech And Signal Processing (icassp), vol. 2014, pp. 960–964. IEEE (2014)
Google Scholar
Delbrouck, J.-B., Tits, N., Brousmiche, M., Dupont, S.: A transformerbased joint-encoding for emotion recognition and sentiment analysis. In: Second Grand-Challenge and Workshop on Multimodal Language (Challenge-HML) (2020). https://doi.org/10.18653/v1/2020.challengehml-1.1

Download references

Author information

Authors and Affiliations

LITIV Laboratory Polytechnique Montréal, Montréal, Canada
Rihab Hajlaoui & Guillaume-Alexandre Bilodeau
Airudi, Montréal, Canada
Jan Rockemann

Authors

Rihab Hajlaoui
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume-Alexandre Bilodeau
View author publications
You can also search for this author in PubMed Google Scholar
Jan Rockemann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rihab Hajlaoui .

Editor information

Editors and Affiliations

York University, Toronto, ON, Canada
Jean-Jacques Rousseau
Ontario Tech University, Oshawa, ON, Canada
Bill Kapralos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hajlaoui, R., Bilodeau, GA., Rockemann, J. (2023). MTGR: Improving Emotion and Sentiment Analysis with Gated Residual Networks. In: Rousseau, JJ., Kapralos, B. (eds) Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges. ICPR 2022. Lecture Notes in Computer Science, vol 13643. Springer, Cham. https://doi.org/10.1007/978-3-031-37660-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-37660-3_11
Published: 30 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37659-7
Online ISBN: 978-3-031-37660-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

MTGR: Improving Emotion and Sentiment Analysis with Gated Residual Networks