Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Gera, Darshan; Raj Kumar, Bobbili Veerendra; Badveeti, Naveen Siva Kumar; Balasubramanian, S

doi:10.1007/s11042-023-17510-3

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Published: 04 November 2023

Volume 83, pages 49537–49566, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

182 Accesses
1 Citation
Explore all metrics

Abstract

The real-world facial expression recognition (FER) datasets suffer from noisy annotations due to crowd-sourcing, ambiguity in expressions, the subjectivity of annotators, and inter-class similarity. However, the recent deep networks have a strong capacity to memorize the noisy annotations leading to corrupted feature embedding and poor generalization. Recent works handle the problem by selecting samples with clean labels based on loss values using a fixed threshold for all the classes which may not always be reliable. They also depend upon the noise rate in the data which may not always be available. In this work, we propose a novel FER framework (DNFER) in which samples with clean labels are selected based on a class-specific threshold, computed dynamically in each mini-batch. Specifically, DNFER uses supervised training on selected clean samples and unsupervised consistent training on all the samples. This threshold is independent of noise rate and does not need any clean data, unlike other methods. In addition, to effectively learn from noisy annotated samples, the posterior distributions between weakly-augmented image and strongly-augmented image are aligned using an unsupervised consistency loss. We demonstrate the robustness of DNFER on both synthetic as well as on real noisy annotated FER datasets. In addition, DNFER obtains state-of-the-art performance on popular benchmark datasets, with 90.41% on RAFDB, 57.77 % on SFEW, 89.32% on FERPlus, and 65.22% on AffectNet-7. Our source codes are made publicly available at https://github.com/1980x/DNFER.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

Notes

https://github.com/ufoym/imbalanced-dataset-sampler

References

Liu Z, Peng Y, Hu W (2020) Driver fatigue detection based on deeply-learned facial expression representation. J Vis Commun Image Represent 71:102723
Article Google Scholar
Bisogni C, Castiglione A, Hossain S, Narducci F, Umer S (2022) Impact of deep learning approaches on facial expression recognition in healthcare industries. IEEE Trans Ind Inform 18(8):5619–5627. https://doi.org/10.1109/TII.2022.3141400
Article Google Scholar
Maqableh W, Alzyoud FY, Zraqou J (2023) The use of facial expressions in measuring students’ interaction with distance learning environments during the covid-19 crisis. Vis Inform 7(1):1–17
Article Google Scholar
Yan L, Sheng M, Wang C, Gao R, Yu H (2022) Hybrid neural networks based facial expression recognition for smart city. Multimed Tools Appl 1–24
Abate AF, Bisogni C, Cascone L, Castiglione A, Costabile G, Mercuri I (2020) Social robot interactions for social engineering: opportunities and open issues. In: 2020 IEEE intl conf on dependable, autonomic and secure computing, intl conf on pervasive intelligence and computing, intl conf on cloud and big data computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 539–547. https://doi.org/10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00097
Ding H, Zhou P, Chellappa R (2020) Occlusion-adaptive deep network for robust facial expression recognition. In: 2020 IEEE international joint conference on biometrics (IJCB), pp. 1–9. IEEE
Gera D, Balasubramanian S (2021) Landmark guidance independent spatio channel attention and complementary context information based facial expression recognition. Pattern Recognit Lett 145:58–66
Article Google Scholar
Mao S, Shi G, Gou S, Yan D, Jiao L, Xiong L (2022) Adaptively lighting up facial expression crucial regions via local non-local joint network. arXiv:2203.14045
Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069
Article Google Scholar
Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7660–7669
Zhao Z, Liu Q, Zhou F (2021) Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 3510–3519
Li S, Deng W (2019) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370
Article MathSciNet Google Scholar
Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp. 2584–2593. IEEE
Barsoum E, Zhang C, Ferrer CC, Zhang Z (2016) Training deep networks for facial expression recognition with crowdsourced label distribution. In Proceedings of the 18th ACM international conference on multimodal interaction, 279–283
Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput
Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp. 2106–2112. IEEE
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Arpit D, Jastrzkebski S, Ballas N, Krueger D, Bengio E, Kanwal MS, Maharaj T, Fischer A, Courville A, Bengio Y et al (2017) A closer look at memorization in deep networks. In: International conference on machine learning, pp. 233–242. PMLR
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2021) Understanding deep learning (still) requires rethinking generalization. Commun ACM 64(3):107–115
Article Google Scholar
Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: CVPR, pp. 6897–6906
Gera D, Vikas G, Balasubramanian S (2021) Handling ambiguous annotations for facial expression recognition in the wild. In: Proceedings of the twelfth Indian conference on computer vision, graphics and image processing, pp. 1–9
Liu Y, Zhang X, Kauttonen J, Zhao G (2022) Uncertain label correction via auxiliary action unit graphs for facial expression recognition. 777–783. https://doi.org/10.1109/ICPR56361.2022.9956650. IEEE
Mao S, Shi G, Jiao L, Gou S, Li Y, Xiong L, Shi B (2021) Label distribution amendment with emotional semantic correlations for facial expression recognition. arXiv:2107.11061
She J, Hu Y, Shi H, Wang J, Shen Q, Mei T (2021) Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6248–6257
Gera D, Balasubramanian S (2021) Noisy annotations robust consensual collaborative affect expression recognition. In: 2021 IEEE/CVF international conference on computer vision workshops (ICCVW), pp. 3578–3585. https://doi.org/10.1109/ICCVW54120.2021.00399
Fan X, Deng Z, Wang K, Peng X, Qiao Y (2020) Learning discriminative representation for facial expression recognition from uncertainties. In: 2020 IEEE international conference on image processing (ICIP), pp. 903–907. IEEE
Zhang Y, Wang C, Deng W (2021) Relative uncertainty learning for facial expression recognition. Adv Neural Inf Process Syst 34:17616–17627
Google Scholar
Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M (2019) How does disagreement help generalization against label corruption? In: International conference on machine learning, pp. 7164–7173. PMLR
Wei H, Feng L, Chen X, An B (2020) Combating noisy labels by agreement: A joint training method with co-regularization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13726–13735
Sarfraz F, Arani E, Zonooz B (2021) Noisy concurrent training for efficient learning under label noise. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 3159–3168
Jiang L, Zhou Z, Leung T, Li LJ, Fei Fei L (2018) Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In: International conference on machine learning, pp. 2304–2313. PMLR
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang IW, Sugiyama M (2018) Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd international conference on neural information processing systems, pp. 8536–8546
Yuan B, Chen J, Zhang W, Tai HS, McMains S (2018) Iterative cross learning on noisy labels. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp. 757–765. IEEE
Ren M, Zeng W, Yang B, Urtasun R (2018) Learning to reweight examples for robust deep learning. In: International conference on machine learning, pp. 4334–4343. PMLR
Wang X, Hua Y, Kodirov E, Robertson NM (2019) Imae for noise-robust learning: Mean absolute error does not treat examples equally and gradient magnitude’s variance matters. arXiv:1903.12141
Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J (2019) Symmetric cross entropy for robust learning with noisy labels. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 322–330
Song H, Kim M, Park D, Shin Y, Lee J-G (2022) Learning from noisy labels with deep neural networks: A survey. IEEE Trans Neural Netw Learn Syst
Malach E, Shalev-Shwartz S (2017) Decoupling” when to update” from” how to update”. Adv Neural Inf Process Syst 960–970
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory, pp. 92–100
Gera D, Balasubramanian S (2021) Co-curing noisy annotations for facial expression recognition. ICTACT J Image Vid Process (IJIVP). https://doi.org/10.21917/ijivp.2021.0356
Ge H, Zhu Z, Dai Y, Wang B, Wu X (2022) Facial expression recognition based on deep learning. Comput Methods Programs Biomed 215:106621
Article Google Scholar
Sun Z, Zhang H, Bai J, Liu M, Hu Z (2023) A discriminatively deep fusion approach with improved conditional gan (im-cgan) for facial expression recognition. Pattern Recognit 135:109157
Article Google Scholar
Li Y, Zeng J, Shan S, Chen X (2019) Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans Image Process 28:2439–2450
Article MathSciNet Google Scholar
Gera D, Balasubramanian S, Jami A (2022) Cern: Compact facial expression recognition net. Pattern Recognit Lett 155:9–18
Article Google Scholar
Zeng J, Shan S, Chen X (2018) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV), pp. 222–237
Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 702–703
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee D-H et al (2013) Challenges in representation learning: A report on three machine learning contests. In: International Conference on Neural Information Processing, pp. 117–124. Springer
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503. https://doi.org/10.1109/LSP.2016.2603342
Article Google Scholar
Guo Y, Zhang L, Hu Y, He X, Gao J (2016) Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. ECCV
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp. 618–626
Arnaud E, Dapogny A, Bailly K (2022) Thin: Throwable information networks and application for facial expression recognition in the wild. IEEE Trans Affect Comput
Mahmoudi MA, Chetouani A, Boufera F, Tabia H (2020) Kernelized dense layers for facial expression recognition. In: 2020 IEEE international conference on image processing (ICIP), pp. 2226–2230. IEEE
Liu P, Lin Y, Meng Z, Lu L, Deng W, Zhou JT, Yang Y (2021) Point adversarial self-mining: A simple method for facial expression recognition. IEEE Trans Cybern
Jiang P, Wan B, Wang Q, Wu J (2020) Fast and efficient facial expression recognition using a gabor convolutional network. IEEE Signal Process Lett 27:1954–1958
Article Google Scholar
Huang Q, Huang C, Wang X, Jiang F (2021) Facial expression recognition with grid-wise attention and visual transformer. Inf Sci 580:35–54
Article MathSciNet Google Scholar
Georgescu MI, Ionescu RT, Popescu M (2019) Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7:64827–64836
Article Google Scholar
Siqueira H, Magg S, Wermter S (2020) Efficient facial feature learning with wide ensemble-based convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 5800–5809
Meng D, Peng X, Wang K, Qiao Y (2019) Frame attention networks for facial expression recognition in videos. In: 2019 IEEE international conference on image processing (ICIP), pp. 3866–3870. IEEE
Cai J, Meng Z, Khan AS, Li Z, O’Reilly J, Tong Y (2018) Island loss for learning discriminative features in facial expression recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp. 302–309. IEEE
Hua W, Dai F, Huang L, Xiong J, Gui G (2019) Hero: Human emotions recognition for realizing intelligent internet of things. IEEE Access 7:24321–24332
Article Google Scholar
Fu Y, Wu X, Li X, Pan Z, Luo D (2020) Semantic neighborhood-aware deep facial expression recognition. IEEE Trans Image Process 29:6535–6548
Article Google Scholar
Chen Y, Wang J, Chen S, Shi Z, Cai J (2019) Facial motion prior networks for facial expression recognition. VCIP. IEEE

Download references

Acknowledgements

We dedicate this work to Bhagawan Sri Sathya Sai Baba, Divine Founder Chancellor of Sri Sathya Sai Institute of Higher Learning, PrasanthiNilyam, A.P., India.

Author information

Veerendra Rajkumar, B Naveen SivaKumar and S Balasubramanian contributed equally to this work.

Authors and Affiliations

DMACS, Sri Sathya Sai Institute of Higher Learning, Brindavan, Kadugodi, Bengaluru, 560067, Karnataka, India
Darshan Gera
DMACS, Sri Sathya Sai Institute of Higher Learning, Prasanthi Nilayam, Puttaparthi, 515134, Andhra Pradesh, India
Bobbili Veerendra Raj Kumar, Naveen Siva Kumar Badveeti & S Balasubramanian

Authors

Darshan Gera
View author publications
You can also search for this author in PubMed Google Scholar
Bobbili Veerendra Raj Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Naveen Siva Kumar Badveeti
View author publications
You can also search for this author in PubMed Google Scholar
S Balasubramanian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Darshan Gera.

Ethics declarations

Competing interests

Authors have no competing interests.

Code availability

Source code is publicly available at https://github.com/1980x/DNFER.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gera, D., Raj Kumar, B.V., Badveeti, N.S.K. et al. Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition. Multimed Tools Appl 83, 49537–49566 (2024). https://doi.org/10.1007/s11042-023-17510-3

Download citation

Received: 01 July 2022
Revised: 17 August 2023
Accepted: 12 October 2023
Published: 04 November 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17510-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Facial emotion recognition using convolutional neural networks (FERC)

Convolutional neural network: a review of models, methodologies and applications to object detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Facial emotion recognition using convolutional neural networks (FERC)

Convolutional neural network: a review of models, methodologies and applications to object detection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation