Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition

Ming, Haipeng; Lu, Wenhuan; Zhang, Wei

doi:10.1007/978-3-031-26316-3_43

Haipeng Ming¹²,
Wenhuan Lu¹² &
Wei Zhang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13844))

Included in the following conference series:

Asian Conference on Computer Vision

340 Accesses

Abstract

Facial expression recognition (FER) suffers from high interclass similarity and large intraclass variation, leading to ambiguity or uncertainty and further confusing annotators. They also hinder the network in learning the valuable features of facial expression. Recently, many studies have revealed that the uncertainty or ambiguity is one of the key challenges in FER. In this paper, we propose a new method to address this issue from two aspects: a soft label mining module to convert the original hard labels to soft labels dynamically during training, and an average facial expression anchoring module to separate unique expression features from similarity expression features. The soft label mining module breaks the limits of the categorical model and mitigates the uncertainty or ambiguity. And the average facial expression anchoring module suppresses the high interclass similarity of facial expressions. Our method can train any backbone network for facial expression recognition. The experiments on the popular datasets show that our method achieves state-of-the-art results by 92.82% on RAF-DB and 67.91% on SFEW, and achieves a comparable result of 62.26% on AffectNet. The code is available at https://github.com/HaipengMing/SLM-AEA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Google Scholar
Bazzo, J.J., Lamar, M.V.: Recognizing facial actions using Gabor wavelets with neutral face average difference. In: International Conference on Automatic Face and Gesture Recognition, pp. 505–510 (2004)
Google Scholar
Cai, J., Meng, Z., Khan, A.S., Li, Z., O’Reilly, J., Tong, Y.: Island loss for learning discriminative features in facial expression recognition. In: International Conference on Automatic Face and Gesture Recognition, pp. 302–309 (2018)
Google Scholar
Chen, S., Wang, J., Chen, Y., Shi, Z., Geng, X., Rui, Y.: Label distribution learning on auxiliary label space graphs for facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 13984–13993 (2020)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Darwin, C., Prodger, P.: The Expression of the Emotions in Man and Animals. Oxford University Press, Oxford (1998)
Google Scholar
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: International Conference on Computer Vision, pp. 2106–2112 (2011)
Google Scholar
Ding, H., Zhou, S.K., Chellappa, L.: FaceNet2ExpNet: regularizing a deep face recognition net for expression recognition. In: International Conference on Automatic Face and Gesture Recognition (2017)
Google Scholar
Ekman, P.: Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol. Bull. 115(2), 268–287 (1994)
Article Google Scholar
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124–129 (1971)
Article Google Scholar
Ekman, P., Rosenberg, E.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)
Google Scholar
Gao, B., Xing, C., Xie, C., Wu, J., Geng, X.: Deep label distribution learning with label ambiguity. IEEE Trans. Image Process. 26(6), 2825–2838 (2017)
Article MathSciNet MATH Google Scholar
Geng, X.: Deep label distribution learning with label ambiguity. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M.: Bag of tricks for image classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 558–567 (2019)
Google Scholar
Li, S., Deng, W., Du, J., Zhang, Z.: Reliable crowd-sourcing and deep locality-preserving learning for expression recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 285–2861 (2017)
Google Scholar
Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., Li, L.: Learning from noisy labels with distillation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1910–1918 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: International Conference on Computer Vision (2021)
Google Scholar
Ma, F., Sun, B., Li, S.: Robust facial expression recognition with convolutional visual transformers. IEEE Trans. Affect. Comput. (2021)
Google Scholar
Mollahosseini, A., Hasani, B., Mahoor, M.H., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. TAC 10(1), 18–31 (2017)
Google Scholar
Ng, H.W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: ACM International Conference on Multimodal Interaction, pp. 443–449 (2015)
Google Scholar
Ng, P.C., Henikoff, S.: SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31(13), 3812–3814 (2003)
Article Google Scholar
Qu, Y., Mo, S., Niu, J.: DAT: training deep networks robust to label-noise by matching the feature distributions. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)
Google Scholar
Ruan, D., Yan, Y., Chen, S., Xue, J.H., HanziWang: deep disturbance-disentangled learning for facial expression recognition. In: ACM International Conference on Multimedia (2020)
Google Scholar
Ruan, D., Yan, Y., Lai, S., Chai, Z., Shen, C., Wang, H.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7660–7669 (2021)
Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)
Article Google Scholar
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
Article Google Scholar
She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)
Google Scholar
Shi, J., Zhu, S., Liang, Z.: Learning to amend facial expression representation via de-albino and affinity (2021). arXiv:2103.10189
Shu, J., et al.: Meta-weight-net: Learning an explicit mapping for sample weighting. In: Annual Conference on Neural Information Processing Systems (2019)
Google Scholar
Su, K., Geng, X.: Soft facial landmark detection by label distribution learning. In: AAAI Conference on Artificial Intelligence, pp. 5008–5015 (2019)
Google Scholar
Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. T-PAMI 23(2), 97–115 (2001)
Article Google Scholar
Veit, A., Alldrin, N., Chechika, G., Krasin, I., Gupta, A., Belongie, S.: Learning from noisy large-scale datasets with minimal supervision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 839–847 (2017)
Google Scholar
Wang, C., Wang, S., Liang, G.: Identity- and pose-robust facial expression recognition through adversarial feature learning. In: ACM International Conference on Multimedia, pp. 238–246 (2019)
Google Scholar
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Google Scholar
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Article MATH Google Scholar
Wang, Y., Ma, X., Chen, Z., Luo, Y., Yi, J., Bailey, J.: Symmetric cross entropy for robust learning with noisy labels. In: International Conference on Computer Vision (2019)
Google Scholar
Xu, N., Liu, Y., Geng, X.: Label enhancement for label distribution learning. IEEE Trans. Knowl. Data Eng. 33(4), 1632–1643 (2021)
Article Google Scholar
Xu, N., Shu, J., Liu, Y., Geng, X.: Variational label enhancement. In: Proceedings of the 37th International Conference on Machine Learning, pp. 10597–10606 (2020)
Google Scholar
Yang, H., Ciftci, U.A., Yin, L.: Facial expression recognition by de-expression residue learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2168–2177 (2018)
Google Scholar
Yi, K., Wu, J.: Probabilistic end-to-end noise correction for learning with noisy labels. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: European Conference on Computer Vision, pp. 22–237 (2018)
Google Scholar
Zhang, Y., et al.: Global-local GCN: large-scale label noise cleansing for face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Zhao, G., Huang, X., Taini, M., Li, S.Z., äInen, M.P.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)
Google Scholar
Zhao, Z., Liu, Q., Zhou, F.: Robust lightweight facial expression recognition network with label distribution training. In: AAAI Conference on Artificial Intelligence, pp. 3510–3519 (2021)
Google Scholar

Download references

Acknowledgements

This work was supported by the Tianjin Municipal Science and Technology Program for New Generation of Artificial Intelligence (19Z-XZNGX00030).

Author information

Authors and Affiliations

Tianjin University, Tianjin, China
Haipeng Ming, Wenhuan Lu & Wei Zhang

Authors

Haipeng Ming
View author publications
You can also search for this author in PubMed Google Scholar
Wenhuan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Zhang .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ming, H., Lu, W., Zhang, W. (2023). Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13844. Springer, Cham. https://doi.org/10.1007/978-3-031-26316-3_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-26316-3_43
Published: 02 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26315-6
Online ISBN: 978-3-031-26316-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition