Abstract
Facial expression recognition (FER) suffers from high interclass similarity and large intraclass variation, leading to ambiguity or uncertainty and further confusing annotators. They also hinder the network in learning the valuable features of facial expression. Recently, many studies have revealed that the uncertainty or ambiguity is one of the key challenges in FER. In this paper, we propose a new method to address this issue from two aspects: a soft label mining module to convert the original hard labels to soft labels dynamically during training, and an average facial expression anchoring module to separate unique expression features from similarity expression features. The soft label mining module breaks the limits of the categorical model and mitigates the uncertainty or ambiguity. And the average facial expression anchoring module suppresses the high interclass similarity of facial expressions. Our method can train any backbone network for facial expression recognition. The experiments on the popular datasets show that our method achieves state-of-the-art results by 92.82% on RAF-DB and 67.91% on SFEW, and achieves a comparable result of 62.26% on AffectNet. The code is available at https://github.com/HaipengMing/SLM-AEA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Bazzo, J.J., Lamar, M.V.: Recognizing facial actions using Gabor wavelets with neutral face average difference. In: International Conference on Automatic Face and Gesture Recognition, pp. 505–510 (2004)
Cai, J., Meng, Z., Khan, A.S., Li, Z., O’Reilly, J., Tong, Y.: Island loss for learning discriminative features in facial expression recognition. In: International Conference on Automatic Face and Gesture Recognition, pp. 302–309 (2018)
Chen, S., Wang, J., Chen, Y., Shi, Z., Geng, X., Rui, Y.: Label distribution learning on auxiliary label space graphs for facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 13984–13993 (2020)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Darwin, C., Prodger, P.: The Expression of the Emotions in Man and Animals. Oxford University Press, Oxford (1998)
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: International Conference on Computer Vision, pp. 2106–2112 (2011)
Ding, H., Zhou, S.K., Chellappa, L.: FaceNet2ExpNet: regularizing a deep face recognition net for expression recognition. In: International Conference on Automatic Face and Gesture Recognition (2017)
Ekman, P.: Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol. Bull. 115(2), 268–287 (1994)
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124–129 (1971)
Ekman, P., Rosenberg, E.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)
Gao, B., Xing, C., Xie, C., Wu, J., Geng, X.: Deep label distribution learning with label ambiguity. IEEE Trans. Image Process. 26(6), 2825–2838 (2017)
Geng, X.: Deep label distribution learning with label ambiguity. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., Li, M.: Bag of tricks for image classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 558–567 (2019)
Li, S., Deng, W., Du, J., Zhang, Z.: Reliable crowd-sourcing and deep locality-preserving learning for expression recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 285–2861 (2017)
Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., Li, L.: Learning from noisy labels with distillation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1910–1918 (2017)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: International Conference on Computer Vision (2021)
Ma, F., Sun, B., Li, S.: Robust facial expression recognition with convolutional visual transformers. IEEE Trans. Affect. Comput. (2021)
Mollahosseini, A., Hasani, B., Mahoor, M.H., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. TAC 10(1), 18–31 (2017)
Ng, H.W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: ACM International Conference on Multimodal Interaction, pp. 443–449 (2015)
Ng, P.C., Henikoff, S.: SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31(13), 3812–3814 (2003)
Qu, Y., Mo, S., Niu, J.: DAT: training deep networks robust to label-noise by matching the feature distributions. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)
Ruan, D., Yan, Y., Chen, S., Xue, J.H., HanziWang: deep disturbance-disentangled learning for facial expression recognition. In: ACM International Conference on Multimedia (2020)
Ruan, D., Yan, Y., Lai, S., Chai, Z., Shen, C., Wang, H.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7660–7669 (2021)
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)
Shi, J., Zhu, S., Liang, Z.: Learning to amend facial expression representation via de-albino and affinity (2021). arXiv:2103.10189
Shu, J., et al.: Meta-weight-net: Learning an explicit mapping for sample weighting. In: Annual Conference on Neural Information Processing Systems (2019)
Su, K., Geng, X.: Soft facial landmark detection by label distribution learning. In: AAAI Conference on Artificial Intelligence, pp. 5008–5015 (2019)
Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. T-PAMI 23(2), 97–115 (2001)
Veit, A., Alldrin, N., Chechika, G., Krasin, I., Gupta, A., Belongie, S.: Learning from noisy large-scale datasets with minimal supervision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 839–847 (2017)
Wang, C., Wang, S., Liang, G.: Identity- and pose-robust facial expression recognition through adversarial feature learning. In: ACM International Conference on Multimedia, pp. 238–246 (2019)
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Wang, Y., Ma, X., Chen, Z., Luo, Y., Yi, J., Bailey, J.: Symmetric cross entropy for robust learning with noisy labels. In: International Conference on Computer Vision (2019)
Xu, N., Liu, Y., Geng, X.: Label enhancement for label distribution learning. IEEE Trans. Knowl. Data Eng. 33(4), 1632–1643 (2021)
Xu, N., Shu, J., Liu, Y., Geng, X.: Variational label enhancement. In: Proceedings of the 37th International Conference on Machine Learning, pp. 10597–10606 (2020)
Yang, H., Ciftci, U.A., Yin, L.: Facial expression recognition by de-expression residue learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2168–2177 (2018)
Yi, K., Wu, J.: Probabilistic end-to-end noise correction for learning with noisy labels. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: European Conference on Computer Vision, pp. 22–237 (2018)
Zhang, Y., et al.: Global-local GCN: large-scale label noise cleansing for face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
Zhao, G., Huang, X., Taini, M., Li, S.Z., äInen, M.P.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)
Zhao, Z., Liu, Q., Zhou, F.: Robust lightweight facial expression recognition network with label distribution training. In: AAAI Conference on Artificial Intelligence, pp. 3510–3519 (2021)
Acknowledgements
This work was supported by the Tianjin Municipal Science and Technology Program for New Generation of Artificial Intelligence (19Z-XZNGX00030).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ming, H., Lu, W., Zhang, W. (2023). Soft Label Mining and Average Expression Anchoring for Facial Expression Recognition. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13844. Springer, Cham. https://doi.org/10.1007/978-3-031-26316-3_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-26316-3_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26315-6
Online ISBN: 978-3-031-26316-3
eBook Packages: Computer ScienceComputer Science (R0)