Skip to main content
Log in

Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Pose-invariant facial expression recognition is one of the popular research directions within the field of computer vision, but pose variant usually change the facial appearance significantly, making the recognition results unstable from different perspectives. In this paper, a novel deep learning method, namely, soft thresholding squeeze-and-excitation (ST-SE) block, was proposed to extract salient features of different channels for pose-invariant FER. For the purpose of adapting to different pose-invariant facial images better, global average pooling (GAP) operation was adopted to compute the average value of each channel of the feature map. To enhance the representational power of the network, Squeeze-and-Excitation (SE) block was embedded into the nonlinear transformation layer to filter out the redundant feature information. To further shrink the significant features, the absolute values of GAP and SE were multiplied to calculate the threshold suitable for the current view. And the developed ST-SE block was inserted into ResNet50 for the evaluation of recognition performance. In this study, extensive experiments on four pose-invariant datasets were carried out, i.e., BU-3DFE, Multi-PIE, Pose-RAF-DB and Pose-AffectNet, and the influences of different environments, poses and intensities on expression recognition were specifically analyzed. The experimental results demonstrate the feasibility and effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig.4
Fig.5
Fig.6
Fig.7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Shu, X., Yang, J., Yan, R.: Expansion-squeeze-excitation fusion network for elderly activity recognition. arXiv e-prints (2021).

  2. Gogić, I., Manhart, M., Pandžić, I.S.: Fast facial expression recognition using local binary features and shallow neural networks. Vis. Comput. 36(1), 97–112 (2020)

    Article  Google Scholar 

  3. Shu, X., Tang, J., Li, Z.: Personalized age progression with bi-level aging dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 905–917 (2018)

    Article  Google Scholar 

  4. Shu, X., Tang, J., Li, Z.: Personalized age progression with aging dictionary. In: IEEE International Conference on Computer Vision (ICCV), pp. 3970–3978 (2015).

  5. Kumar, S., Bhuyan, M.K., Iwahori, Y.: Multi-level uncorrelated discriminative shared Gaussian process for multi-view facial expression recognition. Vis. Comput. 37(1), 143–159 (2021)

    Article  Google Scholar 

  6. Goh, K.M., Ng, C.H., Li, L.L.: Micro-expression recognition: an updated review of current trends, challenges and solutions. Vis. Comput. 36(3), 445–468 (2020)

    Article  Google Scholar 

  7. Zhu, X., Chen, Z.: Dual-modality spatiotemporal feature learning for spontaneous facial expression recognition in e-learning using hybrid deep neural network. Vis. Comput. 36(4), 743–755 (2019)

    Article  Google Scholar 

  8. Hu, M., Ge, P., Wang, X.: A spatio-temporal integrated model based on local and global features for video expression recognition. Vis. Comput. 1–18 (2021)

  9. Zhang, W., Zhang, Y., Ma, L., Guan, J., Gong, S.: Multimodal learning for facial expression recognition. Pattern Recogn. 48(10), 3191–3202 (2015)

    Article  Google Scholar 

  10. Zheng, W.: Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans. Affect. Comput. 5, 71–85 (2014)

    Article  Google Scholar 

  11. Moore, S., Bowden, R.: Local binary patterns for multi-view facial expression recognition. Comput. Vision. Image Underst. 115(4), 541–558 (2011)

    Article  Google Scholar 

  12. Zhang, F.F., Mao, Q.R., Shen, X.J.: Spatially coherent feature learning for pose-invariant facial expression recognition. ACM Trans. Multimed. Comput. Commun. 14(1), 1–19 (2018)

    Google Scholar 

  13. Liu, Y., Duanmu, M.X., Huo, Z.: Exploring multi-scale deformable context and channel-wise attention for salient object detection. Neurocomputing 428, 92–103 (2021)

    Article  Google Scholar 

  14. Liu, Y., Wei, D., Fang, F., et al. : Dynamic multi-channel metric network for joint pose-aware and identity-invariant facial expression recognition. Inf. Sci. 578, 195–213 (2021)

    Article  MathSciNet  Google Scholar 

  15. Liu, Y., Zeng, J., Shan, S.: Multi-channel pose-aware convolution neural networks for multi-view facial expression recognition. In: 13th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 458–465 (2018).

  16. Zhang, K., Huang, Y., Du, Y.: Facial expression recognition based on deep evolutional spatial-temporal networks. IEEE Trans. Image Process. 26, 4193–4203 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  17. Zheng, H., Wang, R., Ji, W.: Discriminative deep multi-task learning for facial expression recognition. Inf. Sci. 533, 60–71 (2020)

    Article  Google Scholar 

  18. Ma, H., Celik, T., Li, H.C.: Lightweight attention convolutional neural network through network slimming for robust facial expression recognition. SIViP. 15(7), 1507–1515 (2021)

    Article  Google Scholar 

  19. Li, Y., Lu, G., Li, J.: Facial Expression Recognition in the Wild Using Multi-level Features and Attention Mechanisms. IEEE Trans. Affect. Comput. 10(99), 1–1 (2020)

    Google Scholar 

  20. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 7132–7141 (2017)

    Google Scholar 

  21. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of 32nd International Conference on Mechanical Learning, Lille, France, pp. 448-456 (2015).

  22. Lin, M., Chen, Q., Yan, S.: Network in network. In: Proceedings of International Conference on Learning Computer Science. Vol 20, Issue 13, (2014).

  23. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference Computer Vision Pattern Recognit (CVPR). pp. 770–778 (2016).

  24. Yin, L., Wei, X., Sun, Y., et al. : A 3D facial expression database for facial behavior research. In: Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition, pp. 211–216 (2006).

  25. Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-PIE. Image Vis. Comput. 28(5), 807–813 (2010)

    Article  Google Scholar 

  26. Li, S., Deng, W.: Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28, 356–370 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  27. Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 18–31 (2019)

    Article  Google Scholar 

  28. Wu, J.L., Lin, Z.C., Zheng, W.M.: Locality-constrained linear coding based bi-layer model for multi-view facial expression recognition. Neurocomputing 239, 143–152 (2017)

    Article  Google Scholar 

  29. Jung, H., Lee, S., Yim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of International Conference Computer Vision pp. 2982–2991 (2015).

  30. Zhang, T., Zheng, W., Cui, Z.: A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimed. 18(12), 2528–2536 (2016)

    Article  Google Scholar 

  31. Zhang, F., Zhang, T., Mao, Q., Xu, C.: Geometry guided pose-invariant facial expression recognition. IEEE Trans. Image Process. 29, 4445–4460 (2020)

    Article  MATH  Google Scholar 

  32. Can, W., Wang, S., Liang, G.: Identity and pose-robust facial expression recognition through adversarial feature learning. In: The 27th ACM International Conference ACM. pp. 238–246 (2019).

  33. Jampour, M., Mauthner, T., Bischof, H.: Multi-view facial expressions recognition using local linear regression of sparse codes. In: Computer Vision Winter Workshop Paul Wohlhart (2015).

  34. Fan, J., Wang, S., Yang, P., et al. : Multi-view facial expression recognition based on multitask learning and generative adversarial network. In: IEEE International Conference on Industrial Informatics. (2020).

  35. Wang, K., Peng, X., Yang, J.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)

    Article  MATH  Google Scholar 

  36. Gera, D., Balasubramanian, S.: CERN: Compact facial expression recognition net. Pattern Recogn. Lett. 155, 9–18 (2022)

    Article  Google Scholar 

  37. Gera, D., Balasubramanian, S.: Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition. Pattern Recogn. Lett. 145, 58–66 (2021)

    Article  Google Scholar 

  38. Zhao, Z., Liu, Q., Wang, S.: Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Trans. Image Process. 30, 6544–6556 (2021)

    Article  Google Scholar 

  39. Wang, Z.N., Zeng, F.W.: OAENet: Oriented attention ensemble for accurate facial expression recognition. Pattern Recogn. 112(5), 107694 (2021)

    Article  Google Scholar 

Download references

Acknowledgements

National Natural Science Foundation of China (No:31872399), Advantage Discipline Construction Project (PAPD, No.6-2018) of Jiangsu University

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingqiao Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, C., Liu, X., Chen, C. et al. Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition. Vis Comput 39, 2637–2652 (2023). https://doi.org/10.1007/s00371-022-02483-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02483-5

Keywords

Navigation