Abstract
For the present, many technical problems exist in the convolution neural network for facial expression recognition, such as the complexity of convolutional face feature extraction, the difficulty in accurately recognizing the subtle feature changes of facial expressions, and the low automatic recognition rate of facial expressions. In this paper, a hybrid attention mechanism based on space and channel—Height Performance Module Implement(HPMI) attention mechanism is proposed to realize automatic facial expression recognition. The addition of this attention mechanism can enhance the weight of key features and make the model focused on the features which are useful for expression classification in the training process. An HPMI module based on spatial and channel-based mixed attention mechanism is embedded in VGG-16 network. This can effectively alleviate the overfitting phenomenon of the network, strengthen the useful information, suppress the useless information, promote the information flow between the key information of the image and the network model. At the same time, it can solve the problem of the inconsistency between the input dimension and the output dimension. The accuracy of the method in this paper is 98.97% and 88.44% on CK + and RAF-DB expression datasets. Experimental comparison shows that by embedding the HPMI module, the model can further enhance the learning of spatial and channel feature weight. Our experiments include 2 datasets for expression recognition and show an average improvement of 3.94% in the accuracy.
Similar content being viewed by others
Availability of data and materials
Some or all data, models, or code generated or used during the study are available from the corresponding author by request.
References
Li, S., & DENG, W. H. (2020). Deep facial expression recognition:a survey. Journal of Image and Graphics, 25(11), 2306–2320.
Ma, L., & Khorasani, K. (2004). Facial expression recognition using constructive feedforward neural networks. IEEE Transactions on Systems, Man and Cybernetics, 34(3), 1588–1595.
Qiang, W., Zhou, X. Y., Xu, H. N., Li, D. P., & An, H. R. (2020). Micro-expression recognition based on LBP and two-stream spatial-temporal neural network. Information and Contro, 49(06), 37–43.
Zhou, H. L., Kinman, L., & He, X. J. (2016). Shape-appearance-correlated active appearance model. Pattern Recognition, 56, 88–99.
Balasamy, K., & Suganyadevi, S. (2021). A fuzzy based ROI selection for encryption and watermarking in medical image using DWT and SVD. Multimedia Tools and Applications, 80(1), 1–20.
Balasamy, K., & Shamia, D. (2021). Feature extraction-based medical image watermarking using fuzzy-based median filter. IETE Journal of Research. https://doi.org/10.1080/03772063.2021.1893231
Li, J., Zhang, B.(2016). Facial expression recognition based on Gabor and conditional random fields, 2016 IEEE 13th International Conference on Signal Processing (ICSP), IEEE, pp. 752–756.
Sun, Y., & Wen, G. (2017). Cognitive facial expression recognition with constrained dimensionality reduction. Neurocomputing, 230, 397–408.
Lopes, A. T., De Aguiar, E., De Souza, A. F., et al. (2017). Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order. Pattern Recognition, 61, 610–628.
Li, Y., Lin, X. Z. H., & Jiang, M. Y. (2018). Facial expression recognition with cross-connect LeNet-5 network. Acta Automatica Sinica, 44(01), 176–182.
Ding, M. D., & Li, L. (2020). CNN and HOG dual-path feature fusion for face expression recognition. Information and Control, 49(1), 47–54.
Quan, T. N., & Yoon, S. (2020). Facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset. Sensors, 20(9), 2639.
Zhang, T., Zheng, W., Cui, Z., Zong, Y., Yan, J., & Yan, K. (2020). A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Transactions on multimedia, 20(9), 2639.
Li, S., & Deng, W. (2019). Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing, 28(1), 356–370.
Mollahosseini, A., Hasani, B., & Mahoor, M. H. (2019). AffectNet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, 10(1), 18–31.
Hu, M., Wang, H., Wang, X., Yang, J., & Wang, R. (2019). Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks. Journal of Visual Communication and Image Representation, 59, 176–185.
Zeng, N., Zhang, H., Song, B., Liu, W., Li, Y., & Dobaie, A. M. (2018). Facial expression recognition via learning deep sparse autoencoders. Neurocomputing, 273, 643–649.
Georgescu, M. I., Ionescu, R. T., & Popescu, M. (2018). Local learning with deep and handcrafted features for facial expression recognition. IEEE Access, 7, 64827–64836.
Wang, K., Peng, X., Yang, J., Meng, D., & Qiao, Y. (2020). Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing, 29, 4057–4069.
Hua, W., Dai, F., Huang, L., Xiong, J., & Gui, G. (2019). HERO: Human emotions recognition for realizing intelligent Internet of Things. IEEE Access, 7, 24321–24332.
Nie, X., Takalkar, M. A., Duan, M., et al. (2021). GEME: Dual-stream multi-task GEnder-based micro-expression recognition. Neurocomputing, 427, 13–28.
Ling, S., Du, Y. B., Wu, H. F., et al. (2020). Self-adaptive weighted synthesised local directional pattern integrating with sparse autoencoder for expression recognition based on improved multiple kernel learning strategy. Signal, IET Computer Vision, 14, 73–83.
Jain, D. K., Zhang, Z., & Huang, K. Q. (2020). Multi angle optimal pattern-based deep learning for automatic facial expression recognition-ScienceDirect. Pattern Recognition Letters, 139, 157–165.
Lopes, A. T., De ESouza, A. A. F., et al. (2017). Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order. Pattern Recognition, 61, 610–628.
Yan, H. (2018). Collaborative discriminative multi-metric learning for facial expression recognition in video. Pattern Recognition, 75, 33–40.
Zeng, J., Shan, S., & Chen, X. (2018). Facial expression recognition with inconsistently annotated datasets. Lecture Notes in Computer Science, 11217, 227–243.
Li, Z., Han, S., Khan, A. S., et al. (2019). Pooling map adaptation in convolutional neural network for facial expression recognition. 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE: pp. 1108–1113.
Wang, K., Peng, X., Yang, J. D., Meng, & Qiao, Y. (2020). Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing, 29(8), 4057–4069.
Vo, T. H., Lee, G. S., Yang, H. J., et al. (2020). Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access, 8, 131988–132001.
Xie, Y., Zhao, J., Qiang, B., et al. (2021). Attention mechanism-based CNN-LSTM model for wind turbine fault prediction using SSN ontology annotation. Wireless Communications and Mobile Computing, 2021(1), 1–12.
Sangeroki, B. A., & Cenggoro, T. W. (2021). A fast and accurate model of thoracic disease detection by integrating attention mechanism to a lightweight convolutional neural network. Procedia Computer Science, 179(11), 112–118.
Lu, T., Yu, F., Xue, C., & Han, B. (2020). Identification, classification, and quantification of three physical mechanisms in oil-in-water emulsions using alexnet with transfer learning. Journal of Food Engineering, 288, 110220.
Smith, A. D., Abbott, N. L., & Zavala, V. M. (2020). Convolutional network analysis of optical micrographs for liquid crystal sensors. The Journal of Physical Chemistry C, 124(28), 15152–15161.
Acknowledgements
Special thanks to the following funds for their support: Key Research Project of Natural Science in Universities of Anhui Province(No.KJ2020A0782); Provincial quality engineering in Anhui Province Grass-roots teaching and research office demonstration project(No.2018jyssf111); Data Science and Big Data Technology University First-Class Undergraduate Program Construction Center (No. 2020ylzyx02);University-level Quality Engineering Demonstration Experiment and Training Center "Big Data Comprehensive Experiment and Training Center" (No. 2020 sysxx01); 2020 Anhui Provincial College Student Innovation Plan Project (No. 202012216083).
Funding
Special thanks to the following funds for their support: Key Research Project of Natural Science in Universities of Anhui Province(No.KJ2020A0782); Provincial quality engineering in Anhui Province Grass-roots teaching and research office demonstration project(No.2018jyssf111); Data Science and Big Data Technology University First-Class Undergraduate Program Construction Center (No. 2020ylzyx02);University-level Quality Engineering Demonstration Experiment and Training Center "Big Data Comprehensive Experiment and Training Center" (No. 2020 sysxx01); 2020 Anhui Provincial College Student Innovation Plan Project (No. 202012216083).
Author information
Authors and Affiliations
Contributions
Conceptualization, L.Y. and S.H.; methodology, L.Y. and S.H.; software, L.Y.and K.S.; validation, Q.S.; formal analysis, Q.S.; investigation, K.S.; resources, S.H.; data curation, K.S.; writing—original draft preparation, S.H.; writing—review and editing, L.Y.; visualization, L.Y.; supervision, L.Y.; project administration, L.Y.; funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent for publication
Not applicable.
Ethical Approval
Not applicable.
Consent to Publish
On behalf of all authors, I consent to publish our manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yao, L., He, S., Su, K. et al. Facial Expression Recognition Based on Spatial and Channel Attention Mechanisms. Wireless Pers Commun 125, 1483–1500 (2022). https://doi.org/10.1007/s11277-022-09616-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-022-09616-y