Skip to main content

FAPN: Face Alignment Propagation Network for Face Video Super-Resolution

  • Conference paper
  • First Online:
Computer Vision – ACCV 2022 Workshops (ACCV 2022)

Abstract

Face video super-resolution (FVSR) aims to use continuous low resolution (LR) video frames to reconstruct face and recover facial details under the premise of ensuring authenticity. The existing video super-resolution (VSR) technology usually uses inter-frame information to achieve better super-resolution (SR) performance. However, due to the complex temporal dependence between frames, as the number of input frames increases, the information cannot be fully utilized, and even wrong information is introduced, resulting in poor performance. In this work, we propose an alignment propagation network for accumulating facial prior information (FAPN). We design a neighborhood information coupling (NIC) module based on optical flow estimation and alignment, where the current frame, the adjacent frames and the SR results of the previous frame are locally fused. The coupled frames are sent to a unidirectional propagation (UP) structure for propagation. Meanwhile, in the UP structure, the facial prior information is filtered and accumulated in the face super-resolution cell (FSRC), and the high-dimensional hidden state is introduced to propagate effective temporal information between frames along the unidirectional structure. Extensive evaluations and comparisons validate the strengths of our approach, FAPN can accumulate more facial details while ensuring the authenticity of the face. And the experimental results demonstrated that the proposed framework achieves better performance on PSNR (up to 0.31 dB), SSIM (up to 0.15 dB) and face recognition accuracy (up to 1.99%) compared with state-of-the-art methods.

This work is supported by National Natural Science Foundation of China (grant number 62275046).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, M., Deng, W.: Deep face recognition: a survey. Neurocomputing 429, 215–244 (2021)

    Article  Google Scholar 

  2. Farooq, M., Dailey, M., Mahmood, A., Moonrinta, J., Ekpanyapong, M.: Human face super-resolution on poor quality surveillance video footage. Neural Comput. Appl. 33, 13505–13523 (2021)

    Article  Google Scholar 

  3. Yu, F., Li, H., Bian, S., Tang, Y.: An efficient network design for face video super-resolution. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 1513–1520 (2021)

    Google Scholar 

  4. Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3892–3901 (2019)

    Google Scholar 

  5. Wang, X., Chan, K.C., Yu, K., Dong, C., Loy, C.C.: EDVR: video restoration with enhanced deformable convolutional networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1954–1963 (2019)

    Google Scholar 

  6. Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3224–3232 (2018)

    Google Scholar 

  7. Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38

    Chapter  Google Scholar 

  8. Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485 (2019)

    Google Scholar 

  9. Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2848–2857 (2017)

    Google Scholar 

  10. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4945–4954 (2021)

    Google Scholar 

  11. Xin, J., Wang, N., Li, J., Gao, X., Li, Z.: Video face super-resolution with motion-adaptive feedback cell. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12468–12475 (2020)

    Google Scholar 

  12. Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: a general-purpose face recognition library with mobile applications (2016)

    Google Scholar 

  13. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)

    Google Scholar 

  14. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694 (2019)

    Google Scholar 

  15. Wang, L., Guo, Y., Lin, Z., Deng, X., An, W.: Learning for video super-resolution through HR optical flow estimation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 514–529. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_32

    Chapter  Google Scholar 

  16. Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)

    Google Scholar 

  17. Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1874–1883 (2016)

    Google Scholar 

  18. Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: FSRNet: end-to-end learning face super-resolution with facial priors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2492–2501 (2018)

    Google Scholar 

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  20. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines vinod nair. In: International Conference on International Conference on Machine Learning (2010)

    Google Scholar 

  21. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25

    Chapter  Google Scholar 

  22. Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1132–1140 (2017)

    Google Scholar 

  23. Basak, H., Kundu, R., Agarwal, A., Giri, S.: Single image super-resolution using residual channel attention network. In: 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), pp. 219–224 (2020)

    Google Scholar 

  24. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  25. Zhang, L., Wang, H., Chen, Z.: A multi-task cascaded algorithm with optimized convolution neural network for face detection. In: 2021 Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS), pp. 242–245 (2021)

    Google Scholar 

  26. Chu, M., Xie, Y., Mayer, J., Leal-Taixé, L., Thuerey, N.: Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans. Graph. (TOG) 39 (2020)

    Google Scholar 

  27. Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2014, pp. 2672–2680. MIT Press, Cambridge (2014)

    Google Scholar 

  28. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongming Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bian, S., Li, H., Yu, F., Liu, J., Changjun, S., Tang, Y. (2023). FAPN: Face Alignment Propagation Network for Face Video Super-Resolution. In: Zheng, Y., Keleş, H.Y., Koniusz, P. (eds) Computer Vision – ACCV 2022 Workshops. ACCV 2022. Lecture Notes in Computer Science, vol 13848. Springer, Cham. https://doi.org/10.1007/978-3-031-27066-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-27066-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-27065-9

  • Online ISBN: 978-3-031-27066-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics