FAPN: Face Alignment Propagation Network for Face Video Super-Resolution

Bian, Sige; Li, He; Yu, Feng; Liu, Jiyuan; Changjun, Song; Tang, Yongming

doi:10.1007/978-3-031-27066-6_1

Sige Bian ORCID: orcid.org/0000-0003-1553-9167¹⁰,
He Li ORCID: orcid.org/0000-0002-1540-189X¹⁰,
Feng Yu¹¹,
Jiyuan Liu¹⁰,
Song Changjun¹⁰ &
…
Yongming Tang ORCID: orcid.org/0000-0003-2102-2041¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13848))

Included in the following conference series:

Asian Conference on Computer Vision

251 Accesses

Abstract

Face video super-resolution (FVSR) aims to use continuous low resolution (LR) video frames to reconstruct face and recover facial details under the premise of ensuring authenticity. The existing video super-resolution (VSR) technology usually uses inter-frame information to achieve better super-resolution (SR) performance. However, due to the complex temporal dependence between frames, as the number of input frames increases, the information cannot be fully utilized, and even wrong information is introduced, resulting in poor performance. In this work, we propose an alignment propagation network for accumulating facial prior information (FAPN). We design a neighborhood information coupling (NIC) module based on optical flow estimation and alignment, where the current frame, the adjacent frames and the SR results of the previous frame are locally fused. The coupled frames are sent to a unidirectional propagation (UP) structure for propagation. Meanwhile, in the UP structure, the facial prior information is filtered and accumulated in the face super-resolution cell (FSRC), and the high-dimensional hidden state is introduced to propagate effective temporal information between frames along the unidirectional structure. Extensive evaluations and comparisons validate the strengths of our approach, FAPN can accumulate more facial details while ensuring the authenticity of the face. And the experimental results demonstrated that the proposed framework achieves better performance on PSNR (up to 0.31 dB), SSIM (up to 0.15 dB) and face recognition accuracy (up to 1.99%) compared with state-of-the-art methods.

This work is supported by National Natural Science Foundation of China (grant number 62275046).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang, M., Deng, W.: Deep face recognition: a survey. Neurocomputing 429, 215–244 (2021)
Article Google Scholar
Farooq, M., Dailey, M., Mahmood, A., Moonrinta, J., Ekpanyapong, M.: Human face super-resolution on poor quality surveillance video footage. Neural Comput. Appl. 33, 13505–13523 (2021)
Article Google Scholar
Yu, F., Li, H., Bian, S., Tang, Y.: An efficient network design for face video super-resolution. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 1513–1520 (2021)
Google Scholar
Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3892–3901 (2019)
Google Scholar
Wang, X., Chan, K.C., Yu, K., Dong, C., Loy, C.C.: EDVR: video restoration with enhanced deformable convolutional networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1954–1963 (2019)
Google Scholar
Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3224–3232 (2018)
Google Scholar
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38
Chapter Google Scholar
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485 (2019)
Google Scholar
Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2848–2857 (2017)
Google Scholar
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4945–4954 (2021)
Google Scholar
Xin, J., Wang, N., Li, J., Gao, X., Li, Z.: Video face super-resolution with motion-adaptive feedback cell. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12468–12475 (2020)
Google Scholar
Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: a general-purpose face recognition library with mobile applications (2016)
Google Scholar
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694 (2019)
Google Scholar
Wang, L., Guo, Y., Lin, Z., Deng, X., An, W.: Learning for video super-resolution through HR optical flow estimation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 514–529. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_32
Chapter Google Scholar
Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)
Google Scholar
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1874–1883 (2016)
Google Scholar
Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: FSRNet: end-to-end learning face super-resolution with facial priors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2492–2501 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines vinod nair. In: International Conference on International Conference on Machine Learning (2010)
Google Scholar
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25
Chapter Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1132–1140 (2017)
Google Scholar
Basak, H., Kundu, R., Agarwal, A., Giri, S.: Single image super-resolution using residual channel attention network. In: 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), pp. 219–224 (2020)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Zhang, L., Wang, H., Chen, Z.: A multi-task cascaded algorithm with optimized convolution neural network for face detection. In: 2021 Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS), pp. 242–245 (2021)
Google Scholar
Chu, M., Xie, Y., Mayer, J., Leal-Taixé, L., Thuerey, N.: Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans. Graph. (TOG) 39 (2020)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2014, pp. 2672–2680. MIT Press, Cambridge (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Joint International Research Laboratory of Information Display and Visualization, Southeast University, Nanjing, 210096, China
Sige Bian, He Li, Jiyuan Liu, Song Changjun & Yongming Tang
School of Computing, National University of Singapore, Singapore, Singapore
Feng Yu

Authors

Sige Bian
View author publications
You can also search for this author in PubMed Google Scholar
He Li
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jiyuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Song Changjun
View author publications
You can also search for this author in PubMed Google Scholar
Yongming Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongming Tang .

Editor information

Editors and Affiliations

University of Tokyo, Tokyo, Japan
Yinqiang Zheng
Hacettepe University, Ankara, Türkiye
Hacer Yalim Keleş
Data61/CSIRO, Canberra, ACT, Australia
Piotr Koniusz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bian, S., Li, H., Yu, F., Liu, J., Changjun, S., Tang, Y. (2023). FAPN: Face Alignment Propagation Network for Face Video Super-Resolution. In: Zheng, Y., Keleş, H.Y., Koniusz, P. (eds) Computer Vision – ACCV 2022 Workshops. ACCV 2022. Lecture Notes in Computer Science, vol 13848. Springer, Cham. https://doi.org/10.1007/978-3-031-27066-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-27066-6_1
Published: 09 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27065-9
Online ISBN: 978-3-031-27066-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FAPN: Face Alignment Propagation Network for Face Video Super-Resolution