Abstract
In the face reenactment task, identity preservation is challenging due to the leakage of driving identity and the complexity of source identity. In this paper, we propose an Identity-Preserving Face Reenactment (IPFR) framework with impressive expression and pose transfer. To address the leakage of driving identity, we develop an enhanced domain discriminator to eliminate the undesirable identity in the generated image. Considering the complexity of source identity, we inject multi-level source identity priors to keep the identity domain of generated image close to that of source. In detail, firstly, we utilize a 3D geometric prior from the 3D morphable face model (3DMM) to control face shape and reduce artifacts caused by occlusion in the module of generating motion field; secondly, we use an identity texture prior extracted by face recognition network to supervise the final stage, aiming to make the identity domain of the generated close to that of source. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art methods on image quality and identity preservation. Ablation studies are also conducted to further validate the effectiveness of our individual components.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: SIGGRAPH (1999)
Burkov, E., Pasechnik, I., Grigorev, A., Lempitsky, V.: Neural head reenactment with latent pose descriptors. In: CVPR (2020)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: CVPR (2019)
Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3d face reconstruction with weakly-supervised learning: from single image to image set. Cornell University (2019)
Doukas, M.C., Zafeiriou, S., Sharmanska, V.: HeadGAN: one-shot neural head synthesis and editing. In: ICCV (2021)
Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2030–2063 (2016)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Hong, F.T., Zhang, L., Shen, L., Xu, D.: Depth-aware generative adversarial network for talking head video generation. In: CVPR (2022)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, J., et al.: Identity preserving generative adversarial network for cross-domain person re-identification. IEEE Access 7, 114021–114032 (2019)
Liu, J., et al.: Li-Net: large-pose identity-preserving face reenactment network. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2021)
Nagrani, A., Chung, J.S., Zisserman, A.: VoxCeleb: a large-scale speaker identification dataset. arXiv preprint arXiv:1706.08612 (2017)
Ren, Y., Li, G., Chen, Y., Li, T.H., Liu, S.: PIRenderer: controllable portrait image generation via semantic neural rendering. In: ICCV (2021)
Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., Sebe, N.: First order motion model for image animation. In: NeurIPS (2019)
Siarohin, A., Woodford, O.J., Ren, J., Chai, M., Tulyakov, S.: Motion representations for articulated animation. In: CVPR (2021)
Wang, Y., et al.: HifiFace: 3d shape and semantic prior guided high fidelity face swapping. IJCAI (2021)
Xu, M., Chen, Y., Liu, S., Li, T.H., Li, G.: Structure-transformed texture-enhanced network for person image synthesis. In: ICCV (2021)
Yang, K., Chen, K., Guo, D., Zhang, S.H., Guo, Y.C., Zhang, W.: Face2face \(\rho \): real-time high-resolution one-shot face reenactment. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13673, pp. 55–71. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19778-9_4
Yao, G., Yuan, Y., Shao, T., Zhou, K.: Mesh guided one-shot face reenactment using graph convolutional networks. In: ACMMM (2020)
Yin, F., et al.: StyleHEAT: one-shot high-resolution editable talking face generation via pre-trained styleGAN. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13677, pp. 85–101. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19790-1_6
Zhang, H., Ren, Y., Chen, Y., Li, G., Li, T.H.: Exploiting multiple guidance from 3dmm for face reenactment. In: AAAI Workshop (2023)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Zheng, M., Karanam, S., Chen, T., Radke, R.J., Wu, Z.: HifiHead: one-shot high fidelity neural head synthesis with 3d control. In: IJCAI (2022)
Ackownlegement
This work is supported by Shenzhen Fundamental Research Program (GXWD20201231165807007-20200806163656003) and National Natural Science Foundation of China (No. 62172021). We thank all reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhu, L., Li, G., Chen, Y., Li, T.H. (2024). IPFR: Identity-Preserving Face Reenactment with Enhanced Domain Adversarial Training and Multi-level Identity Priors. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14434. Springer, Singapore. https://doi.org/10.1007/978-981-99-8549-4_10
Download citation
DOI: https://doi.org/10.1007/978-981-99-8549-4_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8548-7
Online ISBN: 978-981-99-8549-4
eBook Packages: Computer ScienceComputer Science (R0)