Facial feature embedded CycleGAN for VIS–NIR translation

Wang, Huijiao; Zhang, Haijian; Yu, Lei; Yang, Xulei

doi:10.1007/s11045-023-00871-1

Facial feature embedded CycleGAN for VIS–NIR translation

Published: 14 February 2023

Volume 34, pages 423–446, (2023)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Huijiao Wang ORCID: orcid.org/0000-0002-8113-091X¹,
Haijian Zhang¹,
Lei Yu¹ &
…
Xulei Yang²

372 Accesses
1 Citation
Explore all metrics

Abstract

Visible and near-infrared (VIS–NIR) heterogeneous face recognition remains a challenging task due to distinctions between spectral components of two modalities and insufficiently pairwise VIS–NIR data. Inspired by the cycle-consistent generative adversarial network (CycleGAN), this paper proposes a facial feature embedded CycleGAN to translate between VIS and NIR face images, aiming to enable the distributions of translated (fake) images to be similar as those of true images. To learn the particular feature of NIR or VIS domain while preserving common facial representation between VIS and NIR domains, a facial feature extractor (FFE), tailored specifically for extracting effective feature from face images, is embedded in the generator of original CycleGAN. For implementing the FFE, we use the MobileFaceNet which is pre-trained on a VIS face database. The domain-invariant feature learning is enhanced by proposing a new pixel consistency loss. Additionally, we establish a new WHU VIS–NIR database including varies in face rotation and expressions to enrich the insufficient training data. Moreover, experiments on the well-known Oulu-CASIA NIR–VIS database and our WHU VIS–NIR database validate the potential benefit of the proposed FFE-based CycleGAN (FFE-CycleGAN). In particular, we achieve 96.5% accuracy on Oulu-CASIA and 98.9% accuracy on WHU VIS–NIR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modality Interference Decoupling and Representation Alignment for Caricature-Visual Face Recognition

LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition

Article 12 February 2021

Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination

References

Cao, B., Wang, N., Gao, X., Li, J., & Li, Z. (2019). Multi-margin based decorrelation learning for heterogeneous face recognition. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19 (pp. 680–686).
Chen, J., Yi, D., Yang, J., Zhao, G., Li, S.Z., & Pietikäinen, M. (2009). Learning mappings for face synthesis from near infrared to visual light images. In: 2009 IEEE conference on computer vision and pattern recognition (pp. 156–163).
Chen, S., Liu, Y., Gao, X., & Han, Z. (2018). MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices. CoRR abs/1804.07573.
Deng, J., Guo, J., Niannan, X., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. In: CVPR (pp. 4690–4699).
Di Huang, J.S., & Wang, Y. (2012). The BUAA-VisNir face database instructions. In: Technical report.
Fu, C., Wu, X., Hu, Y., Huang, H., & He, R. (2022). Dvg-face: Dual variational generation for heterogeneous face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6), 2938–2952.
Article Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In: Proceedings of the 27th international conference on neural information processing systems (pp. 2672–2680).
Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. (2016). MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In: B. Leibe, J. Matas, N. Sebe, M. Welling (Eds.), Computer Vision – ECCV (pp. 87–102).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90.
He, R., Li, Y., Wu, X., Song, L., Chai, Z., & Wei, X. (2021). Coupled adversarial learning for semi-supervised heterogeneous face recognition. Pattern Recognition, 110, 107618.
Article Google Scholar
He, R., Wu, X., Sun, Z., & Tan, T. (2017). Learning invariant deep representation for NIR–VIS face recognition. AAAI Conference on Artificial Intelligence, 4, 7.
Google Scholar
He, R., Wu, X., Sun, Z., & Tan, T. (2019). Wasserstein CNN: Learning invariant features for NIR–VIS face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(7), 1761–1773.
Article Google Scholar
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report (pp. 07–49).
Huang, X., Lei, Z., Fan, M., Wang, X., & Li, S. Z. (2013). Regularized discriminative spectral regression method for heterogeneous face matching. IEEE Transactions on Image Processing, 22(1), 353–362.
Article MathSciNet MATH Google Scholar
Isola, P., Zhu, J., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5967–5976).
Jo, Y., Yang, S., & Kim, S. J. (2020). Investigating Loss Functions for Extreme Super-Resolution. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW) (pp. 1705–1712).
Juefei-Xu, F., Pal, D.K., & Savvides, M. (2015). NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction. In: 2015 IEEE conference on computer vision and pattern recognition workshops (pp. 141–150).
Keinert, F., Lazzaro, D., & Morigi, S. (2019). A robust group-sparse representation variational method with applications to face recognition. IEEE Transactions on Image Processing, 28(6), 2785–2798.
Article MathSciNet MATH Google Scholar
Klare, B. F., & Jain, A. K. (2013). Heterogeneous face recognition using kernel prototype similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1410–1422.
Article Google Scholar
Lei, Z., & Li, S. Z. (2009). Coupled spectral regression for matching heterogeneous faces. In: 2009 IEEE conference on computer vision and pattern recognition (pp. 1123–1128).
Lezama, J., Qiu, Q., & Sapiro, G. (2017). Not afraid of the dark: NIR–VIS face recognition via cross-spectral hallucination and low-rank embedding. In: 2017 IEEE conference on computer vision and pattern recognition (pp. 6807–6816).
Li, S.Z., Yi, D., Lei, Z., & Liao, S. (2013). The CASIA NIR–VIS 2.0 face database. In: 2013 IEEE conference on computer vision and pattern recognition workshops (pp. 348–353).
Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: Proceedings of the 9th European conference on computer vision - Volume Part IV, ECCV’06 (pp. 13–26). Berlin, Heidelberg: Springer-Verlag.
Liu, X., Song, L., Wu, X., & Tan, T. (2016). Transferring deep representation for NIR-VIS heterogeneous face recognition. In: 2016 international conference on biometrics (ICB) (pp. 1–8).
Park, T., Efros, A. A., Zhang, R., & Zhu, J. Y. (2020). Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision.
Peng, C., Wang, N., Li, J., & Gao, X. (2019). DLFace: Deep local descriptor for cross-modality face recognition. Pattern Recognition, 90, 161–171.
Article Google Scholar
Peng, C., Wang, N., Li, J., & Gao, X. (2019). Re-ranking high-dimensional deep local representation for NIR–VIS face recognition. IEEE Transactions on Image Processing, 28(9), 4553–4565.
Article MathSciNet MATH Google Scholar
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 (pp. 234–241). Springer International Publishing.
Chapter Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (pp. 4510–4520).
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). FaceNet: A unified embedding for face recognition and clustering. In: IEEE conference on computer vision and pattern recognition (CVPR) (pp. 815–823).
Shao, M., & Fu, Y. (2017). Cross-modality feature learning through generic hierarchical hyperlingual-words. IEEE Transactions on Neural Networks and Learning Systems, 28(2), 451–463.
Article MathSciNet Google Scholar
Song, L., Zhang, M., Wu, X., & He, R. (2018). Adversarial discriminative heterogeneous face recognition. In: AAAI conference on artificial intelligence.
Sun, Y., Liang, D., Wang, X., & Tang, X. (2015). DeepID3: Face recognition with very deep neural networks. CoRR abs/1502.00873.
Wang, H., Zhang, H., Yu, L., Wang, L., & Yang, X. (2020). Facial feature embedded CycleGAN for Vis-Nir translation. In: IEEE international conference on acoustics, speech and signal processing (pp. 1903–1907).
Wang, R., Yang, J., Yi, D., & Li, S. Z. (2009). An analysis-by-synthesis method for heterogeneous face biometrics. In M. Tistarelli & M. S. Nixon (Eds.), Advances in biometrics (pp. 319–326). Springer.
Chapter Google Scholar
Wu, F., Jing, X. Y., Feng, Y., mu Ji, Y., & Wang, R. (2021). Spectrum-aware discriminative deep feature learning for multi-spectral face recognition. Pattern Recognition, 111, 107632.
Article Google Scholar
Wu, X., He, R., Sun, Z., & Tan, T. (2018). A light CNN for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, 13(11), 2884–2896.
Article Google Scholar
Yu, A., Haoxue, W., Huang, H., Lei, Z., & He, R. (2021). LAMP-HQ: A large-scale multi-pose high-quality database and benchmark for NIR–VIS face recognition. International Journal of Computer Vision 129.
Yu, Y. F., Dai, D. Q., Ren, C. X., & Huang, K. K. (2017). Discriminative multi-layer illumination-robust feature extraction for face recognition. Pattern Recognition, 67, 201–212.
Article Google Scholar
Zhang, K., Zhang, Z., Li, Z., & Yu, Q. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.
Article Google Scholar
Zhao, G., Huang, X., Taini, M., Li, S. Z., & Pietikäinen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29, 607–619.
Article Google Scholar
Zhu, J., Park, T., Isola, P., & Efros, A.A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International conference on computer vision (ICCV) (pp. 2242–2251).
Zhu, J. Y., Zheng, W. S., Lu, F., & Lai, J. H. (2017). Illumination invariant single face image recognition under heterogeneous lighting condition. Pattern Recognition, 66, 313–327.
Article Google Scholar

Download references

Acknowledgements

We thank Dr. Zhao et al. for offering the Oulu-CASIA NIR–VIS face expression database Zhao et al. (2011). It greatly helps us to further train and test the performance of the proposed method. Meanwhile, we give our great appreciation for those who help us to collect datasets, select pictures and finally build the WHU VIS–NIR paired face database. Besides, we would like to thank the people in Figs. 5 and 6 for their generous support.

Author information

Authors and Affiliations

Wuhan University, Wuhan, China
Huijiao Wang, Haijian Zhang & Lei Yu
Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, #21-01 Connexis, Singapore, 138632, Singapore
Xulei Yang

Authors

Huijiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haijian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xulei Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haijian Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Hubei Provincial Natural Science Foundation of China under Grant 2022CFB084.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Zhang, H., Yu, L. et al. Facial feature embedded CycleGAN for VIS–NIR translation. Multidim Syst Sign Process 34, 423–446 (2023). https://doi.org/10.1007/s11045-023-00871-1

Download citation

Received: 26 January 2022
Revised: 21 January 2023
Accepted: 21 January 2023
Published: 14 February 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11045-023-00871-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial feature embedded CycleGAN for VIS–NIR translation

Abstract

Access this article

Similar content being viewed by others

Modality Interference Decoupling and Representation Alignment for Caricature-Visual Face Recognition

LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition

Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Facial feature embedded CycleGAN for VIS–NIR translation

Abstract

Access this article

Similar content being viewed by others

Modality Interference Decoupling and Representation Alignment for Caricature-Visual Face Recognition

LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition

Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation