RetouchFormer: Semi-supervised High-Quality Face Retouching Transformer with Prior-Based Selective Self-Attention

Authors

  • Xue Wen South China University of Technology
  • Lianxin Xie South China University of Technology
  • Le Jiang South China University of Technology
  • Tianyi Chen South China University of Technology
  • Si Wu South China University of Technology
  • Cheng Liu Shantou University
  • Hau-San Wong City University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v38i6.28404

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Applications, CV: Biometrics, Face, Gesture & Pose

Abstract

Face retouching is to beautify a face image, while preserving the image content as much as possible. It is a promising yet challenging task to remove face imperfections and fill with normal skin. Generic image enhancement methods are hampered by the lack of imperfection localization, which often results in incomplete removal of blemishes at large scales. To address this issue, we propose a transformer-based approach, RetouchFormer, which simultaneously identify imperfections and synthesize realistic content in the corresponding regions. Specifically, we learn a latent dictionary to capture the clean face priors, and predict the imperfection regions via a reconstruction-oriented localization module. Also based on this, we can realize face retouching by explicitly suppressing imperfections in our selective self-attention computation, such that local content will be synthesized from normal skin. On the other hand, multi-scale feature tokens lead to increased flexibility in dealing with the imperfections at various scales. The design elements bring greater effectiveness and efficiency. RetouchFormer outperforms the advanced face retouching methods and synthesizes clean face images with high fidelity in our list of extensive experiments performed.

Published

2024-03-24

How to Cite

Wen, X., Xie, L., Jiang, L., Chen, T., Wu, S., Liu, C., & Wong, H.-S. (2024). RetouchFormer: Semi-supervised High-Quality Face Retouching Transformer with Prior-Based Selective Self-Attention. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5903-5911. https://doi.org/10.1609/aaai.v38i6.28404

Issue

Section

AAAI Technical Track on Computer Vision V