Elsevier

Image and Vision Computing

Volume 58, February 2017, Pages 3-12
Image and Vision Computing

Statistical non-rigid ICP algorithm and its application to 3D face alignment*

https://doi.org/10.1016/j.imavis.2016.10.007Get rights and content

Highlights

  • A statistical non-rigid ICP method for 3D face alignment is proposed.

  • Local fitting in dynamic subdivision framework helps capture subtle facial feature.

  • 2D point-driven mesh deformation in pre-processing step helps improve performance.

Abstract

The problem of fitting a 3D facial model to a 3D mesh has received a lot of attention the past 15–20years. The majority of the techniques fit a general model consisting of a simple parameterisable surface or a mean 3D facial shape. The drawback of this approach is that is rather difficult to describe the non-rigid aspect of the face using just a single facial model. One way to capture the 3D facial deformations is by means of a statistical 3D model of the face or its parts. This is particularly evident when we want to capture the deformations of the mouth region. Even though statistical models of face are generally applied for modelling facial intensity, there are few approaches that fit a statistical model of 3D faces. In this paper, in order to capture and describe the non-rigid nature of facial surfaces we build a part-based statistical model of the 3D facial surface and we combine it with non-rigid iterative closest point algorithms. We show that the proposed algorithm largely outperforms state-of-the-art algorithms for 3D face fitting and alignment especially when it comes to the description of the mouth region.

Introduction

Three-dimensional representation of face has always been a valuable source for face recognition and facial behaviour analysis. The huge descriptive power and rich shape variation of 3D human face make it much more informative than the corresponding 2D projection. Due to its complex nature, the structure and topology of the 3D facial scan varies from frame to frame. To obtain a relatively consistent facial representation, it is always preferable to use a 3D deformable facial model. Thus, the building and fitting of the 3D deformable model becomes a key step in retrieving 3D facial information, and not surprisingly, it has received a lot of attention in the past decade [1], [2], [3], [4], [5], [6], [7], [8]. Owing to the recent development on cost-effective depth cameras, such as Kinect 2, Creative Senz3D™ and Intel® RealSense™ Camera (F200), further attention has been drawn on this field of study.

3D face fitting, also referred to as 3D face registration [5], [8], [9] in the literature, aims at aligning two sets of point clouds or meshes, and overlapping the face template as close as possible with the target surface (i.e. facial scan). In the scope of 3D deformable model fitting, the face template refers to the 3D deformable face model. 3D Morphable Model (3DMM) [1] is the most commonly used technique to introduce prior knowledge on 3D human face. 3DMM has been widely adopted in various 3D fitting methods [2], [10], [11], [12], since it can well estimate an unseen 3D facial shape by solving the shape, texture, pose and illumination parameters simultaneously. Fitting of 3D facial models is very important, since they can be used to identify particular facial landmarks or to recognize and define facial deformations that benefits face recognition [3], [13] and facial performance transfer [14]. Furthermore, accurate fitting and modelling of human expressions is able to boost the performance of facial expression recognition[15], [16], [17].

Despite the importance, the majority of existing methodologies use just a simple pre-defined mesh model, parameterised or not, to fit a target mesh [2], [3], [5], [8], [11], [18]. In [2], [11], in order to handle recognition under different pose and lightening conditions, a 3DMM that separates parameters for shape, head pose and illumination is employed. Inspired by this, Amberg et al. [10] developed an expression-invariant 3D face recognition algorithm, in that, they fit an identity/expression separated 3DMM to the facial scan and normalize the resulting face by removing the pose and expression components [10]. Unfortunately, these methodologies may fail to describe properly the complex, non-linear and highly deformable structure of the face. Especially when fitting the data captured by high-resolution face capturing system, such as Di3D (Dimensional Imaging [19]), the results are likely to be over-smooth and thus lose important facial details and micro-expression.

In this paper, we examine the problem of fitting a 3D facial models to high-resolution depth scan. Our key contribution is a new active method for describing and fitting 3D faces, which is achieved by learning a set of local statistical model for facial parts, and combining them with the non-rigid Iterative Closest Point (ICP) algorithm [5]. Besides it, we propose a dynamic local fitting procedure that makes full use of dynamic subdivision framework. To this end, we successfully adopt the proposed active method in the fitting procedure, and show that it manages to accurately model the subtle facial feature. Additionally, we provide a point-driven mesh deformation procedure in the data pre-processing stage that helps to prevent incorrect facial part fitting. It deforms 3D template model under the guidance of the state-of-the-art 2D face alignment algorithm [20].

The remainder of the paper is organized as follows. Section 2 gives a brief introduction to the existing works that are related to this paper, and a short discussion over the merits and demerits of them. Section 3 explains our dynamic subdivision framework as well as the local fitting procedure. Next, in Section 4, we describe the proposed novel Active Non-rigid ICP algorithm. In Section 5, we show the results of our experiments followed by an in-depth discussion. Lastly, we conclude the paper in Section 6.

Section snippets

Related work

Various 3D face fitting methods have been proposed to address specific problems in different scenarios, such as the situation with high-resolution data [3], partial range scan [5], [21], [22] and normal maps [23]. In this section, we will give a brief introduction to different kinds of 3D fitting algorithms.

Dynamic subdivision framework

The core idea of our framework is to dynamically fit facial data using a deformable 3D face model, and to provide an accurately fitted surface. In contrast to previous works on 3D surface registration [3], [18], [24] that subsample the data using an annotated template to gain efficiency, our method starts from a sparse level and dynamically propagates to subsequent levels, in which the fittings are performed locally to model regional deformation. We argue that the subsampling step sacrifices

Statistical Non-rigid ICP algorithm

To capture more local variations, we perform local fitting based on the segmented template of subdivision levels. Inspired by the recent success in region-based face modelling [31], we employ a statistical shape model in non-rigid ICP algorithm (see Section 5 for details of shape model building), and propose to solve the optimal mesh controlling parameters in an alternating manner. We refer to this method as Dynamic Active Non-rigid ICP (DA-NICP) in this paper.

Experiments

Apart from visual comparison between fittings, we conduct three experiments to provide quantitative measures of our proposed dynamic Active Non-rigid ICP (DA-NICP) algorithm. To demonstrate the advantage of putting subspace constraint on fitting, we introduce D-NICP— a method similar to DA-NICP that uses dynamic subdivision surfaces and perform local fitting, but with NICP [5] chosen as the only fitting strategy. The third method to compare is the deformable fitting with subdivision on AFM in 

Conclusion

We propose a dynamic local fitting procedure that gains benefits from dynamic subdivision framework, and show how to adopt the NICP algorithm to our procedure. The proposed fitting procedure is shown capable of modelling subtle facial details. More importantly, we present a statistical model for describing the faces and we combined it with NICP for 3D face alignment. We have shown that the proposed algorithm largely outperforms state-of-the-art 3D facial deformable models, such as the ones that

Acknowledgements

The work of S. Cheng and I. Marras is funded by the EPSRC project EP/J017787/1 (4D-FAB). M. Pantic acknowledges support by the European Community Horizon 2020 [H2020/2014-2020] under grant agreement no. 645094 (SEWA). S. Zafeiriou also acknowledges support from EPSRC project EP/L026813/1Adaptive Facial Deformable Models for Tracking (ADAManT).

References (52)

  • D. Schneider et al.

    Fast nonrigid mesh registration with a data-driven deformation prior

  • G. Tam et al.

    Registration of 3D point clouds and meshes: a survey from rigid to nonrigid

    IEEE Trans. Vis. Comput. Graph.

    (2013)
  • B. Amberg et al.

    Expression invariant 3D face recognition with a Morphable Model

  • V. Blanz et al.

    Face recognition based on fitting a 3D morphable model

    IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

    (2003)
  • X. Zhu et al.

    Discriminative 3D morphable model fitting.

  • I. Kakadiaris et al.

    Three-dimensional face recognition in the presence of facial expressions: an annotated deformable model approach

    IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

    (2007)
  • V. Blanz et al.

    Reanimating faces in images and video

    Comput. Graphics Forum

    (2003)
  • G. Rajamanoharan et al.

    Static and dynamic 3D facial expression recognition: a comprehensive survey

    Image Vis. Comput. (IVC)

    (2012)
  • G. Rajamanoharan et al.

    Recognition of 3D facial expression dynamics

    Image Vis. Comput. (IVC)

    (2012)
  • I. Marras et al.

    Robust Learning from Normals for 3D Face Recognition

  • I. Kakadiaris et al.

    Multimodal face recognition: combination of geometry with physiological information

  • dI4D - 4D Capture Systems

    (2011)
  • A. Asthana et al.

    Robust Discriminative Response Map Fitting with Constrained Local Models

  • H. Li et al.

    Global Correspondence Optimization for Non-rigid Registration of Depth Scans

  • T.B.A. Brunton et al.

    Multilinear Wavelets: A Statistical Shape Space for Human Faces

    (2014)
  • Z. Wang et al.

    3D Face Template Registration Using Normal Maps

  • Cited by (56)

    • Gabor Log-Euclidean Gaussian and its fusion with deep network based on self-attention for face recognition

      2022, Applied Soft Computing
      Citation Excerpt :

      The image obtained with the RGB-D camera can be converted into a point cloud image, which is the mainstream data used for 3D face recognition. Early methods did not consider facial features such as texture in 3D space, but directly used methods such as ICP to match point cloud images [13]. Like 2D face recognition, 3D face recognition is currently fully transformed into a deep learning model.

    • Linearly augmented real-time 4D expressional face capture

      2021, Information Sciences
      Citation Excerpt :

      Then the template was iteratively deformed according to this target face. Cheng et al. [16] presented a non-rigid ICP method for 3D face alignment. Trimech et al. [17] presented a 3D facial expression recognition method.

    View all citing articles on Scopus
    *

    This paper has been recommended for acceptance by Sinisa Todorovic.

    View full text