Statistical non-rigid ICP algorithm and its application to 3D face alignment

doi:10.1016/j.imavis.2016.10.007

Image and Vision Computing

Volume 58, February 2017, Pages 3-12

https://doi.org/10.1016/j.imavis.2016.10.007 Get rights and content

Highlights

•
A statistical non-rigid ICP method for 3D face alignment is proposed.
•
Local fitting in dynamic subdivision framework helps capture subtle facial feature.
•
2D point-driven mesh deformation in pre-processing step helps improve performance.

Abstract

The problem of fitting a 3D facial model to a 3D mesh has received a lot of attention the past 15–20years. The majority of the techniques fit a general model consisting of a simple parameterisable surface or a mean 3D facial shape. The drawback of this approach is that is rather difficult to describe the non-rigid aspect of the face using just a single facial model. One way to capture the 3D facial deformations is by means of a statistical 3D model of the face or its parts. This is particularly evident when we want to capture the deformations of the mouth region. Even though statistical models of face are generally applied for modelling facial intensity, there are few approaches that fit a statistical model of 3D faces. In this paper, in order to capture and describe the non-rigid nature of facial surfaces we build a part-based statistical model of the 3D facial surface and we combine it with non-rigid iterative closest point algorithms. We show that the proposed algorithm largely outperforms state-of-the-art algorithms for 3D face fitting and alignment especially when it comes to the description of the mouth region.

Introduction

Three-dimensional representation of face has always been a valuable source for face recognition and facial behaviour analysis. The huge descriptive power and rich shape variation of 3D human face make it much more informative than the corresponding 2D projection. Due to its complex nature, the structure and topology of the 3D facial scan varies from frame to frame. To obtain a relatively consistent facial representation, it is always preferable to use a 3D deformable facial model. Thus, the building and fitting of the 3D deformable model becomes a key step in retrieving 3D facial information, and not surprisingly, it has received a lot of attention in the past decade [1], [2], [3], [4], [5], [6], [7], [8]. Owing to the recent development on cost-effective depth cameras, such as Kinect 2, Creative Senz3D™ and Intel® RealSense™ Camera (F200), further attention has been drawn on this field of study.

3D face fitting, also referred to as 3D face registration [5], [8], [9] in the literature, aims at aligning two sets of point clouds or meshes, and overlapping the face template as close as possible with the target surface (i.e. facial scan). In the scope of 3D deformable model fitting, the face template refers to the 3D deformable face model. 3D Morphable Model (3DMM) [1] is the most commonly used technique to introduce prior knowledge on 3D human face. 3DMM has been widely adopted in various 3D fitting methods [2], [10], [11], [12], since it can well estimate an unseen 3D facial shape by solving the shape, texture, pose and illumination parameters simultaneously. Fitting of 3D facial models is very important, since they can be used to identify particular facial landmarks or to recognize and define facial deformations that benefits face recognition [3], [13] and facial performance transfer [14]. Furthermore, accurate fitting and modelling of human expressions is able to boost the performance of facial expression recognition[15], [16], [17].

Despite the importance, the majority of existing methodologies use just a simple pre-defined mesh model, parameterised or not, to fit a target mesh [2], [3], [5], [8], [11], [18]. In [2], [11], in order to handle recognition under different pose and lightening conditions, a 3DMM that separates parameters for shape, head pose and illumination is employed. Inspired by this, Amberg et al. [10] developed an expression-invariant 3D face recognition algorithm, in that, they fit an identity/expression separated 3DMM to the facial scan and normalize the resulting face by removing the pose and expression components [10]. Unfortunately, these methodologies may fail to describe properly the complex, non-linear and highly deformable structure of the face. Especially when fitting the data captured by high-resolution face capturing system, such as Di3D (Dimensional Imaging [19]), the results are likely to be over-smooth and thus lose important facial details and micro-expression.

In this paper, we examine the problem of fitting a 3D facial models to high-resolution depth scan. Our key contribution is a new active method for describing and fitting 3D faces, which is achieved by learning a set of local statistical model for facial parts, and combining them with the non-rigid Iterative Closest Point (ICP) algorithm [5]. Besides it, we propose a dynamic local fitting procedure that makes full use of dynamic subdivision framework. To this end, we successfully adopt the proposed active method in the fitting procedure, and show that it manages to accurately model the subtle facial feature. Additionally, we provide a point-driven mesh deformation procedure in the data pre-processing stage that helps to prevent incorrect facial part fitting. It deforms 3D template model under the guidance of the state-of-the-art 2D face alignment algorithm [20].

The remainder of the paper is organized as follows. Section 2 gives a brief introduction to the existing works that are related to this paper, and a short discussion over the merits and demerits of them. Section 3 explains our dynamic subdivision framework as well as the local fitting procedure. Next, in Section 4, we describe the proposed novel Active Non-rigid ICP algorithm. In Section 5, we show the results of our experiments followed by an in-depth discussion. Lastly, we conclude the paper in Section 6.

Section snippets

Related work

Various 3D face fitting methods have been proposed to address specific problems in different scenarios, such as the situation with high-resolution data [3], partial range scan [5], [21], [22] and normal maps [23]. In this section, we will give a brief introduction to different kinds of 3D fitting algorithms.

Dynamic subdivision framework

The core idea of our framework is to dynamically fit facial data using a deformable 3D face model, and to provide an accurately fitted surface. In contrast to previous works on 3D surface registration [3], [18], [24] that subsample the data using an annotated template to gain efficiency, our method starts from a sparse level and dynamically propagates to subsequent levels, in which the fittings are performed locally to model regional deformation. We argue that the subsampling step sacrifices

Statistical Non-rigid ICP algorithm

To capture more local variations, we perform local fitting based on the segmented template of subdivision levels. Inspired by the recent success in region-based face modelling [31], we employ a statistical shape model in non-rigid ICP algorithm (see Section 5 for details of shape model building), and propose to solve the optimal mesh controlling parameters in an alternating manner. We refer to this method as Dynamic Active Non-rigid ICP (DA-NICP) in this paper.

Experiments

Apart from visual comparison between fittings, we conduct three experiments to provide quantitative measures of our proposed dynamic Active Non-rigid ICP (DA-NICP) algorithm. To demonstrate the advantage of putting subspace constraint on fitting, we introduce D-NICP— a method similar to DA-NICP that uses dynamic subdivision surfaces and perform local fitting, but with NICP [5] chosen as the only fitting strategy. The third method to compare is the deformable fitting with subdivision on AFM in

Conclusion

We propose a dynamic local fitting procedure that gains benefits from dynamic subdivision framework, and show how to adopt the NICP algorithm to our procedure. The proposed fitting procedure is shown capable of modelling subtle facial details. More importantly, we present a statistical model for describing the faces and we combined it with NICP for 3D face alignment. We have shown that the proposed algorithm largely outperforms state-of-the-art 3D facial deformable models, such as the ones that

Acknowledgements

The work of S. Cheng and I. Marras is funded by the EPSRC project EP/J017787/1 (4D-FAB). M. Pantic acknowledges support by the European Community Horizon 2020 [H2020/2014-2020] under grant agreement no. 645094 (SEWA). S. Zafeiriou also acknowledges support from EPSRC project EP/L026813/1Adaptive Facial Deformable Models for Tracking (ADAManT).

References (52)

T. Fang et al.
3D/4D facial expression analysis: an advanced annotated face model approach
Image Vis. Comput. (IVC)
(2012)
T.C.S. Rendall et al.
Reduced surface point selection options for efficient mesh deformation using radial basis functions
J. Comput. Phys.
(2010)
D. Chetverikov et al.
Robust Euclidean alignment of 3D point sets: the trimmed iterative closest point algorithm
Image Vis. Comput. (IVC)
(2005)
V. Blanz et al.
A Morphable Model for the Synthesis of 3D Faces
P. Paysan et al.
A 3D Face Model for Pose and Illumination Invariant Face Recognition
G. Passalis et al.
Intraclass retrieval of nonrigid 3D objects: application to face recognition
IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)
(2007)
Y. Wang et al.
3D Face Recognition in the Presence of Expression: A Guidance-based Constraint Deformation Approach
B. Amberg et al.
Optimal Step Nonrigid ICP Algorithms for Surface Registration
B. Allen et al.
The space of human body shapes: reconstruction and parameterization from range scans
ACM Trans. Graph. (TOG)
(2003)
G. Pan et al.
Establishing point correspondence of 3D faces via sparse facial deformable model
IEEE Trans. Image Process. (TIP)
(2013)

D. Schneider et al.

Fast nonrigid mesh registration with a data-driven deformation prior

G. Tam et al.

Registration of 3D point clouds and meshes: a survey from rigid to nonrigid

IEEE Trans. Vis. Comput. Graph.

(2013)

B. Amberg et al.

Expression invariant 3D face recognition with a Morphable Model

V. Blanz et al.

Face recognition based on fitting a 3D morphable model

IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

(2003)

X. Zhu et al.

Discriminative 3D morphable model fitting.

I. Kakadiaris et al.

Three-dimensional face recognition in the presence of facial expressions: an annotated deformable model approach

IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

(2007)

V. Blanz et al.

Reanimating faces in images and video

Comput. Graphics Forum

(2003)

G. Rajamanoharan et al.

Static and dynamic 3D facial expression recognition: a comprehensive survey

Image Vis. Comput. (IVC)

(2012)

G. Rajamanoharan et al.

Recognition of 3D facial expression dynamics

Image Vis. Comput. (IVC)

(2012)

I. Marras et al.

Robust Learning from Normals for 3D Face Recognition

I. Kakadiaris et al.

Multimodal face recognition: combination of geometry with physiological information

dI4D - 4D Capture Systems

(2011)

A. Asthana et al.

Robust Discriminative Response Map Fitting with Constrained Local Models

H. Li et al.

Global Correspondence Optimization for Non-rigid Registration of Depth Scans

T.B.A. Brunton et al.

Multilinear Wavelets: A Statistical Shape Space for Human Faces

(2014)

Z. Wang et al.

3D Face Template Registration Using Normal Maps

Cited by (56)

Gaussian process model based restoration of damaged Buddha statue head
2024, Journal of Cultural Heritage
Buddha statues are distributed worldwide, and each piece reflects the aesthetic fashion of a particular region and historical era. Therefore, these sculptures have important cultural, aesthetic, and historical value. Owing to natural or artificial destruction, several Buddha statues have been damaged, with head or facial damage being the most prominent damage type. Because Buddha statue heads exhibit intricate variations in facial expressions and proportions, restoring head or facial damage is the most challenging aspect of damaged sculpture restoration projects. This study provides a new method for the restoration of damaged statues based on a Gaussian process model. The Gaussian process model entails a type of statistical shape modeling that allows it to learn shape distribution properties from a training dataset. During the restoration process, we used the residual parts of the damaged shape for the observations and added them to a Bayesian inference system to obtain the possible restoration result in the form of a posterior distribution. Therefore, the proposed method can be regarded as a quantitative mathematical representation of the subjective deduction process. Compared to those of the traditional manual restoration method, our results were more repeatable and stable, and the restoration process was automatic and rigorous, which reduced the requirements for professional knowledge and experience. Based on the learned model information, the proposed method exhibits strong robustness and anti-noise properties. In contrast to the optimization method, our final restoration result is a probability distribution, applicable to the repair of cultural relics. As our method make the missing shape inference using residual shape information and the regularity learned from training dataset, it is suitable for the restoration of damaged parts with no direct shape evidence. Finally, our model is generative; in other words, we can generate shapes on the posterior distribution and provide the corresponding probability. To demonstrate the effectiveness of our method, a series of experiments were conducted to restore damaged parts under different parameters and damage type settings. Subsequently, a quantitative analysis was performed. Finally, we applied this method to the virtual restoration of a damaged Buddha statue head in Xi 'an Museum.
Intelligent paving and compaction technologies for asphalt pavement
2023, Automation in Construction
Advanced information technologies, such as artificial intelligence (AI) and big data analytics, along with global navigation systems and the Internet of Things (IoT), are increasingly finding application in the transportation industry. With this sector, pavement construction holds a prominent position, and its technologies are swiftly transitioning toward digitalization, automation, intelligence, and informationization. This paper elaborates on the key principles of intelligent technologies utilized in paving and paving compaction, discussing high-precision intelligent control technology, intelligent driving system control technology, compaction detection technology, intelligent compaction positioning technology, field coordination technology, traffic control technology, intelligent monitoring technology, and intelligent management systems. The paper highlights the research progress and development trends of each technology, while also addressing their respective shortcomings and current research limitations. Ultimately, future development directions for each intelligent technology are provided, including insights to advance the field of intelligent paving and paving compaction.
Gabor Log-Euclidean Gaussian and its fusion with deep network based on self-attention for face recognition
2022, Applied Soft Computing
Citation Excerpt :
The image obtained with the RGB-D camera can be converted into a point cloud image, which is the mainstream data used for 3D face recognition. Early methods did not consider facial features such as texture in 3D space, but directly used methods such as ICP to match point cloud images [13]. Like 2D face recognition, 3D face recognition is currently fully transformed into a deep learning model.
In this work, we proposed a face feature extraction method by Learning Gabor Log-Euclidean Gaussian with Whitening Principal Component Analysis (called LGLG-WPCA). The proposed method extracts raw features from the multivariate Gaussian in the transform domain of Gabor wavelet and uses WPCA to get robust features. Because the space of Gaussian is a Riemannian manifold, it is difficult to incorporate the learning mechanism into the model. To address this issue, Log-Euclidean approach is used to embed the multivariate Gaussian into the linear space, and then use WPCA to learn discriminative face features. LGLG-WPCA is good at extracting the detail features of face image. Furthermore, another outstanding advantage of LGLG is that its features can be effectively integrated with the high-level features of deep learning network for face recognition in more complex environments. We presented the feature fusing approaches for face recognition based on Self-attention Network (SAN) and achieved obvious performance improvement to the-state-of-the-art deep networks including SENet and FaceNet. Experiments show the proposed method is robust under adverse conditions such as varying poses, skin aging and uneven illumination, and it is suitable for face image under small-scale datasets in complex environments, such as network-based or video-based person searching or tracking.
Linearly augmented real-time 4D expressional face capture
2021, Information Sciences
Citation Excerpt :
Then the template was iteratively deformed according to this target face. Cheng et al. [16] presented a non-rigid ICP method for 3D face alignment. Trimech et al. [17] presented a 3D facial expression recognition method.
Personalised 3D face creation has always been a hot topic in the computer vision community. Many methods have been proposed including the statistic model, the non-rigid registration and high-end depth acquisition equipment. However, in practical applications, those existing methods still have their own limitations. For example, the performance of the statistic model-based methods highly depends on the generality of the pre-trained statistic model; the non-rigid registration based methods are sensitive to the quality of input data; the high-end equipment-based methods are less able to be popularised due to the expensive equipment costs; the deep learning-based methods can only perform well if proper training data provided for the target domain, and require GPU for better performance. To this end, this paper presents an adaptive template augmented method that can automatically obtain a personalised 4D facial modelling only using a consumer-grade device. The noisy data from such a cheap device are well handled. The whole process consists of a series of linear solutions and can be achieved in real-time for online processing only based on the CPU computation on a laptop. There is no constraint nor complex operation required by the proposed method. No additional time-consumptive pre- or post-processing for the personalisation is needed. Comparisons against several existing methods demonstrate the superiority of the proposed method.
Comparison of three-dimensional imaging of the nose using three different 3D-photography systems: an observational study
2024, Head and Face Medicine
Robot self-calibration using actuated 3D sensors
2024, Journal of Field Robotics

View all citing articles on Scopus

^*: This paper has been recommended for acceptance by Sinisa Todorovic.

View full text

Statistical non-rigid ICP algorithm and its application to 3D face alignment*

Highlights

Abstract

Introduction

Section snippets

Related work

Dynamic subdivision framework

Statistical Non-rigid ICP algorithm

Experiments

Conclusion

Acknowledgements

Image Vis. Comput. (IVC)

J. Comput. Phys.

Image Vis. Comput. (IVC)

A Morphable Model for the Synthesis of 3D Faces

A 3D Face Model for Pose and Illumination Invariant Face Recognition

Intraclass retrieval of nonrigid 3D objects: application to face recognition

IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

3D Face Recognition in the Presence of Expression: A Guidance-based Constraint Deformation Approach

Optimal Step Nonrigid ICP Algorithms for Surface Registration

The space of human body shapes: reconstruction and parameterization from range scans

ACM Trans. Graph. (TOG)

Establishing point correspondence of 3D faces via sparse facial deformable model

IEEE Trans. Image Process. (TIP)

Fast nonrigid mesh registration with a data-driven deformation prior

Registration of 3D point clouds and meshes: a survey from rigid to nonrigid

IEEE Trans. Vis. Comput. Graph.

Expression invariant 3D face recognition with a Morphable Model

Face recognition based on fitting a 3D morphable model

IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

Discriminative 3D morphable model fitting.

Three-dimensional face recognition in the presence of facial expressions: an annotated deformable model approach

IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI)

Reanimating faces in images and video

Comput. Graphics Forum

Static and dynamic 3D facial expression recognition: a comprehensive survey

Image Vis. Comput. (IVC)

Recognition of 3D facial expression dynamics

Image Vis. Comput. (IVC)

Robust Learning from Normals for 3D Face Recognition

Multimodal face recognition: combination of geometry with physiological information

dI4D - 4D Capture Systems

Robust Discriminative Response Map Fitting with Constrained Local Models

Global Correspondence Optimization for Non-rigid Registration of Depth Scans

Multilinear Wavelets: A Statistical Shape Space for Human Faces

3D Face Template Registration Using Normal Maps