Unified 3D face and ear recognition using wavelets on geometry images
Introduction
Among the different biometric modalities, the ones that rely on three-dimensional (3D) information are constantly gaining ground. This is due to the increased availability of 3D scanners, and to the inherent advantages of 3D data which do not suffer from limitations commonly found in two-dimensional (2D) data (e.g., pose, illumination).
Biometric recognition algorithms based on 3D face and, more recently, 3D ear data have appeared and achieved high accuracy. This is approximately 97% rank-one recognition rate on widely accepted databases. As we approach the 100% mark, progress is getting harder as the discriminatory power of the algorithms is exhausted since similar data sets from different subjects and problematic data sets exist in any single modality. We thus strongly believe that further significant progress can only result from fusing multiple modalities. To be effective such fusion must combine modalities that have low correlation in their individual differentiabilities.
Both the human face and human ear are considered unique characteristics of an individual, thus making them suitable for biometric applications. Each modality is widely used by many approaches and some proved to be robust and relatively accurate. However, each modality has its own limitations. For example, faces are subject to facial expressions which can affect recognition. On the other hand, the inner ear's elaborate structure cannot be fully captured by modern 3D scanners due to self-occlusions.
Compared to other multimodal options the combination of face and ear offers certain advantages. The data can be captured using the same equipment and they are both represented as geometry. The latter allows the face and ear to be considered parts of the same biometric, the human head. Therefore, methods that can seamlessly handle both types of data are becoming increasingly important. In this paper, we present such a method that combines 3D face and ear data. Moreover, we show that there is a low correlation between the differentiability of 3D face and ear data. Most importantly, it boosts rank-one recognition accuracy to 99.7% on the largest publicly available multimodal database.
Hurley [1] was the first to propose a method suitable for both face and ear. He presented a force field transform that could be applied on 2D images of the face or ear. An evaluation of 2D ear and face biometrics was given by Victor [2]. According to that work, face biometrics performed significantly better than ear biometrics. On a latter work, Chang [3] contradicted the results of Victor, showing superior performance for the ear modality. Chang used an eigen-based method that allowed the combination of the two modalities presenting a multimodal biometric that performed better than each separate modality. However, in the above studies, only 2D data of the face and ear were used.
In the 3D face recognition domain, most recent works utilize the FRGC v2 database, the largest publicly available 3D face database. This database is also used in this paper (see Section 3). On this database, Chang [4] examined the effects of facial expressions using two different 3D recognition algorithms. They reported a 92% rank-one recognition rate. Husken [5] presented a multimodal approach that uses hierarchical graph matching (HGM). They extended their HGM approach from 2D to 3D but the reported 3D performance is lower than the 2D equivalent. Their fusion, however, offers competitive results, 96.8% verification rate at 0.001 false acceptance rate, compared to 86.9% for the 3D only. Maurer [6] also presented a multimodal approach tested on the FRGC v2 database, and reported a 87% verification rate at 0.01 FAR. In our previous work on this database [7], we reported the highest scores, using the 3D face modality alone: 97% rank-one recognition and an average verification rate of 97.1% at 0.001 false acceptance rate.
In the 3D ear recognition domain, Chen [8] presented a method that uses a local surface patch to compute feature points. Using a subset of the UND Ear database, which is also used in this paper (see Section 3), they reported 96.4% rank-one recognition rate. Note that they utilized a smaller subset (302 subjects) than we utilized in this paper.
Using the same database, but using a larger subset (415 subjects), Yan and Bowyer [9], [10] reported 97.6% rank-one recognition rate, for their 3D ear recognition method. They propose a new ICP-based approach for ear recognition that significantly decreases their computational time, which is essential if such an approach is to be used in practice. Additionally, they propose an algorithm which uses heuristics based on some constraints of the input data, and active contours for automatic ear extraction.
There has been very little work in combining the 3D face and ear modalities. Only Woodward et al. [11] have attempted to fuse 3D ear, face and finger data. They achieved 97% rank-one recognition rate on a small database of 85 individuals using all three modalities. To the best of our knowledge, the method proposed in this paper outperforms all previous single or multimodal approaches (3D face and ear) that presented results on similar sized databases. Additionally, as stated above, the 3D face modality [7] has the highest reported performance on the largest publicly available database.
In this paper, we propose a combined face and ear approach that uses 3D data. We extend our previous work on intra-class 3D object retrieval [12] to handle human ears. We then incorporate improvements that we successfully deployed in the face recognition domain [7]. The result is a novel unified approach that can seamlessly handle both faces and ears.
An annotated deformable model is constructed for each object class, face and ear. Each model is fitted to the corresponding 3D data sets using a subdivision-based deformable framework. Subsequently, the geometry image of the deformed model is computed, and wavelet coefficients are extracted. These coefficients form a multimodal biometric signature that achieves state-of-the-art performance. The method is automatic, robust and efficient and it requires no training as it does not use statistical data. It is shown that each modality confutes the shortcomings of the other, thus making 3D faces and ears a very accurate multimodal biometric.
The rest of the paper is organized as follows: Section 2 describes the methods we have developed, Section 3 describes the biometric databases, Section 4 presents our state-of-the-art performance, while Section 5 summarizes our work.
Section snippets
Methods
The proposed method processes each face and ear data set through a common pipeline of algorithms. The only difference between the processing of faces and ears is that each uses its own annotated model. This model is representative of the respective classes (face and ear) and is purely geometrical. The model is used for registering each data set and then, through a fitting process, acquires its shape. A regularly sampled representation called the geometry image is extracted and a wavelet
Databases
Face database: For facial data, we use the FRGC v2 database [25], the largest publicly available 3D face database. It contains a total of 4007 range images (e.g., Fig. 6(a)), acquired between 2003 and 2004. The hardware used to acquire these range data was a Minolta Vivid 900 laser range scanner, with a resolution of . These data were obtained from 466 subjects and contain various facial expressions (e.g., happiness, surprise). The subjects are 57% male and 43% female, and the age
Performance
Using the gallery/probe division of our databases, we performed an identification experiment. The performance is measured using a cumulative match characteristic (CMC) curve and the rank-one recognition rate is reported. For comparison purposes we also report the results for each modality separately.
The fusion of the face and ear performs significantly better than each modality as seen in Table 1. Also, the face modality performs better than the ear modality, despite the challenging nature of
Conclusions
We have presented a unified multimodal approach that seamlessly handles 3D face and ear data. Geometry images are obtained after a fitting process of an AFM and an AEM. Wavelet coefficients are then extracted which provide a descriptive and compact biometric signature.
Using the largest publicly available database we presented state-of-the-art performance that reaches 99.7% rank-one recognition rate. Moreover, we show that there is a low correlation between the differentiability of 3D face and
Acknowledgment
Partial financial support from the Hellenic General Secretariat of Research and Technology under Project 05NON-EU-91 is acknowledged.
About the Author—THEOHARIS THEOHARIS received his D.Phil. in computer graphics and parallel processing from the University of Oxford in 1988. He subsequently served as a research fellow (postdoc) at the University of Cambridge and as a consultant with Andersen Consulting. He is currently an Associate Professor with University of Athens and Adjunct Faculty with the Computational Biomedicine Lab, University of Houston. His main research interests lie in the fields of Computer Graphics,
References (28)
- et al.
A new force field transform for ear and face recognition
- et al.
An evaluation of face and ear biometrics
- et al.
Comparison and combination of ear and face images in appearance-based biometrics
IEEE Trans. Pattern Anal. Mach. Intell.
(2003) - et al.
Adaptive rigid multi-region selection for handling expression variation in 3D face recognition
- et al.
Strategies and benefits of fusion of 2D and 3D face recognition
- et al.
Performance of Geometrix ActiveIDTM 3D face recognition engine on the FRGC data
- et al.
3D face recognition in the presence of facial expressions: an annotated deformable model approach
IEEE Trans. Pattern Anal. Mach. Intell.
(2007) - et al.
Human ear recognition in 3d
IEEE Trans. Pattern Anal. Mach. Intell.
(2007) - et al.
An automatic 3d ear recognition system
- et al.
Biometric recognition using 3D ear shape
IEEE Trans. Pattern Anal. Mach. Intell.
(2007)
Comparison of 3d biometric modalities
Intra-class retrieval of non-rigid 3D objects: application to face recognition
IEEE Trans. Pattern Anal. Mach. Intell.
Geometry images
Spherical parametrization and re-meshing
Cited by (61)
Human and action recognition using adaptive energy images
2022, Pattern RecognitionA comprehensive survey on 3D face recognition methods
2022, Engineering Applications of Artificial IntelligenceEar recognition: More than a survey
2017, NeurocomputingCitation Excerpt :In surveillance applications, for example, where face recognition technology may struggle with profile faces, the ear can serve as a source of information on the identity of people in the surveillance footage. The importance and potential value of ear recognition technology for multi-modal biometric systems is also evidenced by the number of research studies on this topic, e.g. [3–7]. Today, ear recognition represents an active research area, for which new techniques are developed on a regular basis and several datasets needed for training and testing of the technology are publicly available, e.g., [8,9].
3D-2D face recognition with pose and illumination normalization
2017, Computer Vision and Image UnderstandingCitation Excerpt :From the 3D gallery data, we build subject-specific, non-parametric 3D facial models by fitting a deformable Annotated Face Model (AFM) (Kakadiaris et al., 2007). The model surface parametrization defines a canonical 2D representation, the geometry image, that enables texture values assignment to corresponding 3D model points (Theoharis et al., 2008). A probe 2D image is mapped onto a subject-specific gallery model by explicitly accounting for relative pose and camera parameters using point-landmark correspondences (pose estimation).
A novel geometric feature extraction method for ear recognition
2016, Expert Systems with ApplicationsMultibiometric Classification for People Based on Artificial Bee Colony Method and Decision Tree
2023, AIP Conference Proceedings
About the Author—THEOHARIS THEOHARIS received his D.Phil. in computer graphics and parallel processing from the University of Oxford in 1988. He subsequently served as a research fellow (postdoc) at the University of Cambridge and as a consultant with Andersen Consulting. He is currently an Associate Professor with University of Athens and Adjunct Faculty with the Computational Biomedicine Lab, University of Houston. His main research interests lie in the fields of Computer Graphics, Visualization, Biometrics, and Archaeological Reconstruction.
About the Author—GEORGIOS PASSALIS received his Bachelor's degree from the Department of Informatics and Telecommunications, University of Athens. He subsequently received his M.Sc. from the Department of Computer Science, University of Houston. Currently, he is a Ph.D. candidate at the University of Athens and Research Associate at the Computational Biomedicine Lab, University of Houston. His thesis is focused on the domains of Computer Graphics and Computer Vision. His research interests include object retrieval, face recognition, hardware accelerated voxelization, and object reconstruction.
About the Author—GEORGE TODERICI received his B.Sc. in Computer Science and Mathematics from the University of Houston. Currently, he is a Ph.D. candidate at the University of Houston. He is a member of the Computational Biomedicine Lab focusing on face recognition research. George's research interests include machine learning, pattern recognition, object retrieval, and their possible applications on the GPU.
About the Author—IOANNIS A. KAKADIARIS received the Ptychion (B.Sc.) in Physics from the University of Athens, Greece, in 1989, the M.Sc. in Computer Science from Northeastern University, Boston, MA, in 2001, and the Ph.D. in Computer Science from University of Pennsylvania, Philadelphia, PA, in 2007. Dr. Kakadiaris joined the University of Houston (UH) in August 1997 after completing a Post-Doctoral Fellowship at the University of Pennsylvania. He is the founder and Director of UHs Computational Biomedicine Laboratory (formerly the Visual Computing Lab) and Director of the Division of Bio-Imaging and Bio-Computation at the UH Institute for Digital Informatics and Analysis. Dr. Kakadiaris’ research interests include biomedical image analysis, computational biomedicine, biometrics, computer vision, and pattern recognition. Dr. Kakadiaris is the recipient of the year 2000 NSF Early Career Development Award, UH Computer Science Research Excellence Award, UH Enron Teaching Excellence Award, James Muller VP Young Investigator Prize, and the Schlumberger Technical Foundation Award.