multimodal biometric systems

Multimodal biometric systems have many advantages over single- modal biometric systems, e.g. higher recognition accuracy and security level. However, for most existing multimodal biometric systems, experiments are conducted using the unreal multimodal databases in which no samples of different biometric modules are from the same person. For example, a ﬁ ngerprint sample of person A and a biometric face sample of person B will be combined to form a sample of the joint ﬁ ngerprint module and face module. A multimodal biometric system is designed and tested, which is made up of two common biometrics, face and ﬁ ngerprint, using a real multimodal database (a sample of the joint ﬁ ngerprint module and face module is formed from the ﬁ ngerprint and face of the same person) and two unreal multimodal databases. From the experimental results it is observed that there is a large discrepancy between the system performances evaluated with the real and unreal multimodal databases. This indicates that ignoring the in ﬂ uence of feature dependency, which has been a common practice in evaluating multimodal biometrics systems, can produce misleading system performance evaluation results.

Introduction: Single-modal biometric systems that use only one biometric trait for recognition often suffer from issues such as biometric data variation, lack of distinctiveness, low recognition accuracy and spoof attacks [1]. To overcome these problems, multimodal biometric systems that fuse multiple biometric features from two or more sources, e.g. fingerprint, finger-vein, iris and face, are built. Compared with single-modal biometric systems, multimodal biometric systems tend to improve recognition accuracy, security and system reliability. However, most existing multimodal biometric systems, see, e.g. [2][3][4], are designed by processing biometric features independently without taking into account the features' mutual dependency in different modules. In these experiments, 'unreal' multimodal databases are used, in which samples of multimodal biometric modules which are supposed to be constructed from different biometric modules of the same subjects are constructed from different persons. Such an 'unreal' multimodal database ignores the mutual dependency of features. A simple example of this mutual dependency is an intuitive observation that a tall person tends to have bigger hands than a short person. Experimental results generated using this type of 'unreal' multimodal databases cannot reflect real situations, and consequently may misguide the implementation of multi-biometric systems in real applications. To show the existence of feature dependency and verify its influence on system performance, in this Letter we design a face and fingerprintbased multi-biometric system and test it on the 'real' and 'unreal' multimodal databases.
Multimodal biometric system: In this Section, we design a multimodal biometric system to substantiate the impact of feature dependency on system performance. The designed multi-biometric system includes two modules, namely, face module and fingerprint module.
In the face module, each face image is first rotated and cropped into a standard size of 128 × 128 pixels according to the eye coordinates. Then the Gabor filter and linear discriminate analysis-based technique is used to extract the face feature [5]. For each face image, a real-valued vector containing 99 real values is generated. This real-valued vector is further transformed into a binary string P 1 by utilising the biohashing technique [6]. The overall processing flow of face feature extraction is shown in Fig. 1. To calculate the similarity score between a face template T 1 and a face query Q 1 , the feature set P T 1 extracted from the template T 1 and the feature set P Q 1 extracted from the query Q 1 are compared and the similarity score between them is calculated as follows: where hamdist(P T 1 , P Q 1 ) is the Hamming distance between P T 1 and P Q 1 and len(P T 1 ) is the binary length of the P T 1 . In the fingerprint module, given a fingerprint image, a set of minutiae M = (m 0 , m 1 , m 2 , …, m N − 1 ) are extracted by a commercial software package called VeriFinger 4.0 from Neurotechnology [7]. Each minutia m i∈[1, N − 1] is represented by a vector (x i , y i , θ i ), where x i and y i are the x, y coordinates and θ i is the orientation of m i . To avoid global registration, a polar coordinate-based structure is constructed, centred at each minutia [8]. For example, if minutia m 0 is chosen as the centre point of the polar coordinate, other minutiae in the range of R around m 0 are translated and rotated with respect to the vector (x 0 , y 0 , θ 0 ). The value of (x 0 , y 0 , θ 0 ) is deemed as (0, 0, 0) and any other minutia m i∈[1,N − 1] can be transformed and represented as (ρ i , α i , β i ) in a polar coordinate system, where ρ i ∈ (0, R] is the radial distance, α i ∈ [0, 2π) is the radial angle and β i ∈ [0, 2π) is the relative orientation of minutia m i -m 0 . To tolerate some feature uncertainty caused by nonlinear distortion, the polar coordinate space is further quantised into many grids of step sizes (s ρ , s α , s β ), where s ρ is the step size of ρ i , s α is the step size of α i and s β is the step size of β i . In this way, the polar coordinate space centred at minutia m 0 can be divided into a threedimensional (3D) cube containing Y⌊R/s ρ ⌋ × ⌊2π/s α ⌋ × ⌊2π/s β ⌋ cells. Then, each cell of the 3D cube is examined and number '1' is used to index a cell if at least one minutia falls into it; otherwise, '0' is used to index that cell. By concatenating the index number of each cell, the polar coordinate-based structure centred at minutia m 0 can be represented by a vector P 2 (m 0 ), which contains only 0s and 1s. At the end of this process, for each fingerprint image, a vector set where N is the number of minutiae in the fingerprint image. The overall processing flow of fingerprint feature extraction is shown in Fig. 2. [0, 1, 0, ..., 1, 0, 1] To calculate the similarity score between a fingerprint template T 2 and a fingerprint query Q 2 , each vector P T 2 (m i ) from the vector set extracted from query Q 2 , where NT and NQ are the number of minutiae in T 2 and Q 2 , respectively. The similarity score between P T 2 (m i ) and P Q 2 (m j ) is calculated as follows: On the completion of the above comparisons, a score matrix in the size of NT × QT can be generated and the maximum score in the matrix is chosen as the final matching score S 2 between T 2 and Q 2 .
The final score S of the multi-biometric system is calculated based on score S 1 from the face modal and score S 2 from the fingerprint modal, that is where norm(·) represents the score normalisation process, which converts the score value into the range of [0, 1]. If the final matching score S is greater than a preset threshold, then the authentication is considered to be successful, and vice versa.
Experiments: We chose a subset of the 'real' Multimodal Biosecure Database [9], in which all the combined biometric samples are extracted from the same user. The database DS2 contains biometric samples collected from a total of 210 users and each user contributed six modalities, namely, audio-video, face, fingerprint, hand, iris and signature, in two sessions which were generated in a 1 month interval. Our tests included the first 100 users from the database DS2. Four high-definition face images and the first two fingerprint images from each of the 100 users were chosen from each session, so there are altogether 800 face images and 400 fingerprint images that were employed in our experiments. By applying the algorithms described in the previous Section, biometric features were extracted from the face and fingerprint images, respectively. For the face module, the first four images of each user were used for training and the remaining four images were used for testing. We combined the 5th, 6th, 7th and 8th face images with the 1st, 2nd, 3rd and 4th fingerprint images to form four multimodal pairs for each user. The 1st pair acted as the template and the other three pairs as the query. Three cases are designed to verify the influence of feature dependency on system performance. The performance of the designed multi-biometric system is evaluated by the false accept rate (FAR) and false reject rate (FRR). The results of the three cases are shown in Fig. 3, from which we can see that the performances of case 1 (using the 'real' multimodal database) and case 2 (using 'unreal' multimodal database A) are quite different. For example, when the FAR is set to 0.1%, the FRR is 0.78% for case 1 and 1.66% for case 2, which means that the FRR has increased by about 113%. Case 3 ('unreal' multimodal database B) may also happen in real life but it is more like a kind of attack (random attempt using different feature combinations). We can see from Fig. 3 that case 3 also shows performances different from case 2. When FAR = 0.1%, the FRR is 0.73% in case 3, which is also quite different from the FRR of 1.66% in case 2. Conclusion: We have evaluated a multimodal biometric system based on 'real' and 'unreal' multimodal databases. Our experimental results verify the existence of feature dependency in multi-biometric systems and also show that most existing multimodal biometric systems that utilise 'unreal' multimodal databases for testing do not reflect actual system performance in real situations where a multimodal template should consist of multimodal biometrics from the same person. Our investigation manifests the importance of taking into account the features' mutual dependency when designing and testing a multimodal biometric system.