A deep biometric hash learning framework for three advanced hand ‐ based biometrics

Hand ‐ based biometrics has undergone extensive research in recent decades. Besides fingerprint, which is commonly used in personal authentication, there are other three advanced hand ‐ based biometrics that are further researched worldwide, such as palm-print, palm vein, and dorsal hand vein. However, the academics mainly focus on their unimodal or multimodal recognitions, and few researchers conduct comprehensive comparisons for them to guide practical applications, that is, to assess which one is the suitable biometric modality. Inspired by deep hashing network (DHN) and transfer learning, we propose a deep biometric hash learning (DBHL) framework to uniformly analyse and deal with these three biometrics. An end ‐ to ‐ end network is involved to convert images into binary codes. Pre ‐ trained network is employed for fine ‐ tuning, and the hamming distance is adopted to measure the similarity between the codes of query and registration images. Through experiments on benchmarks, equal error rate (EER) and the maximum or minimum distance of genuines or imposters are obtained as performance


| INTRODUCTION
Recently, personal authentication systems based on specific physiological or behavioural characteristics of humans are prevalent due to their greater convenience and security than the traditional identification card-or password-based systems. Therefore, hand-based biometrics has attracted a lot of research interest due to their potential advantages in user-friendliness and high-precision. Generally, hand-based biometrics identifies users based on their unique features, such as palmprint [1], hand geometry [2], palm vein [3], and dorsal hand vein [4].
Hand is a natural part of our body that has good user friendliness both culturally and customarily. Furthermore, various features can be extracted from our hands which provide high discriminative power by lines, feature points, vein figuration, and minutiae. In this article, we mainly focus on three advanced hand-based biometrics, that is palmprint, palm vein, and dorsal hand vein because of their inherent properties including large feature area, contactless imaging, and richness of texture and minutiae. For instance, the principal lines, ridges, singular points, and minutiae points in the palm images and the structure of vein skeleton can all be utilised for constructing a stable recognition system.
Regarding the development of palmprint recognition, traditional methods can be divided into several categories, that is texture-based, orientation-based, and statistics-based methods [5]. For example, Luo et al. [6] proposed a local binary pattern-like descriptor in the local line-geometry space, which can be considered as a texture-based approach. In [7], Fei et al. extracted orientation features and proposed a discriminative neighboring direction indicator palmprint recognition. With the introduction of deep learning methods, many deep learning-based palmprint recognition algorithms have achieved satisfactory performance [8].
Concerning vein pattern, by near-infrared imaging of the subcutaneous blood vessels and tissues, vein skeleton can be extracted from the region of interest (ROI). Since most of veins are invisible to the naked eye, they are more difficult to replicate than the fingerprint. Vein pattern can be considered as unique to each individual and remains stable for a long period [9]. Therefore, it is of great research value in the field of hand-based biometrics. Among different vein patterns, we pay more attention to palm vein and dorsal hand vein recognitions.
With regard to palm vein recognition, most of the traditional methods are based on the skeleton structure of the subcutaneous vein [10]. Chen et al. [11] used a Gaussian Matched Filter (GMF) for feature extraction and then employed an Iterative Closest Point (ICP) algorithm for matching. Other methods focus on structural features to identify and classify special patterns of palm vein image [12], which rely much on manually designed features. More details will be described in subsequent sections.
The general process of dorsal hand vein recognition consists of image segmentation, extraction of blood vessel skeleton, and matching. Biometric Graph Matching (BGM) algorithm [13] was proposed for matching after skeleton extraction. An improved method [14] was adopted for application under uncontrolled environment. With deep neural networks (DNN) employed for identification tasks, Wan et al. [15] trained Convolutional Neural Network (CNN) to extract features followed by the logistic regression, while some operations including ROI extraction and preprocessing were still essential. Furthermore, dualmodal biometrics partially based on dorsal hand vein was engaged as well [16].
Towards the practical application of the above-mentioned three types of hand-based biometrics, there are still some serious issues to be solved, such as image occlusion, rotation, and translation along with the complex background, etc. Thanks to the tremendous progress in deep learning, many researchers are involved in the learning methods of hand-based biometrics. Usually the end-to-end learning can be realised by resorting to DNN without specific manually designs, which is helpful for hand-based biometrics with many hidden regular patterns.
In this article, deep biometric hash learning (DBHL) framework is proposed to uniformly deal with these three types of hand-based biometrics, that is palmprint, palm vein, and dorsal hand vein recognitions. An end-to-end network structure is adopted to process a certain biometric image and output its hashing code. Feature extraction is performed by CNN, and binary coding is achieved by the sign function. Finally, Hamming distance between the codes of query image and enrolled image can be calculated to determine whether they are from the same category for palmprint verification. The schematic diagram of the DBHL framework is shown in Figure 1.
Based on the DBHL framework, extensive experiments of these advanced hand-based biometrics are conducted on several representative datasets, and satisfactory performance is obtained. In experiments, palmprint and palm vein databases are from the widely used benchmark databases collected by the Hong Kong Polytechnic University [17], that is, PolyU-Blue and PolyU-NIR databases. Dorsal hand vein database is from the North China University of Technology, referred to as the NCUT database [18]. Under our experimental conditions, equal error rate (EER) of dorsal hand vein recognition can be as low as 0.196%. More impressively, the EERs of palmprint and palm vein recognition can reach 0%. This shows that our proposed DBHL framework is very effective and helpful to construct specific hand-based biometrics systems easily and conveniently.
Furthermore, for the first time, based on the DBHL framework, a direct comparison for these three different types of competitive hand-based biometrics is conducted, which can provide guidance to researchers in application. Currently the academic focus is mainly on the unimodal recognition or multimodal fusion recognition of palmprint, palm vein, and dorsal hand vein [19]. Hence, in order to further guarantee the performance of the recognition system, a comparative analysis of their advantages and disadvantages is strongly needed. Then, when researchers are faced with the problem of selecting modalities in the practical application of hand-based biometrics, it is very helpful to decide which among them is the suitable choice. However, in previous studies, due to the different patterns and characteristics of these three biometrics, it was difficult to find a uniform algorithm to quantitatively analyse their advantages. For example, BGM, which performs well on the dorsal hand vein, cannot be applied for palmprint recognition. In contrast, the DBHL framework can achieve the state-of-the-art on the three modalities at the same time, rather than only the specific objects. Thanks to the end-to-end deep learning framework, additionally, DBHL can be used for a unified comparison without manually extracted features, even though the image quality and imaging methods are quite different. Therefore, a preliminary comparison is conducted using DBHL with the same network parameters, training strategies, and sample sizes. Based on the results, in terms of pros and cons of these three hand-based biometrics, palmprint has been found to have certain superiority over the other two.
The contributions of this article can be briefly summarised as follows: (1) DBHL framework is proposed to convert ROIs into hashing codes for three unique hand-based biometrics, that is palmprint, palm vein, and dorsal hand vein recognitions. By calculating the Hamming distance between codes, it is easy to determine whether they are from the same category, which can improve the efficiency of feature matching. Experimental results demonstrate its high adaptability and efficiency for biometrics. (2) Based on the DBHL framework, a unified and comprehensive comparison is conducted on these three types of hand-based biometrics for the first time. According to the experimental results, the palmprint is more suitable for high-accuracy authentication, although a little lack of anticounterfeiting ability exists. -247 The article consists of seven sections. Section 2 offers the related work. In Section 3, it describes our proposed DBHL. Section 4 presents experiments and results on three types of hand-based biometrics. Section 5 provides discussions for the results. A preliminary comparison is shown in Section 6. Section 7 concludes this article.

| Palmprint recognition
Palmprint recognition mainly consisted of machine learning methods and its optimisations [19]. Due to the rich distinctive orientation information of palmprint, there are several orientation-based coding methods based on Gabor filter, such as the competitive code [20], robust line orientation code method (RLOC) [21], binary orientation co-occurrence vector (BOCV) [22], extended BOCV (E-BOCV) [5], and discriminative and robust competitive code (DRCC) [23]. Fei et al. [24] proposed a novel palmprint recognition algorithm by performing the convolution of direction-based templates and palmprint called the discriminant direction binary code (DDBC) method. Gumaei et al. [25] proposed a novel approach using autoencoder (AE) and regularised extreme learning machine (RELM) to improve the efficiency of palmprint recognition. Jia et al. [26] proposed a more complete and comprehensive way to represent the images. Fei et al. [27] analysed different direction features based on exponential and Gaussian fusion model (EGM) for palmprint recognition. Gumaei et al. [28] present a hybrid feature extraction method, HOG-SGF, using the histogram of oriented gradients (HOG) and a steerable Gaussian filter (SGF) for effective palmprint recognition.
About deep learning-based methods, deep discriminative representation (DDR) was proposed in Reference [29] to extract discriminative deep features based on limited palmprint samples. Zhong et al. [19] proposed a hand-based multibiometrics using deep hashing network (DHN) and BGM and obtained promising results. Chen et al. [30] proposed an effective denoising method for low-resolution palmprint images based on generative adversarial network (GAN). Shao et al. [31] proposed PalmGAN to perform cross-domain palmprint recognition on several databases. Xie et al. [32] finetuned visual geometry group network to propose a new CNN model for palmprint gender classification. Based on Gabor filters and principal component analysis (PCA), Genovese et al. [33] proposed PalmNet for touchless palmprint recognition. In Reference [34], Shao and Zhong proposed graph convolutional network for few-shot palmprint recognition.

| Palm vein recognition
Traditional palm vein recognition algorithms mainly used physical patterns including minutiae points, ridges, and texture for matching. For instance, winner-take-all hashing (WTA) [35], random sample consensus (RANSAC) [36], and adaptive 2D Gabor filter [37] were all applied to improve the representation performance. Sometimes, the palm vein image for recognition is not always clear, and irregular shading and saturated areas may appear, leading to prolonged processing time. One exemplary work is given in Reference [38], where Holle et al. proposed a palm vein recognition system using the local line binary pattern (LLBP) method. Ma et al. [37] proposed a novel palm vein recognition approach using an adaptive 2D Gabor filter, consisting of three key steps. Even if a lot of optimised algorithms were proposed, once the size of the dataset is enlarged, the time complexity is higher, which will lead to adverse effects. Ahmad et al. [39] adopted the wave atom transform (WAT) for palm vein recognition with high computational efficiency. Wu et al. [40] extracted discriminate features using the Haarwavelet decomposition for palm vein recognition. Cho et al. [41] extracted the features of palm vein and palmprint to perform cross-spectral verification and conducted experiments on public multi-spectral palm databases.

| Dorsal hand vein recognition
The general process of dorsal hand vein recognition consists of image segmentation, extraction of the blood vessel skeleton, F I G U R E 1 Schematic diagram of the deep biometric hash earning (DBHL) framework, the ROIs of query image and enrolled image are inputted with the same DNN to obtain their hashing codes, which consist of convolutional layers and fully connected layers. Hamming distance is adopted to measure the similarity between the codes. An appropriate threshold can be used to determine the genuine and impostor matches and matching. Huang et al. [42] proposed a new key-point generation pattern, namely centroid-based circular key-point grid (CCKG), which efficiently localised a number of points on the dorsal hand. Another typical method is the BGM algorithm mentioned above, which extracted the skeleton of vein and converted it into a graph for matching [13,14]. Arora et al. [43] extracted some new features based on the information set theory, such as vein effective information, Shannon transform feature, vein energy feature, and composite transform feature. Huang et al. [30] combined both the texture features and the shape features and proposed a novel shape representation method to improve the distinctiveness of the dorsal hand vein image. Based on the non-invasive near infrared imaging method, Yildiz and Boyraz [44] proposed the dorsal hand vein imaging system for vein visualisation. In Reference [4], Wei and Zhang analysed different ROI extraction algorithms and performed the dorsal hand vein identification and verification. Based on neural networks, Wan et al. [15] trained the reference-CaffeNet, AlexNet, and VGG depth CNN to extract image features followed by the logistic regression for identification. In this article, an end-to-end recognition method is proposed. The images are inputted in the model directly and then encoded as 128-bit binary codes. Through the network, the codes for the same class can become similar, while codes of different types vary significantly.

| Deep hash coding
In recent years, deep learning has achieved outstanding performance in both the theoretical and applied research [45]. In many computer vision tasks, deep learning-based algorithms have achieved the incomparable performance of traditional methods. Recently, many effective and efficient deep hashing algorithms have also emerged. Lu et al. [46] proposed a new deep hashing approach for scalable image search using DNN to exploit linear and non-linear relationships. Liu et al. [47] proposed a Deep Supervised Hashing (DSH) scheme for fast image retrieval. In Reference [48], Song et al. proposed a selfsupervised video hashing (SSVH) to capture the temporal nature of videos in an end-to-end manner to hash fashion. Using unsupervised adversarial learning, Deng et al. [49] proposed the Unsupervised ADversarial Hashing (UADH) for image search. An encoder was used to generate the hashing code, a generator was used to reconstruct the images, and a discriminator was adopted to distinguish the hashing codes and images. Yang et al. [50] proposed the supervised semantics-preserving deep hashing (SSDH), which constructed hashing functions as a latent layer in a deep network, and the binary codes were learned by minimizing an objective function combining classification error and other desirable hashing code properties. Yang et al. [51] proposed the dubbed shared predictive deep quantization (SPDQ) to improve the performance of efficient cross-modal similarity search. To improve the effectiveness of hash coding, Song et al. [52] incorporated the advantages of quantization error reduction methods into the conventional property preserving hashing methods and proposed the quantization-based hashing (QBH). The superior performance of the deep hashing method for image retrieval has motivated researchers to expand the deep hashing applications to biometrics.

| DEEP BIOMETRIC HASH LEARNING FRAMEWORK
Inspired by the DHN [53] and transfer learning [54], the DBHL framework is proposed to uniformly deal with three types of hand-based biometrics. Fine-tuning is a powerful trick in transfer learning, which can improve the accuracy and reduce the training costs. In [55], Donahue et al. pointed out that the features extracted by the DNN trained on a large and fixed set of object recognition tasks significantly outperformed the state-of-the-arts. In this article, we fine-tune the weights of VGG-16 [54] pre-trained on ImageNet.

| The structure of the deep biometric hash The learning framework
DBHL framework consists of convolution layers, fully connected layers, and coding layer. Pre-trained VGG-16 is adopted as the backbone network. Figure 2 shows the network parameters of the adopted DBHL. VGG-16 is scalable and has been successfully applied to many computer vision tasks [54]. Its structure is simple, and the entire network uses the same size convolution kernel size (3 � 3) and maximum pool size (2 � 2). The regularisation of small convolution kernels and deeper networks are adopted. Due to the small number of samples used, VGG-16 is easier to obtain better performance and reduce the degree of overfitting. The ROI is first inputted into CNN and fully connected layers to obtain discriminative features. Then, in order to encode each image into a k-bit code, the extracted features are further inputted to Equation (1). Finally, the codes are adopted to calculate their Hamming distance through a simple XOR operation, which can present their similarity to determine the classes.

| Optimization goal
The DBHL is aimed to reduce the distance between the codes of genuine matches and pushes away the codes of imposter matches. Therefore, the distance between the hashing codes can be involved into the loss function, denoted as hashing loss. Given two images, i and j, their outputs before coding are u i and u j , and their distance is D ij . The hashing loss can be expressed as where C ij ¼ 1 when the two images are genuine matches, otherwise C ij ¼ 0. m is the margin threshold. D ij is measured by the Euclidean distance during training, In addition, since Equation (1) is used to get the binary code, it will cause an error between the image feature and the final code, which should be as small as possible and taken into account as the quantization loss. Assuming that the code of image i is represented as b i ∈ fÀ 1; þ1g k , the quantization loss can be expressed as The entire loss of N training images consists of the hashing loss and the quantization loss with a weight w,

| Database
Palmprint database: The PolyU-Blue [17] database was used as palmprint samples for experiments, which contain 6000 palmprint images from 55 females and 195 males. The samples were collected in two sessions, and each session selected six images. ROIs of 128 � 128 pixels are extracted from each image. Figure 3 shows two samples. Palm vein database: The PolyU-NIR [17] database was adopted as the palm vein database. The infrared absorption of vein is stronger than the surrounding tissue. So the palm images collected under infrared lights can be used as the vein pattern. There are also 6000 images in this database from 250 individuals. During acquisition, each volunteer was asked to provide 12 images of the right and left hand, respectively. Hence, these images can be divided into 500 categories. ROIs of 128 � 128 pixels size are also extracted. The typical samples are shown in Figure 4.
Dorsal hand vein database: The NCUT [18] database was adopted consisting of 102 individuals, each with 20 images of left and right hands. There are grayscale images of 640 � 480 pixels containing the complete back of the hands. The F I G U R E 2 Some details of the deep biometric hash learning (DBHL) based on VGG-16. For convolution layers, the parameters of filter size, convolution stride, and padding are listed. For maxpooling layers, the windows and strides are given. For fully connected layers, the dimensions and activation functions are also described F I G U R E 3 Two region of interest (ROI) samples of PolyU-Blue palmprint database infrared illumination is relatively weak during image acquisition, so histogram equalisation and filtering are required for preprocessing. In this article, two modes of ROI, that is, circular and square, are adopted. The circular ROI (cROI) is the largest inscribed circle of the extracted hand contour, while the square ROI (sROI) is its largest inscribed square. Some typical samples and ROIs are shown in Figure 5.

| Implementation details
In order to ensure a consistent sample size for the experiments, 204 classes are randomly selected from PolyU-Blue and PolyU-NIR databases. All images are preprocessed in the same manner, such as histogram equalization and filtering. For each class in all three databases, there are 10 images, half of which was used for the training network and the remaining half used for the testing. The final performances of different hand-based biometrics are evaluated by EERs, the maximum Hamming distance of genuine matches, and the minimum Hamming distance of imposter matches. During testing, all the images are entered into the trained network for encoding. Then, the Hamming distances between codes from the testing and training sets are calculated. Finally, by comparing the distances, it is determined whether two biometric images belong to the same class. Particularly, the 'Conv1' 'Conv2', 'Conv3', and 'Conv4' of pre-trained VGG-16 is fine-tuned to improve the performance, as shown in Figure 2. The experiments are implemented using the Tensorflow framework. The learning rate is set to 0.0001, and the Adam Optimiser and Stochastic Gradient Descent (SGD) are adopted.

| Experimental results
During the experiments, for every 10,000 steps, the trained models are saved and evaluated with the testing sets. The benefit is that we can observe the trend of the algorithm performance. EERs under the corresponding training steps are obtained in Table 1. The receiver operating characteristic (ROC) curves are shown in Figure 6.
It can be observed that EERs of palmprint and palm vein recognitions can reach 0% at 10,000 steps. While EER of the dorsal hand vein recognition with cROI does not stabilise until 40,000 steps, it even has to be trained by 60,000 steps to be stable with sROI, however, neither can reach 0%. The best result of dorsal hand vein recognition with cROI is when EER equals 0.196%, while the smallest EER with sROI -251 equals 0.392%. Meanwhile, from Figure 6, we can also find that the ROC curves of the palmprint and the palm vein are both above those of the dorsal hand vein. This shows that the dorsal hand vein recognition is not sufficiently discriminative inherently, compared to the other two.
In addition, the maximum Hamming distances of genuine matches and the Hamming minimum distances of imposter matches are obtained and presented in Tables 2 and 3, respectively. It can be observed that the maximum distances of the genuine matches of the palmprint are within 10. The minimum distances of the imposter matches are about 40. For the palm vein, the maximum distances of the genuine matches are around 20, and the minimum distances of the imposter matches are around 37. It indicates that the palmprint recognition has a superior discriminative ability than the palm vein. In comparison, for the dorsal hand vein recognition with two ROI extraction modes, the maximum distances of the genuine matches are about 60, and the minimum distances between the imposters are more than 20. Therefore, there is an inevitable overlap between the Hamming distance of genuine matches and imposter matches, which again illustrates the shortcomings of the dorsal hand vein recognition.

| Comparison with other works
In order to demonstrate the validity and versatility of the DBHL framework for hand-based biometrics, some comparisons with the state-of-the-arts are provided in Tables 4 and 5. Since many works attribute PolyU-NIR to a palmprint database, we put the palmprint and the palm vein in the same Table 4. In Reference [50], classification loss and desirable hash code properties were combined to optimise the deep network. Inspired by it,the classification loss also was introduced into the DBHL based on VGG-16, Resnet 50 [56], and AlexNet [57]. It can be observed from the results that for all three biometrics, the proposed method, DBHL, can outperform other baselines to achieve the state-of-the-art performance.

1) Effect of different code lengths
Biometric images are transferred into binary codes by the DBHL. Theoretically, the longer the length of code, the more the information it contains, and the higher the accuracy. In this part, we conduct several experiments to compare the performance of 32-bit, 64-bit, 128-bit, and 256-bit codes on these three advanced hand-based biometrics. The results are shown Table 6. For palmprint, all the EERs of four code lengths are 0%. For palm vein, when the code length is 32, the EER is 0.109%, and all the other EERs are 0%. For the dorsal hand vein recognition with two ROI extraction modes, as the code length increases, most of the EERs decrease.

2) Effect of different margin threshold m
In Equation (2), m is a margin threshold which is used to optimise the distance between the image matches. Here, we evaluate different ms on these databases to find the optimal margin threshold, where the code length is set to 128, and the results are shown in Table 7. For the palmprint and the palm vein, the image quality is relatively good and the patterns may be identified more easily, so almost all the EERs of different m are equal to 0%. However, when m is 180, the difference between the maximum distances of genuine matches and the minimum distances of imposter matches is the largest. For the dorsal hand vein recognition with two ROI modes, as m increases, the EERs first decrease and then increase, and the results of m ¼ 180 are also optimal.

1) Necessity of hash coding
In order to verify the impact of the hash coding module, the results of different fully connected layers before hash coding are compared, as shown in Table 8. There are four fully connected layers, which can extract features in 4096, 2048, and 128 dimensions. From the results, for palmprint and palm vein, all the fully connected layers can obtain optimal results with EER equal to 0%. For dorsal hand vein, as the network deepens, the performance becomes better. In general, the results of the hashing codes are the best. Furthermore, for the other fully connected layers, their matching time is much longer than that of the hashing codes, which exactly shows the advantage of hash coding.

2) Effect of hashing loss and quantization loss
Hashing loss and quantization loss are combined to generate the optimization object. Here, experiments are conducted to show their roles. We set different weights, w, for them to calculate EERs and the results are presented in Table 9. Hashing loss is based on a contrastive loss, which makes the genuine codes close and the imposter codes far. Quantization loss is adopted to reduce the error of the thresholding procedure. Objectively, hashing loss is the most important part of optimization object and cannot be replaced. Combined with the results above, quantization loss can make the extracted codes more -253 discriminative. From the results, when the weight of quantization loss is too large, the distance between the codes cannot be optimised well. However, when its weight is too small, the error caused by thresholding will become relatively obvious. In this article, we set w ¼ 0:5, and the performance as optimal.

3) Necessity of fine-tuning
In this article, the weights of VGG-16 pre-trained on ImageNet are fine-tuned to improve the performance. Here, several experiments are conducted to show the necessity of the fine-tuning, and the results are shown in Table 10. From the results, the strategy of fine-tuning can greatly improve the performance, which shows its necessity.

| Performance on unseen samples
In order to evaluate the generalization ability of our DBHL, experiments on biometrics with unseen samples are conducted, that is, only some categories are used for training. For the NCUT, PolyU-Blue, and PolyU-NIR databases, 1020 images from 102 categories are randomly selected to train model, and the remaining categories are used for test. The results are shown in Table 11. The results are not as good as those mentioned above, where all the categories are used for training. But this is not  Abbreviations: cROI, circular region of interest; EERs, equal error rates; sROI, square region of interest.

-
abnormal. It is difficult to extract specific features for untrained categories, which is a tricky and ubiquitous topic. However, the palmprint recognition obtains the optimal result, that is EER ¼ 1.90% and the performance of the dorsal hand vein is still the worst, which can also support our conclusion about their advantages and disadvantages.

| Performance on other databases
In the above section, in order to maintain the fairness of comparison, databases collected from similar environments are adopted. Here, we conduct experiments on other more real-life databases. The GPDS dorsal hand vein database is collected by the Digital Signal Processing Group at the University of Las Palmas de Gran Canaria, and it contains 1030 images from 103 individuals [65]. Tongji, is a contactless palmprint database, which has 12,000 images collected from 600 different palms in two sessions [66]. In order to be consistent with other experiments, 2040 images belonging to 204 subjects are also randomly selected. The IITD palmprint database was collected by a contactless device from 230 individuals [67]. Each is captured into five or six images. The Xi'an Jiaotong University Unconstrained Palmprint (XJTU-UP) database is an unconstrained database collected by five mobile phones and contains 10 palmprint subdatasets [68]. Here, SF and SN databases are adopted to conduct the experiments, which have 2000 images for 200 palms. For the IITD, the first five palmprint images from each palm are employed, and three images are used for training and the remaining two images are used for testing. For the other databases, first five images are selected to train model and the remaining five images are selected to test. The results are shown in Table 12. There are only 1030 images in the GPDS database and the images are variable and complex, hence its performance is not very good. For Tongji, the DBHL can also obtain an EER equal to 0. For the XJTU_UP database, the collection is performed in an unconstrained manner, so it is relatively difficult to identify. In summary, these results also demonstrate the effectiveness of our DBHL.

| Computational complexity
To  Table 13. From the results, the traditional methods can obtain a lower feature extraction time. The time complexity of the DBHL is relatively high, which is a ubiquitous problem. However, it can be observed that the DBHL still meets the real-time requirements. In particular, the matching time of the DBHL can be improved, which is more important in application.

| Quantitative comparison
To the best of our knowledge, for the first time, a preliminary comparison of these three types of hand-based biometrics is provided based on the DBHL framework, which is a uniform 'ruler' despite different image quality and patterns. Firstly, by measuring EER, the optimal results of the palmprint and the palm vein recognitions show EER ¼ 0%. While EER of the dorsal hand vein recognition cannot reach TA B L E 9 EERs (%) of different combinations of hashing loss and quantization loss -255 0%, no matter square or cROI is adopted. In addition, from the convergence speed of algorithms, the palmprint and the palm vein recognitions achieved optimal results at only 10,000 training steps. However, when dorsal hand vein is trained with 40,000 to 60,000 steps, the EERs tend to be stable. The reason may be that the texture of dorsal hand vein is not rich enough inherently. In addition, its infrared imaging method may also result in the loss of details. Secondly, the maximum distance of genuine matches and the minimum distance of imposter matches are two other important indicators. The former belonging to the palmprint is smaller than that of the palm vein; while the latter is larger than that of the palm vein. It means that for the palmprint recognition, there is a bigger gap to separate the genuine matches from the imposter matches. The gap is very important especially when the amount of samples to be classified increases significantly. In this sense, the palmprint recognition is more advantageous than the palm vein. By the means, this kind of gap does not exist for the dorsal hand vein, and there is an inevitable overlap between the Hamming distance of the genuine matches and the imposter matches. It will result in false acceptance and false rejection, on verifying the inadequacies of the dorsal hand vein recognition once again.
Thirdly, the convolutional features extracted from the convolutional layers are visualised into grey scale images, trying to compare these features. When the convolution layer deepens, the visually identifiable image is gradually reduced in visibility. The output of the convolutional layers at the same location is selected for visualization, that is the pool one and pool two layers in the pre-trained VGG-16, which are shown in Figure 7.
From Figure 7, the texture information of the samples can be roughly visualised in the first few convolutional layers. For palmprint, there is more obvious texture features, such as the main line and the wrinkles. These obvious features can also be found in the palm vein image, which indicates that the palm vein recognition also utilises the features of the palmprint. In fact, the palmprint and the palm vein images from the PolyU database are collected in a similar manner except for the illuminations. Palmprint information can also be captured when palm vein images are collected under infrared lighting, which is one of the reasons for its high recognition rate. For the dorsal hand vein, the infrared acquisition method causes it to lose some information, especially the texture information. The Palmprint and the palm vein are richer in texture than the dorsal hand vein, as a result, their recognition accuracies are relatively higher.

| Qualitative comparison
Besides the above-mentioned preliminary comparison, we would like to conduct discussion with intuitive and qualitative comparisons.
First of all, regarding the difficulty in collection, palmprint is easier to collect than palm vein and dorsal hand vein. The vein is present under the skin tissue; therefore, during image acquisition, it is necessary to add a certain intensity of infrared illumination in a closed environment, which increases the difficulty. Palmprint recognition is mainly based on visible texture features, so the requirements of acquisition device are not too high. Some researchers have adopted mobile phones to collect and identify palmprint in the natural environment [69].
Second, palmprint has richer textures. Palmprint recognition is based on palm line and textures [26], while palm vein and dorsal hand vein recognitions are based on the direction and the endpoint of the vein pattern [9]. In addition to several clear main lines, palmprint also has rich wrinkles and fine ridges, which ensures its high accuracy.
Thirdly, from the stability of the features, the palm vein and the dorsal hand vein are more stable than the palmprint. Human

-
veins naturally are distributed in subcutaneous tissues, so they are difficult to lose and change over time, unless through surgery [70]. Palmprint is exposed to the outside world and is often in contact with other objects, so it is easy to wear off and the stability is relatively poor, which is also its biggest drawback. Finally, in terms of safety, palm vein and dorsal hand vein recognitions are safer. Vein recognition has the advantage of the so-called biopsy (Liveness detection) that once the blood stops flowing, it will no longer be able to identify. This increases the difficulty and cost of committing fraud and copying for criminals, thereby enhancing security. Palmprint recognition is based on simple optical principles to capture images, so it is easy to copy, which is another drawback.
As a result, in our opinion, from the perspective of simplicity, friendliness, and accuracy in practical applications, palmprint is more suitable, especially in scenes where the safety is extremely demanding, palm vein or dorsal hand vein may be a better choice. Furthermore, multimodal biometrics based on palmprint and other biometric modalities may be a good compromise, with high accuracy and high security.

| CONCLUSION
Palmprint, palm vein, and dorsal hand vein recognitions are three advanced hand-based biometrics for the personal authentication systems. In this article, for the validity and versatility, we propose the DBHL framework for these three types of hand-based biometrics. The DBHL converts biometric images into binary codes in an end-to-end manner. Feature matching can be easily performed by comparing the Hamming distances between the codes, which ensures high computational efficiency.
On several widely used benchmark databases including PolyU-Blue, PolyU-NIR and NCUT databases, DBHL outperforms other models to obtain a state-of-the-art performance. Palmprint and palm vein recognitions achieve the optimal performance with EER ¼ 0%. For the dorsal hand vein recognition, the EER of cROI is equal to 0.196%, and the EER of sROI is equal to 0.392%. By measuring the EER, the maximum distance of genuine matches and the minimum distance of imposter matches are determined, indicating that the palmprint has a superior discriminative ability. In addition, intuitive and qualitative comparisons among three types of hand-based biometrics are provided as well. According to the results, from our point of view, the palmprint is simpler and more convenient for practical application than the palm vein and the dorsal hand vein, while the vein pattern may be more suitable for scenes with extremely high safety requirements.