Automatic Extraction of Two Regions of Creases from Palmprint Images for Biometric Identification

Palmprint has become one of the biometric modalities that can be used for personal identi ﬁ cation. This modality contains critical identi ﬁ cation features such as minutiae, ridges, wrinkles


Introduction
A biometric identification system identifies automatically an individual based on his physical or behavioral characteristics [1,2]. Palmprint is one of the most important physiological modalities, and it can be used for personal identification [3,4]. The palmprint indicates a photograph of a hand or the impression the hand leaves on the surface [5]. Its region starts from the wrist to the root of the fingers [5]. The palmprint provides a lot of information that can be used in personal identification [4,6].
Recently, the usage of the palmprint for identification has attracted increasing attention from researchers [2,7]. Palmprint has several advantages. If compared with the finger, palmprint area is larger [2,8], and thus, contains more features [1,9]. Palmprint-based biometric systems are lowcost [8], user-friendly [7], and provides high accuracy and efficiency [2]. Although many researchers have observed the effectiveness and usefulness of palmprint in identification, only a few works have been reported in improving palmprint identification for individualization purposes [10].
Various unique features can be extracted from the palmprint, such as crease features, point features, and texture features. Features on the palmprint are stable, as most of these features remain unchanged in individual lifespan [1,7]. Among these features, the crease feature has attracted our attention for biometric recognition. Creases are discontinuities in the epidermal ridge patterns [11] and developed during gestation age [12]. Creases are formed during the embryonic skin development stage [11], and they are permanent and unique [13]. These creases divide the palm into several regions as shown in Figure 1. The main regions of the palm are hypothenar, thenar, and interdigital region.
Crease feature is a stable and important feature that can provide rich information for palmprint recognition [6,7]. Identification methods that are using creases have shown a promising recognition rate that is comparable with methods that are using other features such as face and fingerprint [6]. Creases get attention from most researchers because this feature is the most clearly observable feature even when captured at low resolution (e.g., in the resolution of 100 dpi) [2]. A feature from creases is a special salient feature on palmprint, but unfortunately, identifications based on creases are still not very common [13,14]. In an examination of palmprint, creases commonly served as supporting information to identify or eliminate distorted latent palmprints [13].
Palmar flexion creases are important crease features on palmprint images [6]. Palmar flexion creases are one of the external anatomical landmarks on the hand [15]. They represent the region of firmer attachment of the skin to the basal skin structure (i.e., dermis) [11,15]. Most of the existing works were focused on principle lines of palmprint [6,9,10,[16][17][18]. Unfortunately, these studies did not utilize significant portions of the palmprint. Principal lines are genetically dependent, whereas most of the other creases are not [19]. These nongenetically deterministic features are still very useful, and regarding this fact, our study will focus on other creases on the palm.
In our project, two regions of palmprint's creases of size 2 cm × 2 cm will be used as our features. One of the regions is located in the hypothenar region, while another region is in the interdigital region. As the manual extraction of these regions is tedious, time-consuming, not consistent, and prone to errors, an automatic region of interest (ROI) extraction technique is proposed in this manuscript. An automatic technique not only makes the ROI extraction simple but also standardizes the features and can reduce the problems with palmprints that are not correctly aligned. This paper is organized as follows. Section 2 presents our methodology. Then, the experimental results are presented in Section 3. Finally, Section 4 concludes our findings.

Methodology
The palmprint images used in our research are acquired through a general scanner, which is a Canon E400 Series. The images are scanned at 300 dpi × 300 dpi and saved in JPEG format. They are 24-bit-per-pixel color images with a size of 2488 × 3484 pixels (i.e., around 8.6 Mpixel image).
Examples of input images used in this experiment are shown in Figure 2. As shown in this figure, in addition to the palm, there are two rulers located in the image. These rulers are used to measure the geometry of the palm. There is always one ruler located on the top portion of the image. Another ruler is located either on the left or the right side of the image. There are also six markers or pegs in each image. The color of each marker in the image varies between images. These markers were used to help the person to align their hand during image acquisition.
We asked the volunteers to put their hand on the glass surface of the scanner. Then, to minimize the effects from the external light, the hand is covered with one plain black cloth. The lighting is solely from the internal lighting provided by the scanner itself. However, some of the images, such as in Figure 2(a), are appearing brighter than others. This is due to improper positioning of the hand or improper covering of the black cloth, which allows the ambient light to penetrate into the system and interfere with the images. Figure 3 shows the block diagram of the proposed method. The input of this system is a color input image F. The outputs of this system are two ROIs, which are the ROI on the hypothenar region, ROI 1 , and the ROI on the interdigital region, ROI 2 . This method has been implemented using C# in Microsoft Visual Studio. As shown in this figure, the proposed method consists of 12 main blocks. These blocks will be explained in the following subsections.
2.1. Image Downsampling by Factor 10. Because the resolution of the original input image F is around 8.6 Mpixels (i.e., 2488 × 3484 pixels), a long processing time is required to process the whole image. Thus, in this research, to reduce the computational burden, image F has been downsampled by using a scaling factor of 10. The output from this process is a downsampled image f with dimensions of 248 × 348 pixels (i.e., floor(2488/10) × floor(3484/10) pixels), with image resolution around 0.086 Mpixels. Therefore, the area of image f is only 1% as compared with the area of the original image F. The downsampling process by using a factor of 10 is given by the following equation: where x and y are the spatial coordinates, and c is the color channel (i.e., red (R), green (G), and blue (B)). Coordinates (x,y) = (0,0) are located at the top left of the image. The resolution of the image now becomes 30 dpi × 30 dpi.

Conversion from
Color to Grayscale Image. To further reduce the processing time, color image f is then been converted to grayscale image g. By using this conversion, the three-channel image is converted to a one-channel image.

Journal of Sensors
Although there are many methods that can be employed for this conversion, in this research, we do the conversion by keeping only the red color channel (R). This is because we assume that the human skin (i.e., the palm) is more dominant in red color (R), as compared with blue (B) or green (G) color. Besides, to reduce the unwanted effect of highintensity values from the rulers in the image that might deteriorate the performance of the next segmentation process, the regions of the rulers have been cropped out. In order to do so, we have inspected image f and found that the rulers' areas only occupy 25-pixel-row from the top and 25-pixelcolumn from the left. Therefore, this area is given intensity 0, which is similar to the background intensity value. Image g is given as where R is the red color channel, and W f is the width of image f (i.e., in this case, W f is 248 pixels).

Segmentation of the Hand Region.
In this research, the hand region is identified by using a simple thresholding. As the hand region is brighter than its background, the mask of hand M 1 is defined as where T 1 is the threshold level. The value of T 1 is in between 80 and 120. Yet, T 1 equal to 100 works in most cases. From equation (3), the value of M 1 (x,y) is 1 for the hand region, and 0 for the background area.

Removal of the Fingers' Region.
In this research, the creases that we are interested are located on the palm. Therefore, the regions of the fingers from M 1 will be deleted. We assume that the area of the palm is bigger than 50 × 50 pixels on image g. To do so, we use two temporary masks, which are m 1 and m 2 . By inspecting row-by-row, only horizontal lines of value 1 with length more than 50 pixels are copied to m 1 . Similarly, by inspecting column-by-column, only vertical lines of value 1 with length more than 50 pixels are copied to m 2 . The output from this process is M 2 , which is defined as where T 2 is the first quartile of the values from h(x,y) inside the region defined by M 2 (x,y). 3 Journal of Sensors 2.7. Detection of Palmar Crease. In this research, we have assumed that the palmar crease is a straight line that passing through the palm from the left side to the right side. Therefore, we use a Hough line transformation to find this straight line. The Hough image (i.e., a 2D matrix) that we use for this Hough transformation is the radius r versus angle ϕ. In the beginning, all bins in this Hough image are empty. The range of r considered in this experiment is integer values from −300 to 300. Next, for each defined value in M 3 (i.e., M 3 (x,y) = 1), the value of r is calculated: for ϕ from 0°to 365.5°, with a step size of 0.5°. From this information, the value in the corresponding bin of the   Journal of Sensors Hough image, which is based on coordinates (ϕ,r), will be added by one. After all points with value 1 in M 3 have been transformed, we then find the bin with the maximum value. We have restricted our search for the range of ϕ in between 70°t o 80°only. This bin will give us coordinates (ϕ max ,r max ). From this bin, we then track back all points from M 3 that have contributed to this bin. These points are then given value 1 in a temporary mask M 4 , while other points are given value 0. As these points might not form one single straight line, we refine these lines by using a morphology closing operation, using a square structuring element of size 5 × 5 pixels, for both dilation and erosion operations.
It is also possible to use information from ϕ max to correct the alignment of the hand so that the palmar crease will lay horizontally. We also have rotated the original image 180°s o that it is easier to analyze the image by using visual inspection. Therefore, the rotation angle θ obtained from this step is defined as θ = ϕ max − 90°8 2.8. Rotation of Palmar Crease. To rotate this image, we have considered the center of the image, which is the location of (W f /2, H f /2), as the new origin (0,0). Here, W f is the width of image f, and H f is the height of image f. Then, for each coordinates (x,y) on the rotated mask M 5 , the corresponding coordinates (x o ,y o ) on mask M 4 are found using the following formula: From here, the rotated mask M 5 is defined as Information from M 6 is also used in this stage. Coordinates P 1 = (x 1 ,y 1 ) are at the point located on the left side of the line. It is defined as the left-most point in M 5 that is with M 6 (x-1,y-1) = 0, M 6 (x-1,y) = 0, and M 6 (x-1,y + 1) = 0. On the other hand, P 2 = (x 2 ,y 2 ) is defined as the right-most point, which is located more than 50 pixels from P 1 , with M 6 (x + 1,y-1) = 0, M 6 (x + 1,y) = 0, and M 6 (x + 1,y + 1) = 0.

13
Point P 1 corresponds to the hypothenar area. Based on this point, an ROI of the size of 24 pixels × 24 pixels is defined, which is in coordinates range of x 1 ≤ x < x 1 + 24 and y 1 ≤ y < y 1 + 24. Within this region, based on mask M 6 , the number of nonpalm pixels is calculated. If the number of nonpalm pixel is greater than 48 pixels, the procedure is repeated by shifting P 1 one pixel to the right. The location of P 1 where this requirement is fulfilled is denoted as P a = (x a ,y a ).
Similarly, point P 2 corresponds to the interdigital area. Based on P 2 , an ROI of the size of 24 pixels × 24 pixels is defined in coordinates range of x 2 -24 < x ≤ x 2 and y 2 -24 < y ≤ y 2 . Within this region, based on mask M 6 , the number of nonpalm pixels is calculated. If the number of nonpalm pixel is greater than 48 pixels, the procedure is repeated by shifting P 2 one pixel to the left. The location    Journal of Sensors of P 2 where this requirement is fulfilled is denoted as

Define Actual Regions.
If the ROI on the mask is of size 24 pixels × 24 pixels, this ROI should be upsampled by factor 10 for the image F. Therefore, the actual size of the ROI is 240 pixels × 240 pixels. The points P a and P b are also multiplied by 10 in order to find the corresponding points on the original resolution. These new points are denoted as P A and P B , respectively.
x A , y A = 10x a , 10y a , 14 A region on hypothenar is defined based on point P A . This region is in the range of x A ≤ x < x A + 240 and y A ≤ y < y A + 240. For each point in this region, the followings are defined:

Journal of Sensors
Therefore, the ROI on the hypothenar region is defined as Similarly, a region on interdigital is defined based on point P B . This region is in the range of x B -240 < x ≤ x B and y B -240 < y ≤ y B . For each point in this region, equations (15) and (16) are calculated. Therefore, the ROI on the interdigital region is defined as ROI 1 and ROI 2 are then saved in separate files in BMP format for further identification processes.

Results and Discussions
This section is divided into two subsections. In Subsection 3.1, qualitative or subjective evaluations will be presented. Results from some stages of the method will also be presented. In Subsection 3.2, quantitative or objective evaluations will be given.

Qualitative Evaluation.
After the image was downsampled by factor 10, the image is converted to a grayscale image by keeping only the red color channel. Figure 4 shows each of the color components from one of the images used in this experiment. As shown by this figure, we can see that the hand has the best contrast in the red channel, as compared with the other two color channels. Therefore, it is shown that Journal of Sensors the use of the red channel in this research is appropriate to present g. The images are then undergoing image thresholding to define the mask of hand region M 1 . The results obtained from the different T 1 value of equation (3) are shown in Figure 5. As shown in this figure, the input image is shown by Figure 5(a) requires a high threshold value, which is T 1 equal to 120, to correctly separate the hand from the background. This is because, during the image acquisition, the scanner lid is not completely closed, which causes higher illumination at the hand's base. Thus, T 1 values equal to 80 and 100 failed to segment the hand for this case. Therefore, for this input image, T 1 equal to 120 is selected.
For input images shown in Figures 5(b) and 5(c), although all threshold values tested separate the hand region, T 1 that is equal to 100 gives the best result. This is because at T 1 equal to 80, the segmented region is bigger, and there are potential that some background regions at the hand edges might be included as the hand region. At T 1 equal to 120, the defined region is smaller, and thus some of the hand regions at the edges might be excluded. Therefore, for these images, T 1 equal to 100 is selected.
As there are intensity variations of the input images, in this project, the user is given the flexibility to choose the threshold level T 1 . Three options are given, which are 80, 100, and 120, which can be set from a graphical user interface. The default value is set to 100. As M 1 mask will not be shown to the user, the selection of the threshold value is mostly based which T 1 value that gives best output ROIs.
The fingers' region on mask M 1 is then removed by using equation (4). Figure 6 shows the M 2 obtained from this step, The green squares present the detected ROIs. As shown by these figures, both ROIs have been located correctly on all test images. It is worth noting that the method also performs well for the hand-wearing ring and other hand ornament, as shown in Figure 8. where TP is the number of the true positive pixels, TN is the number of the true negative pixels, P is the number of real positive case pixels in the image, and N is the number of real negative case pixels in the image. These measures are used to evaluate the extracted ROI 1 (hypothenar region), the extracted ROI 2 (interdigital region), and for both regions. The definitions for TP, TN, P, and N for these three cases are given in Table 1. As given by these equations, all measures used need the information from the ground truth. Therefore, in this experiment, we have created the ground truth data by manually segmenting these 101 palmprint images by using our previously developed ROI segmentation tool [20]. The ROIs from this manual segmentation are considered as the ground truth of the data. Table 2 shows the sensitivity, specificity, and accuracy values obtained from 101 palmprint images. As shown in this table, the proposed method has a good performance in terms of specificity and accuracy, where its values are near to 1 for all input image given. However, if we inspect the sensitivity of the proposed method, the method has the sensitivity range of 0.4389 to 0.9716 for ROI 1 , a range of 0.5086 to 0.9782 for ROI 2 , and a range of 0.5033 to 0.9327 for both ROIs. This indicates that the extraction of ROI 1 is more difficult as compared to ROI 2 . Figure 10 shows some of the differences between the ROIs detected by the proposed method with their corresponding ground truths. Figure 10(a) presents the case where the extraction of both ROIs is not good. For this figure, the sensitivity of ROI 1 is 0.4389, the sensitivity of Number of pixels that define the interdigital region in both the output and the ground truth.
Number of pixels that define the hypothenar and interdigital regions in both the output and the ground truth.

TN
Number of pixels that does not define the hypothenar region in both the output and the ground truth.
Number of pixels that does not define the interdigital region in both the output and the ground truth.
Number of pixels that does not define the hypothenar and interdigital regions in both the output and the ground truth.
P Number of pixels that define the hypothenar region in the ground truth.
Number of pixels that define the interdigital region in the ground truth.
Number of pixels that define the hypothenar and interdigital regions in the ground truth.
N Number of pixels that does not define the hypothenar region in the ground truth.
Number of pixels that does not define the interdigital region in the ground truth.
Number of pixels that does not define the hypothenar and interdigital regions in the ground truth.

10
Journal of Sensors ROI 2 is 0.5678, and the sensitivity for both regions is 0.5033. Figure 10(b) shows a case where we get a good extraction of ROI 2 (i.e., interdigital region) but not a good extraction of ROI 1 (i.e., hypothenar region). For this figure, the sensitivity of ROI 1 is 0.4650, the sensitivity of ROI 2 is 0.9158, and sensitivity for both regions is 0.6904. Figure 10(c) shows a good ROIs extraction. For this figure, the sensitivity of ROI 1 is 0.8769, the sensitivity of ROI 2 is 0.7865, and the sensitivity for both regions is 0.8317. Figure 10(d) presents the case where the extracted ROIs are almost the same with the ground truth. For this figure, the sensitivity of ROI 1 is 0.9590, the sensitivity of ROI 2 is 0.9064, and the sensitivity for both regions is 0.9327.

Conclusion
This paper presents a new technique to extract two regions from palm image. This is a fully automatic technique. However, as threshold value T 1 in equation (3) plays an important role in this method, the user is still given the freedom to In addition, to make the extraction of the features simpler, as compared to manual extraction, the technique able to align the image based on detected palmar creases makes the data more standardized. The features extracted can be fed into a machine learning algorithm for biometric identification.

Data Availability
The image data used to support the findings of this study are available from the corresponding author upon request.

Disclosure
The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; and in the decision to publish the results.

Conflicts of Interest
The authors declare no conflict of interest.