Improved Dynamic Threshold Method for Skin Colour Detection Using Multi-Colour Space

: This paper presents a skin colour detection based on an improved dynamic threshold method to reduce false skin detection. Current fixed threshold skin detection fails in certain situations such as misclassification between non skin-like with similar skin-like colour. Any true skin may falsely be detected as non-skin. Research work introduces high-level skin detection strategy based on online sampling where offline training is not required. This strategy shows a promising performance in term of classifying images under skin-like and ethnicity image variations. However, some of the methods produced high false positives that reduced the accuracy of skin detection performance. Therefore, in this study, instead of single colour space and fixed threshold method, an improved skin detection based on multi-colour spaces is proposed. Furthermore, a dynamic threshold method also has been improved by introducing elastic elliptical mask model for online skin sampling. The experimental result shows an improvement in employing multi-colour rather than single colour space by reducing the false positive and increasing the precision rate


Introduction
The human body is divided into many parts, and skin is one of it.Skin is the largest organ in the human body.Skin colour detection has been studied extensively over the years and is frequently used in many applications such as in security, gaming and also Human Computer Interaction (HCI).Applications such as face detection (Kovac et al., 2003), illicit content filtering (Fleck et al., 1996;Lee et al., 2006), facial recognition (Hsu et al., 2002), steganography (Cheddad et al., 2009) and Content-Based Image Retrieval (CBIR) (Mofaddel and Sadek, 2010;Wen et al., 2009) used skin detection as the primary step in their applications.The main purpose of skin colour detection is to determine the skin pixels in the image and generate skin region by discriminating skin and non-skin pixels.The detected skin region is then examined based on the specific application (Abdullah-Al-Wadud et al., 2009).In the past, numerous skin detection techniques have been introduced and successfully applied for skin tone detection using colour information.Hence, the skin colour information gained serious cue for extracting skin pixels in image processing applications.Numerous colour spaces are being used nowadays such as RGB, YCbCr, HSV, HIS, normalised RGB and CMYK (Gonzales and Woods, 2002).RGB and YCbCr are the most common colour space used in a skin model (Hsu et al., 2002;Khan et al., 2012;Phung et al., 2002;Vezhnevets et al., 2003;Wong et al., 2003).However, the skin colour detection often affected by the image variation such as different illumination, skin-like objects, ethnicity, camera characteristics and complex background (Kakumanu et al., 2007).
There are many skin colour detection techniques that have been proposed and it can be grouped into four: Explicit threshold method, parametric, non-parametric and dynamic skin modelling (Bianco et al., 2013;Osman et al., 2012;Kakumanu et al., 2007).Normally, these techniques fall under two ways, either pixel-based or region-based.From literature, the pixel-based approach is the most widely used since it is less computational, robust information against rotation and partial occlusion.The explicit threshold method is fast yet the simplest skin colour modelling to implement single or multiple thresholds in determining a pixel or non-pixel.The threshold values are often obtained from empirical training whereby the researchers trained large image ■■■ dataset to find optimal boundaries.Since, the thresholds greatly depend on the training dataset, it is difficult to achieve better accuracy through this method.Skin-like object such as leather, wooden or sand could be mistakenly classified as skin.
Instead of one colour space, combining multiple colour space for skin colour model also shows promising result.Several researchers combined multiple colour space to build a skin colour model (Abdul Rahim et al., 2006;Samart et al., 2011;Wang and Yuan, 2001;Xiang and Suandi, 2013;Zhu et al., 2012).For instance, Xiang and Suandi (2013) introduced a fusion of multi-color space for skin segmentation using YCbCr-YUV and RGB-YUV colour space.It was found that RGB-YUV skin colour model is better in handling image with a complex background.Meanwhile, Samart et al. (2011) introduced new rule for face detection based on RGB-HSV-YCbCr skin colour model.This showed significant improvement of skin detection on the multi-colour space model.Unfortunately, their skin colour models were fixed and they required offline training.This training needed large image dataset in order to visualise the skin colour distribution.
This paper is presented in five major sections.Firstly, discussion on the related works is done, then details of the face-based adaptation is elaborated.The next part of the paper introduces the implementation details of the proposed method.Following this, discussion is undertaken based on the result obtained from the experiment.Finally, the paper is concluded with some recommendation for future work.

Related Works
For the past few years, researchers have shifted to dynamic or adaptive approach which is based on the face or hand adaptation as detailed in Table 1 (Bianco et al., 2013;Hsieh et al., 2012;Hwang et al., 2013;Ibrahim et al., 2012;Tan et al., 2012;Taylor and Morris, 2014;Yogarajah et al., 2012).This approach requires no offline training of skin samples and it is less complex.The motivation of this approach was that the skin samples were obtained directly from the image using face or eye detector which known as online skin sampling technique.The adaptability of the modelling skin colour of the faces gives many benefits in term of performance and no offline training needed.The skin colour models were generated based on the individual faces.Yogarajah et al. (2012) and Ibrahim et al., (2012) proposed dynamic skin detector using threshold method.It was reported that, there were many 'black spot' in the segmented area which is the false positive pixels (Yogarajah et al., 2012).This dynamic skin detector classified many non-skin pixels as skin, even though, the method improved in term of classifying skin in different skin tones rather than explicit static cluster.Therefore, an improved skin colour detection based on the dynamic threshold with a combination of multi-colour spaces has been proposed in this study.Initially, four types of face mask have been analysed as to which one generates less non-skin pixels.An elastic elliptical mask model based on eye angles was also introduced.

Face-Based Adaptation
From Table 1, numerous skin detections that were based on face adaptation had been proposed.These facebased adaptations were employed under different approaches and colour spaces.Some of the researchers used threshold approach due to less complex and easier to implement.In addition, the different face mask model has been adopted to obtain the skin sample from the detected face.Other researchers implemented rectangular, circular and elliptical shape for the face mask generation.Only Kawulok (2008) used trapezium face mask to extract the face skin sample.Notice that, the human face can be detected without the colour information, providing an advantage in the face-based adaptation approach.Most of the researcher adopts Viola andJones (2004) andFasel et al. (2005) face detector as their main pre-processing phase.

Face Skin Mask
Initially, four types of face mask models have been studied which presented in Fig. 1.As for our analysis, Viola-Jones face detector was chosen to locate the human faces in the colour image.
Figure 1 illustrates skin samples that were obtained using different face mask models.Figure 1(a) is the face region in which the original dimension of the rectangular size.On the other hand, Fig. 1(b)-(c) are the reduced dimension where both of them were reduced based on these parameters, [0.25, 0.2] and [0.36, 0.36].Reducing dimension would reasonably reduce any non-skin regions.The empirical analysis was done to analyse the percentage of non-skin pixels detected in all the skin samples.The purpose is to select a suitable face mask model for the Region of Interest (ROI) during the skin colour extraction process.
Based on the empirical analysis done, reducing the dimension of the face region greatly reduce the nonskin pixels.As illustrated in Fig. 1, the skin area highly located as the human face shape is oval with less background detected.Therefore, slightly rotated face views were also taken into consideration as there is a high possibility of skin pixels occurrence in slightly rotated face angles.To solve this, an elastic elliptical mask model introduced that based on the eye angle.By employing face mask model, the possible numbers of detected skin pixels from the face region can be increased.Non-skin pixels such as background, hair and lips can be reduced significantly.This process is important in order to generate an optimum threshold values to be used in the dynamic skin classification.

The Proposed Method
Figure 2 presents the overall structure of our proposed dynamic skin detector using a combination of multi-colour spaces.Three different colour spaces used are RGB, YCbCr and HSV.These colour spaces conversions can be found in (Vezhnevets et al., 2003).YCbCr and HSV colour space are defined in Equation 1 and 2: The initial experiment showed that combining colour spaces in our skin detector outperforms single colour space by reducing the false positive and increasing the precision rate.

Skin Colour Extraction
In the face-based skin colour detection, skin samples were obtained directly from the image.As mentioned before, an elastic elliptical mask model was introduced based on the eye angle.According to our initial finding, elliptical mask model generate better skin segmentation with less false positive as shown in Table 2 previously.The elliptical shape is rotated based on the eye angles as shown in the Fig. 4. Parameter 'd' is the distance between the two eyes, while 'a' and 'b' is the major and minor axis of the ellipse size.Value 'c' is the degree of rotation.
The aim of this process is to extract skin pixels that exist in the face region as much as possible.However, the face region may contains non-skin pixels such as lips, eyebrow and hair that need to be removed.Possibility of non-skin pixel recognition can be done by employing skin-region smoothing using Sobel detector to filter out any pixels of non-skin.The pseudo-code for the proposed skin colour detection can be found in Algorithm 1: The Sobel edge detector was used since it gave better outcome in detecting the edges in the image than other existing edge detector.The Sobel edge detector applied to the region as in Fig. 5(b), which is the elastic elliptical mask region with dilation process to expand the detected edges.The white pixel presented in the Fig. 5(c) was the possible existence of non-skin pixels that needs to be removed.Finally, a smooth skin region with minimal non-skin pixels was generated.It was then converted into YCbCr and HSV colour space as presented in the Fig. 5(e) and 5(f).However, the smooth skin region may still contain non-skin pixels during the previous process since there is no guarantee that all the non-skin pixels had been properly removed.Therefore, histogram analysis of confidence interval with two-side 95% acceptance of normal distribution ߪ2) for each of the colour components to determine the accepted region and classified them as skin pixels were carried out.Maximum distribution of skin pixels (with 95% confidence interval) in the histogram was considered to be as the threshold values.Therefore, n-rules of the threshold value were implemented based on multi-colour space.In order to get the locus skin cluster, the skin colour distributions were represented in the form of histogram.By using the mean and standard deviation, the 95% confident interval is calculated using the following statistical formula: where, T is the threshold value for 'G' colour channel.
Figure 6 and 7 illustrates the ‫ܦ1‬ colour distributions of 'R' and 'G' channel obtained from the smooth skin sample of the detected face.Based on the experiments done, it shows that the 'R' maximum distribution always greater than G and B value.Therefore, the skin colour for RGB colour space modelled whereby R is greater than G(R>G) and R is greater than B(R>B).

Dynamic Skin Classification
The purpose of this phase was to combine multithreshold values from the multi-colour space of RGB, YCbCr and HCV.Three colour space were analysed into our proposed skin colour detector.Initially, proposed single dynamic threshold employed in the skin detection based on individual colour space RGB, YCbCr and ‫.ܸܵܪ‬Then, several combinations of YCbCr-RGB, CbCr-RGB, YCbCr-SV and CbCr-SV were respectively presented.Equation 4shows one example of the combinations of two colour space into a single skin colour model for CbCr-SV: Table 2.The effect of using a different face mask under colour space Colour space -------------------------------------------------Face mask model RGB YCbCr HSV The minimum value was the lower bound while the maximum value act as the upper bound threshold value.Multiple threshold values calculated during the online skin sampling process.This dynamic threshold values then were used to classify the skin and non-skin pixels for still images by creating binary image.Value '1' represented the skin pixels while '0' the non-skin pixel.

Multi-Faces Condition
The proposed skin colour detection method also can handle images with multiple faces condition.This was done by repeating the processes of skin detection as in Fig. 2 until there were no faces detected.Then, the generated dynamic threshold was applied to the individual.Figure 8 illustrates four individual results of each detected faces and the merging operation implemented to produce the final result.From the result, a better skin segmentation was produced by merging each of the individual results based on and Boolean operator.This was because each person possesses different skin tone that leads to generation of different threshold values.

Experimental Results
The proposed skin colour detection method was constructed using MATLAB 2014a.The aim of this section is to evaluate the performance of the proposed skin colour detection method applied to different image conditions, skin tones and skin-like objects compared to the state-of-art works.In our study, only frontal face images with slightly face rotation were considered for evaluation.The performance of the skin colour detection could be archived by two methods, i.e., qualitative and quantitative analysis.Qualitative analysis focuses on observing the ability of the proposed skin colour detection to classify skin and non-skin pixels from images.Pratheepan dataset (Tan et al., 2012) was used for this qualitative analysis.However, the ground truth for Pratheepan dataset was not available, therefore the ground truth image were provided by manual selection through Adobe Photoshop CS5.The skin dataset with the ground truth can be accessed at (Chee Seng, 2014).The performance evaluation was based on the following description shown in the Table 3.
Performance measurement of the proposed skin detection was based on the F-measures, precision, recall, specificity and accuracy.The F-measure is the harmonic mean of precision and recall and can be calculated by weighting between precision and recall.The formula for the measurement is shown in Equation 5-9: Where: TP TN Accuracy TP TN FP FN The qualitative analysis showed that the proposed skin colour detection by combining multi-colour space provides better performance in terms of classifying the skin pixels with less false positive regions.
Figure 9 shows the result of Pratheepan dataset (single face) of (Yogarajah et al., 2010;Tan et al., 2012) method.The Fig. 9(b) column is the benchmark or ground thruth image.White colour indicates as skin pixels while black colour indicates the non-skin pixels.The last column Fig. 9(e) with red boxed is the result of performed by our proposed method.
As for the quantitative analysis, comparison of our proposed method based on several combinations of colour space was carried out, where three combinations of colour space were analysed with a single colour space.The results in Table 4 clearly shows that false positive can be reduced significantly from ~19.61% to ~6.99% by combining multi-colour space into single skin colour colour mode.On the other hand, high accuracy was achieved by skin colour model of CbCr with 85.86%.

Discussion
People such as Asian, African and Caucasian have different skin tones that may fall under the different threshold.Adopting dynamic skin colour detection using detected face as the skin sample is the easiest way to classify skin and non-skin under skin-like and ethnicity image variations.This is due to dynamic threshold values that are obtained individually from the detected face to be used in dynamic skin classification.From the literature, current skin colour detection using dynamic threshold method fails to generate better skin regions.In the proposed skin colour detection, elastic elliptical mask model was introduced according to the eye angle.The skin probably located at which employing rotation in the masking model.Any non-skin region such as eyebrow, lips and hair are removed using non-skin filtering.This process was done by implementing Sobel edge detector to the non-skin edges.However, not all the non-skin regions are filtered out.Then, 95% histogram acceptance employed from the extracted skin colour distribution as the final dynamic threshold values.Finally, skin classification with dynamic threshold values employed to the skin colour image.
In multiple faces condition, an iteration of processes was carried out.In this case, initial results of each dynamic threshold value of detected faces were merged to generate the final result of the skin region.The iteration required time to process depending on the number of faces detected in the first place.This process is suitable for an image that contains various skin tones from different people.
Based on the qualitative result obtained, it was proven that our proposed improved skin colour detection generates skin regions rather than the state-of-the-art skin colour detection.Proposed multicolour space is better than a single colour space.YCbCr-SV skin colour model presents the highest precision and lowest false positive.Minimal false positive of skin regions detected where skin-like colour such hair, background were successfully eliminated.However, any skin-like colour region that belongs to the dynamic threshold values remained detected as skin that may reduce the accuracy of the skin detection performance.

Conclusion
Skin colour detection is important in many applications and is continually being researched until now.Hence, developing a dynamic skin detector with flexibility is a great need to model the skin colour for handling image variations.Therefore, in this study, a skin colour detection based on improved dynamic threshold using the multi-colour space have been proposed to detect human skin in coloured image (s).Elastic elliptical face mask model that based on the eye angle was also introduced.Initial analysis found that, elliptical shape more suitable due to human faces are nearly oval, thus lead to reduced possibility of non-skin regions.Experimental results showed false positive was reduced compared to the previous dynamic skin detection methods.In a nutshell, the improved skin colour detection also increased the precision rate compared to the single colour model through the implementation of multi-colour space model.Our proposed skin detection highly depended on the performance of the face detector.Any undetected faces will generate false threshold values, hence resulting in poor detection of skin regions.As for future improvement, we would like to solve this issue by adding adaptability using trained multi-colour model for undetected faces in the images.In addition, a new dataset will be carried out to support this improvement by collecting image under different ethnicity, skin tones and different illumination.
With this, the success in classifying skin pixel with less false positive region, is potential to be applied in skin segmentation for illicit image detection purpose.

Fig. 1 .
Fig. 1.Skin sample using face mask models, (a) Original rectangular, (b) reduced dimension of width 0.25, height 0.2, (c) reduced dimension of width 0.36 and height 0.36 and (d) our elastic elliptical mask based joint with statistical RGB for Hough transform and SVM Trapezium ECU detecting skin in digital image for Polish sign language recognition Yogarajah et al. (2012) Proposed novel learned dynamic threshold using Fasel face detector Elliptical Pratheepan Cheddad et al. (2009) colour space Hsieh et al., (2012) Proposed an adaptive skin colour classifier to segment Viola-Jones face detector Rectangular Not mention dynamic skin regions (face and hand) in real time application using normalized RGB colour space Tan et al. (2012) Combines smoothed 2-D histogram and Fasel face detector Elliptical Pratheepan, Gaussian model using IByRg colour space ETHZ Ibrahim et al. (2012) Developed adaptive margins of skin detector Viola-Jones face detector Rectangular Pratheepan that the skin pixels are collected from the major and minor axes of bounding rectangle.Using YCbCr colour space Bianco et al. (2013) Proposed two high level skin detection strategies: Not mention Rectangular TDSD Adaptive Single Gaussian (AGM) and Colour Gamut Mapping (CGM) Hwang et al. (2013) Proposed new skin detection algorithm that considers Fasel eye detector Rectangular Jones and the luminance value in modelling the colour Pratheepan distribution adaptive chrominance model (YCb, YCr) Taylor and Employed unimodal Viola-Jones face detector Circle Not mention Morris (2014) Gaussian function in the normalized RG colour space.

Fig. 2 .
Fig. 2. The flowchart of the proposed dynamic skin detector

Fig. 4 .Fig. 5 .
Fig. 4. Elastic elliptical mask model according to eye angle.'c' is the rotation angle, d is the distance between two eye points, (a) and (b) respectively 1.2d major axis and 1.1d minor axis

Fig. 6 .
Fig. 6.Threshold value of 'R' channel in RGB colour space

■■■Fig. 9 .
Fig. 9. Qualitative comparison using Pratheepan dataset of faces.From left to right represents the input image (a), ground truth (b), Yogarajah et al. (2010) method (c), Tan et al. (2012) method (d) and our proposed method (e) Table 3. Description of the performance evaluation True Positive (TP) The skin pixels are correctly identified True Negative (TN) The non-skin pixels are correctly rejected False Positive (FP) Incorrectly identified as skin pixels, but actually non-skin pixel False Negative (FN) Incorrectly rejected as non-skin pixels, but actually skin pixel

Table 1 .
Several face-based skin colour detection Table2demonstrates the face mask under different skin colour space respectively, RGB, YCbCr and HSV.It is clear that the reduced dimension and elliptical mask model performs well in classifying the skin regions.However, rectangular shape without reduced dimension mostly fails to generate better skin regions due to a lot of non-skin region extraction.Notice that, HSV colour space sometimes fails to generate better skin region rather than RGB and YCbCr.Therefore, elliptical mask model have been employed by improving it based on the eye angle to overcome this problem.

Table 4
The result in Table4also shows that YCbCr-SV, CbCr-SV and YCbCr-RGB were better than single skin colour model.It can be concluded that YCbCr-SV skin colour model generated the least positive, with reasonable accuracy and high precision.