Abstract

A number of techniques have been proposed earlier for feature extraction using image binarization. Efficiency of the techniques was dependent on proper threshold selection for the binarization method. In this paper, a new feature extraction technique using image binarization has been proposed. The technique has binarized the significant bit planes of an image by selecting local thresholds. The proposed algorithm has been tested on a public dataset and has been compared with existing widely used techniques using binarization for extraction of features. It has been inferred that the proposed method has outclassed all the existing techniques and has shown consistent classification performance.

1. Introduction

Incessant expansion of image datasets in terms of dimension and complexity has escalated the requirement to design techniques for efficient feature extraction. Selection of image features has been the basis for content based image classification as reviewed by Andreopoulos and Tsotsos in [1]. In this work, a new feature extraction technique applying binarization on bit planes using local threshold technique has been proposed. A digital image can be separated into bit planes to understand the importance of each bit in the image as shown by Thepade et al. in [2]. The process was followed by binarization of significant bit planes for feature vector extraction. Binarization process calculated the threshold value to differentiate the object of interest from its background. The novel method has been compared quantitatively with the techniques proposed by Thepade et al. in [2] and by Kekre et al. in [3] and four other widely used image binarization techniques proposed by Niblack [4], Bernsen [5], Sauvola and Pietikäinen [6], and Otsu [7]. Mean square error (MSE) method was followed for classification performance evaluation of the proposed technique with respect to the existing techniques for feature vector extraction.

Various methods have been used for feature extraction that has implemented image binarization as a tool to denote the object of interest and its background, respectively. Threshold selection has been essential to facilitate binarization of image to differentiate the object from its background. Valizadeh et al. [8], Chang et al. [9], and Gatos et al. [10] have described that threshold selection has been affected by a number of factors including ambient illumination, variance of gray levels within the object and the background, and inadequate contrast. Process of threshold selection has been categorized into three different techniques, namely, mean threshold selection, local threshold selection, and global threshold selection. Existing methods of feature extraction from images using selection of mean threshold were adopted by Thepade et al. in [2] and by Kekre et al. in [3]. The first method of feature extraction using even and odd images [2] has generated two different varieties of images by adding and subtracting the original image and its flipped version, respectively, for each variety as shown in where denotes the original image and denotes its flipped version.

Binarization of even image and odd image were done by mean threshold technique as shown in (2) and the feature vectors of size 12 were generated.

In the second case of feature extraction, noteworthy information was extracted from the images by comparing the values of significant bits in each pixel and the extracted values were binarized following the process of mean threshold selection for binarization [2]. Equation (2) shows the process of mean threshold selection. Consider where is red (), green (), and blue () for each color component.

Feature vectors of dimension 12 were generated by taking the mean of significant values higher than the threshold and lower than the threshold, respectively. The third method for feature extraction with multilevel block truncation coding [3] was an iterative process. The initial stage started with binarization using mean thresholding as in (2) and two feature vectors for each color component were generated from the mean of gray values higher than the threshold and lower than the threshold, respectively. Classification performance was tested with the generated feature vectors at the first level of feature vector extraction. The next step involved binarization of the gray values higher than the mean threshold and lower than the mean threshold at the first level. This has produced four feature vectors for each color component and the classification performance was evaluated with the generated feature vectors at second level. Now, the classification results for the first level of feature extraction and second level of feature extraction were compared to find out improvement in performance. If the result for second level was found to be higher than the first level, then the entire feature extraction process was repeated for further levels of feature extraction. The process was continued until the classification performance deteriorated. Multilevel block truncation coding method of feature extraction has shown its best performance at level 3 and the feature vector size at the mentioned level was 24. The drawback of mean threshold method was to only determine a midpoint and has not been effective to differentiate the spread of data among the datasets. Local threshold techniques discussed by Niblack [4], Bernsen [5], and Sauvola and Pietikäinen [6] have used measures of dispersion like standard deviation and variance to calculate the threshold. Both the local threshold selection techniques proposed by Niblack, Sauvola, and Pietikäinen have considered the contribution of both mean and standard deviation for threshold selection. The methods involved sliding of a window over the image for calculation of threshold value pixel-wise. Sauvola’s technique was an improvement over Niblack’s method. Bernsen’s technique of binarization has maintained a window size of 31 for threshold selection. Binarization process was performed locally by comparing each pixel value to the corresponding threshold. A pixel value greater than the corresponding threshold was represented by 1 and a lower value than threshold was denoted by zero. The threshold for each color component was computed as the mean of the highest and the lowest gray value within the local window specified. The contrast was expressed as the distinction between the greatest and the least gray values. Binarization was done by comparing the contrast value to the contrast threshold. Otsu’s method of binarization was based on selection of global threshold. The method categorized the pixels in an image into background and foreground pixels by calculating the optimal threshold. The drawback for the techniques was the hefty feature vector size generated by each. The dimension of feature vectors was dependent on the size of the image. Thus, for an image of size , the generated feature vector size was . Elalami [11] has extracted color and texture based features by 3D color histogram and Gabor filters and addressed the problem for large feature vector size with genetic algorithm. Hiremath and Pujari [12] have divided the image into nonoverlapping blocks. Local descriptors of color and texture were calculated from the color moments and moments on Gabor filter responses of these blocks were considered to generate local descriptors of color and texture. Gradient vector flow fields were used to define the edge images for shape descriptors followed by the use of invariant moments. Banerjee et al. [13] have chosen visually significant point features from images and identified the set by a fuzzy set theoretic approach. Jalab [14] has combined color layout descriptor and Gabor texture descriptor as a fusion approach for better identification of images. Shen and Wu [15] have extracted feature vectors from images by exploiting color, texture, and spatial structure descriptors. Irtaza et al. [16] have used wavelet packets and Eigen values of Gabor filters for creating feature vectors from images. Rahimi and Moghaddam [17] have explored the intraclass and interclass features for effective extraction of image signatures. The proposed technique was compared to seven other state-of-the-art feature extraction techniques for assessment of classification performance. The quantitative evaluation and statistical analyses have revealed the efficiency of the proposed method.

3. Proposed Methodology

The proposed method in this work can be subdivided into two different levels, namely, Bit Plane Slicing and threshold selection as given below.

3.1. Bit Plane Slicing

Individual bits needed to represent a number in binary have been considered as the bit planes. The first level started with segmentation of the images into various bit planes starting from the least significant bit (LSB) to the most significant bit (MSB). The intensity value of each pixel was denoted by an 8-bit binary vector and the value was either 0 or 1 as shown in Figure 1. Thus, each bit plane was represented by a binary matrix which was further used to generate image slices for respective bit planes as in Figure 2. Significant bit planes carrying rich information were identified and considered for feature extraction by binarization as in (3) and insignificant bit planes were discarded. Consider where is , , and , respectively, for each color component.

3.2. Threshold Selection

The test dataset contains different scenes from diverse categories having varying illumination across the scene as observed in Figure 7. Uneven luminance distribution caused the image to be brighter or darker at places which has adversely affected the recognition of the object of interest. Thus, choosing a single global or mean threshold for the entire scenes of diverse categories was not an effective alternative. Instead, selection of local threshold technique can handle the problem of uneven illumination by selecting threshold locally. A popular local threshold selection method named Niblack’s method [4] was used for binarization of significant bit planes of the images from different categories. Pixel-wise threshold was calculated by the method for each color component of the selected bit planes by sliding a rectangular window over the component. The threshold was determined based on the local mean and standard deviation . The window size was considered as . The threshold was given as . Here, has been a constant having value between 0 and 1 and was considered to be 0.6 in the proposed method. The value of and the size of sliding window determined the quality of binarization. Figure 3(a) has shown the original image from which bit plane 5 has been extracted in Figure 3(b). Figure 3(c) has shown the effect of binarization on the original image using mean threshold. The image of bit plane 5 has been binarized with mean threshold in Figure 3(d). Finally, Niblack’s method of local threshold with and window size of has been used to binarize the image of bit plane 5 in Figure 3(e). The effect of binarization with mean threshold and Niblack’s method of local threshold has been shown with different bit planes similarly from Figures 4, 5, and 6.

3.3. Proposed Algorithm

Begin(1)Input an image with three different color components , , and , respectively, of size each.(2)Calculate the local threshold value for each pixel in each color component , , and using Niblack’s method. Consider where /, , and /(3)Compute binary image maps for each pixel for the given image. Consider /, , and /(4)Generate image features for the given image for each color component. Consider /, , and /(5)Identify the higher bit planes for each color component starting from bit plane 5 to bit plane 8 which is equal to 1. Consider /, and /(6)Compute local threshold for the identified significant bit planes for each color component using Niblack’s method. Consider where /, and /(7)Compute binary image maps for each pixel of significant bit planes for , , and , respectively. Consider /, , and /(8)Compute the feature vectors and for bit planes for all the three color components. Consider /, , and /(9)The feature vectors , , , and for each color component are associated to form twelve feature vectors altogether for each image in the dataset.End

4. Experimental Process

The proposed technique was experimented with a subset of Corel stock photo database known as Wang dataset used by Li and Wang in [18]. It has been considered as a widely used public dataset. The experimental process has not compromised with the size and quality of the images. The dataset comprised 9 categories with 100 images in each category. Figure 7 shows a sample of the original database. The classification performances were evaluated with 10-fold cross-validation scheme for the proposed and existing feature vector extraction techniques. The process has been called -fold cross-validation as given by Sridhar in [19]. The value of is an integer and was considered to be 10 in this work. The entire dataset was divided into 10 subsets. 1 subset was considered as the testing set and the remaining 9 subsets were considered to be training sets. The method was repeated for 10 trials and the performance of the classifiers was evaluated by combining the 10 results thus obtained after evaluating the 10 folds.

5. Evaluation of Proposed Technique

The proposed technique for feature extraction has been evaluated by mean square error (MSE) method as in (13). The method considered the similarity measure between two instances as described by Xu et al. in [20] and Kotsiantis in [21]. The nearest neighbor in the instance space was located for classification and then the unknown instance was designated with the same class of the identified nearest neighbor. Consider where and are the two images used for comparison using the MSE method.

Primarily, two different evaluation metrics, namely, misclassification rate (MR) and score, were considered to compare the proposed feature extraction technique with respect to the existing techniques.

5.1. Misclassification Rate (MR)

The error rate of the classifier indicates the proportion of instances that have been wrongly classified as in where true positive (TP) is number of instances classified correctly, true negative (TN) is number of negative results created for negative instances, false positive (FP) is number of erroneous results as positive results for negative instances, and false negative (FN) is number of erroneous results as negative results for positive instances.

5.2. F1-Score

Precision and Recall (TP rate) can be combined to produce a metric known as score as in (15). It has been considered as the harmonic mean of Precision and Recall. Higher value of score indicates better classification results. Consider where Precision is the probability that an object is classified correctly as per the actual value and Recall is the probability of a classifier that it will produce true positive result

The comparison shown in Tables 1 and 2 has been graphically represented in Figures 8 and 9. It has been observed that the images of category named dinosaur have got minimum misclassification rate and the highest score. On the other hand, least classification performance has been exhibited by the category named gothic structure as shown in Figures 8 and 9. The category named gothic structure has the highest misclassification rate (MR) and the lowest score which signified poor feature extraction compared to the other categories in the dataset. The confusion matrix for all the categories has been given as in Table 3.

6. Results and Discussion

The proposed technique has implemented feature extraction by binarization of significant bit planes with Niblack’s local threshold selection method. Binarization of significant bit planes was done in [2] with mean threshold method. Another existing technique of feature extraction in [3] has also used mean threshold for binarization. Comparative analysis of misclassification rate (MR) and score of the proposed technique of feature extraction has outperformed the feature extraction method of [2, 3] as shown in Table 4 and Figures 10 and 11.

The proposed binarization technique of feature extraction has also been compared with widely used global and local threshold techniques for binarization, namely, Otsu’s method, Niblack’s method, Sauvola’s method, and Bernsen’s method. The results in Table 4 and Figures 10 and 11 have revealed minimum misclassification rate (MR) and maximum score, respectively, for the proposed technique with respect to all the techniques compared. Table 4 has shown the comparison of proposed technique with reference to the existing techniques in terms of misclassification rate and score. Minimum misclassification rate of 0.07 for the test image categories has been observed with the proposed method. The existing method of feature extraction by binarization of significant bit planes [2] using mean threshold has shown higher misclassification rate (MR) of 0.078 compared to the proposed technique. The score of the proposed method was the highest among all the existing techniques. The proposed technique for feature extraction was further compared to the existing state-of-the-art techniques with respect to average time taken for extraction of features from each image in Wang database. The comparison has been shown in Figure 12. It was observed that the proposed technique has consumed minimum time for feature extraction compared to the existing techniques.

Feature extraction by binarization with multilevel mean threshold has taken the maximum time followed by Sauvola’s local threshold method for feature extraction. Subsequent time was consumed by feature extraction with Bernsen’s local threshold method, Niblack’s local threshold method, and Otsu’s global threshold method, respectively. The two techniques, namely, feature extraction with Bit Plane Slicing using mean threshold and feature extraction by binarization of original + even image with mean threshold, respectively, have lesser time consumption compared to the rest of the existing techniques.

The results clearly established better classification performance of the proposed technique with respect to the existing mean threshold based techniques and the traditional global and local threshold techniques adopted for binarization to facilitate feature extraction.

Table 5 has shown the comparison of Precision, Recall, and Accuracy of the proposed technique with respect to the existing methods of binarization for feature extraction.

The graphical comparison shown in Figure 13 has clearly ascertained that the proposed technique has highest amount of precision and recall value with maximum accuracy among all the techniques compared.

7. Analysis of Statistical Comparison

The output from various evaluation metrics was individually integrated to assess the relationship of each metric used for performance measure. Correlation analysis was used to explore the relationship among various metrics, namely, misclassification rate (MR), score, Precision, Recall, and Accuracy, as suggested by Bishara and Hittner [22]. Table 6 has shown that score, Precision, Recall, and Accuracy are negatively correlated with misclassification rate (MR). Precision, Recall, and Accuracy are highly correlated with score. Thus, it was inferred that any method must try to minimize the misclassification rate for increased score, Precision, Recall, and Accuracy.

The proposed feature extraction technique was compared with seven existing techniques of feature extraction for classification performance evaluation. The correlation analysis has clearly established the necessity for minimized misclassification rate for better classification. The authors have compared the feature extraction techniques in terms of false positive and false negative results generated during classification for each of the nine categories considered in Wang’s dataset. A univariate test (one-tailed) was conducted for comparison of each of the existing feature extraction techniques to the proposed technique in terms of pair as proposed by Yıldız et al. [23] and Sharma [24]. Two algorithms in the pair were trained and validated on training data folds and confusion matrices , and , were calculated as shown in Table 7.

Comparison was done in terms of errors and was calculated for each of the algorithms. It was followed by calculation of paired difference between the errors . The test calculated whether the differences were generated from a population with zero mean: The results for comparison for each of the techniques to the proposed technique have been shown in Table 8.

Analyses shown in Table 8 have indicated the values as significant or highly significant. Therefore, the null hypotheses of equal error rates for the existing algorithm compared to the proposed algorithm were rejected. As such, we can conclude that the error rate of the proposed technique in comparison to existing techniques is significantly less. Hence, the proposed technique has contributed considerable improvement in classification performance compared to the existing techniques of feature extraction.

8. Conclusion

The paper has presented a new technique of feature extraction with the help of image binarization. The statistical comparison has shown significant improvement in classification performance for the proposed technique compared to the existing feature extraction techniques by binarization with mean threshold method, global threshold method, and local threshold method. The work has the possibility to be extended for a large variety of applications including city surveillance and medical image analysis, where binarization is very important as a preprocessing step for consequent object recognition. The proposed method has demonstrated better average results for five different performance evaluation measures compared to the other widely used feature extraction methods. Thus, the novel technique of feature extraction with significant bit planes using binarization with local threshold has been established as a consistent technique for feature extraction in content based image classification.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.