Improving the Component-Based Face Recognition Using Enhanced Viola – Jones and Weighted Voting Technique

.is paper enhances the recognition capabilities of the facial component-based techniques using the concepts of better Viola–Jones component detection and weighting facial components. Our method starts with enhanced Viola–Jones face component detection and cropping. .e facial components are detected and cropped accurately during all pose-changing circumstances. .e cropped components are represented by the histogram of oriented gradients (HOG). .e weight of each component was determined using a validation process. Combining these weights was done by a simple voting technique. .ree public databases were used: the AT&Tdatabase, the PUTdatabase, and the AR database. Several improvements are observed using the weighted voting recognition method presented in this paper.


Introduction
Face recognition is a very important application of pattern recognition at which a database is used to train a classifier that tries to identify each person in it.A handful of studies concerning the face recognition problem were surveyed in [1].Studies in cognitive science have found that local and global features can be used for face recognition [2][3][4][5][6][7][8].ere is enough evidence to prove that all of the holistic, configurable, and facial component information exist in the human face perception [2][3][4][5][6][7][8][9][10][11][12][13][14][15].Additional studies in humans have concluded that some facial components are more important and useful for recognizing faces than other components.For example, the upper face is more important than the lower face [13,16].Researchers have approached face recognition through two methods: component-based and global-based face recognition.

Component-Based Face Recognition.
is method relies on training multiple models depending on the number of components representing an image.is technique in face recognition has not been researched intensively in comparison to the global-based technique.erefore, they are limited in their approach [17].Most of them use raw-pixel representation and that's what makes them less robust.Several other component-based face recognition methods have been discussed in [4,12,16].e facial components used for recognition in this paper are the eye pair, the nose, and the mouth.
e Viola-Jones object detection framework [18] was used to crop the facial components.

Global-Based Face
Recognition.On the contrary, to the component-based concept, the global method of face recognition relies on a single array to represent a face.A comparison between the best technique in the global-based face recognition such as eigenfaces, Fisher's discriminant analysis, and kernel PCA can be found in [19,20].e global-based face recognition techniques have a weakness against pose changes. is technique had to include a face alignment algorithm phase or be developed to meet the standards of a component-based recognition technique [21].e remainder of this paper is organized as follows: Section 2 explains the methods we used for component detection and cropping.e HOG features are explained in Section 3. Section 4 presents the results summarized and compared.[22] encoded into an indexed collection of radial strings emanating from the nose tip.en, a partial matching mechanism effectively eliminated the occluding parts.Facial curves can express the deformation of the region which contains the facial curve used for detecting occluded facial areas.In [23], a novel automatic method for facial landmark localization relying on geometrical properties of the 3D facial surface working both on complete faces displaying different emotions and in presence of occlusions.

Component Detection and Cropping
e detection functionality is a vital process in our face recognition method.Components help to collect unique data for every person in the database.Two ways of component detection are used: Viola-Jones object detection framework [18] with geometrical approaches and Landmark detection using face alignment with an ensemble of regression trees [24].Both facial component detection methods are used to achieve the detection of the facial components in all circumstances (changes in illumination and pose).Accurate component cropping leads to better features.e more the crop is specific to the facial component, the less the useless information is included in the representation, and therefore unique data will participate in the learning process.

Viola-Jones Object Detection
Framework.Viola-Jones object detection framework is used to train a model that detects the facial components (eye pair, nose, and the mouth) needed for the recognition process.It consists of the following parts that are explained in detail in [18]: the Haar-Like features, integral image, weak classifiers and strong classifiers, AdaBoost, and the cascades.

2.2.
Enhancing Viola-Jones with Geometrical Approaches.Viola-Jones is a robust object detection system.However, trained models may suffer miss detections or failures in detecting the objects.Our recognition method relies on the accurate detection of the three components ( e eye pair, the nose, and the mouth).Miss detections cannot be tolerated when it comes to detecting the facial components.
e component-based face recognition system needs the components to be cropped and represented accurately.e miss detection may lead to the representation of useless data (as shown in Figure 1) in the learning process and that yields a lower recognition success rate.e eye pair component is the most crucial part of the three extracted components.
e eye pair carries the major unique information about a person's face.It is also the reference object used in this algorithm to detect and crop the rest of the facial component.An eye pair-location prediction model is trained to estimate where the eye pair might be found in a face.In Figure 2, some cases where the eye pair was not found are demonstrated along with the detection result after the proposed solution.e nose and the mouth object detector might not find the component because the search area did not include the whole object, or a multiobject is detected in the search area.If the object was not detected, then the search area is expanded gradually until an object is found.
e multiobject detection framework happened in the mouth area and was solved by picking the object with the maximum y coordinate.

e Area Selection Process.
e concept of the geometrical approaches is to concentrate the search for the components in the right areas.For example, the nose cannot be above the eye pair; it is located somewhere beneath the eye pair.e same concept is applied to the mouth; it has to be under the nose and the eye pair.Geometrical approaches aim to narrow search areas to where the nose and the mouth may occur [25].
e area selection algorithm (Figure 3) consists of the following steps: (a) e face is the first component to look for.(b) Eye pair detection in the cropped face image.(c) e area under the eye pair within the cropped face image will be the search area for the nose.(d) Specific area is used to detect the mouth (Figure 3).
In case of multiple mouth detection, the object with the more significant y-axis value (the lowest object) is chosen to be used as the mouth component.
Several problems face the usage of Viola-Jones object detection framework for component detection.ey are as follows: (1) Failure to detect the eye pair.
(2) Failure to detect the nose.
Figure 4 shows the miss detection problems and the solution of our area selection algorithm.
Figures 5 and 6 shows the miss detections and the solution of our area selection algorithm.

Features
Pixel patches extracted from facial images are often too large and cannot help building a robust classifier [24].ey are converted into a vector of features.A feature descriptor is an array of data that describes an image or a part of an image.It helps provide unique information about the image.It can support the recognition application for the object in that image.In this paper, we have used the histogram of oriented gradients (HOG) features [26].

Hog Features. Histogram of oriented gradients (HOG)
is a feature descriptor that uses oriented gradient 2 Modelling and Simulation in Engineering information [26].e steps for calculating HOG are described as follows: (1) For each pixel I (x, y), the horizontal and vertical gradient values are obtained as follows:     ( (iii) e histogram is constructed based on the magnitudes accumulated by orientation.
e image is divided into several small spatial regions (cells) for each of which, a local histogram of the gradient orientations is calculated by accumulating votes into bins for each orientation.e best performance is achieved when the gradient orientation is quantized into 9 bins (0-180).On the contrary, the vote is weighted by the gradient magnitude allowing the histogram to take into consideration the importance of gradient at a given pixel.Finally, the HOG descriptor is obtained by concatenating all local histograms in a single vector.
However, it is necessary to normalize cell histograms due to the fact that the gradient can be affected by illumination variations.Figure 7 shows an example of obtaining the HOG feature vector.

Face Databases Setups.
ree databases were studied in this paper.ey have been picked to test the recognition accuracy against low-resolution, missing components, and pose change circumstances.We have used the PUT [27], the AT&T [28], and the AR databases [29] 8. e HOG features are calculated on the batch basis for each image.A batch is a part of an image cropped out to seek for its useful information, for example, the eye pair, the nose, and the mouth.
e HOG features can be calculated for patches with different aspect ratios.To make the best use of these features, we have to maintain a fixed aspect ratio for all the patches within a single database.A ratio of 1 : 4, 1 : 1 and 1 : 2 was chosen for the eye pair, the nose, and the mouth, respectively (Figure 9).

e Validation Process.
e purpose of this process is to figure out which model performs best for the certain database to calculate its priority.e better the score of the particular component is, the higher its priority is.
is technique uses the validation results to assign weights to each component.e higher the weight assigned to a certain component, the heavier the impact it has on the final classification result.
e process is demonstrated in Figure 10.e results for the three databases are shown in the following subsections.

4.3.1.
e PUT Database Recognition Results.Using our validation process, Table 2 shows the priority of each component for the PUT database.Combining these priorities with a voting technique reached a 100% accuracy success rate for k � 5 (Table 3).

4.3.2.
e AT&T Recognition Results.Table 4 shows the priority of each component for the AT&T database.e voting recognition success rate reached 96% accuracy success rate for k � 5 (Table 5).

e AR Database Recognition Results
. Table 6 shows the priority of each component in the AR database.e voting criteria improved the recognition success range from 73% to 87% for k � 2 and from 84% to 94% for k � 5 (Table 7).

Summary of Results.
ree public databases were used: AT&T with 40 subjects and 400 images.PUT database with 50 subjects and 1100 images.AR database with 50 subjects and 1300 images.
Our method has the following advantages:

Conclusion
Enhancing the recognition capabilities of the facial component-based techniques was the objective of this paper.is was done by using the concepts of better Viola-Jones component detection and weighting facial components.Each component was given a certain weight using a validation process.We used a voting technique which incorporates all of these weights.
e component-weighted technique supplied the opportunity to involve multiple features into the success rate, granting that chance to use a particular feature's strength to suppress other feature's weakness.e improvement of the weighted voting method is demonstrated for the databases that we have used.e voting technique has boosted the recognition success rate.
e boost in the success rate within the voting technique distributes the weight importance among the facial components without settling for one major facial component.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Mouth classification results
Eye pair priority weight

Mouth priority weight
Componentbased classification results Figure 10: e priority voting process.

Figure 7 :
Figure 7: Example of the HOG feature vector.

Figure 9 :
Figure 9: Component resizing process for the PUT database.
. e PUT database consists of 50 people: each one has 22 colored facial images with different poses and different illumination conditions.
e AT&T database consists of images of 40 persons.Each person has ten different facial images.eAR database consists of 50 persons.Each person has 26 different colored facial images.Table 1 shows the different random training sets (k-flops).For example, for the PUT database, for k � 2, we took 11 out of the 22 as training and 11 for testing.Images with a missing component shall substitute that particular missing component with components detected within its learning/testing set as shown in Figure

Table 1 :
K-Flops and their corresponding testing and learning counts (L � learn images, T �test images).

Table 2 :
Validation recognition success rates for each component of the PUT database.

Table 3 :
Recognition success rates improvement using our approach (PUT database).

Table 4 :
Validation recognition success rates for each component of the AT&T database.

Table 5 :
Recognition success rates improvement using our approach (AT&T database).

Table 6 :
Validation recognition success rates for each component of the AR database.

Table 7 :
Recognition success rates improvement using our approach (AR database).