A Face Recognition Algorithm Based on LBP-EHMM

In order to solve the problem that real-time face recognition is susceptible to illumination changes, this paper proposes a face recognition method that combines Local Binary Patterns (LBP) and Embedded Hidden Markov Model (EHMM). Face recognition method. The method firstly performs LBP preprocessing on the input face image, then extracts the feature vector, and finally sends the extracted feature observation vector to the EHMM for training or recognition. Experiments on multiple face databases show that the proposed algorithm is robust to illumination and improves recognition rate.


Introduction
Face recognition is a hot topic in the field of pattern recognition and artificial intelligence. Traditional methods mostly use template matching based on feature extraction methods such as principal component analysis and linear discriminant analysis to realize face recognition. Principal component analysis (PCA) uses K-L transform to extract the main components of the face and form the feature face space for processing [Martnez and Kak (2004); Turk and Pentland (1991)]. Linear discriminant analysis (LDA) is to project the original data into a transformation space by maximizing the objective function and divide the original data in this new space [Howland and Park (2004)]. They describe the face image from a global perspective and can extract the global features of the image better, but they are easy to be affected by illumination and position. After years of development, face recognition technology has made considerable progress, and has a high recognition rate under ideal conditions. However, due to the complexity of face recognition and the multiple effects of objective conditions, there are still many problems not well solved in real-time recognition, such as illumination changes, expression changes, occlusion and so on, in which the changes in lighting caused the most serious interference. Compared with the traditional method, the embedded hidden Markov model (EHMM) can efficiently represent the dynamic time series signal, and the specific structure of the face is consistent with the state sequence of the Markov chain in EHMM [Dong (2009)]. Since the feature observation vector extracted by DCT in EHMM is relatively sensitive to illumination changes, LBP is used as image preprocessing to reduce the influence of illumination change [Viola and Jones (2001); Wei (2010) ;Xiao, Qiang, Song et al. (2012)]. The experimental results show that the proposed method reduces the influence of illumination changes while improving the face recognition rate. Ojala proposed in document [Ojala and Pietikainen (1996)] that the local texture feature of the image is represented by the relationship between the gray values of the local pixels. The basic idea of the LBP operator is to take the gray values of the central pixels as the threshold and compare the gray values of each pixel in the circular neighborhood with the thresholds to generate the binary pattern of the region to represent the local texture features [Ojala, Pietikainen and Maenpaa (2002); Xia, Yuan, Lv et al. (2019);Xia, Ma, Shen et al. (2018)]. For any LBP operator, its formula is:

LBP
In the formula, ( = 0, … , − 1) represents P pixels with the center pixel as the center and R as the circumference of the radius. For different (P, R) values, there are different LBP operators, and graph 1 is 3 different LBP operators.

P=8, R=1
P=8, R=2 P=16, R=2 operator has gray-scale invariance. As long as the illumination does not change the order of the local pixel values, the value obtained by the , operator is also the same. Fig. 2 shows the solution process of 8,1 .  Fig. 3 is a comparison of the image before and after LBP processing.

Hidden Markov Model (HMM)
HMM is a double stochastic process describing time-varying signals by means of probability and statistics. One of them is the Markov chain, that is, the Markov process with discrete state and time. This is a basic stochastic process. Another random process is to describe the statistical correspondence between the state and the observed value. The first order discrete hidden Markov process generally includes 4 aspects: ①N, the number of implied states; ②M, the total number of different observation symbols; ③A, state transition probability distribution or transition matrix; ④B, observation probability matrix or called emission matrix. Using a shorthand notation, the HMM can be represented as the following three-parameter form λ=(A,B,π). Samaria first applied HMM to face recognition in the literature [Samaria (1995)] and proposed the HMM model. It is based on the fact that a frontal face image contains six distinct feature areas: hair, forehead, eyes, nose, mouth and chin. Even if the head is deflected or tilted, the order of them remains constant from top to bottom [Qiang (2008)]. Therefore, the six feature regions can be abstracted into six states, and the observed sequence is generated by the six states. The six states are abstract and have no specific meaning and can only be estimated by observing the sequence. Its state structure diagram is shown in Fig. 3.

EHMM
One-dimensional HMM is good at one-dimensional signal processing, so it has made great breakthroughs in the field of speech. But for image processing and recognition, the image is a two-dimensional signal. Although it can be expanded into one-dimensional data like a feature face, the expanded one-dimensional data has lost spatial information. In the later calculation, there is a large amount of computation and is not suitable for real-time applications. Therefore, in the real-time face image recognition system, the standard onedimensional HMM method is difficult to meet the requirements, and another twodimensional model needs to be searched to model the image to achieve high-efficiency and high-precision recognition. EHMM was first used by Nefian et al. [Nefian and Hayes (1999)] for face recognition and achieved good results. EHMM is based on a one-dimensional HMM. It is obtained by nesting a new HMM in each state of a one-dimensional HMM. Generally, the state of the outer HMM is called a super state, and the inner embedded HMM is called an embedded state. It can be represented by an extended triplet λ = {П 0 , 0 , A} . The relevant parameters are briefly introduced as follows: 1) П 0 = � 0, �, 1 ≤ ≤ 0 , initial distribution of super state, where 0, is the superstate probability and 0 represents the number of superstates.

4)
= {П , . } includes an initial state distribution П of the kth embedded HMM, a state transition probability matrix and a state probability matrix = { ( 0 , 1 )}, and 0 , 1 represents an observation sequence located in the t 1 th row of the t 0 th row. The state structure description of EHMM in the face recognition system is shown in Fig.  4. The six states of the hair, the eyes, and the like in the vertical direction are super-states, and the plurality of groups of HMMs embedded in the horizontal direction are referred to as an embedded state. EHMM can reflect the two-dimensional structural features of the face in the model, and the state transition in the horizontal direction is limited to the inside of the super state. It is simpler than the general two-dimensional HMM. It is a better model for describing and recognizing the face. 1) The DCT transform parameters are initialized, the face image is sampled, and each sample window DCT transform is calculated to obtain a DCT coefficient matrix.
2) A general EHMM model λ = {П 0 , 0 , } is established, and the number of model states and the number of possible observations corresponding to each state are determined, and the DCT transform coefficient vector is input as an observation sequence into the EHMM.
3) The training data is evenly divided, corresponding to N states, and the initial parameters of the model are calculated. 4) Using the one-dimensional Viterbi algorithm to adjust the division of the kth super state embedded HMM on this input sequence, so that the P� � ( ) � output is the largest, and the output of P� � ( ) � is taken as the observation probability of (0) . After all the observation sequences are calculated, the Viterbi algorithm is implemented on (0) , so that the output of P� � ( ) � is the largest.
5) The Baum-Welch algorithm is used to re-estimate the parameters to iteratively adjust the model parameters until (0) converges, save the EHMM and end the training. When the training is over, an EHMM model is built into the face database. The specific training process is shown in Fig. 5 Figure 5: EHMM training process

LBP-EHMM-based face recognition algorithm
The real-time face recognition based on LBP-EHMM can be basically divided into four stages: face detection, LBP preprocessing, DCT feature extraction, EHMM training or recognition.

The Adaboost algorithm
Firstly, the input image is subjected to real-time accurate face detection to segment the face region. The Adaboost algorithm updates the weight based on whether the samples are correctly classified in each training and the correct rate of the last overall classification. The updated new data set is sent to the next level classifier for training, and finally the classification obtained by each training is cascaded into a strong decision classifier. The device cascades into a strong decision classifier. The algorithm itself is realized by changing the data distribution. The final strong classifier is constructed by cascading different weak classifiers to realize the face detection. Using the classifier can eliminate some unnecessary training data features and focus on the key training data, greatly improving the speed and accuracy of face detection.

LBP pretreatment
Then LBP preprocessing is performed to reduce the influence of illumination changes. When the LBP operator is used to extract facial texture features, the statistical histogram of the pattern is usually used to describe the image information. However, too many pattern types may cause the histogram to be too sparse, so it is necessary to effectively reduce the dimensions of the original mode so that the minimum amount of data can represent the information of the image best.

DCT feature extraction
Then, the appropriate window and moving step are set to perform DCT feature extraction on the face region. DCT is an orthogonal transform method that has been considered as the best optimized transform method for signal processing of speech and images. A two-dimensional transformation of an image separates the high frequency information of the image from the low frequency information. After the image is transformed, most of the coefficients are concentrated in the upper left corner of the image. The coefficients in the upper left corner are the low-frequency information of the face image, which represents the overall feature of the face image. Selecting the coefficient of this part as a feature can achieve the effect of dimension reduction. The characteristics of DCT extraction focus on the overall attributes of the face, and often lose some of the detailed feature attributes. This part of the detail features often contain important facial features, while the LBP method based on local texture features can can extract the local details of the face better, which makes up for the deficiency of DCT extraction.

EHMM training
Finally, a set of feature observation vectors is sent to EHMM modeling and the new face image is identified by EHMM. Each group of face images can get an EHMM model. A set of feature observation vectors is obtained in the same way for the face images to be recognized, then the probability of each EHMM model is calculated and the maximum probability is taken as the recognition result [Shen, Yuan, Shen et al. (2018)]. Fig. 6   The self-made face database contains 20 people, 10 images per person, for a total of 200. Take the first 5 images of each person, a total of 100 images under normal lighting for training, and 100 images with light treatment to identify them. Fig. 7 is part of the self-made face image. The size of the sampling window is: 16×16, The moving step size is: 4×4. The experimental results are shown in Tab. 1.

Comparison and analysis of experimental results
By comparing the ORL face database and the YALE face database experiment, it can be seen that: The recognition rates of the two algorithms on the ORL face database and the YALE face database are basically the same. Because the illumination of the face image in ORL database changes little, and the illumination changes relatively large in YALE database, the recognition rate of YALE database has decreased. In order to reduce the recognition rate caused by illumination changes, a face database is created and some of the face images are illuminated. From the experimental results, we can see that in the selfmade face database, the recognition rate of LBP-EHMM is obviously higher than that of EHMM. The reason is that some of the light in the face image used by the self-made face database is very dark, so that the EHMM misidentification is basically those with dark light. Since the feature observation vector extracted by DCT in EHMM is relatively sensitive to illumination changes, LBP pre-processing is performed before feature extraction to reduce the influence of illumination changes. The results of comprehensive comparison experiments show that the proposed method is robust to illumination changes while maintaining a high recognition rate.

Conclusion
This paper proposes a real-time face recognition method based on LBP-EHMM. The method firstly performs face detection on the acquired image, and then performs LBP preprocessing on the detected face region to weaken the influence of illumination, and then performs DCT feature coefficient extraction, and finally sent to EHMM for training and recognition. Based on EHMM, the dynamic sequence can be well modeled. LBP is simple in calculation and has gray invariance, which can describe local texture more effectively. The method of this paper combines the advantages of both. The experimental results show that the recognition performance of this method is better than that of single EHMM, especially when the image is dark. In practical applications, with the rapid development of modern computer technology, the processing of more complex algorithms is also very fast, so the choice of LBP-EHMM method has more advantages.