Cultural Emperor Penguin Optimizer and Its Application for Face Recognition

Face recognition is an important technology with practical application prospect. One of the most popular classiﬁers for face recognition is support vector machine (SVM). However, selection of penalty parameter and kernel parameter determines the performance of SVM, which is the major challenge for SVM to solve classiﬁcation problems. In this paper, with a view to obtaining the optimal SVM model for face recognition, a new hybrid intelligent algorithm is proposed for multiparameter optimization problem of SVM, which is a fusion of cultural algorithm (CA) and emperor penguin optimizer (EPO), namely, cultural emperor penguin optimizer (CEPO). The key aim of CEPO is to enhance the exploitation capability of EPO with the help of cultural algorithm basic framework. The performance of CEPO is evaluated by six well-known benchmark test functions compared with eight state-of-the-art algorithms. To verify the performance of CEPO-SVM, particle swarm optimization-based SVM (PSO-SVM), genetic algorithm-based SVM (GA-SVM), CA-SVM, and EPO-SVM, moth-ﬂame optimization-based SVM (MFO-SVM), grey wolf optimizer-based SVM (GWO-SVM), cultural ﬁrework algorithm-based SVM (CFA-SVM), and emperor penguin and social engineering optimizer-based SVM (EPSEO-SVM) are used for the comparison experiments. The experimental results conﬁrm that the parameters optimized by CEPO are more instructive to make the classiﬁcation performance of SVM better in terms of accuracy, convergence rate, stability, robustness, and run time.


Introduction
As an important branch of pattern recognition, face recognition has the widespread application demands in the field, such as intelligent monitoring, virtual reality, medicine examination, and human-computer interaction.
e key step in face recognition is classifier design, namely, how to use the extracted features to classify the new face images. Recently, many classifiers have been presented, such as decision tree [1], neural network [2], k-nearest (KNN) [3], and support vector machine (SVM) [4], among which SVM is the most popular one.
Although SVM has the advantages of good generalization ability, small training set size limitation, and high noise stability, it sometimes leads to overfitting problems. To address this problem, some researchers have applied genetic algorithm (GA) [5], particle swarm optimizer (PSO) [6], and simulated annealing (SA) [7] to parameter optimization of SVM. However, the implement of GA is complex although the search process has the characteristic of exclusivity, which makes the convergence speed slow. PSO and SA are successfully applied to solve some engineering optimization problems, but they have poor global search ability for parameters optimization. erefore, it is worthy to design a new intelligent algorithm with superior performance for parameters optimization of SVM.
In the face of complex optimization problems which are difficult to deal with effectively by traditional methods, many new inspired algorithms have been proposed for solving problems by simulations of natural rules and social behaviours of biological population. Recently, some algorithms such as grey wolf optimizer (GWO) [8], moth-flame optimization (MFO) [9], tunicate swarm algorithm (TSA) [10], and dice game optimizer (DGO) [11] are widely used. In 2018, Dhiman and Kumar proposed a novel bio-inspired algorithm named emperor penguin optimizer [12], which is inspired by the budding behavior of emperor penguin. However, the experimental results showed the convergence rate of EPO will decrease continuously in the later stage, which will affect the efficiency of optimization. Afterwards, Baliarsingh et al. proposed a hybrid algorithm using EPO with social engineering optimizer (SEO), namely, emperor penguin and social engineering optimizer (EPSEO) [13] to improve the performance of EPO.
Cultural algorithm (CA) is a bio-inspired intelligent algorithm proposed by Reynolds in 1994 [14], which is by simulating the cultural evolution of human society. CA is a double-layer evolutionary mechanism, which uses all kinds of knowledge accumulated in belief space to update continuously, so as to guide the evolution of population space and accelerate the optimization efficiency of the algorithm. erefore, it can provide a space that allows other evolutionary algorithms to be embedded to cooperate and promote each other. Recently, some researchers have proposed PSO, bacterial foraging algorithm (BFA), and firework algorithm (FA) as a population space to ingrate with CA [15][16][17] and verified that the performance of the hybrid algorithms are significantly improved.
Based on the above observations, due to the fact that CA with single population does not make full use of the knowledge of belief space and the convergence speed of EPO will decrease continuously in the later stage, we propose a hybrid metaheuristic algorithm named cultural emperor penguin optimizer (CEPO) with the help of the embedded framework provided by CA. In CEPO, hybrid of two different evolution processes for locations is an effective mechanism to enhance the exploitation capability of EPO.
Specifically, CEPO are implemented on six well-known benchmark test functions compared with eight state-of-theart optimization algorithms. Furthermore, CEPO is applied to multiparameter optimization problem of SVM for face recognition. In the process of feature, extraction feature representation vectors of face images are obtained by Gabor filter and Principal Component Analysis (PCA), which are as input to CEPO-SVM. PCA is used to make up Gabor wavelet transform disadvantages caused by increased dimensionality. Along with feature classification, penalty parameter and kernel parameter of SVM are optimized by CEPO to get the training model with high classification accuracy. To prove the performance of CEPO-SVM, eight state-of-the-art optimization algorithm-based SVM are used for the comparison experiments in terms of accuracy, convergence rate, stability, robustness, and run time.
e main contributions of this paper are as follows: (1) A new hybrid intelligent algorithm named CEPO inspired by EPO and the framework of CA is proposed (2) CEPO is applied to automatic learning of the network parameters of SVM to solve the classification problem for face recognition (3) We demonstrate that the performance of CEPO-SVM is superior on the basis of accuracy, convergence rate, stability, robustness, and run time, compared with eight state-of-the-art algorithmbased SVM (4) CEPO as a multidimensional search algorithm can not only obtain the optimal penalty parameter and kernel parameter of SVM but also can be employed in other classifiers and other real number optimization problems e rest of the work is organized as follows. Section 2 describes feature extraction of face images including the Gabor face image representation and dimensionality reduction by Principal Component Analysis. Section 3 presents formulations of parameter optimization of SVM for classification as an optimization problem. e design of cultural emperor penguin optimizer for parameter optimization of SVM is depicted in Section 4. Section 5 describes the experimental setup. In section 6, the experiment and result analysis are presented. Section 7 provides the conclusion and future works.

The Feature Extraction for Face Images
At the preprocessing stage, each image is first converted into grayscale and then adjusted to the same size. Due to the superior robustness of the Gabor filter to changes in brightness and attitude of face images, the Gabor filter is used to capture more useful features from face images. Furthermore, to make up Gabor wavelet transform disadvantages caused by increased dimensionality, PCA is applied to dimensionality reduction. After the preprocessing stage, face images are divided into training sets and testing sets to obtain feature representation vectors, which are as the input to SVM. Figure 1 shows the block diagram of face recognition based on Gabor-PCA.

Gabor Face Image Representation.
e two-dimensional Gabor filter can be defined as follows: where z � (� x, � y) defines the pixel position in the spatial domain, o defines the orientation of the Gabor filter, r defines the scale of the Gabor filter. σ represents the radius of Gaussian function, which is used to limit the size of the Gabor filter, and W o, r is the wave vector of the filter at orientation o and scale r [18,19]. In this paper, with five different scales, r ∈ 0, . . . Mathematical Problems in Engineering characteristics of spatial locality, spatial frequency, and orientation selectivity. Let Y(z) represent the grey level distribution of an image, and the Gabor wavelet representation of an image can be given by where G o, r (z) defines the convolution result between image Y(z) and the Gabor filters φ o, r (z) at different orientations and scales. In order to improve the computation speed, the convolution computation can be transformed by FFT [20].
To contain the different spatial localities, spatial frequencies, and orientation selectivities, all these representation results will be concatenated to derive an augmented feature vector Q. In [19], before the concatenation, each G o, r (z) will be downsampled by a factor δ at different scales and orientations. erefore, the augmented Gabor feature vector Q (δ) can be given by where T denotes transpose operator, δ is the downsample factor, and G (δ) o, r is the concatenated column vector.

Principle Component Analysis for Feature Optimization.
e augmented Gabor feature vector which is introduced in equation (3) has very high dimensionality: Q (δ) ∈ R O , where O is the dimensionality of vector space. Obviously, the high dimensionality seriously affects the computing speed and recognition rate in the process of face recognition. In the method with PCA [21], we can find orthogonal basis for feature vector, sort dimensions in order of importance, and discard low significance dimensions. Let Q (δ) ∈ R O×O be the covariance matrix of the augmented feature vector Q (δ) : where E(·) is the expectation operator. e covariance matrix Q (δ) can be transformed into the following form: where Ω ∈ R O×O is an orthogonal eigenvector matrix and Λ ∈ R O×O is a diagonal eigenvalue matrix, and the diagonal elements are arranged in descending order An important property of PCA is that it is the optimal signal reconstruction in the sense of minimum mean-square error by using a subset of principal components to represent the original signal [19]. Following this property, PCA can be applied to dimensionality reduction: where S � [U 1 , U 2 , . . . , U J ], J < O, and S ∈ R O×J . e lower dimensional vector X (δ) ∈ R J captures the most expressive features of the original data Q (δ) .

Model of Support Vector
Machine. Support vector machine (SVM) [22] is a kind of binary classifier with strong learning ability and generalization ability. e key point of SVM is to find the optimal hyperplane for accurate classification of two classes and ensure that the classification gap is large enough. Figure 4 shows the optimal hyperplane of SVM, where H indicates the hyperplane and margin represents the gap between class H 1 and class H 2 . Suppose the training data V can be given by where a h are the training samples, b h are the labels in w-dimensional vector, n is the number of training data, and each variable must subject to the criteria a h ∈ R w and b h ∈ −1, +1 { }. For linear data, the hyperplane g(a) � 0 that separates the given data can be determined: where ω is n-dimensional vector and c is a scalar. ω and c determine all the boundaries. SVM could be classified according to the problem types, when the hyperplane with minimum margin width ‖ω 2 ‖/2 subject to For the data that cannot be separated linearly, the relaxing variables ξ h are introduced to allow for optimal generalization and classification by controlling the size of them. en, the optimal hyperplane can be obtained by the following optimization problem: where C denotes the penalty parameter. To satisfy the Karush-Kuhn-Tucker (KKT) conditions, the Lagrange multipliers α h is introduced. e abovementioned optimization problem is converted into the dual quadratic optimization problem: subject to us, the ultimate classification function can be obtained by solving the dual optimization problem: where sgn represents symbolic function. Kernel functions can be used to map the data in the lowdimensional input space to the high-dimensional space by nonlinearity. erefore, the linear indivisibility of the input can be transformed into the linear separable problem by this mapping. Kernel function can be defined as follows: en, the optimal function can be given by Several types of kernel functions such as Linear Kernel, Polynomial Kernel, and Radial Basis Function (RBF) are widely used. However, RBF has advantages of realizing nonlinear mapping, less parameters, and less numerical difficulties. In this paper, RBF is selected for face recognition, which can be represented as follows: where c denotes the kernel parameter.

Objective Function of SVM.
With a view to obtaining training model with high classification accuracy, the penalty parameter C which is used to adjust confidence scale and the kernel parameter c which determines the width of kernel and the range of data points could be optimized by an effective intelligent algorithm. Furthermore, the mean square error of parameters of SVM can be selected as the objective function of the intelligent algorithm: where b h is the output value of the corresponding parameter and b h is the actual value of the corresponding parameter.

e Design of Cultural Emperor Penguin
Optimizer. EPO is a novel optimization algorithm presented by Dhiman and Kumar in 2018 [12], which is inspired by the budding behavior of emperor penguin. In this paper, a hybrid algorithm named CEPO is proposed to solve the real numbers optimization problems, which is with the help of cultural algorithm basic framework in Figure 5.
e key idea of CEPO is to obtain problem-solving knowledge from the budding behavior of emperor penguin and make use of that knowledge to guide the evolution of emperor penguin population in return.
Suppose CEPO is designed for general minimum optimization problem: Belief space of emperor penguin population in the tth generation defined is given by s t and N t j , where s t is situational knowledge component. N t j is normative knowledge which represents the value space information for each parameter in the jth dimension and in the tth generation. N t j denotes I, L, U. I t j � [l t j , u t j ], where I t j represents the interval of normative knowledge in the jth dimension. e lower boundary l t j and the upper boundary u t j are initialized according to the value range of variables given by the problem. L t j represents the objective value of the lower boundary l t j of the jth parameter, and U t j represents the objective value of the upper boundary u t j of the jth parameter.
e acceptance function is used to select the emperor penguins who can directly influence the current belief space. In CEPO, the acceptance function selects the cultural individual in proportion top 20% from the current population space to update the belief space.
Situational knowledge s t can be updated by update function: where x t+1 best is the optimal position of emperor penguin population space in the (t + 1)th generation. Mathematical Problems in Engineering 5 Assume that, for the qth cultural individual, a random variable θ q lies in the range of [0, 1] is produced. e qth cultural individual affects the lower boundary of normative knowledge in the jth dimension when θ q < 0.5 is satisfied. Normative knowledge N t j can be updated by update function: e qth cultural individual affects the upper boundary of normative knowledge in the jth dimension when θ q ≥ 0.5 is satisfied: Situational knowledge and normative knowledge can be used to guide emperor penguin population evolution by the influence function. In CEPO, a selection operator β is produced to select one of two ways to influence the of evolution emperor penguin population: where Max iteration denotes the maximum number of iterations. Assume that, for the ith emperor penguin, a random variable λ i which lies in the range of [0, 1] is produced. e first way is to update the position of emperor penguin by changing the search size and direction of the variation with belief space, which will be implemented when satisfied λ i ≤ β. e position of emperor penguin in the jth dimension can be updated by where N(0, 1) is a random number subjecting to the standard normal distribution. size(I t j ) is the length of adjustable interval of the jth parameter in belief space in the tth generation. η is set to be in the range of [0.01, 0.6] by [14]. e other way is by a series of steps in EPO which are the huddle boundary generation, temperature profile around the huddle computing, the distance calculation between emperor penguins, and the position update of emperor penguins, which will be carried when satisfied λ i > β. e specific steps can be represented as follows: where T ′ represents the temperature profile around the huddle, T is the time for finding best optimal solution, and R is a random variable which lies in the range of [0, 1].
where D t ep denotes the distance between the emperor penguin and the optimal solution, x t best represents the current optimal solution found in emperor penguin population space in the tth generation, S ep represents the social forces of the emperor penguin that is responsible for convergence towards the optimal solution, A t and B t are used to avoid the collision between neighboring emperor penguins, and B t is a random variable which lies in the range of [0, 1]. A t can be computed as follows: where M is the movement parameter which holds a gap between emperor penguins for collision avoidance and P t grid (Accuracy) defines the absolute difference by comparing the difference between emperor penguins. S ep (A t ) in equation (28) is computed as follows: where e represents the base of natural logarithm. ε and ρ are two control parameters for better exploration and exploitation, which is in the range of [1.5, 2] and [2,3]. Ultimately, the position of emperor penguin is updated as follows: e algorithm procedure of CEPO is sketched in Algorithm 1.

CEPO for Parameter Optimization of SVM.
During the iterative searching process, each emperor penguin adjusts the optimal position by comparing the value of the objective function, and the optimal position x i is considered to be the parameter pair (C and c) for repeated training of SVM and update of each emperor penguin. Figure 6 shows a schematic diagram of SVM based on CEPO.

. , m)
Output: the optimal position of emperor penguin (1) Procedure CEPO (2) Initialize the size of population, Max iteration , the parameters of EPO and CA (3) Initialize the situational knowledge and normative knowledge of the belief space (4) Initialize the population space of Eps (5) Compute the fitness of each EP (6) Arrange the fitness value of each EP (7) Initialize the belief space by acceptance proportion 20% and AdjustCulture (8) Initialize the belief space (9) While (t < Max iteration ) do (10) Calculate T and T′ using equations. (9) and (10)  (11) For i ⟵ 1 to m do (12) λ i ⟵ Rand() (13) For j ⟵ 1 to D do (14) If(λ i ≤ β) using equation (7)  (15) Update the position of EPs using equation (8)  (16) Else (17) Compute A and B using equations (12) and (13)  (18) Compute S ep (A) using equation (14)  (19) Compute the new position of EPs using equation (15)  (20) Compute the fitness of current Eps (21) Update the better position of EP compared with the previous position (22) End

Experimental Setup.
e experimental environment is MATLAB 2018a with LIBSVM on the computer with Intel Core i5 processor and 16 GB of RAM. Table 2 Shows the parameter settings of the proposed CEPO and eight competitor intelligent algorithms, i.e., MFO, GWO, PSO, GA, CA, EPO, CFA, and EPSEO. And the parameter values of these algorithms are set to be recommended in their original paper.

Performance of CEPO Comparison Experiment.
Six benchmark test functions are applied to verify the superiority and applicability of CEPO. Table 3 shows the details of six benchmark test functions including single-peak benchmark functions which are Sphere (F1) and Schwefel 2.22 (F2) and multipeak benchmark functions which are Rastrigin (F3), Griewank (F4), Ackley (F5), and Schaffer (F6). For each benchmark test function, each algorithm carries out thirty independent experiments. e mean and the standard deviation of the thirty independent results are shown in Table 4.
As we can see, the statistical results of CEPO for six benchmark test functions are obviously superior to the other eight comparison algorithms. Under the same optimization conditions, the two evaluation indexes of CEPO are both superior to several or even ten orders of magnitude of other algorithms, which proves that the CEPO has strong optimization performance. e optimal mean index shows that the CEPO maintains a high overall optimization level in thirty independent experiments and implies that CEPO has better single optimization accuracy. And the optimal standard deviation index proves that CEPO has strong optimization stability and robustness. Furthermore, under the same test constraints, CEPO attains the high optimization accuracy in the single-peak functions (F1-F2) and multipeak functions (F3-F6) optimization process, which indicates that the CEPO has strong local mining performance and good convergence accuracy, especially for the four multipeak functions (F3-F6) still maintain a high optimization objective value, which verifies its better global exploration performance and stronger local extreme value avoidance ability. If CEPO takes the target value optimization precision as the evaluation index, the preset precision threshold can be as high as 1.0E-12. Especially, for function F1、F3、F6, the threshold value can even be relaxed to 1.0E-25 under the premise of ensuring the success rate of optimization, and the precision can effectively meet the error tolerance range of engineering problems, which further proves that CEPO has better application potential in engineering optimization problems.

Performance of CEPO-SVM Comparison Experiment.
To verify the superiority of the proposed CEPO-SVM classification model for face recognition, CEPO and eight competitor intelligent algorithms are used to iteratively optimize the penalty parameter C and kernel parameter c of SVM to assess the performance of CEPO. e average recognition rates of thirty independent runs are shown in Table 5. As can be seen clearly from Table 5 15th, 24th, 25th, 19th, 20th, 26th, 18th, 24th, and 21st iterations, respectively. For FERETdatabase, no rise in accuracy is Genetic algorithm (GA) [5] Size of population 80 Probability of crossover 0.9 Probability of mutation 0.05 Maximum iteration 100 Cultural algorithm (CA) [14] Size of population 80 e constant 0.06 Maximum iteration 100 Emperor penguin optimizer (EPO) [12] Size of population 80 seen after 16th, 22nd, 24th, 19th, 18th, 20th, 18th, 21st, and 25th iterations using CEPO-SVM, PSO-SVM, GA-SVM, EPO-SVM, CA-SVM, MFO-SVM, GWO-SVM, CFA-SVM, and EPSEO-SVM, respectively. Obviously, CEPO-SVM converges firstly among nine proposed models, which verifies that EPO with the help of the embedded framework provided by CA could remarkably enhance convergence rate. erefore, the parameters of optimization by CEPO are superior to make the classification performance of SVM better.

Stability of CEPO-SVM Comparison Experiment.
For all nine models, the stability of the models is measured by the mean and standard deviation of penalty parameter C and kernel parameter c. In this experiment, each model runs thirty experiments. Tables 6-9 show the comparison of mean and standard deviation of parameters among nine models on four face databases. As we can see from Tables 6-9, the mean of penalty parameter C in CEPO-SVM is less than that of other models on four face databases, which could reduce the possibility of overlearning of SVM. And CEPO-SVM provides the lowest standard deviation of penalty parameter C among all nine models on all four databases, which verifies the stability of CEPO-SVM is better. For all four databases, the mean and standard deviation of kernel parameter c are close among the nine models. However, it can still be seen that CEPO-SVM has the minimum standard deviation of kernel parameter c. Obviously, CEPO is more stable to optimize the parameters of SVM for face recognition.

Robustness of CEPO-SVM Comparison Experiment.
To prove the robustness of CEPO-SVM, salt and pepper noise is added to each image of the original four databases. e noise databases generated are used for face recognition    Figure 8. From Figure 8, we can see that, as the noise level improves, the face image becomes more blurred. Figure 9 shows the detailed recognition rates among different models on four databases. As we can see from Figure 9, CEPO-SVM obtains the highest recognition rate at all four noise levels on among four databases in comparison with other eight models. Although the face image is seriously polluted by salt and pepper noise which is up to 15%, the recognition rate of CEPO-SVM can still attain  82.5% on YALE database, 83.5% on ORL database, 81.1% on UMIST database, and 82.1% on FERET database, which far exceed that of other eight models. Besides, with the increase of noise level, the recognition rate of CEPO-SVM on four databases decreases more slowly than that of other eight models. Obviously, the results show that CEPO-SVM can not only attain good noise-free face recognition accuracy but also achieve good accuracy under noise conditions, which      verifies that robustness of CEPO-SVM is superior to the other eight models.

Run Time Analysis.
e average training time of thirty independent runs by the proposed CEPO-SVM and eight competitor intelligent algorithm-based SVMs are shown in Table 10. As we can see from Table 10, in case of all four databases, the average training time by the proposed CEPO-SVM is more than that of six models with single algorithms, i.e., MFO-SVM, GWO-SVM, PSO-SVM, GA-SVM, EPO-SVM, and CA-SVM. is may be because hybrid of two algorithms increases the training time. Although CEPO-SVM takes more time than these six models with single algorithms, it can be considered better than these six models because of its high classification performance. However, for all four databases, the average training time by the proposed CEPO-SVM is less than that of two models with hybrid algorithms, i.e., CFA-SVM and EPSEO-SVM. Obviously, CEPO-SVM has better performance than these two models with hybrid algorithms. 6.6. Limitation Analysis. Like other hybrid algorithms, although CEPO provides superior performance in face recognition, due to large amount of computation, the proposed CEPO will cost more time for iterations compared with single algorithms. Besides, when CEPO is applied to solve super-dimensional optimization problems, the performance may be reduced.

Conclusions
In this paper, with respective advantages of EPO and CA, we propose a hybrid metaheuristic algorithm named CEPO with the help of the embedded framework provided by CA to promote each other and overcome each other's self-defects. e proposed algorithm has been experimented on six benchmark test functions compared with other eight state-ofthe-art algorithms. Furthermore, we apply this hybrid metaheuristic algorithm to automatic learning of the network parameters of SVM to solve the classification problem for face recognition. e proposed classifier model has been tested on four face databases that are used commonly and internationally by comparing with models with other eight state-ofthe-art algorithms. e experimental results show that the proposed model has better performance in terms of accuracy, convergence rate, stability, robustness, and run time.
We provide future works in different directions. First, the proposed model has only been experimented on four face databases; however, other new face databases should be tested to extend the current work. Second, the proposed hybrid algorithm should be applied to other classifiers for face recognition. Lastly, the proposed hybrid algorithm should be applied to solve other optimization problems, such as digital filters design, cognitive radio spectrum allocation, and image segmentation. Data Availability e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.