Face and Emotion Recognition using Deep Learning Based on Computer Vision Methods

: Deep learning studies are among the discipline that are rapidly increasing and developing today. Especially after the concept of big data enters our lives, deep learning methods have used to process the data. In the study, it has aimed to detect the face on the picture determined by the user and to conduct emotional analysis and gender determination with the deep learning methods of the detected face. Viola-Jones algorithm has used for face recognition. "Mini_Xception" model of Convolutional Neural Networks (CNN) has used for emotion analysis and gender detection. Estimation rates have measured with 18 different experiments performed. The most successful emotion recognition has measured as 93,11% and the most successful gender recognition has measured as 90,75%. Experiments within the study have supported by visual studies.


Introduction
Computer vision is widely used today in the fields of face recognition systems and object classification. Face recognition is the automatic identification or verification of the people in the data obtained from images or videos. There are four basic stages of face identification processes. These operations are face detection, normalization, feature extraction and classification. No matter how successful the normalization and classification algorithms are in face detection, if the feature extraction stage is not successful, that system cannot achieve the desired success [1]. Commonly used combinations for both model training and evaluation are tested in recent studies [2]. Using real-time convolutional neural networks architecture, facial recognition, gender classification, and emotion classification are performed simultaneously in a single step [3] [4] [5]. Applications that make age estimation from face images over pictures are developed and data sets are created from face pictures for these applications [6] [7] [8]. Processing of human faces; It consists of face recognition, face tracking and face creation stages. Parallel applications are performed using CUDA and OpenCV on the GPU (Graphics Processing Unit) with Viola-Jones face detection algorithm. In this way, the calculation speed of the algorithm that can be obtained is increased, and comparative analyzes are made between serial and parallel applications in order to obtain better calculation results [9] [10]. Facial recognition is the first step in various applications in computer vision, such as human-computer interaction. In a study, a powerful GPU-based face detection application based on Viola Jones face detection algorithm has been developed by starting the computation in the graphics processing unit (GPU) using NVIDIA CUDA (Compute Unified Device Architecture) in terms of computing speed of face detection algorithms [11] [12].

Viola-Jones Algorithm
Viola-Jones Algorithm is a real-time detection algorithm with computer vision. For practical applications, at least two frames per second should be worked on. Being able to distinguish faces from non-faces from images is the first step of the detection and recognition process. The steps of Viola-Jones algorithm are determination of haar-like features, creation of integral image, classifier training (adaboost) and cascading classifier [13] [14].

Convolutional Neural Networks
The input layer on the CNN structure is the layer that is converted to the desired structure by entering the data of the model to be trained. Convolution layer is a customized linear process. Basically, these are networks that do convolution rather than matrix multiplication in at least one layer.
The terms "i" and "j" on Equation (1) indicate the position of the new matrix to be obtained after the convolution process. "S" is the output value, "I" is the input value and "K" is the filter value. Convolutional input matrix and filter matrix may not be the same size. In this case, the size of the output matrix is smaller than the input matrix. If you want the output matrix and the input matrix to be of the same size, pixel addition is done [15]. The values of the results produced in the output layer are transferred to the system and the obtained results become ready for use [3] [15]. There are 20000 face images. In order for the structure of the data set to be the same as the structure of the study, the pictures are processed in 48x48 size and gray tones. The images in the UTKFace data set show differences such as facial expression, resolution and illumination. The flow chart of the system is as shown in Figure 2.  The system is tested with images in the UTKFace dataset. Valid results from experiments made from data sets are in Table 2, invalid results are in Table 4 and are given. The results of the pictures in the table 1 given as input data in the application are explained in detail in Table 2. As an example, when the number one picture is given as input data on table 1, the emotional state of the system is measured as "happy" and its rate as "75.94%". In the gender analysis, the ratio of "male" and gender prediction of the result is measured as "82.48%". When the picture number two is given as the input data on Table 3, the ratio of the emotional state of the system is measured as "39.65%". In the gender analysis, the rate of "female" and gender estimation of the result is measured as "59.44%". The pictures 4, 5 and 6 given in Table 3 do not have a printout on the system.

Result and Discussion
With this study, finding the faces on the picture determined by the user, finding the faces in the images, emotion analysis and gender determination with deep learning methods of the face found. Viola-Jones algorithm for face recognition, emotion analysis and gender determination are performed using CNN's "Mini_Xception" model. In order to define incorrectly defined pictures, similar pictures should be added during the teaching of the system. Therefore, more distinctive pictures should be added to the data set and taught to the model. The previously taught xml extension file used for face recognition can be trained with more data sets and better results can be obtained. The structure of CNN's "Mini_Xception" model used for sentiment analysis and gender determination can be changed, a different model can be used or the number of steps can be increased.