Emotion Recognition-the need for a complete analysis of the phenomenon of expression formation

This article shows how complex emotions are. This has been proven by the analysis of the changes that occur on the face. The authors present the problem of image analysis for the purpose of identifying emotions. In addition, they point out the importance of recording the phenomenon of the development of emotions on the human face with the use of high-speed cameras, which allows the detection of micro expression. The work that was prepared for this article was based on analyzing the parallax pair correlation coefficients for specific faces. In the article authors proposed to divide the facial image into 8 characteristic segments. With this approach, it was confirmed that at different moments of emotion the pace of expression and the maximum change characteristic of a particular emotion, for each part of the face is different.


Introduction
The problem of recognizing emotion based on image analysis is known worldwide.There are many algorithms created by specialists from different fields that allow such analyzes.Examples of such algorithms and analysis are described in the papers [1], [2].Unfortunately, each of these tools is not reliable.Standard algorithms allow to distinguish only a few basic emotions.It is important to know, there are dozens of emotional states for which there is no algorithm for remotely recognizing these states.To create a reliable tool for remote sensing of emotions, consciousness of the action of the human organism is essential.The big problem is that human facial expressions can be consciously controlled by ourselves [3].Therefore, the authors consider that observation with high accuracy is important (this translates into high speed of registration) the whole phenomenon of emotion from its beginning (excitement stimulus) to the end.In addition, the analysis of the change occurring on the face should be done without generalizing to one image showing expression.Presented approach will allow you to record all facial changes, including very short-lasting, so-called microexpressions.They are not noticeable by people observing this phenomenon without the use of additional tools, however our brain is able to understand them even without focussing attention.This is caused by the time in which microexpression on the face is visible.Usually there are hundredths of a second.Recent years of research on microexpressions have evolved at a significant pace.It is believed that recognized microexpressions are the key in the process of knowing the true feelings you feel, for example, in people who want to hide such emotions (criminals, terrorists) or in people with mental illness (eg schizophrenia).Research conducted by scientists from various fields shows promising results to detect "hidden emotions" [4][5][6].The work of deepening knowledge on this subject can help to create a reliable tool to explore what a person is trying to hide and unconsciously shows on his or her face.

Acquisition and processing of data
The authors, have developed an experiment that was based on stimulating emotions (random anger or joy).The stimulus that triggered the emotional state was the randomly selected images from the Karolinska Directed Emotional Faces (KDEF) [   Therefore, the created algorithm assumes the cut, from the original images, the images of a dimension such as the assumed image (Ai) for the first frame.However, the most important aspect was that in every image, the face should be centered.This was achieved by detecting a Bi image representing only the face on all images, while the matrix dimensions of these images were not always the same.Therefore, a dependency has been defined, which allowed to cut images satisfying this condition.
The next stage was eye and mouth detection (Fig. 4 and Fig. 5).This allowed us to create a framework for dividing the frame into eight segments.In this case, the image of the eyes and mouth had to be of the same size as their counterparts for each of the cages.Therefore, for all Ci images, an division was proposed based on the C1 image identical to detect these parts of the face using existing available functions of Matlab.

Results
As a result of calculations have been created matrices of one row containing values of correlation r' (for example  1 ′ = [0.9940.991 0.982 … 0.976], where S1segment no 1) and graphs showing changes (Fig. 7) and r'' coefficients for each segment of the test face.The graphs represent a change in the correlation coefficient depending on successive images.In some cases, growth is noted (Fig. 9).The article did not undertake a thorough analysis of data r''.The results are promising and will be used in future research.

Detection of batting an eye
For all cases, a sudden increase or decrease in the graph showing the correlation coefficient r'' for a given segment represent a rapid change in this area.For example, the files on the graphs generated by the algorithm for segment 4 (representing the eyes of the examined person) show a blink of an eye.

Comparison of reaching the threshold of decrease in correlation coefficient
By analysing all the graphs for one test person, it is noted that for each segment, the correlation coefficient starts to decrease somewhere else.Based on this, it can be said that not all parts of the face react at the same time and with the same intensity at the stimulus.
Already at the "first glance" the joy of the other person is noticeable, after his smile, which is characteristic of this emotion.And the same is true in case of anger.We recognize it in other people after a specific look.The analyser show that it is not right to focus solely on one of all parts of face.In some cases, before there is a smile on the face representing joy, there are first changes in other areas.To provide this, the following calculations were made:

Conclusions
The obtained results confirm the assumption that emotional expressions are very complex and require a very detailed and accurate approach to analyses aimed at remote identification of emotions.Given the fact that, not all parts of the face react at the same time and at the same intensity (as has been proven), it is not correct to analyse emotions from individual photos.One image is not a material that can serve as data for a reliable diagnosis of emotion.To get information about the emotional state of a person on the basis of image analysis, it is necessary to observe before the change of emotion, then we have a complete database that allows accurate and thorough analysis of the expression and tracking of changes that take place over time.
It is important to use tools to track changes to get relevant information about the emotions in people with the accuracy of the order of thousandths of a second.In this case, we get the opportunity to catch microexpressions lasting a few milliseconds that are not controlled by the person being tested.And it is these emotions that reflect the true feelings.The collected material gave another insight into the observations of changes occurring on the face.Analysis for a single movie of several thousand frames, lasting a few seconds, were very time consuming.Therefore, when planning accurate emotional states analysis using high-speed cameras, it is necessary to consider the material handling time.This is of great importance in planning further research related to the analysis of the research material obtained.
7]. 70 people were examined (30 cases were analyzed for this article).A high speed camera was used to register video material.The recording was done at a rate of 1000 frames per second, in grey scale, in high resolution.Each film contained frames representing the face during: • neutral state, • stimulation of emotions, • expression during the emotion • expulsion of expression.The resulting videos occupied approximately 6 GB of data on average.In the study of emotion analysis, high speed cameras are rarely used.Analyzes are made on single images or a stack of images obtained with a standard speed camera.In [8] work, attention has been focused on the need for microexpression analysis with non-standard registration speed.The authors of this work have used rapid recording cameras at 100 fps and 200 fps.The study data at similar recording speed were collected by the authors of the paper [9].It can be said that for more accurate measurements and analyzes it is necessary to apply a non-standard registration rate.This has also been confirmed by the authors of this publication in the following articles [10-12].The initial process of collecting research material and preparing for further analyzes is shown in Fig. 1.Each film was divided into single frames in .tiffformat and stored in 8-bit depth as an Ai image -where i stands for the number of frames in the film.Fig. 2 shows an example of the distribution of a film into individual frames.It was important to normalize the entire research material by stretching the histogram.Therefore it is possible to obtain equally qualitative data.In such a way the prepared images could be subjected to further analyzes.

Fig. 1 .
Fig. 1.Registration and preparation of data.The next step was to develop an algorithm.Fig. 3 shows the flow chart.A completely new approach has been set up.This approach assumes the use of known methods: face detection, mouth detection, eye detection [13], digital image correlation.The initial idea is to divide the face image into specific segments.To achieve this, at the very beginning from the stack of images representing the face with the background (Ai images), the image representing only the face (Ci images) should be cut out.Each of these images for the same movie should have the same size, because the correlation coefficient can only be determined for images of the same size (resolution).Assuming that the image is a matrix, within a single movie, the condition must be fulfilled:For  1() and  (ℎ)

Fig. 2 .
Fig. 2. The effect of unfolding the film on single frames on the example of a film consisting of 7336 frames.

Fig. 3 .
Fig. 3. Description of the procedure used in the algorithm.

Fig. 9 .
Fig. 9.An example of the increase and decrease of the correlation coefficient between adjacent pairs of images for Segment 1 and Segment 3 (emotion -joy).Taking into account only the obtained graphics, simple relationships can be noticed.The beginning of change of emotional state and its termination is visualized (Fig. 8), by decreasing the coefficients r' for each segment.The graphs showing the change of the correlation coefficient r'' are different.At the moment of emotion, the drop in coefficient is not always noticeable.In some cases, growth is noted (Fig.9).The article did not undertake a thorough analysis of data r''.The results are promising and will be used in future research.
= max(  ′ ) − (max(  ′ ) − min(  ′ ) 0,25)(3) of the correlation coefficients stored in the matrix   ′ In next step, with the command, the number of pairs of images for which values lower than   ,   ,   were appointed, The results are presented in Tables 1, 2, 3 for example persons.In this way, how quickly responded parts of the face to a given stimulus can be compared.

Achieving the maximum for the segment In addition to the fact that at different moments of emotion, the rate of change and the maximum change that presents emotion for each part of the face is different, is selection the following values for the test person -pairs of images representing 𝑅𝑅 𝑆𝑆𝑆𝑆 ′ . This dependency for several people is shown in Table 4. The values shown in the table confirm that the maximum facial changes do not occur at the same time for the entire face. Therefore, the analysis of a single photo for the purpose of recognizing emotions is unreasonable because it does not give full information about the expression.Table 4 .
Obtained values   ′ .