Correlation of EEG Images and Speech Signals for Emotion Analysis

Aims: The paper anticipates the correlation of EEG images and speech signals for understanding the emotions. Study Design: The study focuses on recognition of emotions using EEG images and speech signals


INTRODUCTION
Brain Computer Interface (BCI) is a part or a subset of Human Computer Interface that involves the communication between the user and the system via brain signals. BCI are projected for enabling people to operate electrical devices and applications by thinking process or through their psychological activity. Emotion recognition has become credible research topic in the field of BCI [1]. There also exists a broader range of techniques used for automatic emotion recognition of diverse emotional states. Emotion recognition is drawn on two foundational emotion conceptualization. The first type focuses on the basic emotions made up of finite array of emotional constructs such as anger, fear, sad, happy. The second approach focuses on the arousal and valence model which characterize the emotions. The study of neurophysiology aimed to identify the two primary aroused emotional states which are related to positive and negative emotions, the eye blinks were used for identification and experimental work by using algorithms using wavelet transform [2]. It also [3] concentrates on recognizing the inner emotions from EEG signals. In this, two types of emotion induction experiments were performed using music and sound stimulus from International Affective Digitized Sounds (IADS) database which was used for implementation. The Bayesian Network [4] was used for recognizing emotions using multi-emotion states from EEG. The subjects have been gone through stimuli like emotional videos along with collecting EEG signals and applying Receiver Operating Characteristics to select the input features. The continuously detected emotions were collected using valence from EEG signals while showing the videos, the correlation between both EEG and facial expressions were studied [5]. The MRI [6] is also used for lateralization of brain regions; mostly the positive and negative emotions found in music are studied. The happy and sad are the most dissimilar emotions. It is also seen that the results of EEG and MRI are found to be different in different brain activities. The study of EEG and MEG is seen in [7] which introduce the advances in signal processing methodologies. The frequency patterns are analysed to recognize the emotion processes in subjects, where as the popular brain imaging method adopted is the MRI, it returns a sequence of static images of the brain.
Emotion recognition can also be detected using speech. Emotions allow people to precise themselves beyond verbal domain. Emotion recognition from speech has gained rising attention [8]. The speech recognition is also called as Automatic Speech Recognition; it is the process of converting the raw data to a proper sequence, with the help of different algorithms and techniques. The main objective of this research is to develop different techniques, applications, systems which can be used for the benefit of society [9]. The research [10] focus on the study of speech signal performance, on how speech is modulated when speaker's emotion changes from neutral to other emotional state. It was observed that speech link with anger and happy are described by longer utterance duration, higher pitch and energy value with deep length. The study [11] has also been carried out for extracting features from speech to recognize emotions using classification technique i.e. Hidden Markov Model (HMM) and Support Vector machine (SVM). The recognition of different emotions depend on how well the EEG features can be mapped onto chosen emotion representation. The emotion representation used in the two dimensional mapping with arousal valence axes.
One of the most used frameworks for emotion recognition is best characterized by two dimensions: arousal and valence as seen in Fig. 1. The positive emotions are seen at the right part of the brain and the negative emotions are seen at the left part of the brain respectively [12].

Human Brain
The brain is divided into three major areas: the cortex, the limbic system and the brain stem. The amygdale is important for processing emotions like happy, sad, anger, surprise, disgust and so on. The amygdale is the emotional centre of the brain. The prefrontal, frontal and temporal lobe regulates emotion and emotionally attuned communications, it is also involved with collecting information, thinking, processing, creating different options for responding.

Fig. 2. Human brain with lobes
Each part of the brain can further be divided into different lobes as seen above in Fig. 2

Electroencephalography (EEG)
Electroencephalography (EEG) is a technique which measures electrical activity produced from the scalp. EEG is used for diagnosing epilepsy, sleep disorders, coma patients and many more. For collecting the data, RMS EEG 32 channel 19 electrodes data monitoring equipment is used, the 19 electrodes are placed on the scalp of the subjects with the help of 10-20 international system.
As shown in Fig. 3, the standard numbering system places, odd numbered electrodes at the left of the scalp and even numbered electrodes are placed at the right side of the scalp. Electrode locations are determined by dividing these perimeters into 10% and 20% intervals. In this system 21 electrodes are located on the surface of the scalp [14]. EEG signals emanate due to electrical activity produced by the brain, the EEG signals are classified in four different frequency bands; Delta (<3.5), Theta (4-7 Hz), Alpha (8-13 Hz), Beta (14-30 Hz) shown in Fig. 4.
 Delta (δ) wave ranges from < 3.5 Hz. Delta brainwaves are slowest, but loudest brainwaves. They are generated in deepest meditation and dreamless sleep, also found in infants.  Theta (θ) wave ranges from 4 to 7 Hz.
Light sleep or extreme relaxation. Theta is seen in mental state that has proven useful for hypnotherapy.  Alpha (α) wave ranges from 8 to 13 Hz. In alpha wave the person is awake but relaxed. Alpha activity has also been connected to the ability to recall memories, lessened discomfort and pain, and reductions in stress and anxiety.

EEG Images
Apart from EEG signals, EEG images are also integrated in the software, which are considered in this article for experimentation. The machine provides Brain mapping colour coding as per international standard. RMS EEG machine also provides software i.e. "Acquire" and "Analysis".
 Acquire software: The software is used to acquire the recordings of the subject.
Features like patient info, impendence check, start/stop EEG buttons, record EEG and so on is provided by the software.  Analysis software: There are different tools provided by the software like split screen, single map, tri map, frequency map, frequency spectrum, amplitude progressive, frequency progressive and frequency table, for frequency domain analysis the software provides 2 sec data, from which every frequency domain tool can be used, on the other hand the software gives a tool called Amplitude Progressive which provides 12 amplitude maps at consecutive time difference of 7.8125 ms, is used for experimentation. For analyzing the brain images, full spectra is provided as shown in Fig. 5. The spectra ranges from +60 µV to -60 µV. According to literature the +60 µV give the intense higher activity, and -60 µV give the indistinct lesser activity. There are in all 16 colour shades in the spectra provided. We have concentrated on first 4 shades as shown in Fig. 6, to analyze the emotional activity in the brain [16].

Speech Processing
The speech processing is the process of converting the raw signal or data to proper sequence of words or sentences, with the help of various techniques. The Computerized Speech Lab (CSL) machine was used for recording and preprocessing of speech data, for extraction of features from speech signals PRAAT freeware software is used. The parameters like pitch, energy and intensity make an important role in expressing the speech emotionally. We have focused on speech parameters which were used to determine the emotions.
 Pitch: which is the main acoustic correlate of tone and intonation; it gives highest peak of the wave by which we can recognize the state of emotions.  Energy: the best parameter for emotion recognition.  Intensity: the parameter is used to calculate the physical energy and degree of loudness in speech which is used for speech processing [17].
The data is acquired from both EEG and Speech simultaneously from volunteers. The acquired data is refined by which EEG images and speech signals were sorted. The active regions from EEG images were extracted using threshold. The sobel egde detection technique is also enforced on the EEG images to acquire the exact size of the active region. The speech signals are extracted using Praat software. Praat is a freeware software program used for analysis and reconstructing of speech signals. The statistical Correlation Coefficient i.e. the Person's correlation coefficient is used to calculate the correlation between the EEG images and speech signals for recognizing emotions; Fig. 7 shows the flow of experimentation.

Emotion intelligence test (EII) and acquisition protocol
The subjects were selected from Department of CS and IT, Dr. B. A. M. University, Aurangabad. The subjects were counselled before the Emotional Intelligence Inventory (EII) Test. The EII test is conducted to analyse the emotional quotient of the volunteers. The participants have gone through the test by which we were able to select the subject more easily. A threshold was targeted on the bases of which the subjects were selected. For experimental purpose 10 volunteers, were 5 boys and 5 girls with efficient emotion were considered. The consent is also taken from the subjects and the information is kept confidential.
The volunteers were selected from the age group of 22-26. The EEG recording was acquired, using RMS EEG 32 channel machine. 19 electrodes were placed on the scalp with the help of gel, and Computerised Speech Laboratory (CSL) machine was used to record the speech signals from the volunteers. The volunteers were asked to express about their happy and sad emotional incidence from their life, in the mean while the data was acquired for both EEG and speech signals for said emotional states. acquire the recordings of the subject, to start with the detail history of the concern subject is taken like name, gender, age, medical history if any, physically handicap and so on. Then impedance is checked, these shows the voltage of every electrode and whether it's placed on proper place. Then we start with proper EEG, the machine provides a record button which starts recording EEG, it also provides a stop button which can be used when the entire task are completed. iv) Analysis Software: The software only opens the file with .eeg extension and which are saved in acquire software. There are different tools provided by the software like split screen, single map, tri map, frequency map, frequency spectrum, amplitude progressive, frequency progressive and frequency table, for frequency domain analysis the software provides 2 sec data, from which every frequency domain tool can be used, on the other hand the software gives a tool called Amplitude Progressive which provides 12 amplitude maps at consecutive time difference of 7.8125 ms, for this research work this tool have been taken into consideration. v) Matlab R2012: The Matlab R2012a is used as frontend software, it is a high level language and interactive environment software used by students of different faculties, also includes tool like signal processing, image processing and many other packages. vi) Praat: For feature extraction from speech signal Praat is used. This is a freeware program for the analysis and reconstruction of acoustic speech signals. PRAAT is a very flexible tool to do speech analysis. It offers a wide range of standard and non-standard procedures, including spectrographic analysis, articulatory synthesis, and neural networks and many more features are included in the software.

vii) Statistical Package for the Social Sciences (SPSS):
The Statistical Package for the Social Sciences (SPSS) was the first comprehensive data analysis software. As the software is user-friendly and easy to acquire it is rapidly used for different statistical operations, it also offers good data management.

Features extraction using Image processing techniques
Image processing involves handling image as two dimensional signals; techniques applied on EEG images are; i) Threshold: It is a non-linear operation that converts a gray-scale image into a binary image where the two levels are assigned to pixels that are below or above the specified threshold value, the red section which is seen active is extracted as seen in Fig. 8 [18].

Fig. 8. Threshold applied on EEG image
ii) Sobel Edge Detection: The Sobel Edge detection performs a 2-D spatial gradient measurement on an image and so emphasizes regions of high spatial frequency that correspond to edges. Typically it is used to find the approximate absolute gradient magnitude at each point in an input gray scale image as seen in Fig. 9 [19].  Where, Cov xy is covariance between X and Y S x is standard deviation for X S y is standard deviation for Y Cov xy , is always smaller than or equal to S x S y . Thus the maximum value of correlation coefficient is bound to be 1. The sign of Pearson's r depends on the sign of Cov xy . If the Cov xy is negative then r will be negative and if Cov xy , is positive then r will be positive value.

Pitch
Pitch is an important parameter for voiced speech. The pitch values contain the speaker specific information. The pitch variation carries the intonation signal associated with rhythms of speech, speaking manner, emotions and assent. The gender is one of the factors which convey a part in characterization of vocal tract. Randomly, the average pitch for female is about 200 Hz and for male it is about 110Hz. In pitch variation emotion signals in voice is one indicator, speech like excitement, stress can be easily justified. Pitch variation is often correlated with loudness in speech, happy, fear and many other emotions in voice are signaled by fluctuations of pitch.

Intensity
The correlate of physical energy and the degree of loudness of a speech sound is intensity. The measure of amplitude via a microphone signal of a person's voice fluctuation is the intensity of that signal. The intensity reflects watts divided by a unit area because it is describing how much energy has radiated.

RMS Energy
RMS energy is the best signal parameter to separate emotion classes; it is used to measure the energy while speaking. The energy also affects the performance of acoustic model in the speech recognition. The voiced frame was determined by calculating the energy contained within certain bandwidths [17].

RESULTS AND DISCUSSION
Expression of emotion is a multimodal activity. Therefore modalities like facial expression, speech, gestures, tone, speech, force-feedback and bio-signals may be more supportive for the development of Robust Emotion Recognition System (ERS) [22,23]. This paper attempts to investigate the correlation of EEG brain images and speech signal which ultimately is useful for ERS. In this attempt database is acquired for EEG images and speech signals for 10 subjects. The duration of the recording was 15 min for each of two (happy and sad) emotional states. Before the activity the subjects were asked to be relaxed.
The total 20 EEG images were selected for the experiment. Each of 20 images represents data of 45 sec. Table 2; characterize the database. Table 3; expresses the EEG images with its threshold and edge detection for happy emotional state. It is seen that the activity is large in the right part of the brain. Prefrontal, frontal, right temporal, central regions are seen to be more prominent. Table 4, illustrates the EEG images with threshold image and Edge detection for sad emotional state, the activity is seen in left part of the brain. The active regions seen more dominant are the prefrontal, frontal, left temporal, occipital, partial regions. The detail of image no. 3, 7 and 14 are represented in the paper. Table 5; details the EEG image size for selective happy EEG images Table 6; details the EEG image size for selective sad EEG images.
The database of speech signal is acquired along with EEG recordings. Thus the recording of speech signals was also for 15 min. The signal is divided into 20 speech samples which correspond to the selected EEG images. The parameters considered to extract the emotions from speech are pitch, intensity and RMS energy. The similar results are observed and presented in [24]. Where it is observed that in happy, the pitch and intensity are higher as compared to sad.        Table 7; express the happy speech values for representative pitch, intensity and RMS energy for selective happy images. It is observed that the pitch is higher as compared to intensity and RMS energy.
Above Table 8; intreprets the speech value for representative pitch, intensity and RMS energy for selective sad emotional states.
The comparative correlation of EEG images size with happy pitch and happy intensity and EEG image size with sad pitch and sad intensity are seen in Tables 9 and 10 respectively. The (*) Correlation is significant at the 0.05 level for 2tailed, and (**) Correlation is significant at the 0.01 level for 2-tailed correlation. The correlation of EEG image with happy pitch and intensity and sad pitch and intensity falls under the category of strong positive correlation as mention in Table 1. This result efficiently presents that speech signals and EEG images can be considered as powerful candidate modalities for designing Robust Emotion Recognition System.

CONCLUSION
The study aimed to identify the Emotion recognition utilizing EEG images and speech signals which is a new approach in Brain Computer Interface, since correlation of EEG images and Speech signals for recognizing emotions are studied and observed as follows: 1) In happy mental state the activities are seen in right hemisphere, the prefrontal, frontal, temporal, occipital regions are seen active and in sad mental state the activities are seen in left hemisphere, the prefrontal, frontal, temporal, occipital regions are seen active.
2) The active size are seen more prominent in happy as compared to sad mental state.
3) The speech signals for happy pitch is seen eminent as compared to sad pitch, whereas happy intensity is leading sad intensity, the happy RMS energy can also be seen salient then sad RMS energy. 4) The correlation of both EEG images and speech signals found to be between moderate to strong correlation and this correlation is signified through p value which is in the range of .001 to .081 in happy state and .000 to .069 in sad state. 5) The correlation coefficient significance accuracy is about 95% for said emotional state. 6) The results can be utilized in making the Emotion Recognition System (ERS). This research is also found to be significant in research domains like forensic science, psychology and many applications of Brain Computer Interface. 7) As we have studied the database of 10 subjects, but if the number of subjects are increase the accuracy of recognition can also be increased.

CONSENT AND ETHICAL APPROVAL
The subjects were selected from Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad. The subjects were counselled before the Emotional Intelligence Inventory (EII) test. The consent was also taken