EEG-Based Personality Prediction Using Fast Fourier Transform and DeepLSTM Model

In this paper, a deep long short term memory (DeepLSTM) network to classify personality traits using the electroencephalogram (EEG) signals is implemented. For this research, the Myers–Briggs Type Indicator (MBTI) model for predicting personality is used. There are four groups in MBTI, and each group consists of two traits versus each other; i.e., out of these two traits, every individual will have one personality trait in them. We have collected EEG data using a single NeuroSky MindWave Mobile 2 dry electrode unit. For data collection, 40 Hindi and English video clips were included in a standard database. All clips provoke various emotions, and data collection is focused on these emotions, as the clips include targeted, inductive scenes of personality. Fifty participants engaged in this research and willingly agreed to provide brain signals. We compared the performance of our deep learning DeepLSTM model with other state-of-the-art-based machine learning classifiers such as artificial neural network (ANN), K-nearest neighbors (KNN), LibSVM, and hybrid genetic programming (HGP). The analysis shows that, for the 10-fold partitioning method, the DeepLSTM model surpasses the other state-of-the-art models and offers a maximum classification accuracy of 96.94%. The proposed DeepLSTM model was also applied to the publicly available ASCERTAIN EEG dataset and showed an improvement over the state-of-the-art methods.


Introduction
Personality has been developed from different theories, but personality core is a function of individual behavioral differences and experiences affected by an individual's development, such as his/her emotions, social relationships, and life experiences [1]. Personality represents the action style of a person in daily life.
ere are many theories and personality measurements, but the personality trait measurements have become the most considerable acknowledgment in the scientific community and play an irreplaceable role [2].
ere are various ways in which personality prediction can be made. Personality can be identified by filling out questionnaires, also known as self-reported personality assessment.
e five-factor personality test [3] and MBTI personality test [4,5] are its examples. Personality prediction can also be made using social media such as Twitter [6] and Facebook [7] data, but that is not always so accurate because the data can be fake [8][9][10]. e personality prediction using physiological signals has recently received a lot of interest [11]. e physiological signal allows researchers to have a better understanding of the participant's reactions during the experiment. Recognizing personality from physiological signals [12][13][14] is more accurate than digital footprints [15,16] because this approach achieves a higher classification accuracy.
Among the physiological signals, the electroencephalogram (EEG) signals have grown in prominence in recent years and have achieved a higher classification accuracy [17,18]. e electrical activity produced by neurons in the brain is recorded using EEG, which have been widely utilised to study functional changes in the brain [19,20]. EEG signals frequency varies from 0.5 Hz to 100 Hz and is grouped into five bands: delta, theta, alpha, beta, and gamma; all the bands have different frequencies [21,22]. A band of 0.5Hz-50 Hz is used for this paper.
e main contribution of this paper is as follows: (i) e newly EEG dataset is created for personality prediction using NeuroSky MindWave Mobile 2 device (ii) is study proposed a DeepLSTM model for the prediction of personality traits e remaining paper is structured in the following manner. Section 2 provides background details. Section 3 is devoted to the materials and methods used in this study. Section 4 discusses the proposed personality framework. Section 5 provides the experimental results. Section 6 discusses the comparison of the proposed DeepLSTM model with the other state-of-the-art methods. Section 7 presents the conclusion.

Background
is section explains the FFT for extraction of the features and is discussed in detail next.

Fast Fourier Transform.
e first step in the successful classification [23,24] of personality traits is to extract important EEG signal features. e popular methods for analyzing EEG data are decomposing signals into various frequency bands, as shown in Figure 1, including delta (0.5 to 4 Hz), theta (4 to 8 Hz), alpha (8 to 12 Hz), beta (12 to 30 Hz), and gamma (30 to 100 Hz). e MindWave can use the onboard chip inkGear ASIC Module (TGAM1), with algorithms that reduce the background noise and objects. For a decomposing signal with fast Fourier transform (FFT), the TGAM1 chip has an algorithm. e value is provided to the application program by the TGAM1 chip using the device. Each second data are gathered and processed in the temporal field to identify and correct as much as possible the artifacts and background noise, without the practical usage of NeuroSky's proprietary algorithms, of the original signal. e headset helps us to control meditation and attention features that their eSense technology measures.

Materials and Methods
is particular section gives details about the pool of participants, details about the device used for experimentation, the details of the dataset used for experimentation, and lastly details about the procedure of experimenting.

Pool of Participants.
is study consists of 55 participants. However, five samples have been removed from the final assessment due to dware errors or inappropriate EEG signal artifacts. erefore, 50 representative samples of 18 to 46 years of age (25 males and 25 females) participated in the study. Forty participants are handed to the right; ten are handed to the left, each with a natural vision. Participants were not allowed 24 hours before the experiment to take tobacco or caffeine.

Device Description.
e NeuroSky MindWave mobile 2 device's functionality is to capture brain signals, as seen in Figure 2. e brainwave reading EEG headset is simple to monitor and is cheap. It generates 12-bit (3-100 Hz) raw brainwaves at a 512 Hz rate and generates EEG power spectrums at various frequency and morphology bands. It is used for pairings with a static headset ID. For capturing the dataset, eegId application is used, which has in-build FFT feature extraction technique and ten features are extracted.

Proposed EEG Dataset.
Visual content is a reliable means of eliciting affect or emotion [25] in the literature. We created and consolidated a set of 40 movies and series clips for this analysis, which served as elicitation materials for the data collected from subjects. e content of these movie clips includes audio and video elements, allowing students to participate in immersive experience. English language and Indian language (Hindi) film samples with a length of about 2 to 4 minutes were chosen for the process. Each clip in the elicitation material includes content that evokes emotions and personality traits and characters exhibiting a particular personality trait.
All of the chosen movie clips are thought to generate and activate the desired personality trait's characteristic emotions. Table 1 presents a selection of stimuli dataset clips used to evoke a particular personality trait for EEG data acquisition. e order and selection of clips were randomized to ensure effectiveness.

Publicly Available Dataset.
is research also uses the publicly available EEG dataset of personality known as ASCERTAIN dataset [26]. e ASCERTAIN dataset uses the BFF model for personality prediction using EEG signals, which have been collected in laboratory settings from the single-channel EEG device. e recorded information includes frontal lobe activity, level of facial activation, eye-blink rate, and strength. It contains 58  participants' EEG recordings as data, and 36 movie clips were taken. ese clips are between 51 and 127 s long. All topics were popular in English, and the students were regular film watchers from Hollywood. e film clips (nine clips per quadrant) are distributed uniformly throughout the visual analog (VA) space. For the recording of physiological signals, different sensors were used in the surveillance of the clips. After watching the clip, each participant was asked to mark the VA scale with a 7-point scale to represent his practical experience. e personality test for the five large dimensions has also been evaluated using a 5-dimensional questionnaire.

Proposed Personality Prediction Framework
Using EEG Signals and DeepLSTM Model Figure 3 includes the entire framework for personality prediction using EEG signals and the DeepLSTM model. e proposed framework consists of two parts. First is the data collection for personality prediction, and second is the DeepLSTM model for classification of personality traits, and both of these are described next.

Data Collection for Personality Prediction.
Data collection is the first step in the research process. is dataset was obtained using an experimental protocol that is well established and easy to follow. e dataset is created to support 50 volunteers (25 men and 25 women) who will be actively involved in the data collection process. Since an individual's personality trait cannot be assessed solely by their current mood or state of mind, the data will be collected three times over five days [27]. e participant was initially relaxed in the data collection process and wore the NeuroSky MindWave mobile 2 headset on their head. Since there are four groups in the MBTI personality traits, each group consists of two traits in verses of each other. ere are eight traits, and for each trait, one film clip is shown to the participant. During the training time, the proposed procedure is iterated eight times with one participant. Before each film clip, the participants were given a 20-second starting hint to begin the test, during which they viewed video clips of a targeted personality trait. Following that, each participant signs a consent form, which is then accompanied by keeping a record of their general information such as name, age, and gender at the initial levels for developing the  After viewing a film clip of one trait, the participants had to fill the self-evaluation form with options "agree," "neutral," or "disagree" and have seven questionnaires for each personality trait. ese questionnaires are constructed by targeting the characteristics of personality traits. ese questionnaires must be answered based on the participants' real feelings instead of their typical emotions or general attitude, which may differ from person to person. Because of that, the answer to those questionnaires may differ. In each clip, a 1-minute buffer is for neutral clip to neutralize the participants' elicited personality traits. After all of the questions for each of the four grouped personality traits have been answered, the questionnaire (which contains seven questions) is evaluated for each participant's traits. e labeling of the EEG signal depends on the output of the questionnaires given by the participant. e final output is evaluated by the following procedure. Let us suppose that the participant has watched the film clip targeting the characteristic of the extraversion trait. After watching the film clip, the participant answered the questionnaires based on the extraversion trait. Suppose the participant selects for the "agree" option in the questionnaire. In that case, we can raise it by value 1. If, for the extraversion questionnaire, the participant chooses the option "disagree," we raise the counter of the introversion trait (versus trait of extraversion) by one. If the participant opts for the neutral option, we neither increase nor decrease the counter for any trait. Since there are seven extraversion trait questionnaires, the EEG signal labeling depends on the participant's output, and three labeling possibilities exist.
(i) e EEG signal is labeled as extraversion if the number of "agree" options is more selected than "disagree" (ii) e EEG signal is labeled as introversion if the number of "disagree" options is more selected than "agree" (iii) e EEG signal is discarded if the number of "agree" and "disagree" options are equal in number Similarly, for introversion trait-based questionnaires, the counter for introversion trait is incremented if the participant chooses the "agree" option. If the participant chooses the "disagree" option, the counter for extraversion is incremented. If the participant opts for the neutral option, we do not increase or diminish the counter for that questionnaire of both the traits. e labeling of the EEG signal is done by following the above procedure. Similarly, the remaining personality traits marking is done, and their related EEG signals are labeled. e same experimental procedure is repeated after three days for collecting the data and removing bias.
At the end of each trait's evaluation process, the dataset's maximum counter value is labeled. To label the EEG signals, this marking scheme is taken as the reference. e study's testing data were obtained using just four video clips, targeting one personality trait from each group. e experiment will be performed using four machine learning algorithms, ANN, KNN, LibSVM, and HGP, including our proposed DeepLSTM classifier. e survey and review of results for the recorded EEG signal dataset using the described machine learning algorithms will provide valuable material for a similar study of personality types. ese findings show that personality inference from EEG signals outperforms state-of-the-art clear behavioral indicators in classification accuracy.

Proposed DeepLSTM Model.
Various algorithms for learning machines are used for the recognition and description of personality characteristics in literature. e DeepLSTM model for personality traits classification with the use of EEG signals is used in this work. Figure 4 includes the architectures of the DeepLSTM cell network used for classifying the personality traits by using EEG signals in this analysis. e DeepLSTM network has been established on the backend in Python 3.6 Keras 2.0.9 on TensorFlow 1.4.0.
In DeepLSTM architecture, there are 3 LSTM layers, with 512 memory units in the first layer, 256 memory units in the second layer, and 128 memory units in the third layer. In all proposed architectures, the dropout layer is also used, and the probability value is 0.2. In the model between existing layers, the dropout layer is applied to previous layer outputs, which are fed to the layer, as shown in Figure 4. A layer's outputs are arbitrarily subsampled under dropout layer. e memorization capability of the DeepLSTM model is due to the dropout regularization [28]. Furthermore, the model is trained faster with the 0.2 dropouts, overfitting is reduced, and the proposed DeepLSTM model performs better in terms of prediction. e "tanh" function is used as an activation function and generates the output of 64 units.  (1) "Softmax" is used as an activation function in the last layer and 4 outputs are generated representing four personality classes. e key benefit of using the softmax as an activation function is the range of output probabilities, which will be between 0 and 1. It returns each class's probabilities, with the target class having the highest probability.
LSTM cells and dropout layers are utilised to discover the role of EEG signals. e overfitting of these systems was minimized by restricting unit coadapting in the dropout layer of our DeepLSTM architectures. e dense layer, the loss function for these network architectures, is categorical cross-entropy and the batch size is 40. e adaptive moment estimation optimizer (Adam) is used for a learning rate of 0.001. e normalization is applied to the dataset input features with the MinMaxScaler function after loading the dataset. is function normalizes each feature because of which each feature contributes in a maintained manner. It decreases the internal covariate transition, resulting in a change in network activation distribution due to shifts in network parameters during training. e normalization of the proposed network enhances training, reducing the change in the internal covariance. It also helped improve the optimization phase by stopping weights from bursting around the entire site by limiting them to a specific set. An undesired advantage of normalization is that it often allows the mechanism to regularize somewhat. In the parameter specified by Table 2, the proposed DeepLSTM network is initialized [29]. We test the output of our proposed DeepLSTM model, which classifies EEG signals as an output value into five personality groups, for 500 epochs with a batch size of 40. e DeepLSTM model was evaluated using the suggested EEG dataset as well as the publicly available ASCERTAIN EEG dataset. e proposed architectures and parameters were chosen based on our own experiments with nearby architectures (in terms of layers and nodes). In terms of accuracy, the proposed DeepLSTM architecture outperforms their nearby architecture.

Experimental Results
e results of the DeepLSTM model for classifying the EEG signals and to check our system's efficacy are presented next. e computer environment is composed of 3.4 GHz devoted to the 32 GB RAM-based Python (3.6) to incorporate DeepLSTM cell architecture and other states of the art, i.e., ANN, KNN, LibSVM, and HGP. e parameter values of the ANN, KNN, LibSVM, and HGP are the same as in [19,30], respectively.
e parameter values taken for the implementation of the DeepLSTM model are given in Table 2.
e dataset is typically split into two distinct sets, i.e., training sets and test sets. A general review of our method is carried out in this research of personality trait classification using EEG signals. We have separated the dataset into different training and testing partitions to equate them with existing literature. e performance assessment is conducted using a 50-50, 60-40, 70-30, and 10-fold partition scheme.   Computational Intelligence and Neuroscience 5 In 50-50, 60-40, and 70-30 training-testing partition, 50%, 60%, and 70%, respectively, data is used for training, and 50%, 40%, and 30%, respectively, of the data is used for testing. e complete dataset is partitioned into approximately ten equal size blocks in a 10-fold cross-validation scheme; 90% of the dataset, i.e., nine blocks, becomes our training data, and 10% of the dataset, i.e. one block, becomes our testing data. is process is repeated ten times, with each time a different data block being used for testing. Also, our proposed model's sensitivity, precision, and specificity value for the 50-50, 60-40, 70-30, and 10-fold partition schemes are calculated.

DeepLSTM Architecture Evaluation.
is study uses a deep learning algorithm to distinguish personality traits from EEG signals. In practice, the DeepLSTM model outperforms traditional machine learning algorithms because it has the capability of remembering the long-term dependence of sequential data in time, increasing the likelihood of correctness in a short period of time [31]. Table 3 represents the classification accuracy comparison for personality prediction DeepLSTM model on the AS-CERTAIN and the proposed EEG datasets. For the AS-CERTAIN and the proposed EEG datasets, the proposed DeepLSTM model maximum, average, and minimum classification accuracy for 50-50, 60-40, 70-30, and 10-fold cross-validation partition scheme is calculated. e maximum classification accuracy of the proposed DeepLSTM model for 50-50, 60-40, 70-30, and 10-fold cross-validation partition scheme on the ASCERTAIN EEG dataset is 82.48%, 88.14%, 92.86%, and 95.32%, respectively. e maximum classification accuracy of the proposed DeepLSTM model for 50-50, 60-40, 70-30, and 10-fold cross-validation partition scheme on the proposed EEG dataset is 84.56%, 91.52%, 94.82%, and 96.94%, respectively. From the results, it can be seen that the DeepLSTM model performs better in terms of performance on the ASCER-TAIN and the proposed EEG datasets, and the classification accuracy of the DeepLSTM model is higher on our proposed EEG dataset than the ASCERTAIN dataset.

Discussion
is section discusses how the proposed deep learning-based DeepLSTM model works compared to conventional machine learning algorithms.
A comparison with standard conventional classification algorithms is carried out using the same collection of features as used in DeepLSTM-based methodology to show the advantages of incorporating deep learning into the classification of personality traits. e KNN, ANN, LibSVM, and HGP are the other stateof-the-art approaches used for comparison. e classification accuracy comparison of personality traits is contained in Table 4. It contains the maximum, average, and minimum accuracy for the 50-50, 60-40, 70-30, and 10-fold partition schemes.
e parameters and settings for these variables have all been implemented using the same technique to ensure that the findings and comparisons offered are unambiguous and consistent. e proposed deep learning approach has a greater impact than traditional machine learning algorithms. e DeepLSTM classification improved dramatically in classification accuracy, as per the results. Besides the rise in classification accuracy, the DeepLSTM classifier can also retain specificity greater than 92.86% on the ASCERTAIN dataset and 93.84% on the proposed EEG dataset, resulting in very low false prediction % rates. e sensitivity value of the DeepLSTM model for the ASCERTAIN dataset is 94.72%, and the proposed EEG dataset is 95.86% and is high in the other state-of-the-art methods, which shows that the DeepLSTM model correctly classifies the minority class samples. e precision value of the DeepLSTM model for the ASCERTAIN dataset is 93.48%, and the proposed EEG dataset is 94.44% and is high in the other state-of-theart methods. e F1 score value of the DeepLSTM model for the ASCERTAIN dataset is 93.68%, and the proposed EEG dataset is 94.96% and is high in the other state-of-theart methods. Table 5 shows the relation of sensitivity, precision, specificity, and F1 score values of DeepLSTM for 50-50, 60-40, 70-30, and 10-fold data partitioning scheme. Table 6 shows the statistical result disparity is illustrated by the two-tailed Mann-Whitney test [32].
e Mann-Whitney test is used to compute the p value relation in classification accuracy. e outcomes do not change significantly if the p value is greater than 0.05, and it is highly   Computational Intelligence and Neuroscience

Highly Significant
Optimal values are represented in bold. 8 Computational Intelligence and Neuroscience significant if the p value is less than 0.001. It is evident from the interventions in Table 6 that the solution provided by our proposed DeepLSTM model is statistically different from ANN, KNN, LibSVM, and HGP for the 50-50, 60-40, 70-30, and 10-fold data partitioning scheme. When the p values are contrasted with DeepLSTM for these classifiers, there is a significant variation in outcomes. e evaluation results suggest that the proposed DeepLSTM-based deep learning model for classifying personality traits provides accurate classification results.

Conclusion
During this study, we propose EEG signals-based personality prediction system using DeepLSTM-based deep learning model.
A new EEG dataset was also created using 40 film clips of Hindi and English languages. e proposed DeepLSTM model was also applied to the publicly available EEG dataset known as ASCERTAIN. Multiple experiments have been carried out to validate our results, which are helpful to compare our DeepLSTM model with existing methods. Fifty participants were involved and saw a few movie clips targeting eight different personality traits. is method uses NeuroSky MindWave mobile 2 to capture brain signals. Better results of sensitivity, precision, and specificity indicate that our approach beats the current literature. e classification accuracy of the proposed DeepLSTM model on our proposed EEG dataset is 96.94% for the 10-fold partition scheme and outperforms the results of the DeepLSTM model on the ASCERTAIN dataset having classification accuracy of 95.32%.
We are currently using a single-channel device, and in the future, we will extend it to multichannel devices.
Data Availability e data are available on request from the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.