Application of a brain-computer interface for person authentication using EEG responses to photo stimuli

In this paper, a personal authentication system that can effectively identify individuals by generating unique electroencephalogram signal features in response to self-face and non-self-face photos is presented. To achieve performance stability, a sequence of self-face photographs including ﬁrst-occurrence position and non-ﬁrst-occurrence position are taken into account in the serial occurrence of visual stimuli. Additionally, a Fisher linear classiﬁcation method and event-related potential technique for feature analysis is adapted to yield remarkably better outcomes than those obtained by most existing methods. Results show that EEG-based authentication of individuals via brain-computer interface can be considered suitable as an approach to biometric authentication.


Introduction
Accurate personal authentication is often required in many fields of information security and includes several biometric technologies that rely on recognition by fingerprint, face, iris, and voiceprint [1][2][3][4]. Electroencephalogram (EEG) signals from the brain are processed to generate novel biometric features, which can be used for biometric authentication. Among the internal biometric traits, brainwave signals have emerged to become prominent features. Mukherjee et al. [5] have described a new EEG-based system for robust online biomedical content authentication and designed a web-based intelligent EEG signal authentication and tamper detection system. Similarly, Pham et al. [6] have presented a new method based on brainwave signals using the advantage of rich information for personal authentication in multi-level security systems. Al-Hudhud et al. [7] have introduced a multimodal biometric system which overcomes the weaknesses of biometric systems using biometric verification techniques for operating devices.
A visual stimulus is a common experimental stimulus for EEG signal collection. Numerous research projects adopting visual stimuli have been successfully used in the diagnosis of diseases. For example, Coburn et al. [8] demonstrated use of the P2 component of visual stimulation to diagnose Alzheimer disease dementia. Kim et al. [9] studied attention deficit hyperactivity disorder (ADHD) and employed color visual stimulation to help adolescents who suffer from ADHD obtain visual function and color perception. Other researchers have also made progress in studies of Parkinson's disease [10] and diabetes through early visual pathways [11].
In the field of biometric authentication, visually stimulated EEG signals are collected for analysis of identifing features. For example, Yeom et al. [12] used the face of human participants to extract differences between EEG features. In the study of visually evoked EEG signals, the occurrence sequence, probability, and repetitions of an experimental target stimulus are important factors affecting EEG features. A primary reason is that the occurrence sequence and probability of target stimulus in visually evoked experiments can impact the expectations and response to experimental repetitions. Especially in long-term repeated experiments, this can both familiarize participants with an experimental environment and also cause visual fatigue.
Following previous study [13], new elements have been added to reorganize the experimental procedure as different procedures have had different effects on the experimental results. Furthermore, the effect of relevant parameters in visual stimulation experiments were specifically evaluated, resulting in redesign of two aspects of experiments including 'random sequence' and 'aesthetic fatigue', where the term "random sequence" relates to the use of a self-face photograph presented in different positions as a stimulus, while the term "aesthetic fatigue" refers to a subject's visual concentration on a photograph after multiple testing and verification during visual stimulus exposure.

Experimental model design
EEG data were collected from ten subjects (6 male, 4 female). Choice of sample size was determined by subject availability. During the experiment subjects sat on a soft chair without armrests in a quiet shielded room facing a computer screen and performed operations according to the experimental requirements. The Academic Ethics Committee of Jiangxi University of Technology approved the experimental study.
The stimulation program displayed different images on a computer screen. Five different pictures were displayed to each subject during an experiment (Fig. 1). Each image was randomly displayed on the screen for 1000 ms, followed by a black screen for 250 ms, to give a total duration of 1250 ms. The five pictures were presented a total of 370 times in each experiment. Each image was shown the same number of times. Of the five pictures displayed, one was an image of the subject, the others were background pictures. Each image included the head of a subject above the shoulder.
In another authentication study similar to Yeom et al. [12], temporal and spatial parameters the and number of subjects were the same. Subsequently, in this study, three types of rule were employed for each experimental arrangement, which included: (1) five images of familiar people of the same gender, (2) two different genders were included among the familiar images, or (3) a stranger's image was included among the five images. Subjects were asked to count the number of occurrences of their image and the other persons' imaqe in the three sets of pictures. The possibility of authentication according to EEG features was then analytically determined [13]. The experimental mode sequences and aesthetic fatigue that were then introduced are now described: (1) One of five images is presented with equal probabilityeach image appears 74 times. Two rules are associated with each image. Rule 1: Designated image is shown in the first place and subject will see the designated picture. The second rule is that the designated picture is shown at another place with respect to the first one, and subject will see other persons' picture. Throughout the experiment, we let each subject count the number of occurrences of their pictures and the subject could obtain the error occurrence counting number. Thus, when the difference between the subjectcounted number and the actual number is less than five then in such a case, the experimental data remains available.
(2) Designated pictures occur with small probability -each designated picture appears for 36 times, while other pictures appear for 83 times. Similarly, this experiment is carried out in two different orders. For the first order, the designated picture is presented first and the subject sees the designated pictures. For the second order, the subject sees the designated picture subsequently to other persons' picture. In the experiment, we let subject count the number of occurrences of their pictures and the subject could obtain the error-occurrence counting number. Thus, when the difference between the subject-counted number and the actual number is less than five then in such a case, the experimental data remains available.
(3) Long-term experiment: To test the effect of subject' fatigue, we selected 10 subjects, 6 males and 4 females. We choose equal probability mode with designated pictures being placed first. The experiment is performed once every three days, a total of 30 days. This experiment design is shown in Fig. 1. A 40-channel Neuroscan amplifier is used to collect EEG signals, and Scan 4.3 software was adopted. The right mastoid is used to place the reference electrode. The maximum sampling rate of our acquisition equipment is 1000 Hz, so we set the sample frequency to 1000 Hz which is not the default frequency of the device. We use 200 Hz low-pass, 0.05 Hz high-pass, and 50 Hz notch. Comparison of EEG signal features before and after filtering. The curve 2 describes the filter signal after filtering while the curve 1 is the original signal before filtering. The X axis represents an experimental cycle ranged from 150 ms to 1100 ms, and Y axis is the voltage difference of brain signal. The electrode F8 is one of the electrode labels assigned by international 10-20 system. Negative voltage parameter is used for highting signal peaks and troughs.

Frequency filtering technique
We investigated different frequency components of EEG signals to display in response to changes of the EEG feature under different modes and before calculation, EEG signals were filtered before being recorded. The frequency of collected EEG signals range from 0.05 Hz to 200 Hz. The frequency of studied EEG signal is concentrated between 2 Hz and 45 Hz. The frequency filtering technique described by Deller et al. [14] is used and after filtering the EEG signal features are strengthened, as shown in Fig. 2. It shows a comparison of EEG signal features before and after filtering, that takes the appearing time ofstimulation to be 0 ms, and selects the EEG signal data at 150 ms before the stimulation appeared and1100 ms after the stimulation appeared.
The negative part of the time axis does not mean that the time is negative, but rather we recorded the signal data in advance for a period of time to prepare for recording the event data. We marked the time point when the stimulus photo appeared as the time-starting point, which as "0". The 10% or 20% signal data before the timestarting point together with the signal data after the time-starting point constituted a complete data sample. In order to maintain the visual integrity in the EEG study, relative to the time-starting point of the event, the time axis appeared in "negative" time.  3. Feature selection for Fisher distances. The subject number represents 30 electrode labels, and sample number is the time, and F represents the Fisher distance. The four rectangular block areas describe the marked differences of the Fisher distance with the same electrode when the subjects scan their selfand non-self photos. The Fisher distances for each subject looking at his/her own pictures marked with light gray and others picture marked with dark gray.

Fisher linear discriminant method
According to the experimental design, the samples about each subject were divided into two categories: (1) stimulation of self photos and (2) non-self photos. At the beginning of this study, we did not know if the EEG data can be divided linearly with Gaussian distribution, so we assumed that the EEG data has linearity and normality. Therefore, as one of the commonly used linear analysis methods, the Fisher linear discriminant method of two types is used to classify features. For sample space of "stimulation of self photos" R S (suppose m samples) and sample space of "stimulation of non-self photos" R O (suppose n samples), the R S expressed as X i (x 1 , x 2 , . . . , x n ) and R O as Y i (y 1 , y 2 , . . . , y n ) their Fisher distance is calculated: where Note: µ xi and µ yi is the average value of x i and y i respectively, and µ is the mean value of all samples on the relative component, and n i is the total sample number of the corresponding class.

Feature selection
We applied the Fisher linear discriminant method to measure the feature distance of different EEG signals. The Fisher linear discriminant method is used to analyze the EEG via selecting the largest Fisher distance among features. The distribution of the distance at all of the time point is calculated for Fisher classification decisions including 30 electrodes. Feature selection is based on time points while selecting several time points with obvious features as the feature of the subjects. Fig. 3 shows the fisher distances for each subject looking at his/her own pictures (light gray) and others pictures (dark gray), and the three dimensional data are time (sample number), electrode (subject number), distance value (F). Shown in Fig. 3 are significant differences between two EEG signals in the black area.

Authentication accuracy
Comparing the descriptions of authentication, the results of test samples are categorized into four classes: true sample and classified as true (TT), true sample but classified as false (TF), false sample but classified as true (FT), and false sample and classified as false (FF). If the number of samples is N, we defined the correct ratio as TR TT/N, and error ratio as FR (TF + FT)/N. For 10 subjects, we chose 500 pictures of subjects (true samples) and 200 pictures of other persons (false samples). As indicated in Fig. 4, the classification accuracy of small probability is obviously better than that of equal probability, and it is obtained as a good classification performance when the photo of subject is shown in the first position based on specified probability.
To better study the feature changes caused by aesthetic fatigue, we require that in long-term tests subject should look at the pictures at least 10 times in the interval of two tests. On the other hand, the experimental mode for long-term tests is equal probability mode with objective pictures at the first place. Completing the full extent of the experiment cycle is excessively long, thus only five subjects completed the experiment tasks. According to the experiment time, we selected six time modes over six weeks with the same time interval. Tracing tests are made for 5 subjects, and the authentication results are shown in Fig. 5. It is obvious that the brain activity of the subject shows a process of initial increase followed by a decrease and the classification accuracy shows significant decreasing trend. This result demonstrates that the authentication accuracy does not necessarily reflect increase in concomitant with the increasing of brain activities.

Visual stimulation components analysis
EEG signals can capture some meaningful features in the time domain according to the peak and trough time of superimposed signals after superimposing the original brain electrical signals. The original EEG signals from different experimental models were superimposed, displaying the superposition event related potential (ERP) component as indicated in Fig. 6 and Fig. 7.  The experiments in this study include three modes, i.e., equal/unequal probability, first/non-first place, and three backgrounds. If background pictures show familiar persons of the same gender, objects are placed at the first place, and each picture occurs with equal probability, thus the code is 1aA. Under the equal probability condition, the comparison of EEG for designated picture at or not at the first place is shown in Fig. 6. The subject's EEG components when observing pictures placed at and not at the first place are shown in Fig. 6a (Background pictures are familiar persons of the same gender). Fig. 6b shows the EEG components under the same condition as in Fig. 6a, but the background pictures include two persons of different genders. While Fig. 6c shows the situation Fig. 6. Equal probability comparison under three modes. The brain topographies demonstrate different brain areas under three modes: (a) the mode with background pictures of familiar persons of the same gender; (b) the mode with background pictures but including two opposite sex persons; (c) the mode with background pictures but including a stranger. These three modes described in Fig. 1 in detail. in which background pictures include strangers. Fig. 6 shows the difference between two situations in which designated pictures are at and not at the first place under the equal probability condition. For three different kinds of background pictures (i.e., familiar persons of the same gender, familiar persons of different genders, and strangers, respectively), when designated pictures are at the first place, the subject response is significantly stronger than those in other situations. This is shown by the wave shape from 150 ms to 400 ms in Fig. 6. This difference is also shown in the random probability condition. When the designated picture occurs with a small probability, the wave shape from 150 ms to 400 ms also shows difference between the situations where designated picture is at and not at the first place. Subject response to designated pictures at the first place is stronger than those pictures at other places. By comparing the cases between equal probability and unequal probability, it is found that the difference between subject responses to whether designated pictures are put at the first place or not in equal probability condition is bigger than those in unequal probability condition.
Under small probability condition, the EEG features in time domain when designated picture is placed at or not at the first place is shown in Fig. 7. Fig. 7a shows the subject's EEG components when observing pictures placed at and not at the first place (background pictures are familiar persons of the same gender). Fig. 7b shows the EEG components under the same condition as in Fig. 7a, but the background pictures include two persons of different genders. Fig. 7c shows the situation in which background pictures include strangers.
The absolute peak to valley differences (PVD) of amplitude from 150 ms to 400 ms for 10 subjects are shown in Table 1 (Unit: micro- Fig. 7. Small probability comparison under three modes. The corresponding brain topographies have a similar significance with three modes described in Fig. 6 but these modes were implemented under small probability. volt), where the average equal probability PVD of the three models is 6.14, while the average small probability PVD is 0.79. The average equal probability PVD is 7.8 times of average small probability PVD, which shows that the effect of object position under equal probability mode is greater than that under small probability mode. The main reason for the generation of visually simulated features is twofold: (i) the reason of the stimulus themselves including the familiarity of photos, the level of interest to the photos, etc. They are the basic features of identification; (ii) P300 evoked expectations of targetinduced psychological feature. The feature under small probability mode is more significant than that of under equal probability mode. The data presented in Table 1 also prove and support the conclusion. The results also proved that the effects on features did not depend on the experiment modes. However, for equal probability mode, when unfamiliar persons exist in the series of pictures, it had a significant effect on features. And interestingly enough, the PVDs for males under equal probability are (7.11, 6.75, 4.80) while for females are (6.37, 6.74, 4.92), hence the results for males under small probability are (0.80, 0.76, 0.72) while for females are (0.73, 0.83, 0.93). Seen from the results, the PVDs for males under equal probability with the mode 1 and 2 are significantly higher than that of results for females, but presented the opposite results with the mode 3, which implies that females are more curious for strangers in the pictures. While under the small probability, the performance for females compared with males is shrinking but higher than results of males with the mode 2 and 3, which implies that females are more interested than males for novelty in small probabilities. Furthermore, we used 5 time multiplier for observing the effects for subjects with the long-time experiment. The PVDs from 150 ms to 400 ms for 10 subjects are listed in Table 2 (mean of 10 subjects and standard deviation). In the header of Table 2, 1-5 corresponds to 5 time multiplier as a computation standard for each subject, and the value is the absolute peak to valley differences (PVD). With the increase of time multiplier, the PVDs for each subject is a growth process and reaches a stage high when the time multiplier is 3, which the most likely reason is that the attention of the subjects is increasing after many experiments. However, continue to increase time multiplier, the value of PVDs began to decline obviously. The results clearly show that the experimental duration of subjects has remarkable influences on EEG data characteristics, due to the fact that the subjects had a feeling of boredom and disgust doing on experimental tasks for a long time. Therefore, the application of visual stimulus model for person authentication, likely to be the negative effects of the experimental period on the visual stimulation of EEG signals need to be considered.

Discussion
The main stimulus patterns included in our study were visual stimulus, motor imagery and resting state that can be compared to the study of Yeom et al. [12], Marcel et al. [15] and Miyamoto et al. [16].
In fact Yeom et al. [12] used visual evoked potential as the input signal source, obtaining an average accuracy of 86.1% for biometric authentication, better than that of 80.7% based on motor imagery in Marcel et al. [15] and that of 83.9% using resting state to identify biometric authentication in Miyamoto et al. [16], while a lower FR rate of 13.9% [12] is obtained as compared with 24.3% (FR) [15] and 21% [16]). Our results show that the classification accuracy of using the visual stimuli method is 82.3%, slightly lower than that reported by Yeom et al. [12]; however this study gives the lowest FR (11.2%). Therefore, this methodology should be suitable for use in visual evoked potentials to identify biometric authentication.
We carried out EEG-based identity recognition with picture stimulus as the object of study to conclude on the accuracy of results during experimental repetitions. The height between the absolute peak to valley differences (PVD) as a feature parameter when we implement three different experimental criteria for better stability and analyzing the results, we draw on the conclusion that responses to designated picture at different places vary, in terms of the absolute peak to valley difference from 150 ms to 400 ms. The same conclusion is also obtained small probability and equal probability cases. After multiple tests on the subjects, visual fatigue appears despite the use of short enhancement on visual effect and the difference between subjects' responses to objective pictures and other pictures becomes smaller. For authentication, both different sequences and aesthetic fatigue have effects on recognition results which further indicate that the stability of EEG signals based on visual stimulation was improved.