Evaluation of EEG-based Complementary Features for Assessment of Visual Discomfort based on Stable Depth Perception

The investigation aimed at the evaluation of EEG activity during stereoscopic perception of images with different levels of visual comfort. Different levels of disparity and the number of details in stereoscopic views in some cases make it difficult to find the focus point for comfortable depth perception quickly. During our investigation, we found a tendency for differences in single sensor-based EEG signal activity at specific frequencies. A dataset of EEG signal records from 19 control subjects was collected and used for further evaluation. To support the reproducible research this dataset of EEG activity with associated subjective scores was made publicly available. During the experimental investigation, we found differences in EEG signal activity at different levels of visual comfort. In addition, the dynamics of EEG signal activity correlated to the moment of depth perception indication registered by the control subjects. The results of our investigation show that the ratio of alpha estimated from a single EEG sensor placed over the frontal lobe can serve as a complementary feature for the automatic detection of visually uncomfortable stereoscopic views.


Introduction
Comfortable stereoscopic perception continues to be an essential area of research.The growing interest in virtual reality content and increasing market for head-mounted displays (HMDs) still cause some issues of balancing depth perception and comfortable viewing.Stereoscopic views are stimulating binocular cues -one type of several available human visual depth cues which becomes conflicting cues when stereoscopic displays are used [1].Depth perception by binocular cues is based on matching of image features from one retina with corresponding features from the second retina.The matching process was analyzed in Cumming and Parker [2] research work.They have investigated disparity-selective neurons' reaction to anticorrelated random dot stereograms and have found, that they do not unambiguously signal stereoscopic depth.It is known that our eyes can tolerate small amounts of retinal defocus, which is also known as Depth of Focus.Such reactions should be taken into account when selecting stereoscopic setups before rendering.
The rendering of stereoscopic images may have different set-ups which have the direct influence on the comfort of visual perception and ability to focus on the object of interest in the image.If there would be possible to estimate the comfort level in real-time during individual stereoscopic perception, various virtual camera separation parameters may be adjusted [3] to avoid visually uncomfortable scenes in the rendered video stream.Virtual camera separation, used during the rendering of stereoscopic views, depending on the scene, causes different image disparities followed by too much or too little-perceived depth on a target display.In their work, Jones et al. [4] proposed a way of controlling perceived depth in stereoscopic images by using an analysis of the distortions introduced by different camera parameters that is applicable to HMDs.
In this paper, we analyze an EEG signal captured from a single sensor placed on the frontal lobe as an indicator or a candidate for a feature to detect visually uncomfortable scenes in stereoscopic view.The detection of such scenes is essential for adjusting disparity according to the content of the image and for predicting and reducing eye fatigue (eye strain) caused by using stereoscopic displays for extended periods.
The analysis of EEG activity during stereoscopic perception has been addressed in a wide range of previously published research work [5], [6].Fischmeister et al. extended the list of previous research work on non-natural images with an investigation of depth cues from natural stereoscopic images [7].Fazlyyyakhmatov et al. [8] analyzed the power of cortical activity during cognitive tasks.Various researchers have used EEG-based measurements to estimate different levels of visual fatigue using 2D and 3D displays [9], [10].The Analysis of Event-Related Potentials (ERPs) has not shown noticeable differences after watching 3D movies [11][12][13].Therefore, EEG band activity measurements were selected for this study.
In our investigation, we used the results of the experimental tests with 16 control subjects to evaluate the differences in EEG activity looking at pictures with five levels of visual comfort.The visually comfortable stereoscopic view is easy to find and quickly focuses on the point in the image at which depth perception is comfortable.There is no object in such images, for which focusing is hard to complete.However, in some stereoscopic images, focusing on the object is challenging and takes more time.During the experimental investigation, control subjects were asked not only to rate images from 1 to 5 according to the visual comfort but also to fix the moment at which focusing on the object was successful.With such an additional marker, we extended the IVY LAB database [14] with annotated single-sensor EEG data and were able to analyze user behavior during stereoscopic perception.

Materials and Methods
During the investigation presented in this paper, we aimed to indicate the dynamics in single-sensor EEG signal activity before the user focuses on the point in the image at which the depth perception occurs and the EEG activity after that moment.We analyzed EEG signal activity as changes in the signal spectrum energy at different frequency bands.The duration from the moment a new image was presented to the user, the moment of focus, and successful depth perception varied from user to user and also depended on the visual comfort level for the image.Collected data was made publicly available1.It is worth to note that there is no publicly available dataset containing EEG activity associated with visual comfort for stereoscopic images.

Group of Volunteers
A group of 19 subjects (16 males and 3 females) participated in our experiment as volunteers.Subjects received no rewards or compensations for their participation.Their age varied between 19 and 37 years old, with an average of 22.6 (with a standard deviation of 5.1).All volunteers were informed about procedure, goals, and the subjective assessment phase of the experiment.Also, participants were instructed to press the spacebar key as soon as they had perceived the depth of the shown stereoscopic 3D content.We have registered the time of each subject input as a depth perception indication (DPI).Furthermore, all subjects have signed consent forms and orally expressed that they were ready to begin the experiment.No subjects were rejected after screening based on [15].However, three subjects were discarded due to faulty reference connection (for one subject) and measurement errors, which led to substantially decreased sampling rates (for two subjects).Consequently, we used 16 subjects' records for this study.

Experimental Setup
Subjects were seated in front of the 17-inch screen in which stereoscopic images were shown.Sitting distance from the screen was approximately 70-80 cm.To simulate real-life conditions, head and body motions were not restricted; subjects were asked to sit freely and comfortably; and subjects were encouraged to wear corrective lenses.Additionally, each participant, if desired, could select a musical background.Stereoscopic visual stimuli were shown on a 1280 × 1024 resolution screen with a 60 Hz refresh rate.The stereoscopic 3D effect was produced by using anaglyph red and blue image encoding.Thus, to achieve the stereoscopic 3D effect, the participants were wearing red-blue filter glasses.Anaglyph technology is traditionally considered more prone to crosstalk [16].However, a study from 2013 claims that crosstalk is lower on passive displays than on active displays [17].Yet another study found no major difference between active and passive stereo [18].Using anaglyph system we were able to accumulate more factors causing visual discomfort and visual fatigue.More factors increases possibility to detect changes in the analyzed EEG signal spectrum.

Dataset
We used stimuli images from an IVY LAB stereoscopic 3D image database [14] that had been used previously in some research work [19], [20].This dataset contained 120 stereoscopic images with urban, nature, and indoor objects, including humans and non-living entities, as shown in Fig. 1.Image resolution was 1920 × 1080 pixels with the magnitude of crossed disparity ranging from 0.11 to 5.07 degrees.

EEG Signal Acquisition Procedure
Before the experiment, all subjects took color blindness and stereoblindness tests, namely, the Ishihara Color Vision and Fine Stereopsis tests [21], respectively.The experiment consisted of 120 trials.The order in which the images were displayed to the subjects during the experiment was randomized.We divided each trial into two phases.During the first phase, a stereoscopic stimulus was displayed for the subject until 5 seconds passed after DPI input.Directly after the first phase, an evaluation screen was shown to the subjects for about 5 seconds.During this time, subjects were asked to grade their level of visual comfort.The subjective assessment was carried out using the single-stimulus adjectival categorical judgment method of ITU-R BT.500-13.Visual comfort was graded on a five-grade scale from 1 (extremely 1DOI:10.13140/RG.2.2.27145.75366uncomfortable) to 5 (very comfortable).We informed the subjects that they could stop to rest or quit the experiment at any time.Also, two resting periods of at least 30 seconds, after 40 and 80 images, were mandatory.The total duration of the experiment was approximately 40 minutes.The experimental procedure, anaglyph stereo rendering, timing, and keyboard input controls were implemented using Psychtoolbox software tools [22].

EEG Signal Pre-Processing
EEG signals were captured using a consumer-oriented Neurosky Mindwave headset with a single electrode placed at the frontal lobe.This device uses a dry sensor technology and is worn similar to a normal audio headset.Maskeliunas et al. [23] previously analyzed the ability to use consumergrade EEG units for control tasks and named some problems that should be taken into account before using them for control tasks.In our investigation, we analyzed EEG activity as a complementary feature for image comfort level classification.Therefore, the requirements for the accuracy of the headset were acceptable.
EEG is described regarding its frequency band.The varying amplitude and frequency of the wave represents various brain states [24], which depend on external stimulation and internal mental states [16].The most common classification uses EEG waveform frequencies (e.g., alpha, beta, theta, and delta).The signal captured from the headset had a sampling rate of 300 Hz and was additionally filtered using a 4 th -order digital Butterworth band-pass filter with cutoff frequencies between 0.01 Hz and 40 Hz.To remove ocular artifacts from the EEG signal, we used a wavelet ICA (wICA)-based method [25].We made two minor modifications to this method by reducing the threshold multiplier to 0.3 and selecting a fastICA algorithm to extract the ICA components.
For the time-frequency analysis of the single-sensor EEG signal, we used a "multitaper method" based on Slepian sequences as tapers.We used Ostenveld et al.'s [26] implementation of this method for our research.We analyzed frequency components in the range of 4 Hz to 30 Hz with a 1 s duration analysis time-frame.In addition, we applied 4 Hz spectral smoothing through multi-tapering.

Analysis of Results
During the experimental investigation, each subject was free to rate the stereoscopic images according to their personal experiences.The requirement of identifying the DPI by pressing a key was an additional stimulus to concentrate on each image evaluation and provided additional time to make a decision.The subjective assessment results of visual comfort showed that individual scores of 5 images varied from "very uncomfortable" (VUn) to "uncomfortable" (Un).The visual comfort of 21 images varied from "uncomfortable" to "mildly uncomfortable" (MdUn).The 58 images had variations between "mildly uncomfortable" and "comfortable" (Co), and 36 images had visual comfort assessment variations between "comfortable" and "very comfortable" (VCo).Com- pared to the experiment of Jung et al. [19], our experiment had two main differences between experimental conditions, namely different screen size and technology, for achieving the stereoscopic depth effect.The mean difference between subjective assessment results were 0.061 (with a standard deviation of 0.42).We have shown subjective image evaluation differences between ours and Jung's et al. [19] experiment in Fig. 2.
We have analyzed results of our experimental investigation from the viewpoint of EEG signal-based features to be used for automatic comfort level prediction.Therefore, we have compared the time from image presentation start to the time DPI was indicated by the subject, and we have performed the comparison of statistical similarity between EEG signal at different frequencies between different visual comfort levels.

Comparison of pre-DPI Time
In our investigation, we used the term pre-DPI time to indicate the duration between the start of the new stereoscopic image appears on the screen till the moment the user press spacebar as an indication, that the user achieved depth perception.
Histograms shown in Fig. 3 illustrates pre-DPI time statistics for images with different mean opinion scores.The number of samples (number of images) in each subjective assessment group varies from 296 to 522.Therefore, we applied normalization according to the probability density function of the histogram.The resolution of the bins in the histogram is 0.5 s.
The mean value of pre-DPI time given in Tab. 1 shows that 5-6 seconds were spent on average by the subjects to perceive the depth of the image and indicate it using the keyboard.The most visually comfortable images, those classified into the "very comfortable" group, required the shortest time for a decision with the smallest standard deviation.

EEG Activity at Different Comfort Levels
The selection of a pre-DPI time for analysis of spectral components is important for EEG signal-based feature estimation.To ensure that the EEG spectral components carry statistically separatable data, we have compared EEG activity between images grouped to five comfort levels at different time frames: 0.5, 1, 3, 4, 5, 6, 7, and 10 seconds.
The selection of frequency band for EEG activity analysis plays a vital role in the comfort level prediction capabilities.Since frequency bands (alpha, beta, theta) have a tendency to contradict each other, Cheng et al. [27], in their study, not only used EEG power analysis in specific frequency bands but also used different combinations to estimate relative power between different bands (e.g., theta/alpha, beta/alpha, etc.).Also, Zou et al. [28] evaluated six types of band ratios during repetitive random dot stereogram based task, results showed significant differences of all investigated ratios, however alpha activity was found as most promising indicator in their experiment.
In our study we have used the following frequency band separation: theta (θ) 4-8 Hz, alpha (α) 8-13 Hz, low beta (β l ) 13-17 Hz, beta medium (β m ) 17-21 Hz, high beta (β h ) 21-30 Hz [29].In addition to common band ratios,   (c) Alpha and high beta power ratio Fig. 5. Average EEG power of all subjects using 5-second pre-DPI window size.The abscissa shows visual comfort scores from 1 -"very uncomfortable" to 5 -"very comfortable".The error bar shows the standard error of means.(c) Alpha and high beta power ratio Fig. 6.Average EEG power of all subjects using 6-second pre-DPI window size.The abscissa shows visual comfort scores from 1 -"very uncomfortable" to 5 -"very comfortable".The error bar shows the standard error of means.
Table 2 presents the results of ANOVA measures for the five groups of images with different visual comfort levels on different brain activity bands, ratios, and pre-DPI times.θ, α, β l , θ/β, (θ + α)/β, α/β and α/β l showed no statistically significant differences at all investigated pre-DPI times.Also, no significant differences were indicated at 10-second, pre-DPI time in all investigated frequency bands and ratios.Lowest pvalues (p < 0.001) were found in high beta-frequency bands and in alpha/high beta ratio at 5 and 6-second pre-DPI time.These results are in line with mean durations of pre-DPI time, as shown in Tab. 1.
In this paper, we used the one-way analysis of variance (ANOVA) to determine whether there are any statistically significant differences between the means of five image visual comfort groups.We used multiple comparison tests using the Tukey-Kramer (α = 0.05) method, when ANOVA results showed significant effects.
Results of the post-hoc Tukey's HSD tests at 4-, 5-, and 6-second pre-DPI times are given in Tabs.3, 4, and 5. Tukey-Kramer's post-hoc tests revealed significant differences between subjective assessment value pairs.At the 4-second pre-DPI time, significant differences between VUn and all other groups were observed in α/β h ratio.Also, significant differences were found in θ/α oscillatory activities between VUn and VCo (p = 0.045), and Un and VCo (p = 0.002) pairs.The test results showed that the β h activity at 4-second pre-DPI time for VCo was significantly different from VCo and Co.The post-hoc test revealed no significant differences between other visual comfort pairs at 4-second pre-DPI time.The 5-second pre-DPI time posthoc analysis indicated significant differences between VUn and Co, VUn and VCo, Un and VCo, and MdUn and Vco pairs of β h frequencies, and also between Un and VCo of θ/α ratio.Moreover, significant effects with p-values less than 0.001 were found between VUn and VCo and Vun and VCo for α/β h pairs.Between VUn and Un, VUn and MdUn, Un and VCo, and MdUn and VCo (p > 0.05), significant differences were observed in the α/β h ratio.However, there were no significant differences between other visual comfort pairs in the oscillatory activities investigated.
The post-hoc results of the 6-second pre-DPI time are shown in Tab. 5.There were significant effects of differ-ent visual comfort between the VUn and Un (p = 0.022), MdUn (p = 0.016), Co (p = 0.001), and VCo (p = 0.000) pairs.Furthermore, using the 6-second pre-DPI time, significant difference was found between the VUn and VCo (p = 0.016) pair in the β m frequency band.In the β h frequency band, significant differences were found between the VUn and Co (p = 0.017), VUn and VCo (p = 0.001), Un and VCo (p = 0.021), and MdUn and VCo (p = 0.008) pairs.No other subjective assessment group pairs showed significant differences in the investigated frequency bands and ratios.
The effects of visual comfort on oscillatory activities at 4, 5, and 6-second pre-DPI times are shown in Figs. 4, 5, and 6.Relative alpha power (see Figs. 4a, 5a, and 6a) of the subjects increased with higher visual comfort levels, while relative beta high power (see Figs. 4b, 5b, and 6b) decreased with the higher visual comfort levels.Therefore, the alpha and beta high ratios (see Figs. 4c, 5c, and 6c) decreased with higher comfort levels.

Discussion
Alpha oscillatory activity showed no significant differences in all investigated pre-DPI time durations.However, using α/β h ratio the best results were found.Figures 4a, 5a and 6a, illustrate the relative mean power of α activity is reduced when visual comfort increases.The reduce of α activity did not show significant changes suitable for classification or prediction of visual comfort level.However, the activity of β h tended to increase for the higher level of visual comfort.
After estimation of the different ratios of EEG sub-band activities (see Fig. 2), we noticed a strong significance in differences for α/β h ratio.This phenomenon was expected to take into account Chen et al.'s previously published results, where they used a full beta frequency range for the ratio estimation [10].However, in our study, we noticed that α ratio for visual comfort level classification should be estimated in respect to β h frequency range and taking the full beta frequency range decreases the significance in differences (see Tab. 2).In Figs.4c, 5c, and 6c we can clearly see that the ratio α/β h , separates images with lowest level of visual comfort significantly.Subjects noticed after the experiment that for some of these images, the DPI was not achieved at all.This shows that using the α/β h ratio as a feature, it is possible to recognize stereoscopic views in the video sequence for which DPI is hard to achieve and to perform automatic correction and reduction of disparities in stereoscopic view.
Developers of VR headsets are implementing EEG and ECG biosensors in their newest devices [31], [32].Compared to the standard subjective assessment methods physiological data based calibration of HMDs is more comfortable and swifter for the user.Main issues of the objective based calibration of HMDs are accuracy, reliability and comfortability.
Our proposed technique can be used towards solving these issues.Using DPI time and EEG data it is possible to detect individual sensation of visual discomfort and calibrate HMD individually for each user without spending a large amount of time.

Conclusions
Using a pre-DPI time of 4 seconds, it is possible to separate 2 nd and 5 th visual comfort levels.Using a pre-DPI time of 5 seconds, it is possible to find significant differences between 5 th visual comfort level and 2 nd or 3 rd .
The experimental investigation, performed on original recordings from 16 subjects, showed that the ratio α/β h , which takes narrow subband high beta waves, is a better choice for a visual comfort classification compared to the ratio of alpha with a full range of beta waves.α/β did not show significant differences in our experimental EEG data.
While the paper does not propose a complete solution for visual comfort level recognition from a single sensor's EEG signal acquired using a consumer-grade device, it shows the possibilities to detect visually uncomfortable views by monitoring the ratio α/β h as a feature.Additionally, EEG data with associated subjective scores collected for this research was made publicly available.

Fig. 1 .
Fig. 1.Right eye stimuli of the IVY LAB stereoscopic 3D image database.

Fig. 2 .
Fig. 2. Comparison of the subjective assessment results for visual comfort of 120 stereoscopic images.Our results of the subjective assessment are shown using yellow bar with 95% confidence intervals.The teal dots represent mean opinion scores obtained from the Jung et al. experiment.The mean difference of opinion scores between ours and Jung et al. experiment results were 0.061 ± 0.42.

8 Tab. 1 .Fig. 3 .
Fig. 3. Time required to establish stable depth perception (DPI).Histograms of 5 subjective assessment scores of visual comfort are shown.The abscissa represents DPI time with 0.5 s bin resolution.

Fig. 4 .
Fig.4.Average EEG power of all subjects using 4-second pre-DPI window size.The abscissa shows visual comfort scores from 1 -"very uncomfortable" to 5 -"very comfortable".The error bar shows the standard error of means.
One-way ANOVA results of the visual comfort scores for different oscillatory activity and their ratio against investigated pre-DPI window size.The p-values less than 0.05 are highlighted.
was born in 1980.He is a professor in the Department of Electronic Systems at VGTU.He received his Bachelor diploma of Electronics Engineering in 2002, MSc. of Electronics Engineering in 2004 and doctor of Electrical and Electronics Engineering degree in 2008 at VGTU.In 2012 he received an Associate Professor title and in 2017 Professor title at VGTU.Main research interests include real-time image and signal processing, intelligent systems.He has published more than 50 papers in reviewed journals and conference proceedings, a monograph in field of intelligent electronic systems and two textbooks.A. Serackis has successfully finished 9 research and development projects, currently supervising 4 Ph.D. students, is a senior member of the IEEE and is a Chair of IEEE Lithuania Section Signal Processing Society/Computational intelligence Society/Communication Society joint chapter.Andrius KATKEVIČIUS was born in 1984.He received his Bachelor, Master and Ph.D. degrees in Electrical and Electronic Engineering at Vilnius Gediminas Technical Univer-sity in 2007, 2009, and 2013, respectively.He is an Associate Professor at the Department of Electronic Systems of Vilnius Gediminas Technical University.His primary research interests include the electromagnetic field theory, super-high frequency technologies and microwave devices, signal processing, multimedia and embedded systems.Mr. Katkevičius is a member of the IEEE and an active member of the IEEE Microwave Theory and Techniques.Darius PLONIS was born in 1984.He received his Bachelor, Master and Ph.D. degrees in Electrical and Electronic Engineering at Vilnius Gediminas Technical University in 2008, 2010, and 2014, respectively.He is an Associate Professor at the Department of Electronic Systems of Vilnius Gediminas Technical University.His primary research interests include electromagnetic field theory, super-high frequency technologies and microwave devices, signal processing, multimedia and embedded systems.He is a member of the IEEE, an active member of IEEE Microwave Theory and Techniques, and currently serves as Secretary of IEEE Lithuania Section.