Cognitive workload modulation through degraded visual stimuli: a single-trial EEG study

Objective. Our experiments explored the effect of visual stimuli degradation on cognitive workload. Approach. We investigated the subjective assessment, event-related potentials (ERPs) as well as electroencephalogram (EEG) as measures of cognitive workload. Main results. These experiments confirm that degradation of visual stimuli increases cognitive workload as assessed by subjective NASA task load index and confirmed by the observed P300 amplitude attenuation. Furthermore, the single-trial multi-level classification using features extracted from ERPs and EEG is found to be promising. Specifically, the adopted single-trial oscillatory EEG/ERP detection method achieved an average accuracy of 85% for discriminating 4 workload levels. Additionally, we found from the spatial patterns obtained from EEG signals that the frontal parts carry information that can be used for differentiating workload levels. Significance. Our results show that visual stimuli can modulate cognitive workload, and the modulation can be measured by the single trial EEG/ERP detection method.


Introduction
Perception of a visual stimulus depends on several factors including strength of the stimulus, top-down attention, and the a priori knowledge level of the subject [1,2]. Strong stimuli alone may not be sufficient for full perception while weak stimuli may be fully perceived if given sufficient attention. In addition, perception can greatly be facilitated and enhanced once the source of stimulus is identified or with previous knowledge of its name or category. Consequently, a weaker neural response is recorded when receiving anticipated stimulus in comparison with a response observed with an unexpected stimulus [3]. It was also reported that the subject's expectation lowers the threshold of perception and reduces the latency of neuronal signature [4]. Overall, we can say that bottom-up perception could dominate when the stimulus is very clear, while top-down processes may be expected to exert more influence when viewing conditions deteriorate due to the lack of clarity or viewing time.
The influence of noise on perception, attention and vigilance has been investigated by several research groups since the 1950s [5]. The focus, however, was primarily on studying the effects of audio noise on cognitive performance with emphasis on dimensions that include noise volume, whether it is a continuous or intermittent, and whether it is a stationary white noise or nonstationary music. Results varied across the publications from improvement to no change to decrement.
Hancock for example found that low volume continuous white noise reduced performance on cognitive vigilance tasks [6], while Blackwell and Belt found that neither high nor low volume white noise affected the performance on low-demand sensory tasks [7]. However, Davies and colleagues found that low volume varied noise such as music improved performance on vigilance tasks [8]. External visual noise was also investigated by Lu and Dosher to study how increasing attention can decrease the effect of external and internal noise [9].
The impact of visual stimulus degradation on cognitive workload is highly important in many real-world and industrial applications. Visual perception and cognition are critical, and how the visual degradation (noise), if added into the visual input, affects the cognitive workload, has only begun to be explored [10]. Hence, the prime objective of this research is to address the impact of visual stimuli degradation on cognitive workload. In general, cognitive workload can be perceived as the effort required by a subject to perform a task and constrained by the subject's capability. The performing subject will be either overloaded or under-loaded when the task demand approaches extreme conditions. Both circumstances are accompanied by the cognitive inefficiency. Therefore, in order to optimize a performer's efficiency, it is important to identify the workload level so that novel strategies can be explored to enhance cognitive performance.
The workload levels are usually modulated by arithmetic tasks [11,12], or multi-attribute task battery (MATB) which includes tasks designed for realistic work effort by air traffic controller [13,14]. The workload imposed by these tasks consists of many complicated cognitive processes and is topdown modulated. Whether and how a simple change in a sensory input, such as vision, modulates the cognitive processes, needs to be studied. One attempt is the work by Haapalainen et al [15], which investigated an individual cognitive process, visual cognition. They focused on visual perception factors including flexibility of closure, speed of closure and perceptual speed. Another interesting aspect of vision is degradation. It is reported that the response time in a letter discrimination task could be increased by both dynamic noise and static noise [16], which implied that the noise influences the perceptual encoding of stimulus features and the evidence accumulation of decision making. In addition, O'Reilly et al (2012) suggested that the recurrent processing would enhance the visual recognition by strengthening bottom-up visual inputs of degraded images [17]. Since adding noise can influence the top-down and bottom-up reaction, the visual degradation may serve as a feasible and different approach for workload modulation.
There are four accepted discrete methods used for the assessment of cognitive workload: analytical methods, subjective methods, performance methods and psycho-physiological methods [18]. Compared to the other approaches, psycho-physiological methods are passive and usually interfere little with the primary task. In addition, the physiological signals being measured pose only a second-order inference about the processing which occurs in the central nervous system [19], and are therefore considered to be more objective and reliable. Electroencephalogram (EEG) is one of the biomarkers suited for workload assessment. EEG directly manifests the dynamics brain activities at high temporal resolution and may be a more promising tool for instantaneous or realtime workload monitoring.
Although there is solid evidence demonstrating that EEG signals, after being decoded, can objectively convey the workload levels, it is also suggested that EEG is vulnerable to task designs [20], adopted cognitive strategies [21], and even individual's emotions [5]. In other words, EEG patterns may be task-specific as well as individual-specific. The most widely used EEG patterns are oscillatory band powers. For instance, in a visual n-back task conducted in [21], the θ-band spectral power was found to increase in the frontal midline along with the increased task difficulty, while the α-band spectral power was attenuated. Similarly, a decrease in the αband power with the workload was reported in another n-back task [22]. However no significant difference in the θ-band was observed [22]. Moreover, in an EEG/fMRI study, θ, β and γ bands demonstrated positive load effects, while lower alpha showed negative effects [23].
In addition to the oscillatory EEG, the event-related potentials (ERPs) are another signature used for real-world applications [24]. Therefore, efforts have also been devoted to studying the linkage between the ERPs and the workload. The typical way to assess workload using ERPs is to conduct a secondary two-stimulus oddball task in parallel with the primary complex one [25,26]. However, this secondary task is not optimal as it could detract mental resources from primary task. To overcome this defect, Allison and Polich introduced a single-stimulus oddball paradigm into the primary gaming task, and studied the impact of workloads on the averaged ERPs [27]. However, the work lacked the single-trial workload monitoring capability. This gap was filled in by [22], which for the first time showed that ERPs in supplemental to oscillatory EEG serve as one of the feature sources for the workload binary classification. It is noteworthy, however, that there was no feature extraction in [22], which could lead to an over-fitting problem that is not defined in a high-density EEG measurement scenario.
This study performs the single-trial classification of multiple workload levels. There are two unique aspects in this study: (1) in the context of high-density EEG measurement, an effective single-trial method for not only oscillatory EEG but also ERPs, namely bilinear common spatial pattern (BCSP) [28], is implemented to extract discriminative; (2) unlike previous research, which manipulated the workload level induced by arithmetic tasks or MATB, this study demonstrates that visual degradation can easily influence the workload performance.

Experiment system
The experiments were conducted following the approval of the National University of Singapore Institutional Review Board. Sixteen healthy subjects participated into the experiments. All participants had normal or corrected-to-normal vision, were not on medication, and without any history of neurological, cardiovascular disease, hypertension, or psychiatric disorders. After signing the consent form, each participant had to go through the color blindness and dominant eye tests. Ahead of experiments, they were also subjected to the Epworth sleepiness scale test. In addition, at the beginning and the end of each experiment, each participant had to fill in the short stress state questionnaire. After completing each workload level session, the participant was asked to complete the NASA task load index (NASA TLX) questionnaire [29]. The NASA TLX assumes six possible sources or demands of workload: mental demand, physical demand, temporal demand, performance, effort, and frustration. Each possible source was rated on a scale 0-20 by the subject after each workload level experiment. The subject was also asked to give a weight for each of the six possible dimensions such that weight zero is allocated to the least relevant possible workload source and weight 5 is allocated to the most relevant workload source. These weights are then multiplied by each source rating score and summed across all the six possible sources. The tally score of all sources is then divided by 3 to produce the final workload score between 0 and 100.

Paradigm
The experiment was carried out in a quiet room with controlled level of luminance. The visual stimuli were presented to the subject on a 24′′ image stimulus monitor. After signing the consent form, the subject was seated such that the distance between the eyes and the monitor was approximately 57 cm corresponding to a visual angle of 40 × 30°. EEG data were collected from each subject while performing the experimental task. Each workload level session lasted approximately 10 min. The experiment lasted for approximately 90 min including subject preparation, trials and breaks. Resting breaks were given between different workload level sessions. If the subject fell asleep, the experiment was terminated and a new appointment was scheduled.
There were four levels of sessions in an experiment. For each session, the subject would be presented with sequences of visual stimuli (see figure 1). Each sequence comprised of a fixation cross (500 ms), Digit1 (300 ms), Digit2 (300 ms), an image (300 ms, 256 × 256 pixels). Digit1 and Digit2 were randomly chosen from 1 to 9. The image was also randomly drawn from an image pool, and was categorized as either a human face image or an object image (not human face image). The fixation crosses, digits and images were displayed at the center of the monitor. After the image disappeared, the subject was required to respond correctly and as fast as possible by hitting on the keyboard letter 'Q' and 'P' keys for every target trial and every non-target trial, respectively. The definition of the target trial and non-target trial were separately defined for each level. A maximum window of 3000 ms was allowed for the subject's response before the next sequence started. Before commencing the experiment, each subject went through a practice session for about 8 min where he/she could be acquainted with all the workload levels. Although the different workload levels were explained in an ascending order, it should be noted that each subject was exposed to these levels in a random order. This was essential to eliminate any adaptation or anticipation. However, it is necessary to point out that since the images are not masked, the stimulus processing which is subject to individual's perceptual memory may not be discreet and localized in time. The 4 workload levels are described as follows: (1) Level 1: the subject was asked to only consider a sequence containing a human face image as a target trial. A total of 210 different stimuli images were used and only 30 images were human face images. (2) Level 2: the subject was asked to consider a sequence as a target trial, if and only if the digits proceeding the face image were either both even or both odd. The other combinations were treated as non-target trials. (3) Level 3: the definition of the target trial was the same as Level 2. But at this level, the signal to noise ratio (SNR) of all displayed images was reduced-by degrading the image and adding salt and pepper noise to them. The SNR was approximately 0 dB. The SNR of the digits remained unchanged. It is hypothesized that visually degraded stimuli would add an additional amount of cognitive workload. (4) Level 4: in comparison to Level 3, the SNR of all displayed images was further reduced to −5 dB.

Acquisition and preprocessing
For every subject, 62-channel EEG signals, 1-channel vertical electrooculogram (EOG) signals and 1-channel electrocardiogram were collected with sampling frequency of 512 Hz, using an ANT amplifier (ANT B.V., Enschede, Netherlands). EEG signals were referenced to linked ears and grounded to the forehead. The filtering was conducted using an EEGLAB function 'pop_eegfiltnew' [30], with a passband from 0.  second-order blind identification based method using EOG signals as reference was utilized for EOG artifact removal [31,32].

Feature extraction
Two feature extraction methods were applied, i.e. the ordinary power spectral density (PSD) and the BCSP method [28].
Since the ERPs are of interest in this work, both the PSD features and BCSP features were extracted from epoch segments starting from the onset of Image to 500 ms after the onset of the image.
(1) PSD features: the spectral powers were calculated and averaged in the frequency bands, i.e. α (8−12) Hz, θ (4-7) Hz and δ (0.5−3) Hz [33], and then were concatenated across the 62 channels as a feature vector. (2) BCSP features: BCSP defers from CSP [34] whose objective function is to extremize (either maximize or minimize) the power ratio between two conditions, while BCSP has two separate objective functions, both of which are to maximize the power ratio [28]. The BCSP objective functions are expressed as: where W and V are spatial and temporal projections, respectively. X c is the EEG data of condition c which includes both target epochs and non-target epochs as the study in [22]. det(•) is the determinant operator. For both objective functions, iterative learning can be used for obtaining bilinear projections, i.e. W and V [28]. To be specific, V is firstly assumed to be already known and initialized to be a full rank square matrix. Let us take (1) for instance. The spatial projection W can be obtained by solving the eigenvalue problem for any matrix Q, (1) can be equivalently expressed as Given the newly obtained W from (3), the temporal projection V can be updated by solving another eigenvalue problem Through iteratively updating W and V using (3) and (5), the ratio value of the objective function will converge. It is necessary to point out that usually only a few projections corresponding to the highest eigenvalues are retained during the iteration. The detailed reasons can be found in [34]. In this study, the first two spatial projections and first four temporal projections obtained from each objective function were kept during the iteration and finally were used for feature extraction. Consequently, a total of four features (two from each objective function) could be obtained using BCSP. It is noteworthy that it is a multi-level classification problem, as there were four workload levels measured in this study. However, BCSP is originally designed for the binary classification problem. In order to address such a contradiction, BCSP should be applied multiple times by boiling down the multi-level classification to several binary classifications. Therefore there are six iterations of BCSP feature extraction based on all pair-wise combinations of the four conditions, resulting in 12 objective functions. Twelve sets of bilinear projections could be derived from these objective functions and 24 features were extracted from every epoch.

Classification
There were four workload levels, requiring multi-level classification. The strategy used for multi-level learning in this study was one-against-one. Similar to the BCSP feature extraction approach, six classifiers needed to be generated for all pair-wise combinations of the four conditions. The majority voting determined the prediction results. In the case of even scores during the voting, the final prediction depended on the probabilistic values of each classifier. Since there were no specific training sessions and testing sessions for each subject, training data (80% of the whole data) of each condition were randomly collected to extract BCSP filters and train the classifier for each subject. The rest of the data (20% of the whole data) were then used for classification evaluation. This training and testing process were performed five times and the averaged classification results were reported. The classifier adopted in this study was linear probabilistic LIBSVM [4], which was robust and less vulnerable to the over-fitting problem, provided that the training sample size was small. The parameters of the linear probabilistic LIBSVM were tuned using the training data to do grid search with four-fold cross-validation.  Figure 2 shows the workload of all subjects as measured using the subjective NASA TLX test. ANOVA test results clearly show that the four workload levels used in this experiment are indeed different with df = 3, F = 9.53 and pvalue = 9.9*10-5. This p-value confirms that the observed subjective workload levels do not support the null hypothesis, and thus the null hypothesis must be rejected and accept the change in workload level with task difficulty.

Multi-level classification results
The multi-level classification results using PSD features and BCSP features for four workload levels are presented in tables 1 and 2, respectively. As can be seen in table 1, the average classification accuracy of each level is above chance, i.e. about 70%. Particularly for a few subjects, the accuracies are over 80% and even as high as 90%. These indicate that generally the PSD features can be used for multi-level workload classification in this study. Compared to table 1, the overall performance in table 2 appears to be much better. The average single-trial classification accuracies are over 80% for all four levels. In particular, there are several subjects achieving over 90% accuracy. Such high single-trial classification accuracies demonstrate that the adopted BCSP method and classification method effectively addressed the multi-level workload estimation problem. In order to verify whether the EOG-related activities contribute to the classification, the measured EOG signal had also been used for BCSP feature extraction and LIBSVM classification. The classification results are presented in table 3. It can be seen that the classification results in tables 2 and 3 for majority of subjects are very different, which may indicate that the contributions from EOG-related activities to the EEG/ERP classification are limited.

Event-related potentials
The experiment design is a variant of the oddball paradigm, in which typically a target can elicit a positive voltage deflection peaking around 300 ms after the presence of the target, i.e. P300. Figure 3 depicts the target ERPs as well as non-target ERPs at middle line (electrode locations Fz, Cz, Pz and Oz) averaged across all subjects.
As can be seen in figure 3, there are prominent P300 waves in the target ERPs, especially at central (Cz) and parietal (Pz) locations. Generally, there is a strong evidence that the amplitude of P300 decreases with workload levels. These observations are in accordance with the previous findings that the P300 component is smaller in a difficult task [35][36][37]. Moreover, although the P300 of Level 3 and Level 4 are weaker than the other two lower workload levels, a positive peak in the target ERP at around 200 ms (P2) at parietal (Pz) and occipital (Oz) locations is stronger. Compared to the more difficult levels, Level 1 and Level 2 elicited stronger positive peak at around 100 ms (P1) at occipital location. In non-target ERPs, P300 is absent in figure 3. However, some characteristic ERPs observed in target ERPs are also seen here. For instance, P2 can be observed, the component of which increases with workload levels. There also exists 'P1alike' ERP, which is stronger at frontal and central for Level 3 and Level 4, and stronger at occipital for Level 1 and Level 2.

Spatial patterns
One of the advantages of CSP family (including BCSP) is that these methods can extract the characteristic common spatial patterns from the spatial projection matrix [28,34]. In view of the blind source separation approach, the extracted common spatial patterns are time-invariant EEG source distribution vectors [38]. In the scenario of a binary classification, some of the common spatial patterns represent the EEG source distribution of one brain state which, however, are weak/absent in the other brain state, and vice versa. The common spatial patterns are the rows of spatial projections W , 1 − and typically sorted according to the discriminating capability of the spatial projections The patterns drawn in figure 4 correspond to the first most discriminative spatial projections of 12 pair-wise classification conditions. In figure 4, Level A/Level B means that the pattern is obtained by maximizing (resembling) Level A while minimizing (differing from) Level B. It is necessary to point out that the averaging across subjects in figure 4 was performed using the absolute values of each subject's common spatial patterns, due to the fact that they only indicate which regions are more active, regardless the polarity of the patterns.
It can be seen in figure 4 that frontal and central parts are characteristic regions, being informative for differentiating workload levels. In particular, frontal region is active for all the common spatial patterns, suggesting that the frontal region can serve as a general indicator of workload level. Furthermore, parietal and occipital region is relatively less pronounced in comparison to the frontal region. However, strong presence can also be seen in these regions in some patterns, such as Level 4/Level 1.   The averaged discriminative common spatial patterns obtained using BCSP. Each common spatial pattern is a topography extracted by BCSP that characterizes (separates) one workload level from another workload level. The colorbar range is (0 3). Figure 5. Two examples of discriminative common temporal patterns obtained using BCSP. Each common temporal pattern is a signal pattern extracted by BCSP that characterizes (separates) one workload level from another workload level. X-axis is time (ms).

Temporal patterns
The BCSP provides the characteristic common temporal patterns from the temporal projection matrix. In this study, common temporal patterns can be perceived as the EEG waveform templates that differentiate one workload level from another from a mathematical viewpoint. For every subject and every Level A/Level B, four common temporal patterns are obtained. However, these common temporal patterns vary from one subject to another, and from one level to another. Generally there are two types of common temporal patterns. As shown in figure 5, one is an ERP while the other is oscillatory signal. Both ERP and oscillatory signal play important role in this workload discrimination task.

Discussion
Cognitive response to workload is heavily investigated topic, given its importance in the real-world industrial, transportation and military applications. In these applications, while visual perception and cognition are critical, how the visual degradation, if added to the visual input, affects the cognitive workload, is not yet deeply explored. The subjective ratings of workload show a progressive increase in task difficulties with the four workload levels as demonstrated above.
Degradation is found to result in increased workload, and this change in workload level can be accurately classified by the BCSP method, as evidenced by the high multi-level classification accuracy shown in table 2. The BCSP method is based on the conventional CSP method. The CSP method is found to be very successful in motor imagery applications by targeting rhythmic event-related de-synchronization and eventrelated synchronization signals [39]. Being a variant of CSP, BCSP inherits the capability of spatially discriminating oscillatory signals. In addition, with the help of its temporal projections, BCSP is also extended to extract ERPs. Therefore, BCSP is able to derive features from both oscillatory signals and ERPs with the aim of optimally separating two distinct conditions. The ability to extract both oscillatory signals and ERPs by the BCSP method is evidenced by the common temporal patterns in figure 5. Using the spatial projections and temporal projections together, the oscillatory signals and ERPs can be naturally combined, rather than separately handled, in order to study the multi-level workload classification problem. Mathematically, BCSP is effective and useful as there are no assumptions of the type of features to be extracted. In addition, it has achieved high discrimination results. In terms of physical meaning, characteristic signals and topographies are visualized and analyzed in figures 4 and 5. However, there may be a disadvantage to the BCSP method. Although the characteristic spatial patterns and temporal patterns are available, the inherent relationship between spatial patterns and temporal patterns is unknown. In other words, it is difficult to directly tell the exact location in the spatial patterns where a specific temporal pattern occurs. According to the literature [21][22][23], the characteristic oscillatory signals emerge in the frontal region, which resemble the common spatial patterns in figure 4.
As can be seen in figure 3, ERPs at different workload levels are different. In general, P2 seems to be an ERP that correlates with workload levels. That is, the P2 component increases with respect to the workload level. It has been reported that P2 has a broad distribution on the scalp, including frontal, central, parietal and occipital regions [29,40], which is similar to the observation in this study. It is also noteworthy that the visual P2 is associated with the highorder perceptual processing [27,41]. Many studies link P2 to the cognitive matching between sensory inputs and stored memory [13,42]. Therefore, the incremental cognitive demand in the workload might lead to the changes in P2 component. Moreover, the workload level influences the amplitude of P300. There is a general agreement that P300 represents the cognitive recognition process [43]. This cognition recognition process can be affected by the task difficulty. The workload level in this study increased in pace with the task demand. Consequently, higher workload levels resulted in attenuated amplitudes of P300.
It is necessary to point out that the frontal region was consistently active for the majority of the common spatial patterns in figure 4. It is possible that this region identified by the BCSP method is associated with the activities of oscillatory EEG signals as reported in the previous studies. This observation could be particularly useful for the multi-level classification, as it shows the possibility of constructing a more generalized feature space than ones which were customized for each pair of workload levels. With such a priori knowledge, the BCSP method can be selectively applied to frontal regions only and, in turn resulting in a feature space accommodating all levels.

Conclusion
This study explored the feasibility of using visual stimuli degradation to modulate the cognitive workload level, and manifested that the workload changes due to visual degradation can be tracked by both PSD method and oscillatory EEG/ERP detection method, i.e. BCSP, in a single-trial classification manner. The single-trial multi-level classification results achieved using BCSP are over 83%.