Analysis of modulations of mental fatigue on intra-individual variability from single-trial event related potentials

Background: Intra-individual variability (IIV), a measure of variance within an individual’s performance, has been demonstrated as metrics of brain responses for neural functionality. However, how mental fatigue modulates IIV remains unclear. Consequently, the development of robust mental fatigue detection methods at the single-trial level is challenging. New methods: Based on a long-duration flanker task EEG dataset, the modulations of mental fatigue on IIV were explored in terms of response time (RT) and trial-to-trial latency variations of event-related potentials (ERPs). Specifically, latency variations were quantified using residue iteration decomposition (RIDE) to reconstruct latency-corrected ERPs. We compared reconstructed ERPs with raw ERPs by means of temporal principal component analysis (PCA). Furthermore, a single-trial classification pipeline was developed to detect the changes of mental fatigue levels. Results: We found an increased IIV in the RT metric in the fatigue state compared to the alert state. The same sequence of ERPs (N1, P2, N2, P3a, P3b, and slow wave, or SW) was separated from both raw and reconstructed ERPs using PCA, whereas differences between raw and reconstructed ERPs in explained variances for separated ERPs were found owing to IIV. Particularly, a stronger N2 was detected in the fatigue than alert state after RIDE. The single-trial fatigue detection pipeline yielded an acceptable accuracy of 73.3%. Comparison with existing methods: The IIV has been linked to aging and brain disorders, and as an extension, our finding demonstrates IIV as an efficient indicator of mental fatigue. Conclusions: This study reveals significant modulations of mental fatigue on IIV at the behavioral and neural levels and establishes a robust mental fatigue detection pipeline.


Introduction
Prolonged working hours and high cognitive demand tasks are common experiences in modern life.However, human cognitive resources are limited and our mental fatigue level increases when we engage in work and tasks for a long period of time.Mental fatigue generally leads to behavioral performance deterioration, reduced motivation, and failure of sustaining attention (J.Liu, Zhu, et al., 2020).This is referred to as the time-on-task effect (Gao et al., 2022) and/or vigilance decrement (Reteig et al., 2019).Mental fatigue has been reported as the main factors of traffic accidents and poor work efficiency (S.Liu et al., 2023).In order to alleviate such consequences, efforts have been made to reveal the underlying mechanisms of mental fatigue (J.Liu et al., 2024) and to monitor the levels of mental fatigue (Chen et al., 2023).The modulatory effects of mental fatigue have been examined on numerous cognitive functions, such as inhibition responses (Guo et al., 2018;Kato et al., 2009), visual selective attention (Faber et al., 2012), sustained attention (Boksem et al., 2005;J. Liu et al., 2023), and top-down cognitive control (Kok, 2022;Lorist, 2008).Nevertheless, the majority of cognitive and neurophysiological studies have assessed average differences across trials between alert and fatigue states, and thus overshadowed the neural underpinnings on intra-individual variability (IIV).In particular, the changes of IIV modulated by mental fatigue have been poorly studied in the literature.
IIV, or response variability, reflects dynamic, transient, and within-subject changes in behavioral performance and brain functions (Wei et al., 2021).Together with inter-subject variability and inter-group variability, they have been shown to be three empirical sources of intrinsic variations in cognitive functions (Braver, 2012).Here, we aim to gain more insights into the mechanisms that are central to mental fatigue and in the intra-individual response variability.Growing evidence has suggested that IIV is not a random phenomenon, but a result of different neurological processes (Fjell & Walhovd, 2007;Leue et al., 2013;Mirajkar & J o u r n a l P r e -p r o o f Waring, 2023).Furthermore, response variability is generally discussed in several attributes such as: magnitude, latency, intensity, or quality (Fiske & Rice, 1955;Joly-Burra et al., 2018).
IIV in behavior performance (e.g., increased fluctuations in reaction time) has been shown to be a common component of brain disorders and aging-related cognitive decline (MacDonald et al., 2006).Similarly, IIV in brain responses has been suggested as an effective indicator of neural functionality and neurophysiological characteristics of the brain (Ouyang et al., 2017).
High-temporal-resolution electroencephalography (EEG), especially event-related potentials (ERPs) is a well-known approach to characterizing neural dynamics of the brain during cognitive processes (G.Zhang & Luck, 2023).Response variability from single-trial ERPs have been used for depicting within-subject variations at the neural level (Leue et al., 2013).Specifically, IIV is generally found in late ERP components such as N2 and late positive component (LPC) (Barry et al., 2020;Polich, 2020).The LPC has been shown to consist of three subcomponents, namely P3a, P3b, and slow wave (SW).Leue et al. (Leue et al., 2013) have shown that response variability in N2 amplitude incorporates systematic variance derived from a cognitive control task.Furthermore, intra-individual P3a amplitude variation has been found to be positively associated with age and negatively related to fluid intelligence and cortical thickness (Fjell & Walhovd, 2007;Joly-Burra et al., 2018).Recent IIV studies go beyond ERP amplitude and also seek to exploit the ERP latency variability associated with cognitive functions and mental abilities (Ouyang et al., 2017).The latency of P3 component, an efficient measure of mental chronometry, has been reported to be closely related to the corresponding reaction time (Duncan et al., 2009).Although response variability from ERPs has been used in brain research, retrieving objective information from single-trial ERP, especially estimating component latency, has proven challenging owing to the low signal-tonoise ratio (SNR) and overlapping spectrum of noises and signals (Da Pelo et al., 2018).

J o u r n a l P r e -p r o o f
Several single-trial ERP latency estimation methods have been proposed in the literature.
They can roughly be divided into four categories: filtering and peak-picking, template matching, maximum likelihood estimation, and decomposition methods (Ouyang et al., 2017).Among these methods, temporal filtering is a typical approach, which assumes that an ERP component possesses a specific frequency band.For example, a low-pass filter at 3-5 Hz has been applied to estimate single-trial P3 latency by removing high frequency distracting peaks (Jaškowski & Verleger, 2000).Temporal filtering is restricted by the fact that EEG noises are mixed in the low frequency components.The basic assumption of template matching is a specified waveform morphology of the ERP component.In previous template matching studies (Alvarado-Gonzalez et al., 2016;Woody, 1967), single-trial P3 latency was characterized by using an iterative procedure.Still, EEG noises cannot be excluded from the mimicking morphology.The maximum likelihood methods were developed based on the hypothesis that the statistical properties of EEG noises follow the normally distributed Fourier coefficients across trials, and the likelihood of noises will be maximised when there is the best Gaussian properties approximation (Jaškowski & Verleger, 1999).A major restriction of the method is that it is greatly affected by EEG noises, even leading to convergence problems.In terms of decomposition methods, they mainly depend on a variety of definitions of ERP components such as topography, temporal, or statistical properties.Principal component analysis (PCA) is one of the most widely used decomposition methods, it is generally efficient for separating average-based ERP components without considering latency variations (Dien, 2012;G. Zhang et al., 2020).Taking into consideration the advantages of the above-mentioned methods, Ouyang et al. (Ouyang et al., 2011, 2015a) proposed the residue iteration decomposition (RIDE) method to assess trial-to-trial latency variations from single-trial ERP.RIDE integrates ERP decomposition based on latency variation and single-trial latency estimation using template matching, low-pass filtering, and likelihood methods.

J o u r n a l P r e -p r o o f
In addition to revealing how mental fatigue modulates intra-individual response variability, it is important to develop robust methods to detect and assess mental fatigue (Lin et al., 2022).With the development of brain-computer interface (BCI) (Blankertz et al., 2011;Lotte et al., 2007Lotte et al., , 2018)), decoding mental states from single-trial ERP has become an important branch of modern neuroscience.Although great efforts have been made in single-trial analyses, it remains a challenging task to achieve good performance owing to trial-to-trial variability and low single-trial SNR.
In the present study, based on an EEG dataset recorded from a long period of flanker task, we explored how mental fatigue affected IIV in behavioral performance and single-trial ERPs.
Within-subject variations of response time (RT) has been used to measure a subject's inconsistency of behavioral performance (Adleman et al., 2016).Trial-to-trial ERP latency variability was estimated from comparisons of latency-corrected ERPs with raw ERPs.
Especially, latency-corrected ERPs were reconstructed from single trials by using RIDE.To cope with challenges during ERP analysis such as a mixture of latent underlying components, temporal PCA was used to separate ERP components from both raw and reconstructed ERPs.
We then compared temporal PCA results to explore the influence of mental fatigue on latency variations.In our previous study (J.Liu, Zhang, Zhu, Ristaniemi, et al., 2020), a single-trial analysis pipeline integrating discrete wavelet packet transformation (DWPT) and multilinear principal component analysis (MPCA) was used to detect and localize heart diseases by using electrocardiography (ECG).Here, we extended our previously established single-trial analysis pipeline to monitor changes in mental fatigue.Altogether, this study provided new insights for the application of RIDE to investigate the modulations of mental fatigue on ERP latency variability and proposed a feasible analysis pipeline to detect mental fatigue from single-trial ERP.
J o u r n a l P r e -p r o o f

Materials and Methods
We used our previously recorded EEG dataset and shortly summarized experimental setup and recording sessions here.The details of participants, experimental task, procedures, and acquisitions can be found in an earlier study (J.Liu, Zhang, Zhu, Liu, et al., 2020).

Participants
Twenty right-handed university students (12 females, mean age = 21.9,SD = 2.4, range 18-28 years) participated in the experiment.They all have normal or corrected-to-normal visions, regular sleep patterns, and no history of prescription medications.This study was approved by the Ethical Committee of the Liaoning Normal University and was conducted in accordance with the tenets of the Declaration of Helsinki.All participants were informed about the contents of the experiments and gave their informed and written consent.

Stimuli and task
Participants were asked to perform a modified Eriksen flanker task (Eriksen & Eriksen, 1974), in which a five-letter string consisting of the letters M and N was used.Congruent (e.g., MMMMM) and incongruent (e.g., NNMNN) stimuli were respectively presented with a proportion of 60% and 40% of the trials in a random order.The participants were instructed to respond to the central letter M or N on the keyboard.Each trial lasted a total of 3 seconds, starting from a fixation cross in the middle of the screen.After 1000 ms, a stimulus was presented for 200 ms and then a response was required within a maximum of 600 ms.Following the responses, an interval of 200 ms was provided for error response awareness, and then the final feedback of responses (e.g., 'Correct') was shown for 1000-1500 ms.

Procedures
Participants did a practice session in order to be familiar with the flanker task.In addition, they were asked to abstain from coffee, tea, and alcohol 24 hours before the experiment.On the J o u r n a l P r e -p r o o f experiment day, participants were instructed to hand over their mobile phones and watches to remove the effects of time indication during the experiment.Thereafter, the participants performed the task for 140 minutes without a break in a sound-attenuated and electricalshielded room.The experiment included seven 20-minute blocks, each block consisting of 400 trials (2800 trials in total).Furthermore, we provided a monetary reward in blocks 2 and 6 to study the interaction between mental fatigue and reward (J.Liu, Zhang, Zhu, Liu, et al., 2020).
In the present work, we analyzed the behavioral and EEG data from block 1 (alert state) and block 5 (fatigue state) in order to explore the modulations of mental fatigue on intra-subject variability.The evidence to support that block 1 and block 5 were respectively in alert and fatigue states was provided by analyzing behavioral performance in these two blocks in Results.

EEG acquisition and preprocessing
During the stimulus presentation, continuous EEG was recorded using a 64-channel EEG system (ANT Neuro by Hengelo, The Netherlands) at a sampling frequency of 500 Hz.The impedance of each electrode was kept below 10 kΩ, and the EEG signals were online referenced to the CPz electrode.
The EEG data were processed offline using MATLAB (The MathWorks, R2022a).First, a notch filter at 50 Hz was applied to the EEG signals followed by a high-pass filter at 0.5 Hz and a low-pass filter at 30 Hz. Subsequently, noisy EEG channels were visually inspected and replaced by surrounding signals using the spherical spline interpolation method (Perrin et al., 1989).Next, the direct current (DC) offset was removed from the EEG signals.Further, a wavelet threshold method was applied to the EEG to remove large spikes and drifts (C.Zhang et al., 2018).By utilizing the independent component analysis (ICA) (Himberg & Hyvärinen, 2003), artifact components were removed including ocular and muscle movements.Thereafter, EEG signals were offline referenced to the averaged mastoid electrodes (M1 and M2).Finally, the EEG was segmented into epochs from 500 ms pre-stimulus to 1000 ms after stimulus onset.
J o u r n a l P r e -p r o o f 2.5.Data analysis

Behavioral performance
For the RT metric, incorrect trials, and trials with RT < 100 ms and > 600 ms were excluded.
First, response accuracy and mean RT from blocks 1 and 5 were computed to illustrate these two blocks were in the hypothesized alert and fatigue states.Next, within-subject variability of behavioral performance from RT was examined.The standard deviations of RT (Jensen, 1992) across experimental trials in the alert and fatigue states, respectively were calculated.

Temporal PCA to ERPs
Trials were re-segmented into epochs lasting 800 ms after stimulus onset with a pre-stimulus baseline of 200 ms.Correct trials with amplitudes under 100 µV were used to calculate grand mean ERPs.Consequently, the number of remaining trials in the alert and fatigue states differed, which induced a bias when comparing the alert and fatigue conditions.To exclude this bias, an equalization method was performed on each subject by randomly and repeatedly selecting a subset of trials with the minimum trial number (mean =235 trials, SD=73) from all conditions.
The equalization procedure was repeated 1000 times, and grand mean ERPs from all 1000 repetitions were averaged to generate the final ERPs.
The stimulus-locked ERPs with temporal and spatial information from the two conditions and all participants were constructed to form a matrix with 2480 cases (20 participants × 62 channels × 2 conditions) and 500 variables (time points).Temporal PCA was applied to the ERP matrix and the factor loadings were estimated from the covariance matrix as it provides an electrophysiological meaningful explanation (Kayser & Tenke, 2003).The oblique factor rotation Promax was used to attenuate the influence of volume conductivity on EEG data (Dien, 2010;Dien et al., 2007).The PCA components explaining more than 1% variance were displayed in order of latency.
J o u r n a l P r e -p r o o f

RIDE to reconstruct ERPs
To address the intra-individual trial-to-trial latency variability, we employed an updated version of RIDE (Ouyang et al., 2015b(Ouyang et al., , 2015a) ) to reconstruct raw ERPs.The updated version of the RIDE framework used L1 norm minimization to cope with serious distortion problems (Ouyang et al., 2015b).The RIDE was established on the ERP model as follows: where where * represents the convolution operation and  denotes the probability density function.
Only if the  is a delta function and the components are located at the most probable latency can the latency-corrected ERP (  ) be realized as follows: The early ERP components, such as  1 () and  2 (), are less affected by latency variability, whereas the later components,   (), show evident latency variability, as shown in Figure 1A.
In line with the general assumptions that ERP is composed of three component clusters associated with stimulus-triggered processes, central processing, and motor-related responses (Luck, 2005), RIDE generally decomposes single-trial ERPs into three component clusters: study (Ouyang et al., 2017).The RIDE algorithm was implemented by an inner loop representing a decomposition module using a time marker and an outer loop representing a latency estimation module with a self-optimized iteration for latency estimation (see the flow chart in Figure 1B).The inner loop is terminated when the difference in latency for cluster component C of two successive iterations is smaller (< 10 -3 ) than that for the two initial iterations.In particular, the R component cluster is obtained by leveraging the RT metric in RIDE processing, which enables a connection between behavior and brain responses.In this

Single-trial classification analysis
The effects of mental fatigue on IIV using feature extraction and classification methods on single trial analysis was explored.A flowchart of the single-trial classification analysis is shown in Figure 2. The unbalanced trials between two conditions were taken into consideration when performing classifications between alert and fatigue states for each subject, similar to PCA to ERP analysis.An equalization procedure was conducted by randomly selecting a minimum number of trials from two conditions for each subject and this procedure was J o u r n a l P r e -p r o o f repeated 100 times.Classification accuracy, defined as the mean values of sensitivity and specificity (Myrden & Chau, 2017), was computed by averaging the accuracies from 100 repetitions (trial-balanced data).
To increase the SNR, we used single trial data from six channel-clusters rather than from single channels.The six channel-clusters were chosen based on the topographic activations obtained from temporal PCA analysis, namely cluster 1 (AF3, AF4, F1, F2, Fz), cluster 2 (F1, F2, Fz, FC1, FC2, FCz), cluster 3 (FC1, FC2, FCz, C1, C2, Cz), cluster 4 (C1, C2, Cz, CP1, CP2, CPz), cluster 5 (Fp1, Fp2, AF3, AF4, F1, F2, Fz), and cluster 6 (FC1, FC2, FCz, C1, C2, Cz, CP1, CP2, CPz).For the feature extraction, single-trial data from the selected channelclusters in the time window of -500 to 1000 ms relative to stimulus onset (considering the contamination of edge artifacts on results (Cohen & Cavanagh, 2011) were entered into the DWPT, providing more precise frequency resolution than discrete wavelet transforms (DWT) (Rajpoot et al., 2003).Summarizing empirical and practical knowledge (C.Zhang et al., 2018), we chose the mother wavelet of "db6" and seven layers of decomposition, resulting in a resolution of around 2 Hz in each DWPT coefficient.As the frequency bands of interest were concentrated below 30 Hz, a total of 15 DWPT coefficients (corresponding to frequency bands 0.5-30 Hz) were reconstructed at the seventh layer.We re-segmented the 15 reconstructed DWPT waveforms into 0 to 800 ms after stimulus onset and extracted mean values from fixed time windows corresponding to different ERP components.A total of 6 mean values were extracted for each channel-cluster and DWPT waveform according to the six ERP components derived from temporal PCA (illustrated in the Results).A feature tensor containing 6 channelclusters × 15 frequency bins × 6 temporal values × trials were constructed and subjected to the MPCA for dimensionality reduction (J.Liu, Zhang, Zhu, Ristaniemi, et al., 2020;Lu et al., 2008).By manipulating the percentage of energy for MPCA, we determined the lowdimensional and representative features for classification.We then estimated the accuracy of J o u r n a l P r e -p r o o f classification by performing 100 runs of 5-fold cross-validation with random permutations.
Four different binary classifiers were used for mental fatigue detection, consisting of support vector machines (SVM) with a linear kernel function (Muller et al., 2001), linear discriminant analysis (LDA) (Martinez & Kak, 2001), Gaussian naive Bayes (NB) (Rish, 2001), and random forest (Kleinberg, 2000).The averaged values from time windows of six components were extracted from reconstructed wavelet packet coefficients.We only considered 1-15 th subbands, covering the frequency bands 1-30 Hz.The tensor features consisting of temporal values, channel-clusters, frequency, and samples (trials) were reduced by MPCA to obtain representative features.The reduced features were subjected to SVM classifiers with linear and MLP functions.Statistical results of these PCs with significant effects were corrected using the false discovery rate (FDR) (Benjamini & Yekutieli, 2001, 2005) for multiple comparisons.All statistical 2sided p or corrected p values less than 0.05 were considered as significant.

Temporal PCA outcomes
Figure 4A illustrates raw ERPs averaged from midline sites (Fz, FC1, FCz, FC2, C1, Cz, C2, CP1, CPz, CP2, Pz) in the alert and fatigue states.The raw ERPs can be identified as a series of ERP components in the order of latency, including N1, P2, N2, and late positive component (LPC).Since latency variability generally affects late ERPs, the effect of mental fatigue on LPC was considered.From visual inspection and consistent with literature that LPC is a typical broad positivity between 400 and 800 ms after stimulus onset (Friedman & Johnson, 2000), the LPC averaged in the time window of 420-800 ms, marked by grey rectangle in Figure 4A, was used for statistical analysis.Paired-t test analysis of LPC amplitude showed no difference between alert and fatigue conditions (t 19 = 0.88, p = 0.39, g = 0.19).Although LPC consists of multiple subcomponents, conventional ERP analysis is limited by a mixture of underlying ERP components.Thus it seems difficult to decode the modulations of mental fatigue on separate LPC subcomponents from raw ERPs.
To separate overlapping ERPs, we performed temporal PCA and then explored the effects of mental fatigue on separated ERP components.

RIDE + PCA outcomes
Trial-to-trial latency variability has been shown to influence ERPs, especially late ERP components (Ouyang et al., 2017).Here, RIDE was applied to correct intra-subject trial-to-trial latency variations.Figure 5A illustrates RIDE corrected results, including the stimulus-locked component cluster S, no explicit time-locked component cluster C, and response-locked component cluster R, as well as the reconstructed ERP.The reconstructed ERPs from electrodes, participants, and conditions were used in the temporal PCA to derive components of latency-corrected ERP.Similarly, six components in the temporal order of N1, P2, N2, P3a, P3b, and SW were identified (Figure 5B), explaining 95.00% of total variances.Nevertheless, J o u r n a l P r e -p r o o f the explanatory variances of these six components were changed after using RIDE, especially for P3a, P3b, and N2, accounting for variances of 51.83%, 23.89%, and 10.17%, respectively.
The temporal fluctuations and brain topographic patterns of these six components were almost the same as those extracted directly from temporal PCA, indicating the stability of existing ERP components.Thus, the grand mean amplitudes of RIDE + PCA outcomes were obtained from the same electrodes and corresponding temporal windows relative to the temporal PCA outcomes.
A significant difference between the alert and fatigue states was detected on N2 component (t 19 = 3.84, p < 0.01, g = 0.83), P3a component (t 19 = 3.24, p < 0.01, g = 0.70), and SW (t 19 = 3.44, p < 0.01, g = 0.74), indicating a stronger activation in N2 and SW, as well as a weaker activation in P3a in the fatigue state compared to the alert state.Statistical analyses of P3b did not reveal a significant difference (t 19 = 0.72, p = 0.48, g = 0.16) between the fatigue and alert states, consistent with PCA outcomes.Taken together, after considering the intrasubject trial-to-trial variability using RIDE, the explanatory variances of separated components were greatly changed although the same six ERPs were extracted from the temporal PCA.The modulations of fatigue on LPCs were the same compared to PCA outcomes, whereas a significant effect of fatigue on N2 was found only in the RIDE+ PCA outcomes.almost equally distributed in all channel-clusters and time windows.These features were fed into classifiers, fitting the rules that the ratio of samples/features should be between five to ten (Lotte et al., 2007(Lotte et al., , 2018)).
The SVM and LDA classifiers generally performed better than NB and random forest in the binary classification of fatigue versus alert trials, and the random forest classifier achieved the lowest performance.

Discussion
The aim of this study was to investigate how mental fatigue modulates IIV and to establish a robust analysis system to assess mental fatigue.Based on an EEG dataset collected during a prolonged flanker task, the modulations of mental fatigue on intra-individual trial-to-trial variability at the behavioral and neural levels was examined.Regarding behavioral performance, we discovered larger variability in RT when subjects were in the fatigue state relative to alert state.In terms of electrophysiological indicators, before considering withinsubject latency variations, a cascade of ERPs in the latency order of N1, P2, N2, P3a, P3b, and SW was derived from temporal PCA.To quantify single-trial latency variability, we employed RIDE to reconstruct latency-corrected ERPs and then applied temporal PCA.The same cascade of ERP components was derived from RIDE + PCA outcomes, nevertheless, the explained variances of these principal components, specifically on late ERPs, were significantly changed compared to PCA results.In addition, significant differences in P3a and SW between the alert J o u r n a l P r e -p r o o f and fatigue conditions were detected in both PCA and RIDE + PCA outcomes.A stronger N2 magnitude was observed in the fatigue state than the alert state only in RIDE + PCA outcomes.
In the case of trial-wise variations, we introduced a robust single-trial machine learning analysis pipeline and achieved an acceptable alert versus fatigue classification accuracy.
The modulations of mental fatigue on IIV in behavioral performance were explored by comparing standard deviations of RT across experimental trials between the alert and fatigue conditions.Our study showed larger standard deviations of RT in the fatigue state compared to alert state, indicating that an increased level of fatigue leads to larger trial-to-trial fluctuations in behavioral responses.In fact, previous research has placed increased focus on the importance of IIV, which can confer perspective information on cognitive functionality above mean performance (MacDonald et al., 2006;Myerson et al., 2007).As such, within-subject variations in RT have been considerably examined in neuroscience studies showing the following correlations: decreased IIV in RT through childhood (Williams et al., 2005), and increasing RT variability with increasing age in adulthood (Mirajkar & Waring, 2023;Myerson et al., 2007); more variable responding for attention-deficit hyperactivity disorder (ADHD) (Castellanos & Tannock, 2002;Johnson et al., 2007); larger performance variations after traumatic brain injury (Stuss et al., 1994).As an extension, the present study supports the idea that increased intraindividual performance variability is associated with mental fatigue elicited by engagement in a long-duration cognitive task.Response variability has been considered as external performance of underlying alternations in the brain (MacDonald et al., 2006), though there have been few integration studies that can link behavior to brain responses in IIV.
The modulations of mental fatigue on IIV in brain responses were investigated via comparisons of ERP components derived from temporal PCA before and after considering trial-to-trial latency variability.Without consideration of IIV, we applied PCA to grand mean ERPs and separated a sequence of ERP components in the temporal order of N1, P2, N2, P3a, J o u r n a l P r e -p r o o f P3b, and SW explaining variances of 2.38%, 12.01%, 1.46%, 18.16%, 58.12%, and 3.42%, respectively.Further, a decreased P3a and an increased SW were detected in the fatigue state relative to alert state.The latency variability within an individual was considered using RIDE, which has been developed and validated as a powerful method for analyzing and reconstructing latency-variable ERPs from single trials (Ouyang et al., 2011(Ouyang et al., , 2015b(Ouyang et al., , 2015a(Ouyang et al., , 2017)).We then subjected the reconstructed ERPs to PCA and derived the same cascade of ERP components: N1, P2, N2, P3a, P3b, SW explaining 1.94%, 2.77%, 10.17%, 51.83%, 23.89%, and 4.41% variances, separately.Compared to PCA outcomes alone, the explaining variances were greatly changed on P3a and P3b components, followed by P2 and N2 components.Moreover, a similar impairment of mental fatigue on P3a and SW was detected, while a stronger N2 was uncovered in the fatigue state uniquely from RIDE + PCA outcomes.The N2 component has been documented as an effective neural signature of conflict monitoring (Borja-Cacho & Matthews, 2008).Furthermore, P3a and P3b has been linked to attentional mechanisms and sequential working memory (Barry et al., 2020;Polich & Criado, 2006), and SW has been demonstrated as indicators of further processing for attended stimuli (Squires et al., 1975;Teixeira-Santos et al., 2020) or conceptual operations (Strüber & Polich, 2002).When interpreting the modulations of mental fatigue on the roles indicated for these ERP components, we speculate that mental fatigue impairs cognitive processes of conflict processing, attention, as well as attended information advanced processing, consistent with deteriorated attention and cognitive control capability affected by mental fatigue as reported in previous studies (Breckel et al., 2011;Faber et al., 2012;J. Liu, Zhu, et al., 2020;Möckel et al., 2015).From these results, it is clear that the IIV could lead to mixing and smearing effects on relatively late ERPs (e.g., N2, P3a, and P3b), consistent with previous studies (Fjell & Walhovd, 2007;Leue et al., 2013;Ouyang et al., 2017).Our results also suggest that the changes of fatigue levels result in different fluctuations in trial-to-trial latency variability.

J o u r n a l P r e -p r o o f
We further classified single-trial ERPs between alert versus fatigue states using our proposed analysis pipeline.An averaged subject-specific classification accuracy of 73.3% ± 4.8% was obtained using a liner SVM in this study.In an earlier passive EEG-BCI study (Myrden & Chau, 2017), eleven participants were involved in mental arithmetic, anagram, and grid-recall tasks to induce mental fatigue and other emotions.Myrden and co-authors achieved an accuracy of 74.8% ± 9.1% for participant-dependent single-trial mental fatigue detection using a shrinkage linear discriminant analysis (LDA) binary classifier.This study demonstrated the feasibility and robustness of the analysis pipeline, firstly proposed for detection of myocardial infarction on ECG (J.Liu, Zhang, Zhu, Ristaniemi, et al., 2020), for EEG-based mental fatigue detection, even though there was strong trial-to-trial variability during prolonged task engagement.Still, the single-trial analysis pipeline in this study is limited to subject-specific classification.Further studies are needed to develop robust across-subject single-trial fatigue detection methods.Further studies should also explore the modulations of mental fatigue on inter-individual variability.

Conclusion
We introduced the RIDE algorithm to reconstruct latency-variant ERP and compared the reconstructed ERP with raw ERP by means of the temporal PCA for detecting the modulations of mental fatigue on specific cognitive processes and trial-to-trial latency variability.We also proposed a single-trial classification pipeline for monitoring the changes of fatigue states.
These proposed methods allowed us to quantify the effects of mental fatigue on IIV and detect mental fatigue from single-trial ERP.Specifically, we explored the effects of mental fatigue on intra-individual trial-to-trial variability during a long-duration flanker task using behavioral performance and ERPs from single trials.There was an increased within-subject variation in RT following increased fatigue levels.By using temporal PCA, a total of six ERP components (N1, P2, N2, P3a, P3b, and SW) were extracted from both raw and reconstructed ERPs, and J o u r n a l P r e -p r o o f the explained variances were significantly changed after considering the trial-to-trial latency variability using RIDE.P3a and SW were detected to be affected by mental fatigue.Particularly, after considering latency variability, a significant difference on N2 was detected when subjects shifted from alert to fatigue states.We further examined the possibility of classification of alert versus fatigue states at the single-trial level.By utilizing the proposed single-trial analysis pipeline, we gained an acceptable classification accuracy of alert and fatigue trials.In summary, these exploratory findings provided evidence for the modulations of mental fatigue on IIV and extended the roles of IIV related to aging and brain injury into normal fluctuations of fatigue level induced by prolonged task engagement.Our results further indicated that although trialto-trial variations and low SNR existed, it is feasible to establish a robust machine learning system for single-trial analysis during a cognitive task.
component cluster S locked to stimulus onset, component cluster C without an explicit time J o u r n a l P r e -p r o o f marker, and component cluster R locked to responses.Still, the RIDE algorithm can be used in different schemes, such as S + C, according to cognitive processes involved in a particular

Figure 2 .
Figure 2. The pipeline of single-trial binary classification.An equalization procedure was performed before DWPT.The averaged values from time windows of six components were extracted from reconstructed wavelet packet coefficients.We only considered 1-15 th subbands, covering the frequency bands 1-30 Hz.The tensor features consisting of temporal values, channel-clusters, frequency, and samples (trials) were reduced by MPCA to obtain representative features.The reduced features were subjected to SVM classifiers with linear and MLP functions.
Statistical analyses were conducted in MATLAB (The MathWorks, R2022a) and IBM SPSS Statistics version 29.0.Behavioral measures (e.g., standard deviations of RT) and temporal PCA separated components with and without RIDE were used as inputs for paired-samples ttests to assess mental fatigue effects.Principal components (PCs) of RIDE + PCA and PCA J o u r n a l P r e -p r o o f results were obtained by projecting the factor scores and loadings onto the primitive temporal space.Grand mean values of PCs obtained in specific temporal windows from activated EEG channels were subjected to paired t-tests.The effect size of results from t-test was reported using Hedges' g correction based on sample standard deviation of the mean difference.

FigureFigure 3 .
Figure3Aand 3B display the response accuracy and mean RT in block 1 and block 5, namely Figure 4. (A) Raw ERPs with LPC components marked with grey rectangle for statistical analysis.Temporal PCA outcomes include (B) six selected factor loadings, and (C) related information and factor scores visualized as activated topographies in the alert and fatigue conditions.Significant differences are marked by black lines and **<0.01.

Figure 5 .
Figure 5. RIDE results and RIDE + PCA outcomes.(A) Raw ERP was decomposed into S component, C component, and R component.Components of S, C, and R were used to reconstruct ERP.The reconstructed ERP was used in temporal PCA and obtained (B) factor loadings, information, and factor scores in the alert and fatigue conditions.Variance values in bold show large differences compared to PCA outcomes.Significant differences are marked by black lines and **<0.01.

Figure 6 .
Figure 6.Single-trial classification outcomes.(A) The changes of classification accuracy between alert and fatigue trails with different energy percentages using SVM-Linear classifier from a subject.(B) Projected matrices from MPCA in the spatial, spectral, and temporal dimensions.(C) Classification results of four classifiers (in a binary classification of alert versus fatigue conditions.S represents Subjects 1-20. () represents the EEG data of th trial at time point  relative to the stimulus onset,   () is the waveform of th ERP component,   denotes the latency of th component for trial , and  denotes the noise.The latency of individual ERP components is not exactly at the same time across single trials, namely  is supposed to vary independently and is modulated by different experimental conditions.The latency variability of each component can be represented by a probability density function of the latency distribution.Therefore, the average ERP is the convolution of the individual ERP components with the corresponding distribution of latencies across trials, represented as: () =  1 () * ( 1 ) +  2 () * ( 2 ) + ⋯ +   () * (  )