Evaluation of Different Visual Feedback Methods for Brain—Computer Interfaces (BCI) Based on Code-Modulated Visual Evoked Potentials (cVEP)

Brain–computer interfaces (BCIs) enable direct communication between the brain and external devices using electroencephalography (EEG) signals. BCIs based on code-modulated visual evoked potentials (cVEPs) are based on visual stimuli, thus appropriate visual feedback on the interface is crucial for an effective BCI system. Many previous studies have demonstrated that implementing visual feedback can improve information transfer rate (ITR) and reduce fatigue. This research compares a dynamic interface, where target boxes change their sizes based on detection certainty, with a threshold bar interface in a three-step cVEP speller. In this study, we found that both interfaces perform well, with slight variations in accuracy, ITR, and output characters per minute (OCM). Notably, some participants showed significant performance improvements with the dynamic interface and found it less distracting compared to the threshold bars. These results suggest that while average performance metrics are similar, the dynamic interface can provide significant benefits for certain users. This study underscores the potential for personalized interface choices to enhance BCI user experience and performance. By improving user friendliness, performance, and reducing distraction, dynamic visual feedback could optimize BCI technology for a broader range of users.


Introduction
A brain-computer interface (BCI) is a technology designed for real-time communication and control, creating a direct link between the human brain and external devices [1].A BCI system translates the brain signals from a user into a desired output, enabling computer-based communication or the control of external devices.One of the most common techniques for evaluating brain waves is the electroencephalogram (EEG) [2].Advances in EEG technology have enabled higher temporal and spatial precision in recording brain activities.This has provided researchers with detailed insights into the brain's electrical activity and facilitated the development of advanced BCI applications [3].
BCI spellers are designed to provide an interface for typing using brain activity [4].Spellers can be categorized based on their method for letter selection.Some systems display all letters (targets) on the screen at once, while others allow the user to select letters in multiple steps, with fewer targets on the screen at a time.In this study, a three-step speller is used, which only requires four independent visual stimuli.It has been proven to provide a reliable and easy-to-use experience for the vast majority of users compared to those with fewer steps and more targets [5,6].
EEG activities collected from the scalp can be used to derive visual evoked potentials (VEPs) or, more generally, evoked electrophysiological potentials.VEPs provide important diagnostic data on the functional integrity of the visual system.These electrical signals are produced in response to visual stimuli and can be used to interpret the user's intent, converting it into control signals for a computer or other device [7,8].
A variation in the standard VEP, known as code-modulated visual evoked potentials (cVEPs), employs a pseudorandom code to create various visual stimuli and has gained increased popularity in recent years [9][10][11].A cVEP is induced when someone responds to such stimuli, and this can be utilized to control a BCI system in a comparatively quick fashion [12,13].
cVEPs can be put to great use in BCI spellers, where the user is provided with a set of flickering targets (letters or collections of letters), each connected to a unique binary code pattern, the so-called m-sequence, that controls whether the stimulus is displayed or not during the actual frame.The user's brain signals are evaluated in real-time using prerecorded target-specific EEG templates from a training session for categorization [1,6,14].
When assessing the effectiveness of BCI spellers, the speed and accuracy with which the user types words, and subjective fatigue assessments, are usually considered as the most important metrics.The information transfer rate (ITR), measured in bits per minute (bpm), is typically employed in the evaluation of BCI performance.It is the speed at which information is transferred, defining the main property of every information channel.ITR is influenced by several factors such as classification speed, accuracy, and the number of targets.
Performance improvements can be achieved through advancements in algorithms.For instance, our prior research [15] enhanced the performance of a cVEP BCI system using novel target identification methods, such as dynamic sliding windows and stimulus synchronization.Other routes to increasing performance include making the interface and the selection process more user-friendly and less fatiguing.Variations in visual feedback for selection have shown promise in Steady-State visual evoked potential-based (SSVEP) solutions [16], motor imagery [17], and other modalities [18].In this context, visual feedback refers to providing real-time, on-screen feedback about the selection process for the user.This could mean any kind of indicator such as a threshold bar [19] that indicates the progress of target selection to the user.
We hypothesize that visual feedback that maintains user focus on the flickering box will result in stronger, more detectable evoked potentials in a cVEP-based speller, as it distracts the user less from the stimulus.For cVEP-based systems, a novel approach to achieve this would be dynamically changing the target and stimulus size on-screen based on the certainty of detecting given cVEPs, providing real-time feedback to the user, while keeping their focus on the stimulus.The potential performance and usability gain is particularly significant because cVEPs can be more challenging to detect and induce more eye fatigue compared to other VEPs, such as the commonly used SSVEPs.
This study aims to evaluate the effectiveness of such visual feedback in cVEP-based systems by implementing a dynamic interface.The target box will increase in size when the user is trying to select it, making it easier to select and providing valuable feedback during the selection progress, as was previously proposed for SSVEP-based BCIs in [16].Conversely, the other targets that the user is not focusing on will appear smaller.This dynamic interface may be a superior kind of visual feedback solution compared to other methods, which might be more distracting.By allowing the user to select targets more easily, this approach could enhance performance and reduce fatigue as well.
To evaluate the effectiveness of the dynamic interface, we compared it with the more commonly used threshold bar interface element [19][20][21][22].Participants tried both versions of a three-step-speller cVEP-based BCI system, allowing us to compare their performance.We also gathered feedback on their subjective experiences and preferences, which helped us conclusively assess whether and in what sense the dynamic interface could be superior in cVEP-based BCI systems.Furthermore, we collected information regarding participant fatigue before and after the experiment using a standardized questionnaire.The study can also reinforce the reliability of this type of three-step cVEP speller and its claim of achieving a nearly 100% literacy rate [6].Additionally, by evaluating the visual feedback solutions, this research seeks to enhance these types of BCI spellers in terms of both performance and user-friendliness, with the overarching goal of making BCI technology accessible and effective for everyone.
In this section, we transitioned from a broad description of brain-computer interfaces to a detailed explanation of the specific BCI system used in this study, emphasizing the significance of visual feedback.In the following section, we describe the user experience of the employed system in all its technical detail, organized based on the user experience timeline.

Participants
A total of 48 participants (30 females, 17 males, and 1 non-binary) took part in this study.The average age of the participants was 23.9 years, with a standard deviation (SD) of ±3.62.All participants provided written consent in adherence to the Declaration of Helsinki, and the study received approval from the ethical committee of the medical faculty at the University of Duisburg-Essen.Participants could withdraw from the experiment at any time without providing any reasons.The collected data were stored anonymously for analysis purposes, ensuring the confidentiality of the participants.Each participant received EUR 20 for their participation in the study.

Experimental Protocol
The experiment was conducted in the BCI-Lab of Rhine-Waal University of Applied Sciences (HSRW).First, participants received an information sheet detailing the nature of the experiment.After providing their personal information and written consent, they completed the pre-questionnaire (Table 1), which included questions about their experience with BCI systems, vision prescriptions, and level of tiredness.Subsequently, the electrode cap was applied, and the participants were further briefed on the procedure and operation of the speller.In your opinion, how long can the system be used without breaks?Do you think the BCI is a reliable control method?Yes/No/Maybe Following these explanations, participants engaged in a preliminary practice phase to familiarize themselves with the speller.During this phase, they chose a five-letter word (such as their name) and practiced selecting letters with the speller.The threshold, gaze shift, and time window settings were calibrated as necessary during this initial free-spelling practice phase.Following the free-spelling practice phase, the participants were split based on their number to counter the potential effects of becoming better with experience during the session.Subject number was given in order of attendance.
Odd-numbered participants started with the dynamic interface, while even-numbered participants started with the threshold bar interface.During the spelling session, participants were first instructed to spell the word "BCI".After this, subjects were again split based on their subject number.Odd-numbered subjects proceeded with spelling "PRO-GRAM" and then "HAVE_FUN".Even-numbered subjects did this in the opposite order, spelling "HAVE_FUN" first, then "PROGRAM".After this first round of three words (plus the free-spelling practice), the subjects switched to the other version and spelled same three words again (in the order described above).
These words for the spelling phase were also selected to maintain a balance in class representation (not including the target with the "UNDO" function).When completing the spelling phase without mistakes, the first target needs to be selected 20 times, the second one 16 times, and the third one 18 times ("HAVE_FUN" and "PROGRAM" are perfectly balanced).
Spelling phases concluded automatically upon correct word spelling.On average, each subject's spelling session (just spelling) lasted 12 to 15 minutes.Resulting accuracy, ITR, and OCM values were recorded for all completed tasks.After successfully completing the spelling session, participants filled out the post-questionnaire (Table 1), which included questions about their impressions, opinions, and experiences with the BCI system and the two interfaces.
Finally, participants had the opportunity to clean their hair from the conductive gel and received documentation confirming their eligibility for compensation.

Hardware
Due to the high number of participants, the study utilized three Dell Precision Desktops equipped with NVIDIA RTX 3070 graphics cards, running Microsoft Windows 10 (21H2) Education on Intel i9-10900K processors (3.70 GHz).For presenting the stimuli, modern Asus ROG Swift PG258Q displays (Full-HD, 240 Hz maximal vertical refresh rate) were used.
EEG data were collected using g.USBamp amplifiers (g.tec medical engineering GmbH, Schiedlberg, Austria), employing all 16 signal channels.Electrodes were placed according to the international 10-20 system at the following positions: P7, P3, Pz, P4, P8, PO7, PO3, POz, PO4, PO8, O1, Oz, O2, O9, Iz, and O10.The reference electrode was positioned at Cz, and the ground electrode was placed at AFz.During the preparation stage, regular abrasive electrolytic electrode gel was applied between the electrodes and the scalp to reduce impedances to less than 5 kΩ.

GUI
The graphical user interface (GUI) presents four selection options, as illustrated in Figure 1.
The GUI spelling software was organized as three-step speller following its successful utilization in previous research [6,23].It includes 26 letters and one underscore character (used instead of a space), divided into three boxes.For example, to spell the letter "A," the user first selects the group "A-I" in the first step.Then, they select the group "A-C" in the second step, and finally, the individual letters "A," "B," and "C" are presented, allowing the user to select the desired letter "A" in the final third step.

Training
During the recording phase, four stimuli were observed sequentially from 1 to 4 by the participants, as illustrated in Figure 1(Left).The recording was grouped into six blocks of training, denoted as n b = 6.Within each block, every stimulus was focused on once, resulting in a total of 6 × 4 = 24 trials.Each trial lasted for 2.1 seconds, during which the code pattern was displayed for two cycles.
A visual cue, represented by a green frame, indicated the specific box towards which participants were required to direct their gaze.Following each trial (gazing on a target), the subsequent target the user needed to focus on was highlighted, and the flickering paused for one second.After completing a block (all four targets), the software transitioned to the next block of training, with a one-second pause until a total of 6 × 4 = 24 trials were accomplished.Once the training phase was completed, the spelling tasks began.

Spelling Phase
In the spelling exercise, four boxes were displayed on the screen.As indicated in Figure 1(Right), the first 3 boxes from left to right contain letters, and the fourth is the "UNDO" function, which is used as either a backspace or delete button (when going back is not an option, the last spelled letter is deleted).In copy-spelling mode, the box turns green when the correct box is selected, and red when the wrong one is selected.The participants spelled the words as outlined in the experiment protocol (see Section 2.2).The underscore stands in for the space character.Errors are corrected using the UNDO feature of the spelling interface.

Visual Feedback
The visual feedback element in this study was either the threshold bar or the dynamic interface.These elements are shown at the top of Figure 2. At the bottom of the figure, two half GUI screenshots are presented side by side, taken at the start of the spelling phase.These screenshots showcase the initial state of the visual feedback elements.
At the top of Figure 2, the dynamic size change in the target box is illustrated.The size depends on how close the certainty is (∆C) (the calculation of certainty is explained in Section 2.9) value is to the threshold (β).When the certainty reaches the chosen threshold value (e.g., 0.15), the classification is complete, and the respective box will be selected.The size is kept at a minimum (75% size) when the certainty is below 10% of the set threshold value and at a maximum size when it is above 75% of the set threshold value.This means that the target box changes its size dynamically between 75% and 125% of the original size when the certainty is between 10% and 75% of the set certainty value threshold.
Figure 2 also showcases the threshold bar visual feedback element, which we used as a reference to evaluate the performance of the dynamic version.This visual feedback was developed and utilized in previous cVEP systems by Volosyak et al. (e.g., [6]).A bluish bar grows in size from 0% (left) to 100% (right) as the certainty (∆C) reaches the set threshold value.

Stimulus Presentation
The spelling interface (utilizing the system most recently used in [20]) incorporates four distinct stimulus options, arranged as a 1 × 4 matrix of boxes, meaning that the number of targets (K) was four.These are 282 × 282 pixels by default in this experimental setup (see Figure 1(left)).The cVEP stimuli code (c) was a 63-bit m-sequence [24], where "0" represents "black" and "1" represents "white," showcasing a state of complete contrast.The remaining stimuli, c k (for k = 2, . . ., K where K = 4), were generated by circularly left (or right) shifting c 1 by 4, 8, or 16 bits.
The duration of a stimulus cycle in seconds can be calculated by dividing the code length by the monitor refresh rate r in Hz; in this experiment, 63/60 = 1.05 s.The refresh rate used was 240 Hz, so the stimulus changed in accordance with the bit sequence, but for every fourth frame.
Spatial filters were developed using the information gathered during the recording phase for classification.Canonical correlation analysis (CCA) [25] was used on the training trials for this purpose.CCA is a statistical method used to analyze the relationship between two multi-dimensional variables by finding linear combinations that maximize their correlation.

Classification
Following the methodology outlined in [6], CCA can be applied to two multi-dimensional variables X ∈ R p×s and Y ∈ R q×s to analyze their relationship.CCA searches for the weights w X ∈ R p and w Y ∈ R q that maximize the correlation, ρ, between x = Xw X and y = Yw Y by solving: Here, the CCA weights, used as a spatial filter in the online spelling, were constructed as follows.Each training trial was stored as an m × n matrix, where m denotes the number of signal channels (here m = 16) and n denotes the number of samples (here, three 1.05 s stimulus cycles with a 600 Hz sampling frequency , i.e., n = 1.05 • 600 • 3 = 1890).In total (given 6 training blocks (n b ), 24 such trials T j i ∈ R m×n , i = 1, . . ., K, j = 1, . . ., n b were recorded.
For each target, individual templates X i ∈ R m×n and filters w i were determined (i = 1, . . ., K).For the generation of spatial filters, the two matrices were constructed: The online classification was performed if a new data block was added to a data buffer Y ∈ R m×n y with dynamically increasing samples.
For target identification, the data buffer Y was compared to the reference signals R i ∈ R m×n y , i = 1, . . ., K, which were constructed as a sub-matrix of the corresponding template X i .
Correlations λ k between the reference signals and the data buffer were calculated as follows: The classifier output C was then determined as follows: A sliding window mechanism was implemented for online spelling.BCI outputs were only performed if a threshold criterion was met.The EEG amplifier transferred data blocks in chunks of 30 samples (every 0.05 s, as the sampling rate was set to 600 Hz).For the sliding window mechanism, it was required that the number of samples per block is a divider of the cycle length.
The data buffer Y was updated dynamically with each new data block, incrementing n y by 30 samples as long as n y < n.The certainty, ∆C, was defined as the distance between the highest and second highest correlation needed to surpass a threshold value, β, which was set to 0.15.For some participants, β was adjusted during the familiarization run to avoid misclassification.
BCI outputs were generated only if ∆C > β.When this condition was met, the data buffer Y was cleared, followed by a specified, usually two-second gaze-shifting phase, during which data collection and visual stimulation were paused, allowing the users to shift their gaze to another target.
The minimum window length and the gaze-shifting period limited the highest achievable information transfer rate of the BCI speller.Individual adjustments to the time window length were made to optimize performance, resulting in varying maximum possible ITRs among participants.

Measures of the BCI Performance
The BCI system's performance was evaluated using commonly used accuracy (Acc.), the information transfer rate (ITR), and in the form of the output characters per minute (OCM).
Accuracy: The accuracy was calculated by dividing the total number of correct selections (classification steps necessary for word completions were considered single commands), including user-necessary corrections during speller execution, by the overall commands classified.The resulting accuracy value was displayed as a percentage value on the speller interface.
OCM The output characters per minute (OCM) measures typing speed by dividing the total number of output characters by the time taken to type them.OCM accounts for the error correction time, as the participants will require additional time for corrections if mistakes are made.
ITR The information transfer rate (ITR) was calculated in bits per minute (bits/min) using the following formula: where B = information transferred in bits; N = number of targets (for this study it is equal to 4); P = classification accuracy.
To obtain the ITR in bits/min, B is multiplied by the average classification time in minutes.For more information and tools to calculate ITR, visit our webpage: https://bci-lab.hochschule-rhein-waal.de/en/itr.html(accessed on 23 June 2024).

Questionnaire
A questionnaire was designed to collect participant feedback, with sections dedicated to both pre-experiment and post-experiment questions.These sections were intended to be completed respectively before and after the experiment, focusing on assessing general user experience as well as feedback regarding the two interfaces used.For further information, refer to Table 1, which outlines these pre-and post-experiment questions.

Results
All statistical analyses were performed using JAMOVI software (The jamovi project, version 2.3, 2022) and Microsoft Excel (version 2021, build 2406).The spelling exercise was successfully completed by every subject, except Subject 27, who reported "too much dizziness"; thus, they did not finish the last spelling task.On average, participants achieved an accuracy of 95.57% with a standard deviation (SD) of 6.12, and an ITR of 53.55 bits/min (SD: 16.25), resulting in an OCM of 9.24 (SD: 2.56).
The results corresponding to each subject are presented in Table 2.The table illustrates the average accuracy, ITR, and OCM values of both the interfaces together and when separated.
Results per task are represented in Table 3, where the performance value averages (time, accuracy, ITR, OCM) are presented in a study-wide and per interface split manner for each spelling task.
The average and per task performance was very similar among the interfaces.With the threshold bar interface, participants reached an average of 95.71% accuracy with an SD of 5.91, an average ITR of 54.58 with an SD of 15.49, and an average OCM of 9.41 with an SD of 2.47.
With the dynamic interface, participants reached an average of 95.81% accuracy with a standard deviation (SD) of 6.32, an average ITR of 52.34 with an SD of 16.91, and an average OCM of 9.04 with an SD of 2.64.Evaluation of the Questionnaires Among the participants, 32 had never used a BCI system before, while 13 had prior experience (with three missing values).A total of 26 participants did not require vision correction, while 18 wore their vision correction, and 4 did not use their vision correction during the experiment.On average, participants slept 7.14 hours (SD = 1.08) the night before the experiment.
To assess changes in tiredness levels due to the BCI speller usage, the participants rated their tiredness before and after the experiment on a scale from 1 (not at all) to 6 (very much).The participants reported similar levels of tiredness before the experiment (M = 2.44, SD = 1.07) compared to after the experiment (M = 2.50, SD = 1.13).A Wilcoxon signedrank test revealed that there was no statistically significant difference in reported levels of tiredness (W = 117, p = 0.249).Following the experiment, participants rated how disturbed they were by the flickering and how easy it was to concentrate on the boxes, again on a scale from 1 (not at all) to 6 (very much).The average ratings were M = 2.60 (SD = 1.27) for disturbance and M = 4.13 (SD = 1.75) for concentration ease.Regarding the reliability of the BCI as a control method, 32 participants answered "Yes", indicating that they consider the BCI to be a reliable control method, 15 answered "Maybe" and only 1 participant selected "No".When asked if they could use the system daily, 21 participants chose "Maybe", 19 selected "Yes", and 5 chose "No" (with three missing values).Furthermore, on average, the participants indicated that the system can be used for 1.46 h without any breaks.Most participants (44) indicated they would repeat the experiment, while only two answered "Maybe" (one missing value).
To evaluate preference differences between the speller interfaces, the participants were asked to what extent they perceived either version as distracting, once more on a scale from 1 (not at all) to 6 (very much).The average rating for the threshold bar interface was M = 2.29 (SD = 1.50), while for the dynamic interface it was M = 2.33 (SD = 1.19).A Wilcoxon signed-rank test indicated no statistically significant difference between the two interfaces in terms of perceived distraction (W = 317, p = 0.798).Finally, the participants indicated their preferred speller interface, with 28 preferring the threshold bar interface and 20 preferring the dynamic interface.A binomial test showed that these proportions did not significantly differ from an equal preference assumption.A visual representation of the results concerning the speller interfaces is provided in Figure 3.

Discussion
Achieving an average accuracy of 95.57% and an ITR of 53.55 bits/min with a 3step speller highlights the peak performance within this category of spellers.[20,26].Additionally, the lowest average accuracy recorded was 87.17%, which is decisively above the commonly established 70% accuracy mark for BCI literacy [6].These findings emphasize that modern cVEP-based BCIs are capable of successful operation across a diverse population.
The primary goal of this study was to compare the performance of cVEP-based BCIs using two different interface types.On average, both interfaces performed well, with subjects using the threshold bar interface achieving slightly better ITR and OCM, while the dynamic interface slightly improved accuracy.However, these differences were not statistically significant.

Statistical Analysis and Validation
Prior to conducting further analysis, the normality of the data was assessed using the Shapiro-Wilk test, confirming that the average ITR for the threshold bar interface (p = 0.6373), average OCM for the threshold bar interface (p = 0.4250), average ITR for the dynamic interface (p = 0.2567), and average OCM for the dynamic interface (p = 0.1728) were normally distributed.
Subsequently, a paired t-test was conducted to compare the average ITR and OCM values between the two interfaces.The results indicated that there was no significant difference in average ITR (t(47) = 0.188, p = 0.188) and average OCM (t(47) = 0.173, p = 0.173) between the threshold bar and dynamic interfaces.
The following factors (for the questions see Table 1) were also tested and did not have a significant influence on the comparative results of the interfaces: participants' age, whether the interface was used in the first or second round, and past BCI experience.The performance metric averages stayed very similar, no matter the grouping.

Top Quartile Performance Analysis
Even the smallest difference disappears when we examine the top quartile of performances (top 25%, top 36 spelling tasks).Table 4 shows how the average performances of the top quartile are nearly identical, regardless of the interface used.This suggests that neither interface significantly contributed to achieving outstanding performance overall.Consequently, we need to explore potential subject-wise differences.To further examine the potential differences between the two interfaces, we compared the average values and calculated the performance difference between the interfaces for each subject by subtracting the threshold bar's averages from the dynamic interface averages.The average ITR difference was −2.16 with an SD of 11.18, favoring the threshold bar interface, and the average OCM difference was −0.35 with an SD of 1.77, also slightly in favor of the threshold bar interface.This was expected based on the average performance scores.
However, to identify potential outliers who had drastically better performance with one interface than the other, we calculated the z-scores.This helped identify subjects who had a significantly (p = 0.05) better performance with one of the interfaces.Beforehand, we used the Shapiro-Wilk test to confirm that the p-values for ITR (p-value: 0.2441) and OCM (p-value: 0.2162) differences were greater than 0.05, indicating that we fail to reject the null hypothesis that the data come from a normal distribution.Similarly, a Kolmogorov-Smirnov test showed that ITR (p-value: 0.6664) and OCM (p-value: 0.5083) differences indeed follow a normal distribution.
Figure 4 presents all the Z-scores for every subject, illustrating both the average ITR and average OCM differences.As shown in the figure, there are three instances where the 1.95 (p = 0.05) mark is reached.Subject 2 had an average OCM difference z-score of 1.99, Subject 18 had an ITR difference z-score of 2.36 and an OCM difference z-score of 2.11, and Subject 43 had an ITR difference z-score of 2.07.Subject 2 experienced a 3.17 increase in OCM.For Subject 18, there was a 24.33 increase in ITR and a 3.39 increase in OCM with the dynamic interface.Subject 43 saw a 21.07 increase in ITR.All of these instances are cases where the subject performed significantly better with the dynamic interface.This leads to the important conclusion that while on average there is no significant difference between the interfaces, 3 out of 48 participants (6.25%) performed significantly better with the dynamic one.This is a crucial finding, especially considering the overarching goal of making BCI available and usable for everyone.

Subject-Wise Perceived Level of Distraction Comparison
Participants on average found the threshold bars more distracting, but not significantly (Figure 3).There were also fewer participants (3 compared to 7) who rated the dynamic interfaces as very/highly distracting (5 and 6 on the Likert scale).
Examining the differences reveals that there were cases where subjects found one interface significantly more distracting.Figure 5 shows the differences between the two interfaces for each subject.Calculated based on the 6-point Likert scale responses, the difference is obtained by subtracting the threshold bar interface distraction score from the dynamic interface distraction score.The highest, 4-point difference visible on the figure comes out to be exactly the upper and lower bound when a standard Interquartile Range (IQR) method is performed with a multiplier of 1.5 to detect outliers.Subjects 12, 40, 41, and 43 achieved this 4-point difference.In three cases, the dynamic interface was heavily preferred in terms of perceived distraction, while in one case (Subject 40), the threshold bar interface was deemed a lot less distracting.Importantly, the authors want to highlight the potential error on the part of Subject 40 when answering the related questions.Subject 40 may have mixed up the two interfaces, since contrary to their answer, the subject performed better (+15.27ITR and +2 OCM) and achieved the highest accuracy difference (+14%) in favor of the dynamic interface.Either way, these findings further strengthen our conclusion that having a different interface was significantly beneficial for some participants.In this instance, three participants (6.25%) strongly preferred the dynamic interface, while one participant (2.08%) strongly preferred the threshold bar interface.

Conclusion on the Dynamic Interface
We can conclude that while the two interfaces generally achieve a similar performance, there are notable exceptions.
Importantly, there were strong outlier cases where a subject either preferred one interface significantly more in terms of its distraction level or performed significantly better with one interface.Aligning with our hypothesis, this almost exclusively favored the dynamic interface, as evidenced by Figures 4 and 5.
Therefore, the conclusion of the study is that the employed dynamic interface solution can serve as a significant improvement by ensuring that there are less extreme outliers in terms of performance and reducing user distraction without affecting overall average performance.
If implemented as an optional choice, where users are provided with a phase to select the optimal interface for themselves, it has the potential to significantly enhance user experience and performance.This is evidenced by Table 5, which compares the average performance of the current mixed interface setup with the envisioned best-suiting interface setup.These metrics project the expected performance during further use if the optimal interface is chosen following an objective performance assessment.This also calls for a shorter, more automated interface selection period.

Future Research Directions
Some participants explicitly mentioned during the experiment that they found the threshold bars more "rewarding" or "fun."This could be a potential area for improvement, possibly through incorporating colors or smoother transitions between sizes, allowing for continuous transitions between different threshold levels.
Future research may focus on experienced users, as even the smallest performance difference disappeared when the we looked at the best performers (see Section 4.2).Additionally, users reported the dynamic interface to be less distracting (see Figure 5).These observations support the initiative of conducting longer sessions to find out if extended use favors the dynamic interface more.Collecting user feedback on perceived levels of eye fatigue after each session, possibly on a larger scale, could provide more detailed insights into differences in eye strain.
Another improvement could be to implement the dynamic interface during training to accommodate potential differences in EEG patterns caused by constant size changes.
Implementing the dynamic interface as an optional feature could be a straightforward improvement for cVEP BCIs.However, this increases the time needed for the system to be ready for use.A potential future direction is to find the quickest way to introduce the subjects to both interfaces and to let them reliably choose their preferred one.

Figure 1 .
Figure 1.(Left) The GUI screen representing the start of the training phase, where the 1st target is marked with a green frame.This mark changes to the consecutive targets during the training phase.(Right) The GUI speller screen during the BCI spelling phase with the threshold bar interface.Letter selection required three steps: to type, e.g., the letter B, the participant first selects the 'A-I' box (1st target), followed by the 'A-C' box (2nd target), and finally the B box.

Figure 2 .
Figure 2. The screenshots of the starting state and during the spelling of both types of visual feedback: the dynamic interface on the left and the threshold bar interface on the right.(Top left, going from the outside to the inside) Showcasing all limits of the box: the maximum size (125%), which appears when the certainty is over the 75% of the threshold, the normal (100%) outline of the box, and the minimum size, which is shown when the certainty is below 10% of the threshold.(Top right) Showcasing the threshold bar interface, changing in size depending on the certainty, reaching 100% when the threshold is reached.(Bottom left) A cut-out of the GUI screenshot with the dynamic interface, showing the starting state before the spelling task begins.(Bottom right) A cut-out of the GUI screenshot of the threshold bar interface, showing the starting state before the spelling task begins.

Figure 3 .
Figure 3. Participants ' questionnaire responses regarding the two speller interfaces.(a) Percentage distribution of the selected answer for the question "To what extent did you find the interface distracting?".Ratings ranged from 1 (not at all) to 6 (very much).(b) Pie-chart of participants selected interface preference.

Figure 4 .
Figure 4. Illustration of z-scores calculated for the performance difference averages.Positive values mean that the dynamic interface performed better, while negative values indicate that the threshold bar interface performed better.

Figure 5 .
Figure 5. Illustration of the perceived level of distraction.Calculated based on the 6-point Likert scale responses, as the dynamic interface distraction and the threshold bar interface distraction.Orange values refer to the ratings of the dynamic interface, while blue ones to the threshold bar interface.

Table 1 .
Used questionnaires with all the questions and answer options.

Table 2 .
The average accuracy, ITR, and OCM values, both with and separately of each of the interfaces.Additionally, the table shows the interface where the users achieved a higher average ITR with and their subjective preference.

Table 3 .
The time, accuracy, ITR, and OCM averages are displayed both study-wide and per interface for each spelling task.

Table 4 .
Comparison of top quartile (top 25%) spelling task performances for dynamic and threshold bar interfaces in terms of accuracy, information transfer rate, and output characters per minute.

Table 5 .
Comparison of performance metrics between the study-wide average and the bestperforming interface.The table shows potential gains in performance if the best interface had been chosen, highlighting the differences and percentage gains.