Exploring the intra-individual reliability of tDCS: A registered report

Transcranial direct current stimulation (tDCS), a form of non-invasive brain stimulation, has become an important tool for the study of in-vivo brain function due to its modulatory effects. Over the past two decades, interest in the influence of tDCS on behaviour has increased markedly, resulting in a large body of literature spanning multiple domains. However, the effect of tDCS on human performance often varies, bringing into question the reliability of this approach. While reviews and meta-analyses highlight the contributions of methodological inconsistencies and individual differences, no published studies have directly tested the intra-individual reliability of tDCS effects on behaviour. Here, we conducted a large scale, double-blinded, sham-controlled registered report to assess the reliability of two single-session low-dose tDCS montages, previously found to impact response selection and motor learning operations, across two separate time periods. Our planned analysis found no evidence for either protocol being effective nor reliable. Post-hoc explorative analyses found evidence that tDCS influenced motor learning, but not response selection learning. In addition, the reliability of motor learning performance across trials was shown to be disrupted by tDCS. These findings are amongst the first to shed light specifically on the intra-individual reliability of tDCS effects on behaviour and provide valuable information to the field.


Introduction
Non-invasive brain stimulation is a powerful neuromodulatory technique that has been extensively used in research and clinical settings to assess and improve performance on various perceptual, cognitive, and motor tasks (Agarwal et al., 2013;Martin et al., 2013;Benninger et al., 2010;Filmer, Varghese, Hawkins, Mattingley, & Dux, 2017b;Loo et al., 2012;Nitsche et al., 2003;Reis et al., 2009).In addition, it can be employed to investigate brain function and behaviour (Marshall & Binder, 2013;Nitsche et al., 2006).In recent years, the use of a particular form of non-invasive brain stimulation, transcranial direct current stimulation (tDCS), has surged (Filmer, Mattingley, & Dux, 2020b).This is likely due to several reasons, but its ease of use and low cost make this an attractive method in both pure and applied settings.tDCS involves a weak electrical current (typically .5 e 3 mA) being applied to the cortex via electrodes placed on to the scalp.While still a matter of debate, tDCS is generally thought to induce changes in neural excitability, an effect hypothesised to occur via the modulation of neuronal membrane potentials (Bindman, Lippold & Redfearn, 1964;Purpura & McMurtry, 1965;Reinhart, Cosman, Fukuda, & Woodman, 2017).This modulatory effect was traditionally considered to be polarity-dependent, with anodal stimulation increasing cortical excitability and cathodal stimulation decreasing cortical excitability of targeted areas (Nitsche et al., 2007;Nitsche & Paulus, 2000;Pellicciari, Brignani, & Miniussi, 2013).However, it is now clear that tDCS effects depend on complex interactions between multiple factors, such as stimulation intensity, duration, polarity, and task-induced neural state (Stagg et al., 2011;Batsikadze, Moliadze, Paulus, Kuo, & Nitsche, 2013;Krause & Cohen Kadosh, 2014;Mosayebi Samani, Agboada, Jamil, Kuo, & Nitsche, 2019).For example, cathodal tDCS can increase cortical excitability at certain stimulation intensities (Batsikadze et al., 2013), and its effect on measures of long-term depression and long-term potentiation can vary as a direct function of stimulation current and duration (Mosayebi Samani et al., 2019).Further, increasing evidence shows that the effects of anodal and cathodal tDCS vary substantially across task modalities that involve learning (Filmer et al., 2020b).
Such complexities in the variables that interact to influence tDCS outcomes, including the use of questionable research practices in some studies (H eroux, Loo, Taylor, & Gandevia, 2017), have led to inconsistent findings in the field.As a result, this has led some to question the efficacy of the approach (Filmer, Dux, & Mattingley, 2014;Filmer et al., 2020b).In particular, a key issue concerns the intra-subject reliability of tDCS effects on behaviour e i.e., the extent to which tDCS elicits similar effects on behaviour within individuals across different time points.Understanding the reliability of tDCS-related effects on task performance is crucial to validate the large body of experimental tDCS work that has already been undertaken in the area of cognitive neuroscience.Although many meta-analyses have examined the reliability of tDCS effects across different studies (Hashemirad, Zoghi, Fitzgerald, & Jaberzadeh, 2016;Hill, Fitzgerald, & Hoy, 2016;Hoy et al., 2013;Jamil et al., 2017), to the best of our knowledge, no study has explicitly tested intrasubject reliability of tDCS effects on behaviour.Here, we assess the intra-subject reliability of tDCS by focussing on two (previously replicated) learning paradigms that occur at different levels of information processing (Filmer, Mattingley, & Dux, 2013a, Filmer, Ehrhardt, Shaw, Mattingley, & Dux, 2019b;Kantak, Mummidisetty, & Stinear, 2012;Nitsche et al., 2003).

tDCS in motor learning and cognitive performance
Over the past two decades, many studies have explored the effect of both anodal and cathodal tDCS on learning and training outcomes (for recent reviews, see Dedoncker et al., 2016aDedoncker et al., , 2016b;;Buch et al., 2017;Reinhart et al., 2017;Filmer et al., 2020b).In the first demonstration of tDCS influencing learning, anodal tDCS applied to the primary motor cortex (M1) was shown to improve the learning of movement sequences, as measured on the Serial Reaction Time Task (SRTT; Nitsche et al., 2003).The SRTT is a measure of incidental motor learning and typically features four stimulus-response pairings presented serially across a large number of trials, with reduced response times to repeated (target) sequences compared to random sequences (Nissen & Bullemer, 1987).In their study, Nitsche et al. (2003) found that 1 mA of anodal tDCS applied to the M1 for 15 min during the SRTT resulted in faster reaction times during repeated sequences compared to controls.This finding has since been replicated (e.g., Cuypers et al., 2013;Kantak, et al., 2012;Karok & Witney, 2013;Vines, Cerruti, & Schlaug, 2008) and has inspired related work investigating effects of M1 tDCS on other forms of motor learning (for a review, see Buch et al., 2017).

The present study
Establishing the reliability of key findings is crucial to the field of cognitive neuroscience.Indeed, at a time when many fields of investigation grapple with a 'replication crisis', and multiple large-scale, multi-lab replication efforts are underway (Klein et al., 2014;Klein et al., 2018), the topic of reliability has never been more relevant to all branches of science.Here we conducted a pre-registered, double-blind and shamcontrolled crossover study to assess the intra-individual reliability of two commonly used single-session tDCS protocols that influence learning effects.We first hypothesised that the previously published effects would both be replicated e (H 1 ) cathodal tDCS to the dlPFC would disrupt training-related performance gains in the response-selection task (RST; Filmer et al., 2013aFilmer et al., , 2019b) ) and (H 2 ) anodal tDCS to M1 would improve performance in the serial reaction time task (Kantak, et al., 2012).Second, based on neurophysiological (Lopez-Alonso et al., 2015;Jamil et al., 2017) and multi-session studies (Filmer, Lyons, Mattingley, & Dux, 2017a;Filmer et al., 2017b) we hypothesised that these effects of tDCS would be moderately reliable (p > .60,r > .30)within-individuals across time for both the response-selection task (H 3 ) and serial reaction time c o r t e x 1 7 3 ( 2 0 2 4 ) 6 1 e7 9 task (H 4 ).That is to say, for an individual, the difference in their task performance under sham and tDCS conditions would be moderately consistent across time periods.

Transparency statement
We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/ exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study.

Participants
Right-handed participants, aged between 18 and 35 years (mean age 23.2, SD ¼ 3.3, 14 males) were recruited for the study, with 92 and 86 complete datasets for the RST and SRTT being collected respectively (see Exclusions).Participants were recruited through the student population and community pool at The University of Queensland and paid $20AUD per hour that they participated.Handedness was confirmed using the Edinburgh Handedness Inventory (Oldfield, 1971).A tDCS Safety Screening Questionnaire (see Appendix A) was employed to screen for tDCS contraindications.Specifically, individuals with psychiatric or neurological condition(s), current psychoactive medication use, significant alcohol or drug use, or history of head injuries or concussions were excluded from participating in the study.In addition, as extensive musical training prevents tDCS improvements on the SRTT (Furuya, Klaus, Nitsche, Paulus, & Altenmu ¨ller, 2014), individuals with greater than 12 years of musical training were excluded.To ensure capacity to complete the response selection task, individuals with deficient colour vision or hearing impairment were also be excluded at pre-screening.Unreported vision/hearing problems were shown by poor performance in the practice phase, thus only participants with accuracy of >60% were asked to continue in the study.Experimenters and testing procedures were kept as consistent as possible throughout the study, such that every participant received the same brief from the experimenters at each stage of testing (see Appendix B).The University of Queensland Human Research Ethics Committee provided approval for this study.

Bayesian sampling
As per the Bayesian sampling approach, participants were recruited until strong evidence for the alternative or null hypothesis (i.e., a Bayes factor of BF 10 ! 6 or BF 01 ! 6) was achieved for the critical hypothesis test (H 3 ), or when 100 complete datasets were collected, whichever was sooner.The stopping rule was first checked once an initial sample of 50 subjects had been achieved, and then at every subsequent 5 thereafter.) with a small-moderate effect (d ¼ .4),stopping boundary of BF 10 ! 6, and a sample size of n ¼ 100 found evidence for the alternative hypothesis (i.e., an effect of tDCS).We used a small-moderate effect size (d ¼ .4)for these simulations as it represents the minimum effect of practical significance and is below the effect sizes of previous studies (Filmer et al., 2013a;Kantak et al., 2012).However, even though the sample size approached 100, with 98 datasets being completed for the RST prior to exclusions, we were unable to reach the full 100 participants within the planned testing period of 18 months, even after replacing excluded participants.While the sample stopped short of the planned 100 datasets, results of post-hoc BFDA simulations show the impact on sensitivity to have been negligible, with 77% of simulations for the critical test (H 3 ) terminating at the null boundary (BF 01 ¼ 6) for N ¼ 80 compared to 80% for N ¼ 100.

Exclusions
Of the 107 participants recruited a total of 21 participants had their data excluded for one or both tasks following the commencement of their first session.Five participants were excluded based on scheduling or attendance issues, 4 were found to have been concurrently participating in another brain-stimulation study at the University, 3 withdrew without reason, 2 withdrew citing headaches following sessions, 2 were unable to follow verbal or written instructions, 2 demonstrated performance levels that met our post-study exclusion criteria, 1 had their data corrupted by a technical issue, 1 disclosed >12 years musical training after their session had commenced and 1 was excluded after reporting light headedness during stimulation.The three participants who reported adverse effects were encouraged to seek medical attention, with no further adverse effects reported at follow-up.Following exclusions we had a total of 92 complete datasets for the response selection task and 86 datasets for the motor learning task.

Tasks
Response selection task (Filmer et al., 2013a).Each session of this task required participants to discriminate between stimuli as quickly and as accurately as possible.Stimuli were mapped to eight individual keys on a standard QWERTY keyboard, with the task separated into two levels of response load (2 response alternatives vs 6 response alternatives) that were interleaved throughout a session in a counterbalanced manner.This task used two response key mappings, with a participant using only one of these versions for all four sessions.1A), symbols (#, %, @, ~,^, *, þ, |) or sounds (eight tones, the same as those used by Dux et al., 2006).
The allocation of stimulus sets to experimental session was counterbalanced across participants and stimulation type.Participants completed this task a total of four times, across two weeks.In the first week (T1) participants completed a sham and active tDCS session separated by a minimum of 24 h.In the subsequent week (T2), participants again completed a sham and active tDCS session with stimulation type counterbalanced (see Appendix D).During sessions, participants sat approximately 70 cm from an LCD monitor with a refresh rate of 100 Hz and respond using a standard Macintosh keyboard.A session began with an introduction of the stimuli and response keys, followed by three practice blocks of 30 trials for each response load during which accuracy feedback was provided (Fig. 1).Once participants had familiarized themselves with the task, they completed 540 trials across three phases (180 trials each phase, 90 of each response load) with instructions to respond as rapidly and accurately as possible.Response time was measured as the time between stimulus onset and keypress, during a response window of 1800 msec following the stimuli presentation.Stimulation was administered immediately after the first phase, with participants instructed to keep their eyes open and refrain from fidgeting during stimulation.The second phase of training began within 1 min of stimulation cessation.The third phase took place 20 min after the end of stimulation, with participants instructed to sit quietly for the 10 min period between the second and third phases of training.As with Filmer et al. (2013a) response times from the pre-stimulation baseline were compared to those of the two post-stimulation phases in order to ascertain a measure of performance improvement, with the greatest difference in performance expected between the baseline and 20 min post-tDCS phases.

3.6.
Serial reaction time task (Nissen & Bullemer, 1987) We used a modified SRTT (Kantak et al., 2012), to train the non-dominant hand on four stimuli-response pairings (A S D F; V B N M; E R T Y; H J K L), with each finger corresponding to one response key (e.g., A e little, S e ring, D e middle, F e index).
Consistent with previous studies (Kantak et al., 2012;Karok & Witney, 2013), training the non-dominant hand should increase the magnitude of performance improvements.In each trial of this computer-based task, a row of four boxes was presented on screen, three grey and one red target stimulus, with participants required to respond as rapidly as possible to the stimulus by pressing the appropriate key on the keyboard (see Fig. 2).In each experimental session, participants were trained on a repeating ten-trial target sequence (e.g., AeDeSeFeSeAeSeDeAeF), with a new target sequence used each session in an order which was counterbalanced across participants.Following a procedure similar to Kantak et al. (2012), each experimental session began with 100 trials at baseline (6 target sequences and 6 random sequences), followed by 600 training trials (60 target sequences) while receiving tDCS.Five minutes after the cessation of tDCS, participants repeated the 120 trial baseline test.Response times were measured as the time between stimulus onset and keypress.Response times to the trained and random sequences at baseline and end of acquisition were compared to ascertain a measure of performance improvement.All sessions were completed using the same equipment and specifications as the previous task.Sessions were also counterbalanced across two weeks, with a minimum 24 h between sessions (see Appendix D).

Stimulation protocols
A Neuroconn DC Plus stimulator (NeuroCare, Germany) was used to deliver stimulation via two rubber electrodes encased in saline-soaked sponges.The International 10e20 EEG system was referenced for electrode placement.For the response selection task, electrode placement and stimulation parameters replicated the cathodal condition from Filmer et al. (2013a) with the 25 cm 2 (5 Â 5 cm) cathodal electrode placed 1 cm posterior to F 3 (left DLPFC) and the anodal electrode placed over the contralateral supraorbital ridge (Fp 2 ) in order to deliver 9 min of tDCS (including a 30 sec ramp up/down period) at an intensity of .7 mA.We chose to only replicate the cathodal condition for this task due to finding it slightly more consistent than the anodal condition during previous iterations (Filmer et al., 2013a(Filmer et al., , 2019b)).
Electrode placement for the SRTT task replicated Kantak et al. (2012), with an 8 cm 2 (2 Â 4 cm) anodal electrode placed on the contralateral M1 (C 4 ) to the hand being trained, and a 48 cm 2 (6 Â 8 cm) cathodal electrode placed on the forehead over the ipsilateral orbit (Fp 1 ), with tDCS delivered for 15 min (including a 30 sec ramp up/down period) at an intensity of 1 mA.Sham stimulation for all tasks involved identical electrode placement to the respective active condition, however stimulation ceased after 1.25 min.
The blinding protocol of the NeuroConn tDCS device allows for the programming of stimulation duration, current and frequency.Both active and sham stimulation can be activated by manually inputting a unique numerical code each session, with codes supplied by an experimenter uninvolved in testing.The display of the device was similar between conditions, with imitation parameters being displayed during sham stimulation.In the event that technical issues affected the duration or intensity of stimulation, or the electrode impedance increased above 15 kU, that session was aborted and its data excluded.Aborted sessions were rescheduled where practical.

Overview
The purpose of this study was to assess whether the effects seen in previous research (Filmer et al., 2013a;Kantak et al., 2012;Karok & Witney, 2013;Nitsche et al., 2003)   the tasks were completed.Response times (RT) for correct responses comprised the principal data analysed in both tasks.Analyses were conducted using JASP software (JASP Team, 2023; Version 0.17.3) and the BFpack package in R (Mulder et al., 2019).Bayesian analyses used default Cauchy priors centred on 0 with a variance width of .707.Although it is normally reasonable to base prior distributions on the posterior distributions of studies one aims to replicate, we chose default priors for the following reasons.First, a recent systematic review found that studies without pre-registration, such as those we are replicating, tend to report larger effect sizes compared to those with pre-registration, a discrepancy thought to be due to publication bias (Sch€ afer & Schwarz, 2019).Basing priors on studies with inflated effect sizes results in larger than necessary distances between the prior distributions, which means in some cases small-medium effects will provide more evidence towards the null than the alternate (Etz & Vandekerckhove, 2016).This issue is further compounded by a number of previous SRTT-tDCS studies not explicitly reporting effect sizes, making it difficult to ascertain average effect sizes across studies.Second, the primary focus of this study was to examine the intra-individual reliability of tDCS effects, regardless of the specific effect size, and thus for these purposes default Cauchy prior distributions are appropriate.

Post-study exclusion criteria
Following completion of the study, individuals who scored 2.5 SDs above or below the mean for response time or accuracy, did not follow task instructions, or did not complete all four sessions of a task had their data excluded from the analysis process.In the instance where an individual's data from only one task was compromised, the data for the second task was still included, provided it did not meet the above criteria.Only 2 exclusions were related to performance identified at the completion of the study.

Effects of tDCS on response-selection task performance
We predicted from previous research that participants would show disrupted performance in the response-selection task for the high load-condition when stimulated with cathodal tDCS compared to sham tDCS (Filmer et al., 2013a(Filmer et al., , 2019b)).To test this hypothesis (H 1 ), Bayesian paired-sample t-tests were run on sham and cathodal tDCS RT-change scores for the high load condition (pre-stimulation mean RT e 20 min poststimulation mean RT) for both T1 and T2 separately.A small-to-medium effect (d .2) of tDCS, consistent with previous studies (Filmer et al., 2013a(Filmer et al., , 2019c)), was expected.NHST analogues were also run, given this approach is still common in the literature.

Effects of tDCS on SRTT performance
Our analysis of SRTT performance followed the method of Kantak et al. (2012).For each session mean RT was calculated from 60-trial blocks of target and random sequences at baseline and the end of acquisition.Training trials were binned into blocks of 60, with mean RT calculated for each block.Implicit sequence-specific performance was measured as the difference in mean RT between target and random sequences at baseline and end of acquisition.We then compared sequence specific performance between conditions (sham vs anodal tDCS) for time period 1 and 2 using Bayesian paired sample t-tests.Our hypothesis (H 2 ) predicted that anodal tDCS to the contralateral M1 would result in significantly better sequence specific performance at end of acquisition than sham tDCS, as has been observed in several replications previously (Cuypers et al., 2013;Kantak et al., 2012;Karok & Witney, 2013;Nitsche et al., 2003;Vines et al., 2008).Again, NHST analogues were also run with a critical threshold of p < .05for significance.

Reliability
The intra-individual reliability of tDCS on both task effects across time (cathodal stimulation disrupting response selection task, anodal stimulation enhancing SRTT) was assessed using intra-class correlation (ICC 2,1 ) analysis, a routinely employed technique to assess test-retest reliability.ICC 2,1 , used as an indicator of agreement amongst test sessions is calculated by comparing the variance across different sessions for the same subject with the variance across all sessions and all subjects, and is expressed as; where MS B is the variance between sessions, MS W is the variance within sessions and MS E is variance due to error.The between-condition difference scores from each time period (i.e., T1 active tDCS e T1 sham tDCS) were used for this correlational analysis, with both Bayesian and NHST bivariate correlations also calculated using the same values.Based on previous neurophysiological studies of tDCS reliability (L opez-Alonso et al., 2014;Jamil et al., 2017) as well as multi-session studies (Filmer et al., 2017a(Filmer et al., , 2017b)), we predicted that tDCS effects in both tasks should reach at least moderately reliable thresholds (p > .60,r > .30)within individuals (H 3 , H 4 ).

Control analysis
The order of stimulation conditions, tasks, and stimulus sets were counterbalanced within the study design to control for order effects (see Appendix D), thus we did not include any pre-planned control analysis.Stimulation sessions were separated by a minimum 24 h.

Results
The following analyses employed the standard interpretation of the resultant Bayes factors (van Doorn, Aust, Haaf, Stefan, & Wagenmakers, 2021), where those between 1 and 3 were considered to be weak, those between 3 and 10 moderate, and those greater than 10 strong evidence for the test hypothesis (BF 10 ) or null hypothesis (BF 01 ).NHST results are reported for communication purposes but were not used to justify conclusions.

Effects of tDCS on response-selection task performance
In contrast to our predictions, .7 mA cathodal tDCS to the DLPFC, in the high response load condition, did not disrupt training-related performance gains in the response-selection task at either time point (see Fig. 3).Bayesian paired-sample t-tests comparing sham and cathodal tDCS RT-change scores (pre-stimulation mean RT e 20 min post-stimulation mean RT) revealed strong evidence for no effect of tDCS in disrupting learning at both Time 1 (BF 0À ¼ 20.7, p ¼ .91)and Time 2 (BF 0À ¼ 19.4,p ¼ .93;see Fig. 4).

Effects of tDCS on SRTT performance
As was the case for the RST paradigm, and contrary to our predictions, 1 mA to the contralateral M1 did not improve sequence performance at either time point (see Fig. 5).Indeed, Bayesian paired-sample t-tests comparing changes in sequence-specific performance from baseline to test between conditions revealed strong evidence for no effect of tDCS on improving performance at both Time 1 (BF 0þ ¼ 24.05, p ¼ .98)and Time 2 (BF 0þ ¼ 20.05, p ¼ .93;see Fig. 6).

Reliability
The impact of tDCS, on both tasks, was not reliable.Specifically, ICC 2,1 analysis of difference scores (active e sham) across time periods revealed little to no reliability of an effect of tDCS in the RST (ICC 2,1 ¼ .14),with results of Bayesian Pearson correlations showing moderate evidence against a correlation of difference scores across time periods (r ¼ .13,BF 01 ¼ 3.509, p ¼ .21;see Fig. 7A).And, for the SRTT, with ICC 2,1 analysis of difference scores across time periods showing little to no reliability of an effect of tDCS (ICC 2,1 < .10),with Bayesian Pearson correlations showing moderate evidence against a correlation of difference scores across time periods (r ¼ .09,BF 01 ¼ 5.52, p ¼ .43;see Fig. 7B).These results are perhaps not surprising given evidence for the null found for the overall effect of tDCS for both tasks.

Overview
Given the failure to observe our predicted effects, we conducted exploratory analyses to investigate potential explanations for these findings.Firstly, we assessed whether differing stimulation and task orders may have influenced intraindividual performance.We broke the data down by time (1 and 2), and by sham versus active stimulation, as both tasks involved learning and stimulation components which may have changed over time, and training data in line with the previous literature (Biabani, Farrell, Zoghi, Egan, & Jaberzadeh, 2018).These analyses used Bayesian Repeated Measures (RM) ANOVAs and correlations (NHST is also presented as above).
For the response selection task, we ran Condition (Active, Sham) and Time (Time 1, Time 2) RM ANOVAs on RST change scores (Pre-Immediate Post tDCS, Pre-20 min Delay) with both the order of Active and Sham sessions and the order of task stimulus entered as between subjects factors.Similar 2 Â 2 RM ANOVAs, Condition (Active, Sham) and Time (T1, T2), were run on SRTT Sequence Specific Performance scores (Trained RT/ Random RT) at Baseline and Test, with stimulation order entered as a between subjects factor.Further, whilst our preregistered SRTT analyses examined pre-to-post stimulation effects, this overlooks potential tDCS effects on the training block.Indeed, this is potentially important as large number of tDCS SRTT studies have shown effects during the training block (Ballard, Eakin, Maldonado, & Bernard, 2021;Debarnot et al., 2019;Greeley & Seidler, 2019;Hsu, Shereen, Cohen, & Parra, 2023).We additionally analysed the training block data, via Condition (Active, Sham) Â Time (Time 1, Time 2) Â Block (1… 10) RM ANOVAs on reaction times for correct keypresses, with stimulation order again as a between-subjects factor.Lastly, we dug deeper into our reliability data.Specifically, we examined within-session reliability and correlations across conditions i.e., sham and active as opposed to difference scores.

Effects of tDCS on SRTT performance
There was a strong effect of tDCS on SRTT performance in the training trials.).The absence of stimulation effects for Time 2 might have resulted from large performance improvements that were evident from Time 1 to Time 2 (main effect of Time, BF incl > 1000, BF 10, U ¼ 1.623Eþ12; F 1,82 ¼ 16.31, p < .001),which might have masked possible effects of tDCS on sequence learning in Time 2 (i.e., the second week).We discuss the potential impact of this on reliability below.The effect of stimulation at Time 1 was also dependent on the order of stimulation received across the two sessions, with strong evidence for a Condition Â Stimulation Order interaction (BF incl > 1000; F 1,84 ¼ 15.37, p < .001).RM ANOVAs on Baseline and Test data, with factors being Phase (Baseline, Test) and Condition (Active, Sham), also revealed a Condition Â Stimulation Order interaction (BF incl > 1000; F 1,84 ¼ 21.92, p < .001).Follow-up t-tests demonstrated that participants who received Active then Sham tDCS at Time 1 had much lower RT at Test in the second session relative to the first (BF 10 ¼ 128.21; t (42) ¼ 4.08, p < .001;Fig. 8).Active tDCS appeared to prevent this practice effect amongst those who received Sham then Active tDCS (BF 10 ¼ .243;t (42) ¼ .91,p ¼ .37)with strong evidence for tDCS disrupting learning in the second session when performance at Test was normalised to Baseline (BF 10 ¼ 22.47; t (42) ¼ 3.43, p < .001).

Discussion
A number of studies have demonstrated that tDCS may influence cognition and motor learning (Antal et al., 2022;Filmer et al., 2014;Filmer, Mattingley, Marois, & Dux, 2013b;Fregni et al., 2005;Nitsche et al., 2003;Reinhart et al., 2017), however, very few if any at all have tested whether these effects are repeatable in the same individual.Given the previous criticism regarding tDCS reproducibility (Horvath, Carter, & Forte, 2016;Horvath, Forte, & Carter, 2015;Horvath, Vogrin, Carter, Cook, & Forte, 2016) proper assessments of tDCS reliability are needed.In the present study, we sought to determine the intra-individual reliability of two previously replicated tDCS paradigms.The first paradigm was the Serial Reaction Time Task, in which previous studies have found improved/ impaired performance in response to anodal tDCS to the M1 (Kantak et al., 2012;Karok & Witney, 2013;King, Rumpf, Heise, et al., 2020;Nitsche et al., 2003;Puri, Hinder, Kru ¨ger, & Summers, 2021).The second paradigm was the Response Selection Task, in which we have previously shown an effect of anodal and cathodal tDCS to the left prefrontal cortex disrupting learning performance (Filmer, Ehrhardt, Bollmann, Mattingley, & Dux, 2019a, Filmer et al., 2019b, 2013a, 2013b).We specifically chose these two tasks as they not only allowed for the examination of tDCS reliability in both the motor and cognitive domains, but also for a comparison of both online and offline montages.
To evaluate the intra-individual reliability of tDCS effects, we had participants attend four training sessions across two weeks for each task during which they underwent active or sham tDCS in a counter-balanced manner.The present work directly replicates the protocols of previous studies in terms of stimulation parameters and task, however there exists methodological differences arising from the nature of the study requiring repeated sessions: for example, the RST paradigm needed an additional session and stimuli set.Whilst our data demonstrated learning effects within both tasks, our  planned analysis found poor reliability of tDCS effects on these two paradigms, suggesting individuals did not respond consistently to the tDCS manipulations.However, our exploratory reliability analyses revealed anecdotal evidence for tDCS enhancing reliability of RST performance, and strong evidence for tDCS disrupting reliability of SRTT performance.Additionally, our exploratory analyses revealed strong evidence for a negative effect of tDCS on SRTT performance, and anecdotal evidence for a positive effect of tDCS on RST performance, with both these observed effects being in the opposite direction to our hypotheses.We discuss here what may have contributed to these findings, and what they might mean for the field.In short, our results provide a complex picture of the influence of tDCS across paradigms and its reliability.While we did find evidence that tDCS influenced motor learning, this was influenced by practice effects that were prominent despite the use of different sequences for each session.In addition, stimulus set composition appears to have played an important role in the RST paradigm, with this change removing an effect previously replicated.These lack of overall effects may have limited the extent to which we could observe correlations across sessions.

tDCS disrupted motor sequence performance
We observed that 1 mA anodal tDCS to the motor cortex disrupted, rather than enhanced, both online and test performance in the SRTT, which is contrary to previous studies (Kantak et al., 2012;Karok & Witney, 2013;Nitsche et al., 2003).However, more recent research, published after our report was registered, demonstrates that anodal tDCS to the M1 may disrupt motor performance learning (King, Rumpf, Heise, et al., 2020;Puri et al., 2021).This fits with suggestions that variable results, including null effects, are not uncommon in tDCS motor learning literature (Cuypers et al., 2013;Hashemirad et al., 2016;Summers, Kang, & Cauraugh, 2016;Buch et al., 2017;Guimarães et al., 2022).Although a pattern of poorer performance in Active tDCS sessions was witnessed at both time periods, it was only found to be meaningful at T1, during the first two sessions participants completed.Further separation of the sample at T1 by stimulation order showed that tDCS disrupted performance in the first session and prevented practice effects in the second session, relative to sham counterparts (Fig. 8).
The tDCS protocol used for the motor learning in the present study is based on that of Kantak et al. (2012), whose work resulted in one of the largest effects of anodal M1 tDCS observed 24 h post-stimulation (Hashemirad et al., 2016).This protocol uses an electrode montage consisting of a small (8 cm 2 ) active electrode and large (48 cm 2 ) reference electrode to deliver 1 mA with a resultant charge density e current(A)* time (sec)/electrode size(m 2 ) e of 1125 C/m 2 , much larger than the typical density of 257e386 C/m 2 seen in other studies of this type (e.g., Karok & Witney, 2013;Nitsche et al., 2003).This is an important point, as recent research has demonstrated dosageeduration interactions that challenge the polarity dependent expectations of tDCS (Batsikadze et al., 2013;Hassanzahraee, Nitsche, Zoghi, & Jaberzadeh, 2020;Mosayebi Samani et al., 2019).For example, in their work Hassanzahraee et al. (2020) showed that when applied in excess of 26 min, 1 mA and 1.5 mA anodal tDCS to the M1 induces inhibitory, rather than excitatory, activity in the target cortical regions.While the present work only stimulated the M1 for a duration of 15 min, the resultant charge density (1125 C/m 2 ) was much greater than that of Hassanzahraee et al. (2020)'s inhibitory protocols (446e669 C/cm 2 ), thus it is plausible that higher charge densities to the M1 caused perturbation rather than enhancement of online motor performance.
Lastly, unlike previous studies, we did not localise the hand region of the M1 using TMS, but rather placed the active electrode over the C 4 location using the 10e20 EEG system, as this point has often been reported as corresponding with the hand region (Kantak et al., 2012;Karok & Witney, 2013).However, given some research suggests that the optimal scalp location corresponding to hand area may in fact be more dorsal of C 3 /C 4 at the C 1 /C 2 and C 1h /C 2h positions (Kim, Wright, Rhee, & Kim, 2023;Silva, Silva, Lira-Bandeira, Costa-Ribeiro, & Arau ´jo-Neto, 2021;Sparing, Buelte, Meister, Pau s, & Fink, 2008), it is possible that larger electrodes (35 cm 2 /25 cm 2 ) that cover all of these regions, as in the majority of previous studies (see Buch et al., 2017), may be necessary for online effects of tDCS to be observed.

Cathodal tDCS did not affect response-selection
We did not observe any effect on response selection following .7 mA tDCS to the left PFC across either time point.There exists many potential reasons for this lack of effect, including the increased inter-individual variability inherent to large tDCS studies such as the present (Etz & Vandekerckhove, 2016;Minarik et al., 2016).For example, previous work has shown that multiple neurobiological and anatomical factors can influence the efficacy of tDCS both in training and transfer (Filmer, et al., 2020a;Filmer et al., 2019aFilmer et al., , 2019b;;Kabakov, Muller, Pascual-Leone, Jensen, & Rotenberg, 2012;King, Rumpf, Verbaanderd, et al., 2020;Opitz, Paulus, Will, Antunes, & Thielscher, 2015;Shahid, Wen, & Ahfock, 2013;Stagg, Bachtiar, & Johansen-Berg, 2011).Other research shows that tDCS efficacy can also interact with participant's baseline ability, especially in healthy samples (Assecondi et al., 2021;Bruny e et al., 2014;Schmicker et al., 2021).In contrast to our previous work (e.g., Filmer et al., 2019b), participants completed 4 sessions instead of 3, necessitating the use of an additional stimulus set.In addition, sessions were run closer together in time and more trials overall were completed.As this was not a direct replication of earlier work by Filmer et al. (2019b), that also employed a large sample, future work will be required to assess the boundary conditions of the influence of anodal tDCS on response-selection learning.

tDCS effects were not reliable within individuals
Turning our focus to the question at the heart of the present work, that being whether tDCS elicits reliable effects on behaviour within an individual, several factors may explain why participants did not reliably respond to the tDCS interventions.First, in the present work, participants attended the lab twice a week for four weeks.Within each week the median period between sessions was 2 days, and median period between the second and third sessions was 6 days.This timeframe was designed to provide the most reasonable and convenient scheduling for both participants and experimenters, and thereby reduce the rate of attrition and its associated bias (Flick, 1988).However, while 2e3 days between sessions is not unprecedented in tDCS studies reporting significant results (see Biabani et al., 2018;Dedoncker et al., 2016aDedoncker et al., , 2016b)), the previous tDCS studies we replicate here tested subjects closer to once every week, allowing for at least 5e8 days between each session (Filmer et al., 2019b;Kantak et al., 2012).It is possible that the reduced period between sessions in the present work may have led to carryover effects, be them from training or stimulation, that have complicated the results.Previous studies have shown how learning effects can heavily influence results (Jacoby & Lavidor, 2018).We see possible evidence for this in the SRTT data, where baseline RTs for the Trained sequence were lower at T2, suggesting participants had become more adept at detecting the implicit sequence following the two previous sessions at T1. Conversely, in the RST data, whilst we saw no effect of time on baseline task performance, perhaps highlighting a strength of using differing task stimuli across sessions, we also observed varied levels of performance across the four stimuli modalities, which may have confounded the results.
Further, in regards to the SRTT, previous research demonstrates that competition between implicit and explicit motor learning mechanisms may influence performance (Kantak et al., 2012) and gaining explicit awareness of the sequence during an implicit SRTT can influence tDCS efficacy (Greeley & Seidler, 2019).Due to the repeated nature of the present study, we did not assess participants awareness of the sequence postsession, as this would have informed them of the presence of sequences in subsequent sessions, but it is possible that by T2 participants had come to understand the sequential nature of the task and responded accordingly, thus influencing the effect of tDCS at this time period.Future work may address this issue by informing participants of the presence of a sequence prior to sessions, thus negating issues arising from differing levels of awareness amongst the sample.
Finally, perhaps the most parsimonious account is that due to either practice or task design or both, the tDCS effect were not large enough for reliability to be observed here.Indeed, as is the case with any individual difference measure, if the system is not stressed sufficiently or a process in not assessed validly, it is unlikely a reliable correlation will be observed in a test-retest setting.

Conclusion
Over the past two decades, tDCS has garnered much interest as a means to influence motor and cognitive functioning.However, despite its popularity, it is often criticised for heterogenous results, especially with regards to behaviour.
While much research has investigated potential methodological and inter-individual factors, the present work is among the first to examine intra-individual reliability of tDCS.
This study investigated the intra-individual reliability of two tDCS protocols administered within the same large cohort of individuals across two time points.In a large sample of individuals (>80) we failed to replicate the previously reported main effects of both protocols at either time point.Instead tDCS was observed to disrupt, rather than enhance, motor learning with no effect being observed on response-selection.Our planned analysis found no to low reliability for tDCS effects in both protocols, although exploratory analysis revealed anecdotal evidence for tDCS improving consistency of response-selection performance.Possible methodological and individual contributors were discussed, including areas for future research to improve.These findings underscore the necessity for further in-depth investigations into the intraindividual reliability of tDCS effects and cognitive neuroscience generally.To maximise the utility of tDCS as a tool for investigating and influencing brain function, it is imperative that future work focusses on delivering a comprehensive understanding of the consistency and reliability of tDCS outcomes across and within individual participants.
Response selection task e subsequent sessions "Thank you for continued participation in our study.Today you will again complete three repetitions of the Response Selection Task.As before it is important that you attempt to complete this task as quickly and accurately as possible.
As with previous sessions following the first task repetition you will receive stimulation to your dorsolateral prefrontal cortex.You may feel a slight tingling or itching sensation during stimulation, please advise me if these sensations become too uncomfortable at any stage.Again, today's session should last a total of 70 minutes." Serial reaction time task e first session "Thank you for participating in our study today.This study is investigating the effect of transcranial direct current stimulation on cognitive and motor task performance.
The task you are about to complete is known as the Serial Reaction Time Task.It involves a series of stimuli appearing on the screen which you will then need to select the correct response to by using the appropriate keyboard key.More specific instructions will be provided in the task.It is important that you attempt to complete this task as quickly and accurately as possible.Today is the first of four sessions in which you will complete this task.This session will begin with a short practice period before repeating the task three times, with short breaks in between for you to stretch and take a break.The first repetition is used to get a baseline of your performance and will last approximately 2.5 mins.The second repetition will be the main part of the task, and will last approximately 15 mins.During this second repetition you will receive stimulation to your primary motor cortex.You may feel a slight tingling or itching sensation during stimulation, please advise me if these sensations become too uncomfortable at any stage.The third and final repetition is used to gauge your improvement will again be a short period of 2.5 minutes, completed without any stimulation.Today's session should last approximately 30 minutes." Serial reaction time task e subsequent sessions "Thank you for continued participation in our study.
Today you will again complete three repetitions of the Serial Reaction Time Task.As before it is important that you attempt to complete this task as quickly and accurately as possible.
As with previous sessions following the first short repetition you will receive stimulation to your primary motor cortex while you complete the main 15 min task repetition.You may feel a slight tingling or itching sensation during stimulation, please advise me if these sensations become too uncomfortable at any stage.You'll then complete another short repetition at the end.Again, today's session should last a total of 30 minutes."c o r t e x 1 7 3 ( 2 0 2 4 ) 6 1 e7 9

Fig. 1 e
Fig. 1 e Response Selection Task.(A) An example of a single trial where the participant is presented with one of eight possible target stimuli from one of four stimulus sets and is required to respond with the appropriate key as quickly and as accurately as possible.Error feedback is only given during practice trials.Abstract shape stimulus set shown bottom right.(B) The low-load and high-load response mappings for version 1 and version 2 which are equally distributed throughout a phase.(C) Session schedule.Each of the three task phases consists of 180 trials (90 of each response load).(D) Illustration of full schedule of sessions showing counterbalance of stimuli and stimulation condition.Sessions will be separated by a minimum 24 h within time periods.Edited from Filmer et al. (2019b).

Fig. 2 e
Fig. 2 e Serial reaction time task.(A) Five consecutive trials shown.Participants are presented with a stimulus and instructed to respond as quickly as possible by pressing the corresponding key.Stimulus remains on screen until response is made.(B) Schedule of session, with 15 min anodal tDCS applied during training.(C) Illustration of full schedule of sessions showing counterbalance of stimuli and stimulation condition.Sessions within a time period will be separated by a minimum 24 h, and the order of stimulation and stimuli counterbalanced across sessions and time periods.

Fig. 3 e
Fig. 3 e Response times across the three phases of the response selection task sessions.Error bars indicate 95% confidence intervals.

Fig. 5 e
Fig. 5 e Serial reaction time task response times across session.TB/RB: Trained/Random Baseline.TT/RT: Trained/ Random Test.Error bars indicate 95% confidence intervals.

Fig. 6 e
Fig. 6 e Mean difference in response time across the serial reaction time task sessions, Mean response time prestimulation is compared to response times at test.Error bars indicate 95% confidence intervals.

Fig. 7
Fig. 7 e A) Linear regression depicting relationship between ActiveeSham difference scores for each period of RST.B) Linear regression depicting relationship between ActiveeSham difference scores for each period of SRTT.Shaded area indicates 95% confidence intervals.C) Individual participants Active versus Sham change scores plotted for T1 and T2 of the Response Selection Task.D) Individual participants Active versus Sham change scores plotted for T1 and T2 of the Serial Reaction Time Task.

Fig. 9
Fig. 9 e A) Linear regression depicting relationship between Sham RT change scores (Baseline e Test) for each time period of SRTT.B) Linear regression depicting relationship between Active RT change scores (Baseline e Test) for each time period of SRTT.Shaded area indicates 95% confidence intervals.

Fig. 8 e
Fig. 8 e Training performance at Time Point 1 separated by Condition and Stimulation Order (Active e Sham/ ShameActive).TB/RB: Trained/Random Baseline.TT/RT: Trained/Random Test.Error bars indicate 95% confidence intervals.
tDCS RST: response selection task, SRTT: serial reaction time task, tDCS: transcranial direct current stimulation.a All sessions separated by minimum 24 h.

Table C1 e
Study design table.
BFDA for r ¼ .3estimates final N ¼ <100.ICC.Sample.Size for p ¼ .4,p 0 ¼ .0estimates N > 80 provides 1 À b ¼ .90 at a ¼ .02. r e f e r e n c e s

Table D1 e
Counterbalanced schedules.