Limits to tDCS effects in language: Failures to modulate word production in healthy participants with frontal or temporal tDCS

Transcranial direct current stimulation (tDCS) is a method of non-invasive brain stimulation widely used to modulate cognitive functions. Recent studies, however, suggests that effects are unreliable, small and often non-significant at least when stimulation is applied in a single session to healthy individuals. We examined the effects of frontal and temporal lobe anodal tDCS on naming and reading tasks and considered possible interactions with linguistic activation and selection mechanisms as well as possible interactions with item difficulty and participant individual variability. Across four separate experiments (N, Exp 1A = 18; 1B = 20; 1C = 18; 2 = 17), we failed to find any difference between real and sham stimulation. Moreover, we found no evidence of significant effects limited to particular conditions (i.e., those requiring suppression of semantic interference), to a subset of participants or to longer RTs. Our findings sound a cautionary note on using tDCS as a means to modulate cognitive performance. Consistent effects of tDCS may be difficult to demonstrate in healthy participants in reading and naming tasks, and be limited to cases of pathological neurophysiology and/or to the use of learning paradigms.


Introduction
Transcranial direct current stimulation (tDCS) is a popular technique for modifying cognition using a weak electric current. Over the past decade, thousands of articles have reported beneficial effects especially in language tasks in participants with healthy (Prehn & Fl€ oel, 2015) and pathological brains (for aphasia, see de Aguiar, Paolazzi, & Miceli, 2015; for dyslexia see, Heth & Lavidor, 2015). Based on early research on the motor cortex, cortical excitability can be modulated via shifts in resting membrane potentials, resulting in hypopolarization/ excitation versus hyperpolarization/inhibition depending on the polarity of stimulation (i.e., anodal versus cathodal). However, cognitive effects are far more complex and unpredictable (Horvath, Forte, & Carter, 2015a). This is in part because tDCS effects interact with ongoing cortical activity (see Silvanto, Muggleton, & Walsh, 2008), as indicated by the general effectiveness of tDCS in patient samples (for review, see Cappon, Jahanshahi, Bisiacchi, Turner, & Paul, 2016;de Aguiar et al., 2015). It may therefore be that tDCS can modulate cognition in pathological brains where excitability or processing capacity is unusually low or dysfunctional, but not in healthy brains where neuronal excitability is operating at optimal levels. If true, this will limit the applicability of tDCS. We aimed to gather further evidence on this question by focusing the effects of single-session, anodal tDCS in normal participants coupled with picture naming and reading tasks, and by considering the moderating influence of cortical excitability resulting from individual differences and task demands. The reliability of tDCS in cognitive tasks has been questioned in recent reviews. Horvath et al. (2015a) found no evidence of any cognitive effects across eighty studies on healthy participants using single sessions of tDCS. In a companion review, Horvath, Forte, and Carter (2015b) also showed no neurophysiological effects of tDCS beyond the modulation of motor evoked potential (MEP) amplitudes. Meta-analyses focusing on working memory/short-term memory effects in healthy samples reported similarly significant but small effects of anodal tDCS (e.g., Brunoni & Vanderhasselt, 2014;Hill, Fitzgerald, & Hoy, 2015). For example, Dedoncker, Brunoni, Baeken, and Vanderhasselt (2016) found a significant but unimpressive reduction in response times following single sessions of anodal (or excitatory) tDCS applied to the left dorsolateral prefrontal cortex in healthy volunteers (effect size: À.10). However, a recent and arguably more comprehensive review by Mancuso, Ilieva, Hamilton, & Farah (2016) focusing on the effects of anodal tDCS in healthy participants revealed that effects became non-significant after correction for publication bias. This is important given the notorious "file-drawer" tendency to favor publishing studies reporting significant results.
Only one published review has examined effects of tDCS on language tasks in healthy participants, and it has not included naming tasks. Price, McAdams, Grossman, and Hamilton (2015) examined effects in verbal fluency (N ¼ 6) and word learning (N ¼ 2) and found a small anodal tDCS improvement in accuracy scores when all studies were pooled together, but also when analyses were limited to the four studies using offline stimulation (i.e., applied prior to task performance) or the three studies measuring offline effects in verbal fluency. Here as well, however, effects were small (<~.05), and depended largely on two studies with abnormally large effects (~.8;Fl€ oel, R€ osser, Michka, Knecht, & Breitenstein, 2008;~1.2;Cattaneo, Pisoni, & Papagno, 2011). What is worse, the effect in one of these studies (i.e., Cattaneo et al., 2011) has not been replicated since (see Penolazzi, Pastore, & Mondini, 2013;Vannorsdall et al., 2016; but see Cattaneo et al., 2016 for response). Another review by Jacobson, Koslowsky, and Lavidor (2012) showed no cathodal-induced decrements for language studies (0 out of 5 studies), but significant anodalinduced improvements (7 out of 8 studies). This review, however, included both patient and control samples. Moreover, since the aim was comparing cathodal and anodal stimulation, for each study, only the most significant effect for either cathodal or anodal stimulation was included across conditions, a zero effect size was assigned to null outcomes, and any effect that contradicted an anodal-excitation/ cathodal-inhibition outcome was excluded. In actuality, across the four studies investigating language production in healthy participants, only 3 out of 26 effects were significant.
Variation in tDCS outcomes may be due to methodological differences across studies, especially in terms of the parameters of the applied current (for further discussion, see Antal, Keeser, Priori, Padberg, & Nitsche, 2015;Horvath, Carter, & Forte, 2016;Nitsche, Bikson, & Bestmann, 2015), but also to interaction with ongoing cortical activity (see Miniussi, Harris, & Ruzzoli, 2013). Picture naming could be an important task to assess these interactions. Naming involves both the need for cortical excitation to allow retrieval of target representations and the need to curtail excitation of related words that may otherwise reach 'activation threshold' and be produced in error (for similar argument, see Miniussi et al., 2013). Depending on the task, one can have a relatively greater need of activation/excitation versus selection/control. Therefore, instead of looking at an overall effect of tDCS, one can assess whether the increased excitability offered by tDCS is overall positive versus negative depending on the lexical mechanisms (activation vs selection) primarily required by the task. A crucial feature of our investigation will be to look at these potential differences.
The interplay of lexical activation and selection in word retrieval is well demonstrated with paradigms where the presence of semantically related words increases the need for mechanisms of selection and results in longer time/less accuracy in retrieving the target word. This so-called semantic interference effect is demonstrated when: a) naming pictures in the presence of semantically related versus unrelated words (picture-word interference; Abdel Rahman & Melinger, 2007;Belke & Stielow, 2013;Levelt, Roelofs, & Meyer, 1999;Mahon, Costa, Peterson, Vargas, & Caramazza, 2007), b) repeatedly naming sets of semantically related versus unrelated words (cyclic blocked picture naming; Belke, 2013;Belke & Stielow, 2013;Oppenheim, Dell, & Schwartz, 2010;Schnur, Schwartz, Brecher, & Hodgson, 2006), c) comparing naming of exemplars early in a sequence of related pictures e when interference is low e with naming exemplars later in the sequence e when interference has built up (continuous naming paradigm; Belke, 2013;Belke & Stielow, 2013;Howard, Nickels, Coltheart, & Cole-Virtue, 2006). Effects in picture naming are sometimes compared with effects in reading with the expectation that difficulties with lexical-semantic selection will affect picture naming, but not reading, where targets are retrieved from an orthographic rather than a semantic specification (see Belke, 2008Belke, , 2013. One can put forward different hypotheses on how tDCS could modulate effects of semantic interference. One may assume that anodal tDCS, which increases excitability, will improve performance when retrieving words in neutral conditions, but will have more mixed effects when retrieving words in the face of competitors. In this context, effects can even be negative, because it is harder to select among highly activated competitors (i.e., interference effects will increase). Furthermore, these contrasting effects may depend on the site of stimulation. It has been suggested that negative effects of anodal tDCS are more likely when applied to temporal areas, which are involved in lexical activation and retrieval (e.g., Indefrey & Levelt, 2004; Piai, Roelofs, Jensen, Schoffelen, & Bonnefond, 2014), while positive effects may be more likely when anodal tDCS is applied to the frontal lobe, which are involved in boosting mechanisms of control and selection (e.g., Hirshorn & Thompson-Schill, 2006;Novick, Trueswell, & Thompson-Schill, 2010;Scott and Wilshire, 2011). Note, however, that this further hypothesis depends on two controversial assumptions: 1. that effects of tDCS can be focal enough to target specifically one of two adjacent cortical areas (but see Datta et al., 2009); 2. that top-down frontal mechanisms contribute to lexical selection in addition to mechanism of lateral inhibition intrinsic to the lexical module (see Hamilton & Martin, 2005 for a discussion). Pisoni, Papagno, and Cattaneo (2012) tested effects of tDCS on semantic interference using a cyclic blocked picture naming paradigm. As predicted, they found increased interference following stimulation of the temporal lobes, but decreased interference following anodal tDCS of the frontal lobe. Meinzer, Yetim, McMahon, and de Zubicaray (2016) and Wirth et al. (2011) also found decreased interference during frontal tDCS with the same paradigm. However, Meinzer et al. (2016) did not replicate the expected increased interference following temporal stimulation and Henseler, M€ adebach, Kotz, and Jescheniak (2014) found no significant effect of either frontal or temporal stimulation with a picture-word interference paradigm. These findings, together with more general reviewed findings, point to the limited efficacy of single session tDCS to modulate cognition in healthy participants. In our experimental study, we want to try to replicate these findings, but also explore reasons for variability by considering how tDCS effects may interact with individual differences in cortical excitability.
Participants are likely to differ in baseline levels of cortical excitability for a variety of factors (for extensive reviews, see Krause et al., 2013;Li, Uehara, & Hanakawa, 2015). If cognitive performance depends on an optimum level, with worse performance associated with either too low or too high excitability, then some individuals may show improvement after anodal tDCS, whilst others may show no effect or even worse performance depending on baseline levels. Individual variability in response to both TMS (Silvanto et al., 2008) and tDCS (L opez-Alonso et al., 2014;Wiethoff, Hamada, & Rothwell, 2014) has been demonstrated in the motor domain. L opez-Alonso et al. (2014), for example, reported that following tDCS more than half of participants showed no increase in TMS-elicited MEPs, but actually a slight decrease. There are also indications that tDCS effects may depend on baseline level of performance (Hsu, Tseng, Liang, Cheng, & Juan, 2014;Tseng et al., 2012). For example, Tseng et al. (2012) showed that anodal tDCS induced improvements in visual short-term memory and associated increases in event-related potentials (ERPs), but that both of these changes were limited to participants with initially poor performance. These individual sources of variability may compound task-mediated variability in producing variable tDCS outcomes.
In our experimental investigation, we will use naming and reading tasks to assess effects of tDCS both overall and, more specifically, on interference effects. We will use 'best practice' anodal stimulation protocols. With cyclic blocked naming picture, we will target frontal areas; with continuous naming, we will contrast stimulation of frontal and temporal areas. Frontal stimulation may be particularly helpful to reduce interference effects, boosting selection mechanisms which control the activation of potential competitors. Temporal stimulation, instead, may increase the activation of competing items, leading to even stronger interference.
In addition, we will consider the possibility of individual variation. Individuals with high baseline levels of excitability may be more likely to exceed an optimal level of activation, especially in naming conditions where a sequence of competitors increases overall activation levels. To evaluate potential effects of tDCS which may have a different sign (positive or negative) in different individuals, we will consider absolute (independent of sign) inter-session differences in an experimental group, where one session is carried out with real stimulation and one with sham stimulation. We will, then, compare these differences with absolute inter-session differences in a control group, where both sessions are carried out in neutral, no stimulation conditions. If tDCS has any effect, differences in the experimental group, due to tDCS, should be larger than differences in the control group, due to random variability between sessions.
Finally, we will also look at effects of tDCS depending on item variability. We will carry out so-called Vincentized analyses where the RTs of each participant are separated into different bins according to their relative speed (very slow, slow, fast, very fast; for a similar method, see Henseler et al., 2014) and then assess the effects of tDCS for each bin. RTs in the 'very slow' category may be particularly susceptible to modulation by tDCS (see also Ross, McCoy, Wolk, Coslett, & Olson, 2010).

Method
2.1. Experiment 1: continuous picture naming and reading Experiment 1 assessed effects of tDCS on picture naming by applying anodal tDCS to frontal (Experiment 1A and 1B) or temporal areas (Experiment 1C). Following Pisoni et al.'s (2012) logic, we expected frontal anodal tDCS to facilitate naming by boosting the ability to select the target word amongst competitors, but temporal stimulation to have possible negative consequences by increasing competition among related items. Differently from Pisoni et al. (2012), however, we used a continuous naming task where participants are presented with sequences of semantically related pictures, but are generally not aware of relationships between pictures because items belonging to the same semantic category are intermixed with distractors. This makes the disruptive effect of competitors less susceptible to strategic control. A reliable increase of RTs for every new item belonging to the same category in a sequence has been shown across studies (with increases of as much as 30 msec for every additional picture; e.g., Belke, 2013;Belke & Stielow, 2013;Howard et al., 2006).
We paired picture naming tasks with corresponding reading tasks to see whether interference effects were specific to the semantic domain and to test more general facilitation effects in word production. If tDCS selectively modulates interference effects in picture naming, with no interference effects in reading, this will show that there are specific effects of tDCS on lexical-semantic control.
2.1.1. Experiment 1A 2.1.1.1. TASKS. Participants carried out word reading and picture naming tasks, with picture names corresponding to the words used in reading. Stimuli were presented one by one on a computer screen, and participants named stimuli as fast and as accurately as possible. In both tasks, the experimental pictures/words belonged to sets of semantically related items, with related items being separated by a variable number of unrelated items. We measured general speed and accuracy of performance, but also accumulation of semantic interference effects across sets of related pictures.
2.1.1.2. DESIGN. Each participant carried out both tasks in each of two testing sessions, scheduled one week apart and involving parallel versions of the same tasks. In the experimental group, sham stimulation was applied in one session and real stimulation in the other. In the control group, no stimulation was applied in either session. Reading was always done first in order to prime and, therefore, facilitate retrieval of picture names. The order of real and sham stimulation sessions, and which particular version of the task was paired with each session, was counterbalanced across participants. Reading lasted for 5e6 min and picture naming for 9e10 min. Stimulation covered all testing times. It started at the beginning of the reading task, and was applied continuously with no gap when the task was changed.
2.1.1.3. STIMULI. 165 colored pictures (720 Â 540 pixel dimensions) were taken from a variety of sources, and the same number of corresponding words made up the stimuli. 120 stimuli were experimental and 45 were "fillers". Experimental stimuli were drawn from 24 semantic categories, with 5 members to each category (for a listing see Appendix A). Presentation of stimuli followed Howard et al. (2006): the first and last five items were filler items; pictures from the same category were presented in a sequence that separated category members by 2, 4, 6, or 8 items composed of fillers or pictures from other categories; each of the 24 categories used a different sequence of lags. The parallel versions of the tasks included the same categories, but different items. To make sure that positional effects were not confounded with other variables, items in different positions were carefully matched for typical age of acquisition (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012), frequency (based on CELEX Database; Baayen, Piepenbrock, & Gulikers, 1995), word length and name agreement 1 . These variables were also matched across the two versions of the task (Appendix B).
2.1.1.4. TASK PROCEDURE. Participants were verbally instructed to read or name the stimuli as fast and as accurately as possible, and to use sub-ordinate nouns (e.g., correct responses to water-lily could be "water-lily" or "lily" but not "flower"). A practice task familiarized participants with the voice key.
Each naming/reading trial started with the presentation of a fixation cross for 1000 msec followed by a blank screen for 250 msec. Stimuli were then presented centered, for 2500 msec or until the participant made a response. A blank screen followed for 500 msec before the next trial started. Stimuli were presented using E-Prime 2 Software and a Dell Laptop computer screen (screen size: 15.6 00 ). Words were presented in Arial typeface 24-font. Vocal responses were recorded using a Sony ICDPX333.CE7 voice recorder. The voice key was a serial response box (Refresher Detector System, Psychology Software Tools, INC). The microphone was a Sony ECM-MS957.
2.1.1.5. TDCS. tDCS was administered using a battery driven NeuroConn DC-Stimulation via a pair of saline soaked sponges. Stimulation was administered using a double-blind procedure, whereby both the experimenter and the participant were unaware of the type of stimulation administered in a given session. For sham stimulation, an intermittent current of 110 mA was delivered for a period of 3 msec every 550 msec. This produces the perceptual sensations of real stimulation without modulating underlying brain areas (Palm et al., 2013). For real stimulation, a constant current of 1 mA was administered for 15 mins with a ramp up and ramp down of 30 sec to reduce discomfort and perceptual differences with sham stimulation. The active electrode (9 cm 2 ; current density ¼ .11 mA/cm 2 ) was placed over the left inferior frontal gyrus (LIFG) whilst the reference electrode (35 cm 2 ) was placed over the contralateral supraorbital area. The LIFG was located by measuring 2 cm from the corner of the eye towards the preauricular point of the left ear then 3 cm upwards perpendicular from this measurement, which corresponds to F7 using the electroencephalogram (EEG) 10/20 position system (Devlin & Watkins, 2007). At the end of each session, participants completed a feedback questionnaire (see Fertonani, Rosini, Cotelli, Maria, & Miniussi, 2010) to assess the effectiveness of stimulation blinding.
2.1.1.6. PARTICIPANTS. Fifty undergraduate students from Aston University participated for course credits or financial reimbursement, and were assigned to the experimental or control group in a semi-random fashion. Two participants in the experimental group and control group failed to attend the second session due to other commitments. This left eighteen participants (10 female; 21 ± 2.76) in the experimental group and twenty-eight participants (17 female; 23 ± 2.52) in the control group. All participants were right-handed and native English speakers. We excluded volunteers with language impairments, history of migraine, headaches (frequent or severe), skin disorders (e.g., eczema), any adverse experience to previous tDCS, any history of epilepsy or stroke, head/metal implants, any neurological disorders, and any volunteers who had participated in a tDCS or TMS study in the 6 months prior to the current study.

Experiment 1B
As shown later, Experiment 1A returned no evidence of tDCS effects. Therefore, we changed the stimulation protocol to increase the chances of positive effects as detailed below. In all other methodological aspects, Experiment 1B was the same as Experiment 1A.
1 Fifteen undergraduate students were shown the 165 pictures and were asked to name each picture. The experiment was selfpaced. Name agreement was measured in terms of the number of different names given to each picture. For example, low name agreement would mean relatively more alternatives, and visa versa. c o r t e x 8 6 ( 2 0 1 7 ) 6 4 e8 2 2.1.2.1. STIMULI. In Experiment 1A, the order of stimuli was the same for each participant. In Experiment 1B, we created 24 different stimuli orders for each of the two matched versions of the naming (and reading) task, with a different sequence of lags for the different semantic categories, but most importantly with a different set of items in the five positions. Each participant was administered one of these 24 versions (for a similar procedure, see Howard et al., 2006). This was to ensure better counterbalancing of items across positions.
2.1.2.2. PROCEDURE. The order of reading and naming tasks was counterbalanced across participants instead of reading always coming first.
2.1.2.3. TDCS. We increased the intensity of the current from 1 mA to 1.5 mA, and increased the size of the active electrode from 9 to 25 cm 2 . These changes were made to reduce current density (e.g., .06 mA/cm 2 instead of .11 mA/cm 2 ); larger electrodes may make the current more uniform and increase cortical excitation (Miranda, Lomarev, & Hallett, 2006). Stimulation duration was increased by 10 mins (total stimulation duration now 25 mins), with a 5 min delay added between the onset of stimulation and the experimental tasks (during which participants read the instructions again from the computer screen) to ensure tDCS effects were fully engaged at task initiation (see Nitsche & Paulus, 2000;Nitsche et al., 2008;Price et al., 2015). We also added 5 mins at the end to ensure that both tasks were covered by stimulation. Two participants in Experiment 1A had completed naming slightly after stimulation offset (these participants were, in any case, excluded from analysis because they failed to show up to the second session).
2.1.2.4. PARTICIPANTS. Thirty-nine undergraduate students from Aston University participated for course credits or financial reimbursement. Data from four participants in the experimental group were lost due to a technical problem. Thus, the final experimental group included twenty participants (12 female; 21 ± 2.92) and the control group twenty-five participants (13 female; 21 ± 3.73).

Experiment 1C
In Experiment 1C, we assessed whether contrasting effects of tDCS would be found with temporal lobe stimulation. In all methodological details, bar those reported below, Experiment 1C was the same as Experiment 1B.
2.1.3.1. TDCS. The active electrode (25 cm 2 ) was placed over the left mid-posterior temporal lobe area (pMTG) whilst the reference (35 cm 2 ) was placed over the contralateral cheek. The pMTG was determined to be at the halfway point between T3 and T5 using the 10e20 International EEG system. We used the contralateral cheek for the reference electrode as it was speculated that by doing so we can avoid current flow through frontal areas, thereby avoiding the difficulty in localizing possible behavioral effects.

2.2.
Experiment 2: cyclic blocked picture naming In Experiment 2, tested the effects of tDCS on cyclic blocked picture naming. This paradigm has been extensively studied (for a review, see Belke & Stielow, 2013), and positive effects of tDCS have been reported (Meinzer et al., 2016;Pisoni et al., 2012;Wirth et al., 2011). In this paradigm, participants are asked to repeatedly name sets of pictures that are either semantically related or unrelated. There is, initially, a marked facilitation, with reaction times falling in cycle 2 relative to cycle 1, due to practice. The facilitation continues in subsequent cycles, but the magnitude of this facilitation is reduced for sets of semantically related pictures, due to increased interference amongst competitors which counters facilitation effects. Even more than the previous continuous naming task, this task taps into the ability to select between a set of highly activated lexical representations, because the same small set of pictures is presented repeatedly over a number of cycles. Consistent with this view, imaging evidence shows increased prefrontal activity, presumably linked to the effort for selection, during cyclic blocked picture naming (Schnur et al., 2006), and improvement during anodal tDCS stimulation is associated with increased activity in frontal areas (Wirth et al., 2011).

Task
Participants named as fast and accurately as possible sets of six pictures, with pictures presented one at a time and each set presented four times in a row (four cycles). We measured general naming speed and accuracy, and semantic interference as it builds up across repeated cycles.

Design
Participants carried out two testing sessions in different stimulation conditions (real or sham), one week apart, with parallel sets of materials. The order of real and sham stimulation, and the task version coupled with each type of stimulation, were counterbalanced across participants. The task lasted for roughly 20 min. Stimulation began five minutes before participants initiated the task and lasted the entirety of the task. During the 5 min delay, participants read task instructions via a computer screen.

Stimuli
72 black and white line drawings were taken from the Snodgrass and Vanderwart (1980) set. Pictures were grouped into 12 sets of six pictures: half the sets included semantically related pictures, the other half included semantically unrelated pictures created by selecting one member from related sets (see Appendix C for a listing). Pictures were presented in 4 cycles in different quasi-random orders (i.e., each picture occupied a different ordinal position across the 4 cycles, and the last item of a cycle and the first of the following cycle were never the same). The related/unrelated blocks were also alternated in a quasi-random order to ensure that no more than two blocks of the same type were shown consecutively. The order of stimulus presentation was the same for all participants. The two versions of the tasks included different semantic categories and different items. Items in the two versions were carefully matched for age of acquisition , frequency (based on CELEX Database; Baayen et al., 1995), word length and name agreement (based on H statistic from Snodgrass & Vanderwart, 1980; see Appendix D).

Procedure
Participants were given the same instructions as in Experiment 1. Additionally, they were familiarized with the pictures before beginning the experiment. They were first presented with each picture with its name written below, and then with the pictures on their own and asked to name them. An accuracy score of 90% or more was needed to progress to the main experiment.
In the main experiment, each naming block began with a "Get Ready …" message for 4000 msec, followed a blank screen for 1000 msec and then a fixation cross for 1000 msec. The picture was then presented and remained on the screen until the participant gave his or her naming response. The end of each block of pictures was followed by blank screen for 1000 msec, and by an "End of block …" message which requested the participant to "Press any button" to start the next block. Stimuli were presented using E-Prime 2 Software. Vocal responses were recorded using a TASCAM DR-680 digital voice recorder with a Rode NTG 2 Condenser Shotgun Microphone. Vocal response times were measured using a Cedrus SV-1 voice key.

tDCS
The stimulation protocol matched Experiment 1B in every way except that stimulation was administered using a battery driven Eldith DC-Stimulation device (functionally equivalent to the Neuroconn DC stimulator).

Participants
Thirty-two undergraduate students from University of Birmingham participated for course credits or for financial reimbursement. A technical error meant that data from three participants in the experimental group had to be excluded, leaving seventeen participants (12 female; 21 ± 2.40) in the experimental group and thirteen participants (7 female; 22 ± 1.76) in the control group.

Ethical approval
Our experimental investigation was approved by The Ministry of Defense Research Ethics Committee, by the Aston Research Ethics Committee and by the University of Birmingham Ethics Committee. All participants gave written informed consent prior to any testing session.

Scoring
Response accuracy was scored after each testing session. Only near-synonyms (e.g., "Hoover" instead of "vacuum") were allowed as correct, any other response was scored as incorrect. Incorrect responses were excluded from RT analysis, as well as RTs below 250 msec and above 2.5 standard deviations from the participant mean. For picture naming, we analyzed percentage error rates and RTs. Errors rates were not analyzed for word reading and cyclic blocked naming tasks because they were very low (<5% and <7%, respectively).

Data re-sampling
In the experimental groups, the order of stimulation (i.e., Sham vs Real) and the set of stimuli (i.e., A vs B) were counterbalanced. So, in the first session, half of the participants received sham whilst the other half received real stimulation, and half of the participants that received either type of stimulation saw stimuli set A whilst the other half saw set B. In the control group e where stimulation was not applied e half of participants saw set A in the first session and B in the second, and vice versa. To make results from the control group comparable with results from the experimental group, we resampled control data to create two pseudo datasets for sessions 1 and 2, so-called pseudo-sham and pseudo-real so that the order of presentations (session 1 vs 2) and stimulus set (A vs B) was also counter-balanced across these two sessions.

Data analysis
Data was analyzed with repeated factor ANOVAs (analysis of variance) to assess the effect of condition in the experimental (Real tDCS vs Sham) and control (Pseudo-Real vs Pseudo-Sham) groups separately. In addition we ran mixed factor ANOVAs, which combined data from both groups, and considered group as a between-participants factor. This provided a more rigorous test. If tDCS were to have an effect, we excepted an interaction between condition and participant group because the experimental group would show a significantly larger effect of condition than the control group e where stimulation was not applied. For these analyses, we report only the condition by group interactions, since the main effect of condition is irrelevant.

tDCS feedback questionnaire
Participants tolerated stimulation well. None reported adverse effects nor withdrew from the study because of stimulation.

Overall effects of tDCS
Effects of stimulation across tasks, experiments and participant groups are shown in Fig. 1. We carried out individual one-c o r t e x 8 6 ( 2 0 1 7 ) 6 4 e8 2 Fig. 1  These results show no systematic effects of tDCS. There were some significant differences between the experimental and control group. The experimental group was faster in naming, but slower in reading than the control groups. It is possible that stimulation (both real and sham) modulates level of performance, but more detailed interpretations are difficult.

Interaction with cortical loci of stimulation
To test for a possible interaction between stimulation site and tDCS, for the experimental group only we conducted a mixed factor ANOVA, with Site (Temporal vs Frontal) as a betweenparticipants factor and Condition (Real vs Sham) as a withinparticipants factor. We report, here, only experiments 1B and 1C, which used exactly the same paradigm.

Direction-neutral effects of stimulation
Here, we considered tDCS effects when allowing for possible opposite outcomes across participants. We found that both participant groups were equally likely to improve or worsen performance relative to sham (or pseudo-sham), with both We also compared absolute differences between conditions in the experimental and control group via a series of ManneWhitney U tests (as values were non-normally distributed). Results are shown in Fig. 2. Overall, for picture naming RTs, the difference between conditions was smaller in the experimental group relative to the control group (M ± SE: 56 ± 6 vs 64 ± 7 msec). This was the opposite of what was expected. It could be that stimulation (both real and sham) reduces variability by increasing arousal and/or motivation. It has to be noted however, that this effect was inconsistent with naming errors (5 ± .4 vs 5 ± 1%) and reading RTs (37 ± 5 vs 36 ± 5 msec).

3.3.
Effects of tDCS on semantic interference

Cumulative interference
Performance across ordinal positions within sets of related items are shown in Fig. 3. Across participant groups, tasks and conditions, our behavioral manipulation worked well. Picture naming shows a steady increase in latencies across positions; errors also show an increasing trend or no effect. Reading shows no systematic effect of position. Crucially, however, there are no detectable effects of tDCS e i.e., the increase in RTs with ordinal position was equivalent with or without tDCS. Numerically, performance was faster in real tDCS than sham in reading experiment 1A (with a slight increase across positions similar to picture naming), but this difference is not significant (see below) and the opposite of what was seen in experiment 1B. We carried out separate repeated factor ANOVAs for each task, experiment and participant group, with Ordinal Positions (1e5) and

Interference by relatedness and cycle
Results for Experiment 2 are shown in Fig. 4. As expected, semantic relatedness interacted with cycle to modulate performance. For unrelated picture sets, participants became progressively faster with every repetition (or cycle), whilst, for related sets, naming latencies flattened after initial facilitation between the first and the second cycle. This pattern was produced by both the experimental and control group, and replicates what is typically found with this paradigm (Belke, 2013;Belke & Stielow, 2013). We carried out a mixed factor ANOVA, with Group as a between-participants factor and Relatedness, Cycle and Condition (Real vs Sham for experimental group; Pseudo-Real vs Pseudo-Sham for control groups) as within-participants factors. There was a main effect of Relatedness, because related sets were slower than unrelated sets [F(1,28) ¼ 14.

Aggregated interference
Here, we considered whether tDCS effects are detectable when interference effects are aggregated across conditions. For Experiment 1AeC, we considered the difference in RTs between items in position 4e5 and items in position 1e2. For Experiment 2, we considered the difference between related and unrelated sets at cycle 4 (where the difference should be positive; with related sets being faster) and at cycle 1 (where the difference should be negative; with related sets being slower). Aggregated interference effects across experiments, groups and conditions are presented in Fig. 5. tDCS clearly had no consistent effect. In the experimental group, interference was larger with tDCS in Experiment 1A and 2, but the opposite was found in Experiment 1B and 1C. We carried out separate one-way ANOVAs for each experiment and participant group, with aggregate interference as a dependent measure and Condition as a within-participants measure. The results showed no significant main effect of Condition (Experimental group: F < 3.30, p > .09, h p 2 < .17; Control group: F < 1.04, p > .32, h p 2 < .04). We also carried out a mixed factor ANOVA with Group as a between-participants factor and Condition as a withinparticipants factor. Crucially, there was no Group Â Condition

Interaction with cortical loci of stimulation
Given the possibility that tDCS could reduce a semantic interference effect with frontal stimulation, but increase it with temporal stimulation we carried out a mixed factor ANOVA with aggregate interference as a dependent measure, Site (Frontal-Stimulation-Exp 1B vs Temporal-Stimulation-Exp 1C) as a between-participants factor and Condition as a within-participants factor. Again, there was no main effect of Condition

Direction-neutral effects of stimulation
Here, we compared absolute differences in interference across stimulation conditions in the experimental and control groups.
Results are shown in Fig. 6. ManneWhitney U tests showed that interference effects changed more across conditions in the experimental than in the control group in Experiment 2, but not in any other experiment and effects were numerically in the opposite directions in Experiments 1B and 1C.

Effect of stimulation by magnitude of interference
To assess whether tDCS effects were dependent on the level of semantic interference we grouped experimental participants into those who showed high versus lower levels of semantic interference. We collapsed picture-naming data for all experiments and conducted a median split on the size of semantic interference across both the tDCS and sham conditions. Fig. 7

Effects of stimulation by item difficulty
We assessed if tDCS effects were limited to items that recruited greater cognitive resources by running a so-called Vincentisation analysis. For each task (reading and picture naming), we ranked each participant's RTs within each ordinal position (Experiment 1) or Cycle (Experiment 2), and then placed the RTs into four bins according to speed (e.g., very slow, slow, fast, very fast), each with 25% of data. This was done separately for each condition (i.e., Real and Sham; Pseudo-Real and Pseudo-Sham). Results in Fig. 8 show that conditions in the experimental and control groups did not systematically differ depending on speed bin. We carried out separate mixed factor ANOVAs for each experiment, with Group (Experiment vs Control) as a betweenparticipants factor and Speed Bin (1, 2, 3, 4) and Condition (Sham vs Real for the experimental group; Pseudo-Sham vs Pseudo-Real for control group) as within-participants factors. Effects of speed bins are expected and not of interest. Crucially, there was no significant Speed Bin Â Group Â Condition interaction for picture naming RTs

Fig
. 5 e Semantic interference effect averaged across conditions. For experiment 1, interference measured as the differences between the last two and first two ordinal positions; for experiment 2, interference measured as the difference between related and related blocks at cycle 4 versus cycle 1; e.g., (relatedeunrelated at cycle 4) minus (relatedeunrelated at cycle 1).

General discussion
In the Introduction, we outlined how recent reviews have reported effects of tDCS to be small, inconsistent and not significant when averaged across studies (e.g., Horvath et al., 2015a). Our experimental investigation aimed to provide further evidence for whether tDCS can modulate language processing in normal healthy participants. We carried out four studies with different groups of participants which employed tasks typically used to probe lexical access and word production e namely picture naming and word reading e and used stimulation protocols typically used by studies reporting positive effects (e.g., 1e1.5 mA of anodal stimulation to frontal and temporal areas for 15e25 min during task performance). We made particular efforts to assess whether potential null effects could be masked by variability in the net outcome of tDCS depending on individual baseline levels of cortical excitability and task requirements. We maximized our chances of demonstrating a possible reversal of the advantages generally predicted for language tasks with anodal tDCS of left-hemisphere areas by: 1. Considering task conditions affording a high level of competition from semantically related items, that is, comparing tDCS effects on sets of related versus unrelated items; 2. Considering individual variability in the net outcome of tDCS, that is, assessing whether, with the same task, some participants may show significant facilitation and others significant worsening of performance; 3. Contrasting activation of different areas with the hypothesis that frontal stimulation may boost selection mechanisms, thus reducing interference, while temporal activation may boost lexical activation, thus, increasing interference; 4. Considering preferential effects for participants who demonstrated high semantic interference; 5. Considering possible enhanced/reduced effects of tDCS on difficult to name items. Despite our best efforts, we found no evidence of performance modulation due to the tDCS. Our results contribute to growing doubts surrounding the reliability of tDCS applied within one stimulation session as a tool to modulate cognition in populations of neurologically  intact participants. The effects of tDCS on semantic interference are particularly representative. With temporal stimulation one study found reduced interference (Meinzer et al., 2016), one found enhanced interference (Pisoni et al., 2012) and two found no effect (our own and Henseler et al., 2014). With frontal stimulation three studies found reduced interference (Meinzer et al., 2016;Pisoni et al., 2012;Wirth et al., 2011), but two others found no effect with the same paradigm (our own study) or with a different paradigm (Henseler et al., 2014). Why these differences? A close consideration of the tDCS paradigms employed by these studies does not reveal any clear difference which may be responsible for different outcomes. The three studies which found a reduction of interference effects after frontal stimulation used parameters in the range covered by our experiments. Like us, they stimulated the left inferior frontal gyrus; placed the electrode on the contralateral supraorbital area; used a current density in a similar range (mA/cm 2 of .029, .057, .080; ours .11e.06); a similar size of the reference electrode 35e100 cm 2 (our 35 cm 2 ), a similar size of active electrode (25e35 cm 2 ; our 9e25 cm 2 ) and administered the current for a similar duration (20e25 min; ours 15e25 min). Of course, one may always argue that we did not use the right combination of c o r t e x 8 6 ( 2 0 1 7 ) 6 4 e8 2 parameters. However, lack of empirical evidence in addition to lack of any appropriate mechanistic model that can provide specific predictions means that we are in the dark when searching for the right parameter combination (for a discussion, see de Berker, Bikson, & Bestmann, 2013;Horvath et al., 2016).
Another possible explanation for our null effects is of course lack of power. Our total samples of 56 and 73 participants for reading and naming respectively allowed us good power to detect medium (.5) or strong (.8) effects of tDCS (1Àb > .96) for both. However, the power to detect a small effect of tDCS (effect size ¼ .25, a ¼ .05) was limited even within a within-participants design like ours (1Àb ¼ .45 and .56 for reading and naming). To prove or disprove a small effect of tDCS with strong statistical power would have required a sample of 128 participants (effect size ¼ .25, 1Àb ¼ .8, a ¼ .05). This is inconsistent with standards in the field. Most published studies report samples between 10 and 25 participants (see Horvath et al., 2015a;Price et al., 2015;Tremblay et al., 2014). One may want to encourage studies with many more participants, but the fact remains that if effects of tDCS are so small, tDCS is not a tool fit for purpose in the way it is currently employed for modulation of normal cognition. Metaanalyses are of course one way to tackle the issue of small sample sizes. In a review of studies assessing effects of tDCS in reading and picture naming, we pooled studies using a similar protocol to the present study e i.e., applied left anodal tDCS to frontal/temporal lobes e and included the present study. This gave a total sample size of roughly 200 participants. Even with this sample size, we found no evidence of a tDCS effect (see Westwood & Romani, in preparation).
It is possible that future studies will elucidate conditions where single session tDCS is efficacious even in healthy participants. It is also possible, however, that cortical excitability in healthy brains is already close enough to an optimal level that cannot be bettered and/or that homeostatic mechanisms come into play to reduce excessive levels of activation, thus, nullifying any effect of tDCS (Krause & Cohen Kadosh, 2014). Instead, effects of tDCS may only be reliable in neurologically damaged participants where targeted regions may have a pathologically reduced level of excitability (for a review, see Silvanto et al., 2008). A recent review of extant literature on post-stroke aphasia composed of twelve studies (de Aguiar et al., 2015) indicated a general benefit of tDCS across language tasks and types of therapy with varied stimulation protocols. The results showing improvements in picture naming are particularly relevant here (see Fiori et al., 2011;Floel et al., 2011;Kang, Kim, Sohn, Cohen, & Paik, 2011;Lee, Cheon, Yoon, Chang, & Kim, 2013;Marangolo et al., 2013;Saidmanesh, Pouretemad, Amini, Nilipor, & Ekhtiari, 2012; but see also Monti et al., 2008).
Alternatively, positive results may be dependent on dose of stimulation (see Meinzer et al., 2014). Positive results with aphasic participants are obtained when tDCS is administered in conjunction with naming once or twice a week for a number of weeks (sessions ranging from 5 to 10). It is possible, therefore, that the key for positive effects of tDCS is not whether the treated population is healthy or impaired, but the stimulation dose and/or repeated application across a number of sessions. It is also possible that positive effects are more likely in tasks that require novel cognitive operations, which are less established in the brain, such as during the acquisition of new processes or representations. Novel operations may be easier to manipulate than operations already well established, such as naming common items (for a similar argument, see Jacobson et al., 2012). It has been shown that tDCS can modify synaptic plasticity by modulating levels of glutamate, GABA, and other neurotransmitters (e.g., dopamine, serotonin, acetylcholine; for extensive reviews, see Medeiros et al., 2012;Stagg & Nitsche, 2011). This may permit modulation of learning. Indeed, a number of studies have shown enhanced learning following repeated stimulation even in normal participants (Cohen Kadosh, Soskic, Iuculano, Kanai, & Walsh, 2010;Dockery, Hueckel-Weng, Birbaumer, & Plewnia, 2009;Meinzer et al., 2014;Reis et al., 2009). Fl€ oel et al. (2008 reported enhanced novel word learning even after a single stimulation session, although the effect vanished after one week.

Conclusions
The bias to publish significant results combined with a lack of appetite for replication (see, Open Science Collaboration, 2015; Vannorsdall et al., 2016), may have given the research community a false sense of tDCS effectiveness. Our results suggest that the unreliability of tDCS results should be taken as a starting point and as a challenge that needs addressing, rather than assuming a level of a reliability that is not there. Across a variety of conditions and analyses, we found no evidence that online tDCS could modulate word retrieval in healthy participants. We performed analyses which considered possible causes of variability, but found no significant results. Further studies should expand on these analyses. Further studies should also assess whether positive effects can be obtained even in healthy participants when stimulation is carried out across different sessions and/or when it involves learning of novel words rather than the modulation of a consolidated vocabulary as in the present study. More generally, our results suggest that the efficacy of tDCS to modulate normal cognition needs to be carefully re-evaluated.

Funding
The work is funded by the Ministry of Defence and is funded by Defence Science and Technology Laboratory ( c o r t e x 8 6 ( 2 0 1 7 ) 6 4 e8 2