Alertness fluctuations when performing a task modulate cortical evoked responses to transcranial magnetic stimulation

Transcranial magnetic stimulation (TMS) has been widely used in human cognitive neuroscience to examine the causal role of distinct cortical areas in perceptual, cognitive and motor functions. However, it is widely acknowledged that the effects of focal cortical stimulation can vary substantially between participants and even from trial to trial within individuals. Recent work from resting state functional magnetic resonance imaging (fMRI) studies has suggested that spontaneous fluctuations in alertness over a testing session can modulate the neural dynamics of cortical processing, even when participants remain awake and responsive to the task at hand. Here we investigated the extent to which spontaneous fluctuations in alertness during wake-to-sleep transition can account for the variability in neurophysiological responses to TMS. We combined single-pulse TMS with neural recording via electroencephalography (EEG) to quantify changes in motor and cortical reactivity with fluctuating levels of alertness defined objectively on the basis of ongoing brain activity. We observed rapid, non-linear changes in TMS-evoked responses with decreasing levels of alertness, even while participants remained responsive in the behavioural task. Specifically, we found that the amplitude of motor evoked potentials peaked during periods of EEG flattening, whereas TMS-evoked potentials increased and remained stable during EEG flattening and the subsequent occurrence of theta ripples that indicate the onset of NREM stage 1 sleep. Our findings suggest a rapid and complex reorganization of active neural networks in response to spontaneous fluctuations of alertness over relatively short periods of behavioural testing during wake-to-sleep transition.

Transcranial magnetic stimulation (TMS) has been widely used in human cognitive neuroscience to examine the causal role of distinct cortical areas in perceptual, cognitive and motor functions. However, it is widely acknowledged that the effects of focal cortical stimulation can vary substantially between participants and even from trial to trial within individuals. Recent work from resting state functional magnetic resonance imaging (fMRI) studies has suggested that spontaneous fluctuations in alertness over a testing session can modulate the neural dynamics of cortical processing, even when participants remain awake and responsive to the task at hand. Here we investigated the extent to which spontaneous fluctuations in alertness during wake-to-sleep transition can account for the variability in neurophysiological responses to TMS. We combined single-pulse TMS with neural recording via electroencephalography (EEG) to quantify changes in motor and cortical reactivity with fluctuating levels of alertness defined objectively on the basis of ongoing brain activity. We observed rapid, non-linear changes in TMS-evoked responses with decreasing levels of alertness, even while participants remained responsive in the behavioural task. Specifically, we found that the amplitude of motor evoked potentials peaked during periods of EEG flattening, whereas TMS-evoked potentials increased and remained stable during EEG flattening and the subsequent occurrence of theta ripples that indicate the onset of NREM stage 1 sleep. Our findings suggest a rapid and complex reorganization of active neural networks in response to spontaneous fluctuations of alertness over relatively short periods of behavioural testing during wake-to-sleep transition.

Introduction
Transcranial magnetic stimulation (TMS) is widely used for probing human brain function in health and disease ( Dugué and Van-Rullen, 2017 ;Valero-Cabré et al., 2017 ;Ziemann, 2017 ). A number of neurophysiological indices of cortical TMS perturbation have been used to contrast experimental conditions of interest, including motor evoked potentials (MEPs) recorded from peripheral muscles ( Barker et al., 1985 ;Bestmann and Krakauer, 2015 ) and TMS-evoked potentials (TEPs) which are thought to reflect the reactivity of underlying cortical circuits ( Chung et al., 2015 ;Ilmoniemi et al., 1997 ). These and other out-neural activity are influenced by spontaneous fluctuations in the level of alertness that may occur during a single experimental session.
Recent research has suggested that human participants can show widely varying levels of alertness throughout a testing session. For instance, Tagliazucchi and Laufs (2014) found that 30% of participants drifted into a drowsy state (N1 sleep) during resting-state functional magnetic imaging (fMRI) protocols after only three minutes. These periods of early N1 sleep during passive resting-state scans were accompanied by increased signal variance in sensory and motor cortices and altered cortico-cortical functional connectivity ( Tagliazucchi and Laufs, 2014 ). Likewise, using an active decision-making task, De Gee et al. (2017) demonstrated that brainstem-controlled inter-trial fluctuations in phasic arousal are accompanied by changes in the involvement of prefrontal and parietal cortices in choice encoding. Further evidence for the contribution of fluctuating levels of alertness might come from studies of MEP amplitudes, which tend to be highly variable from trial to trial ( Ellaway et al., 1998 ;Maeda et al., 2002 ). A significant portion of this variance is related to EEG oscillatory activity in a pre-TMS time window Hussain et al., 2018 ;Mäki and Ilmoniemi, 2010 ;Madsen et al., 2019 ;Ogata et al., 2019 ;Sauseng et al., 2009 ;Thies et al., 2018 ;Zarkowski et al., 2006 ;Zrenner et al., 2018 ). In particular, trials with higher pre-stimulation alpha power tend to be associated with lower MEP amplitude ( Ogata et al., 2019 ;Sauseng et al., 2009 ;Zarkowski et al., 2006 ), although null findings ( Iscan et al., 2016 ) or a positive rather than negative correlation ( Thies et al., 2018 ;Bergmann et al., 2019 ) have also been reported. The association between pre-stimulus alpha power and MEP amplitude is typically interpreted in terms of spontaneous fluctuation of regional sensorimotor mu-alpha rhythms Hussain et al., 2018 ;Thies et al., 2018 ;Zrenner et al., 2018 ). Unfortunately, previous TMS investigations have not measured or controlled for changes in alertness in their participants, so it remains unknown whether fluctuations in alertness are systematically associated with changes in TMS-evoked neural activity.
Here we combined single-pulse TMS with concurrent EEG recording and a simple behavioural task to quantify changes in motor and cortical reactivity with fluctuating levels of alertness during wake-to-sleep transition, defined objectively on the basis of ongoing brain activity. We had four goals: (1) to estimate the latency and stability of fluctuations in alertness over the course of an active, single-pulse TMS session; (2) to test whether fluctuations in alertness modulate the occurrence and amplitude of MEPs; (3) to determine whether the amplitude of TEP responses within the first 50 ms after a TMS pulse changes across different levels of alertness; and (4) to assess whether inter-trial variance of MEP and TEP amplitudes is altered with decreases in alertness.

Participants
Twenty participants (7 male; mean age 23.7 years: age range 21-33 years) took part in the study. All participants were screened for contraindications to TMS ( Rossi et al., 2009 ), which included having no history of hearing impairment or injury, and no neurological or psychiatric disorders. All participants were right handed, as assessed using the Edinburgh Handedness Scale ( Oldfield, 1971 ). The mean handedness index was 0.79 (SD = 0.19; range 0.3-1). Potential participants were also screened with the Epworth Sleepiness Scale (ESS) ( Johns, 1991 ). The mean ESS score was 9.4 (SD = 4.3), which indicates that most of the participants had a slight to moderate chance of dozing off in a situation of prolonged inactivity. Notably, the average ESS score between 9 and 11 is typical for student samples ( Kaur and Singh, 2017 ;Rodrigues et al., 2002 ;Yang et al., 2003 ;Zailinawati et al., 2009 ), which has been related to the pressures of studying and hectic lifestyles rather than clinical sleep problems ( Hershner and Chervin, 2014 ).
The experimental protocol was approved by the Medical Research Ethics Committee of The University of Queensland (UQ), and the study was carried out in accordance with the Declaration of Helsinki. All participants gave informed, signed consent. Participants were recruited through an electronic volunteer database managed by UQ's School of Psychology. They received $30 for taking part in the study. There were no adverse reactions to TMS.

Electromyography (EMG)
Surface EMG was recorded from the first dorsal interosseous (FDI) of the left and right hands using disposable 24 mm Ag-AgCl electrodes (Kendall H124SG by Covidien; MA, USA) (only left-hand EMG shown in Fig. 1 (B)). The electrodes were placed in a belly-tendon montage with the reference over the proximal phalanx of the index finger and a common ground on the left elbow. Raw EMG signals were amplified ( × 1000) and filtered (20-2000 Hz; 50 Hz notch filter) using a Digitimer NeuroLog system (Digitimer; Hertfordshire, UK). The data were digitised at 5000 Hz using a Power 1401 and Signal (v5) software (Cambridge Electronic Design; Cambridge, UK), and stored for offline analysis on a PC. Throughout the experiment EMG activity was monitored on-line using a digital oscilloscope with a high gain. Participants were prompted to relax if any unwanted muscle activity was observed.

Transcranial magnetic stimulation (TMS)
TMS was applied to the right primary motor cortex using a single monophasic pulse generated by a Magstim 200 2 stimulator and a 70 mm figure-of-eight coil (#9925-00; The Magstim Company; Carmarthenshire, UK). The site for stimulation was the point on the scalp over the motor cortex that elicited the largest and most consistent amplitude MEPs from the left FDI. This stimulation 'hotspot' was found by placing the TMS coil tangentially on the scalp with the handle pointing posteriorly and laterally at ~45°to the sagittal plane, which induced a posterior-to-anterior current in the cortex. Stimulation commenced at an intensity that was assumed to be slightly suprathreshold for most individuals. Once the hotspot had been identified it was marked using an infrared neuro-navigation system (Visor 2 by ANT Neuro; Enschede, The Netherlands). A small piece of foam ~5 mm thick was then placed under the centre of the TMS coil so that it was not in physical contact with any EEG electrodes. The hotspot was re-marked and the location and orientation of the TMS coil were maintained throughout the testing session with the aid of the neuro-navigation system. Accuracy of coil position and handle orientation were kept within 5 mm and 5 degrees, respectively, but were typically within 3 mm and 3 degrees, as indicated in the Visor 2 panels. Resting motor threshold was determined using the relative frequency method with a criterion of ≥ 50 μV (peak-to-peak) MEP amplitude in at least five out of ten consecutive trials Rossini et al., 1994 ;Samii et al., 1996 ). A two-down, one-up staircase was used, starting at a suprathreshold intensity. Mean motor threshold for the group was 53.1% (Range 34-74%) of maximal TMS output intensity. TMS was controlled manually during the localization of motor cortex and the estimation of motor threshold. During the main experiment, TMS was controlled via Matlab functions from the Rapid 2 toolbox ( Abrahamyan et al., 2011 ).

Electroencephalography (EEG)
Continuous EEG data were acquired using a 64 channel BrainAmp MR Plus amplifier, TMS BrainCap and Brain Vision Recorder (v1) software (Brain Products; Gilching, Germany). A high chloride abrasive electrolyte gel was used (Abralyt HiCl by Easycap; Herrsching, Germany), and electrode placement corresponded with the International 10-10 system. Data were sampled at 5 kHz with a bandpass filter of  Two EEG windows preceding single-pulse TMS were used to assess  alertness: a 4 s window was used for manual scoring of Hori stages (5 Alertness Levels), and a 2 s window was used for automatic calculation of the theta (3-7 Hz) to alpha (8-12 Hz) spectral power ratio. Following each TMS pulse delivered over the right motor cortex, motor evoked potentials (MEPs) were recorded from the first dorsal interosseous (FDI) muscle of the left hand, and TMS-evoked potentials (TEPs) were recorded using high density EEG to characterize cortical reactivity. (B) Schematic of experimental set-up, showing EMG, EEG, TMS and response mouse in situ. (C) Brief definitions and EEG examples of 9 Hori stages of sleep onset, progressing from relaxed wakefulness (Hori Stage 1) to NREM Stage 2 sleep (Hori Stage 9) (modified with permission from Ogilvie (2001) ). In the current study, Alertness Levels 1-5 (marked in green) correspond to Hori Stages 1-5. (D) Percentage of trials obtained within each Alertness Level, shown separately for the 20 participants. Datasets are sorted from the most alert participants (lower rows) to the drowsiest participants (upper rows). There were very few epochs of Alertness Level 6 or above. (E) Representative dataset for one participant, showing good agreement between the two EEG measures of alertness across the whole testing session. The upper subplot indicates / ratio; the lower subplot shows fluctuations of Alertness Levels on the Hori scale. (F) Cross-validation of EEG measures of alertness: intra-individual correlations between / power and the full range of Alertness Levels (Hori Stages 1-8) across single session trials. Bars represent intraindividual Spearman's rank order correlation coefficients for the 20 participants, sorted from the most to the least positive coefficients.
DC-1000 Hz and resolution of 0.5 μV ( ± 16.384 mV). Recordings were referenced online to the left mastoid, and electrode impedance was typically kept below 5 k Ω.

Experimental procedure
Participants were seated in a comfortable reclining chair that included head and leg support (see Fig. 1 (B)). After placing the EMG and EEG electrodes, participants had their eyes blindfolded and the lights in the lab were dimmed. They were instructed to relax for a few minutes while estimation of individual resting motor threshold was performed. Participants' hands were comfortably supported with pillows, and they wore earplugs throughout the experiment in order to reduce auditory stimulation. After threshold estimation the combined TMS-EEG experiment was carried out, and participants were reminded to stay relaxed and keep their eyes closed. They were also instructed to pay attention covertly to their left hand and to respond by clicking one of the two keys on a mouse held in their right hand if they felt a tactile sensation (left key), such as a twitch or a touch, or had no sensations (right key) in their left hand at the time of each TMS pulse (see Fig. 1 (A)). In the present study, key presses were used to determine responsiveness. Participants were explicitly instructed that they were permitted to fall asleep should they wish to. If no responses were registered after 3-5 consecutive trials (i.e., failure to press a mouse button within 6 seconds of a TMS pulse), participants were gently awakened verbally and reminded to continue the task.
During stimulation, nine TMS intensities centred on the individual resting motor threshold were used ( − 20%, − 15%, − 10%, − 5%, 0%, + 5%, + 10%, + 15%, + 20%). Given that TMS stimulator output intensity is measured in whole numbers from 1 to 100, the calculated percentage from threshold intensity was rounded. This yielded slightly different sized steps from − 20% to + 20% for some individuals, and thus the actual TMS output intensity values were used when fitting sigmoidal functions at the single participant level. An a priori rationale for this design related to a secondary purpose of the project, which was to plot psychophysical functions of kinaesthetic awareness of hand movement, the results of which will be reported in a separate paper. For each individual, we aimed to deliver 520 trials of single pulse TMS, with an average inter-pulse interval of 9.5 s and a uniformly distributed random jitter of ± 1000 ms. Thus, the inter-pulse interval lasted between 8.5 and 10.5 s. We incorporated a relatively long inter-pulse interval to facilitate the natural development of drowsiness, and to allow sufficient time for a return of tonic EMG activity to its baseline level. As our aim was to obtain the maximum number of evoked responses (MEPs and TEPs) around the TMS threshold intensity, the following number of trials was delivered at each TMS intensity: 40 trials (7.7% of a total) at each of the − 20%, − 15%, − 10%, + 10%, + 15% and + 20% intensities; 80 trials (15.4% of a total) at each of the − 5% and + 5% intensities; 120 trials (23.1% of a total) at 0% (i.e., at the individual resting motor threshold). These proportions of trials enabled us to maximize neural data around the motor threshold, while at the same time provided sufficient data to analyse sigmoidal fit as a function of TMS intensity (see Fig. 2 (A)). Trial order was randomized throughout the experiment. TMS pulses were delivered in 8 blocks of 65 trials. In addition to the Rapid 2 toolbox ( Abrahamyan et al., 2011 ), the experiment was controlled using Matlab functions from Psychtoolbox-3 ( Kleiner et al., 2007 ). Occasional technical difficulties meant that we could not complete all trials for every participant, or that we added an extra block of trials, yielding an average of 517.5 trials per participant (SD = 35.84, Range = 379-575).
One experimenter held the TMS coil, with the aid of the neuronavigation system (Visor 2), and the other monitored ongoing EEG; these individuals switched their places after each block. An extended rest was provided after 4 blocks to allow participants a break from the task, to change the heated TMS coil, and to reduce the impedance of any EEG electrodes if required. Data collection lasted approximately 90 minutes. In an effort to reduce the potential impact of any circadian fluctuation in cortical excitability ( Sale et al., 2007 ), all testing sessions commenced at 1.00 pm (a time at which participants were more likely to feel drowsy after having had their lunch).

Motor evoked potential (MEP) analyses
Peak-to-peak amplitudes of MEPs evoked by TMS pulses delivered over the right motor cortex were calculated for each trial within a 20-45 ms time window using Signal (v5) software (Cambridge Electronic Design; Cambridge, UK). Trials containing phasic muscle activity in the left FDI channel within 100 ms prior to a TMS pulse being delivered were discarded from the analyses.
We characterized modulations of MEPs as a function of alertness levels by fitting a sigmoid function to the proportion of trials that evoked MEPs (constrained from 0 to 1 on the y axis) across the 9 TMS intensities ( − 20%, − 15%, − 10%, − 5%, 0%, + 5%, + 10%, + 15%, + 20), and then comparing threshold and slope measures in EEG-defined awake and drowsy trials separately for each participant (for definition of awake and drowsy trials, see Section 2.9 ). A 50 V cut-off threshold in peak-to-peak amplitude Rossini et al., 1994 ;Samii et al., 1996 ) was used to define the presence of an MEP. A sigmoid function was fitted to each individual participant's data using the following formula: Level, the red line depicts the group-level mean of peak-to-peak amplitude. The pink shaded region represents 1 standard deviation (SD), and the blue shaded region represents the 95% confidence interval of the mean.
where F is the MEP ratio (the proportion of trials with supra-threshold MEPs), x is the TMS intensity, μ is the threshold value (the TMS intensity at the inflection point), and s is inversely proportional to the slope at the threshold. The actual slope of the fitted sigmoid was calculated by fitting a straight line between a point 0.1 above the inflection point and a point 0.1 below it.
Importantly, in addition to spontaneous fluctuations in alertness, the spectral power of EEG pre-stimulus oscillations can reflect attentional sampling and/or sensory gating ( Capotosto et al., 2009 ;Romei et al., 2008 ;van Dijk et al., 2008 ). We expected that an alertness-related effect would be spatially and temporally widespread and consistent, so we repeated the analysis of MEP threshold and slope by splitting the data between awake and drowsy trials separately for each EEG electrode, and in 20 equally sized pre-TMS time bins of 100 ms duration, from − 2000 ms to 0 ms relative to the TMS pulse.
Furthermore, to assess dynamics of MEP peak-to-peak amplitude across Alertness Levels 1-5 ( Hori et al., 1994 ; see Section 2.9 ), responsive trials with MEP amplitude at least twice as high as the range between minimum and maximum values in the − 100 to 0 ms baseline window were averaged separately for each Alertness Level and each participant. In an effort to control for MEP variance as a function of TMS intensity, only three TMS intensities ( − 5%, 0%, + 5%) around each individual's motor threshold were included in the group-level analysis of MEP amplitude changes across Alertness Levels 1-5.

EEG pre-processing and analysis: TMS-EEG reactivity
EEG data pre-processing was carried out using EEGlab toolbox for Matlab ( Delorme and Makeig, 2004 ), with two separate pre-processing pipelines developed for the analysis of EEG reactivity to TMS and EEG spectral power before TMS pulses. Analysis of EEG reactivity to TMS pulses in the first 50 ms time window requires a perfect alignment of TMS markers in the EEG recording with the onset of the actual TMS coil discharge. Given that there was some delay and jittering between a TMS marker sent to the EEG system and the coil discharge itself (M = 9.6 ms, SD = 1.7 ms; see Fig. A.1, left side), EEG markers indicating TMS intensity were automatically adjusted to the time point of the actual TMS pulse. For this, raw EEG data were segmented ± 200 ms around each TMS marker, and global field power (GFP) was calculated as a standard deviation of voltage across all electrodes, resulting in a single time waveform for each TMS marker. Each obtained waveform was baseline corrected to the − 200 ms to − 50 ms time window, and each time sample was transformed to its absolute value. The remaining time window of − 49 ms to + 200 ms was scanned, searching for the first time point at which a GFP value exceeded the maximal baseline GFP value by a factor of five, which indicated the onset of a TMS artefact. The TMS marker was then reallocated to this point in the continuous EEG recording (see The EEG data were processed following an ICA-based approach of TMS-EEG artefact cleaning ( Rogasch et al., 2014 ). First, EEG data were segmented from − 1000 ms to + 1000 ms around the onset of the TMS artefact. Data were manually inspected and epochs containing excessive artefacts as well as epochs corresponding to noisy MEP recordings or pre-trial EEG segments used to assess alertness, were deleted (M = 55.6 epochs, SD = 30.37, Range = 22-148). Next, the segments were baseline corrected to the mean of the interval from − 500 ms to − 100 ms time window. A straight line was then fitted to the data from − 2 ms to 15 ms, thus deleting the initial TMS-EEG artefact, and the epochs were down-sampled to 1000 Hz. The most deviating EEG channels were then detected with the 'spectopo' function and the first round of independent component analysis (ICA) was performed excluding noisy channels (M = 2.5 channels, SD = 1.24, Range = 1-6). After deleting on average 2.2 distinctive, early high-amplitude components (SD = 1.47, Range = 0-6) representing the exponential decay artefact (which probably results from a combination of the amplifier's step response, the induction of currents in the electrode leads, polarization of the electrodeelectrolyte-skin interface, and cranial muscle response-related electrode movement ( Herring et al., 2015 ;Rogasch et al., 2014Rogasch et al., , 2017 and is sometimes spread across many channels) a small number of additional noisy channels was identified (M = 0.8 channels, SD = 1.33, Range = 0-5). EEG data were then filtered (1-80 Hz) and epoched from − 400 ms to + 600 ms around the onset of the TMS marker. The second round of ICA was carried out without the noisy channels identified in the two previous rounds (M = 3.3 channels, SD = 1.9, Range = 1-9). Independent components reflecting the remaining TMS-EEG decay artefact, eye movements, auditory evoked potentials, 50 Hz line noise, and other sources of noise were deleted (M = 20.75, SD = 4.12, Range = 15-29), after which bad channels were recalculated using spherical spline interpolation. The EEG segments were again baseline corrected ( − 100 ms to − 3 ms), and manually inspected to delete a few remaining epochs that still contained a residual TMS artefact (M = 2.7 epochs, SD = 2.76, Range = 0-9). After all data cleaning steps, on average 58.2 trials (11.2%) were discarded per single participant during EMG and EEG pre-processing (SD = 30.82, Range = 23-153), leaving on average 459.2 trials per participant (SD = 36.44; Range = 347-508 trials) available for the subsequent TEP and MEP analyses.
To account for between-trial variance, the raw EEG signal from each individual trial was transformed to z-scores using the mean and standard deviation of the baseline period ( − 100 to − 3 ms). Trials were then split into different levels of alertness. To assess changes in EEG reactivity to TMS perturbation as a function of alertness, the four electrodes immediately beneath the TMS coil (electrodes FC2, FC4, C2, C4) were chosen to contribute their voltage values to a region of interest (ROI) (see Fig. 3 (A)), and these were then averaged across all TMS intensities within each participant. The group-level waveform was then plotted, revealing an early TEP peak at 31 ms post-TMS pulse. The data were then split between Alertness Levels and the mean amplitude ( ± 5 ms) around the peak (26-36 ms) was calculated for each participant and each level of alertness (see Section 2.9 ).
ERP dynamics were additionally studied using data-driven spatiotemporal clustering analyses similar to what we have described previously ( Chennu et al., 2013 ). Awake and drowsy trials were compared in the time windows of interest (15-100 ms) by averaging single-subject data and running group level clustering. Using modified functions from the FieldTrip toolbox ( Maris and Oostenveld, 2007 ;Oostenveld et al., 2011 ), we compared corresponding spatio-temporal points, i.e. each electrode and each time sample, between awake and drowsy conditions, with a paired samples t-test. Although this step was parametric, Field-Trip uses a nonparametric clustering method ( Bullmore et al., 1999 ) to address the multiple comparisons problem. The t values of adjacent spatio-temporal points whose p values were less than 0.05 were clustered together by summating their t values, and the largest such cluster was retained. A minimum of two neighbouring electrodes had to pass this threshold to form a cluster, with the neighbourhood defined as other electrodes within a 4 cm radius. This whole procedure -that is, calculation of t values at each spatio-temporal point followed by clustering of adjacent t values -was repeated 1000 times, with recombination and randomized resampling before each repetition. This Monte Carlo method generated a nonparametric estimate of the p value representing the statistical significance of the originally identified cluster. The cluster-level t value was calculated as the sum of the individual t values at the points within the cluster.
Control analyses. We conducted a number of control analyses to rule out the possibility that our specific approach to dealing with a variety of experimental artefacts unduly influenced the results. First, to determine whether transformation of the TEP data to z-scores might have influenced the results, all group level analyses were repeated in the voltage domain. Second, to rule out that removal of several auditory components following ICA might have influenced the results (M = 1.95, SD = 1.50, Range = 0-5), the group level analyses were repeated with all auditory components retained. Third, to determine whether a different distribution of TMS intensities across awake and drowsy trials could have confounded the TEP results (see Table A.1), a number of trials at each TMS intensity was compared between / -defined awake and drowsy conditions. In case of a mismatch trials were randomly drawn from a condition with the larger number to match another condition with the smaller number, and this was repeated separately for each participant. Once trial numbers were perfectly matched by TMS intensity (A) Time course of electroencephalography (EEG) potentials averaged over 4 EEG electrodes beneath the TMS coil in the / -defined awake (red) and drowsy (blue) trials. Green shaded area highlights the cortical reactivity time window (26-36 ms). Only behaviourally responsive trials are included in the analysis shown in this and other subplots. 0 ms corresponds to the time of the TMS pulse. Red and blue shading depicts standard error of the mean (SEM). (B) Topographical distribution of the early TEP mean peak at 26-36 ms post-TMS pulse in the / -defined awake (upper left) and drowsy (upper right) states. Black dots indicate locations of three EEG electrodes with the maximal amplitude in the map. Non-parametric z map (below) reveals region reliably different between awake and drowsy states. (C) 0-100 ms data-driven spatio-temporal clustering of EEG potentials post-TMS pulse between / -defined awake (red) and drowsy (blue) states. TEP amplitude was significantly higher in drowsy trials in a 5-67 ms time window (cluster peak: 27 ms, t = 4884.47, p = 0.004). The green horizontal line depicts the time window of significant difference. The electrode with the largest difference between awake and drowsy states is marked as a green dot in the topographic voltage map, and its waveforms are plotted below. The black contours within the map show the electrodes with statistically reliable differences (cluster). The topographic voltage map is at the peak difference between awake and drowsy states. In addition to the P30 response, N45 and P60 TEP components are visible in this plot ( Rogasch et al., 2014 ;Premoli et al., 2014 ). (D) Individual-level dynamics of TEP cortical reactivity peak amplitude across Alertness Levels 1-5 (TEP amplitude averaged over 26-36 ms across 4 electrodes beneath the TMS coil). Normalized amplitude is shown relative to Alertness Level 1 (green dashed line). Black lines represent participants with higher TEP amplitude at Alertness Level 5 relative to Alertness Level 1 ( N = 15); grey lines represent participants with lower TEP amplitude at Alertness Level 5 relative to Alertness Level 1 ( N = 5). (E) Group-level dynamics of TEP waveforms across Alertness Levels 1-5 (TEPs averaged over 4 electrodes beneath the TMS coil). Horizontal green dashed line delineates TEP cortical reactivity peak at 31 ms post-TMS at Alertness Level 1.
between the states of alertness, the key contrast of TEP mean amplitude (26-36 ms) averaged across 4 electrodes beneath the TMS coil was repeated between / -defined awake and drowsy conditions (see Section 2.9 ).
Finally, to determine whether variance in MEP amplitude could have confounded the TEP amplitude difference between awake and drowsy trials (e.g. through a sensory afferent signal re-entering the cortex), MEP peak-to-peak amplitude was statistically compared between / -defined awake and drowsy conditions with an independent samples t test, separately for each participant. When a contrast yielded p < 0.5, one trial with the largest amplitude was deleted from a condition with the larger mean amplitude and one trial with smallest amplitude was deleted from a condition with the smaller mean amplitude. Then, the awake and drowsy conditions were compared again, and the entire procedure was repeated until the obtained p value was ≥ 0.5. While initially MEP peak-to-peak amplitude was significantly higher in the drowsy condition for five par-ticipants and higher in the awake condition for one participant, no differences remained after awake and drowsy trials were matched by MEP amplitude (t(19): Mean = − 0.12, Range = − 0.65 to 0.60, p: Mean = 0.67, Range = 0.51-0.99), and the key contrast of TEP mean amplitude (26-36 ms) averaged across the 4 electrodes beneath the TMS coil, as well as across all electrodes, was repeated between / -defined awake and drowsy conditions.

EEG pre-processing and analysis: pre-TMS spectral power
To calculate EEG spectral power before TMS, the recordings were downsampled to 250 Hz, and then epoched in − 4000 ms to − 12 ms time segments preceding each TMS pulse. The noisiest epochs were manually deleted, and the most deviant EEG channels were detected with the 'spectopo' function, before running the independent component analysis (ICA) for further removal of artefacts such as eye blinks and saccades, heartbeats, and muscle noise (deleted components: M = 21.45, SD = 3.71, Range = 15-28). ICA was carried out on clean channels only, whereas the noisy channels were recalculated by spherical spline interpolation of surrounding channels after deleting ICA components with artefacts. Data were again manually inspected and several remaining noisy epochs were deleted.
The spectral power of EEG oscillations over the 4 s time interval immediately preceding each TMS pulse was computed using a Hilbert transform, set from 1.5 Hz to 48.5 Hz in steps of 1 Hz, using a finite impulse response (FIR) filter implemented in the EEGlab toolbox ( Delorme and Makeig, 2004 ). Given that estimation of spectral power of slow oscillations can be difficult close to the edges of EEG segments ( Cohen, 2014 ), and we were particularly interested in the spectral power just before each TMS pulse, a dummy copy of each EEG epoch was created by flipping the beginning and end of each pre-TMS epoch along the time axis, except of the last time point ( − 12 ms). The resulting "mirror image " data were then concatenated with the original pre-TMS data; that is, the time axis of the obtained 7.976 s EEG epochs extended from − 4000 ms to − 16 ms (original) to − 12 ms (original) and then back from − 16 ms to − 4000 ms (mirror). In this manner, an abrupt discontinuity was avoided in the time window just before the TMS pulse, thus enabling a more stable estimate of spectral power. After Hilbert transformation, the "mirror " part of the EEG epoch was deleted, retaining the original pre-TMS window from − 4000 ms to − 12 ms. To reduce data size, EEG recordings were down-sampled to 250 Hz before running the Hilbert transform.

EEG measures of alertness
Two complementary EEG measures were used to assess participants' level of alertness before each TMS pulse: (1) the Hori scoring system of sleep onset EEG ( Hori et al., 1994 ), and (2) a ratio between EEG spectral power of pre-TMS theta and alpha oscillations, which we refer to here as the ' / ' measure of alertness ( Bareham et al., 2014 ;Noreika et al., 2020 ).
The Hori system relies on visual scoring of 4 s segments of continuous EEG data ( Hori et al., 1994 ). It consists of 9 stages reflecting a gradual progression from wakefulness to sleep, from Hori Stage 1 which refers to alpha-dominated relaxed wakefulness, to Hori Stage 9 which is defined by the occurrence of complete spindles coinciding with classic Stage 2 NREM sleep (see Fig. 1 (C)). The Hori system has been used to map dynamic wake-sleep changes in ERPs ( Nittono et al., 1999 ), EEG spectral power ( Tanaka et al., 1997 ), reaction times, and the rate of subjective reports of being asleep ( Hori et al., 1994 ). In the present study, Hori stages were visually assessed by an experienced sleep researcher (VN) who was blind to participants' responsiveness and the TMS intensity on any trial. Given that the Hori system was developed using low-density EEG data, only 19 EEG channels of the standard 20-10 system were used for scoring purposes (Fp1, Fp2, F7, F3, Fz, F4, F8, C3, Cz, C4, T7, T8, P7, P3, Pz, P4, P8, O1, O2). EEG recordings were low pass filtered (20 Hz). Previous research has found that participants are typically unresponsive in Hori Stages 6 and above ( Ogilvie, 2001 ), so our MEP and TEP analyses were restricted to Hori Stages 1-5, which we refer to here as Alertness Levels 1-5 .
Hori Stages 1 to 4 are marked by decreasing activity in the alpha range, and Hori Stages 4 to 8 are characterized by an increase in activity in the theta range ( Hori et al., 1994 ). Thus, progression of drowsiness can be quantified by a ratio of the spectral power of the alpha and theta EEG frequency bands. Specifically, here drowsiness was quantified as a period of time with an increased / ratio of spectral power ( Bareham et al., 2014 ). To apply this measure, theta (4.5-7.5 Hz) and alpha (8.5-11.5 Hz) power was first averaged in time from − 2000 ms to − 12 ms, and the / ratio was then calculated for each trial and electrode. Next, the / ratio was averaged across all electrodes, resulting in a single "alertness " value per trial. Finally, trials were split into the most strongly "awake " (45%) and most strongly "drowsy " (45%) trials, excluding the 10% of trials that were intermediate between the two extremes. For the MEP and TEP analyses, unresponsive trials were deleted before carrying out the / -split between awake and drowsy trials. However, all trials were used when the / -split was carried out in order to compare the number of unresponsive trials between awake and drowsy states.

Convergence of the EEG measures of alertness
All participants completed the experimental task and reached the expected Alertness Level 5 or higher, marked by the occurrence and dominance of theta waves. At a group level, a comparable proportion of awake and drowsy trials were obtained as per the criteria defined above (Alertness Levels 1-2: M = 45.17%, SD = 19.92; Alertness Levels 4-5: M = 35.68%, SD = 16.44) (see Fig. 1 (D)).
Thus, even though the Hori system provides absolute electrophysiological signatures of the depth of drowsiness, the / ratio was used to identify equal proportions of awake and drowsy trials within each participant. Given that the / measure is relative, there was a risk of mislabelling trials for some participants, as it would make a split between "awake " and "drowsy " trials even if all of them happened to be Alertness Level 1. Thus, to verify the use of / data splits, we compared these two measures at an individual level and at the level of the group as a whole. First, we carried out correlation analyses between the two measures of alertness within each participant. Second, we compared correlation coefficients against zero to assess the consistency of association between the Hori and the / measures. At an individual level, Hori stages 1-8 and / scores were positively and significantly correlated for all 20 participants (individual rho ranged from 0.66 to 0.9). Group analysis confirmed a very strong association between these two electrophysiological measures of alertness (one sample t test: t (19) = 51.99, p < 0.000005), confirming that the / ratio was well suited to assessing the level of alertness in the sample here (see Fig. 1 (F)). A similar convergence between Hori and / scores was observed when trials of Hori Stages 6-8 (M = 2.23% of data, SD = 2.74, Range = 0-10.92) were excluded from correlations (individual rho ranged from 0.64 to 0.9; one sample t test: t (19) = 50.75, p < 0.000005).

Statistical analysis
Paired samples t tests were used to compare neural summary measures between / -defined awake and drowsy states. Pooled variance was used to calculate Cohen's d, with 0.2 indicating a small effect size, 0.5 a medium effect size, and 0.8 a large effect size ( Cohen, 1988 ). For a similar comparison of summary measures across Alertness Levels 1-5, a one-way repeated measures ANOVA was carried out with linear as well as non-linear contrasts. Huynh-Feldt correction was used when Mauchly's test indicated violation of the assumption of sphericity. Partial 2 was calculated as an effect size in ANOVA tests, with 0.01 indicating a small effect size, 0.06 a medium effect size, and 0.14 a large effect size ( Cohen, 1988 ). A Shapiro-Wilk's test was used to assess normality of the distribution before running parametric tests. Squareroot or log10 transforms were used to normalize skewed data. When transformations failed, non-parametric statistical tests were used, such as Wilcoxon's signed-ranks test instead of a paired samples t test, and Page's L trend test instead of a one-way repeated measures ANOVA for linear contrasts across Alertness Levels 1-5. Bonferroni-Holm multiplicity correction ( Holm, 1979 ) of p values was carried out to account for the planned comparisons between baseline Alertness Level 1 and the other four Levels of Alertness. Non-parametric Spearman's rank order correlation tests were used to assess for an association between single-trial MEP and TEP responses across trials, separately for each participant, followed up by a one sample t test of rho values. Statistical analyses were carried out using Matlab and IBM SPSS (v25) software packages. Bayes Factors were approximated from the t value and sample size using JASP software ( https://jasp-stats.org/ ).

Fluctuation of alertness during single TMS sessions
On average, TMS sessions lasted for 92.5 min (SD = 7, Range = 73.5-104.3 min) including time spent switching TMS coils and allowing breaks for participants. During this period, all 20 participants reached Alertness Level 5 or higher (up to Level 8), reflecting deep drowsiness with a dominance of EEG theta ripples (see Fig. 1 (C)). Notably, it took only 9.44 min on average for participants to reach Alertness Level 5 (SD = 8.95, Range = 2.25-33.35 min), indicating a rapid decrease of alertness despite the fact that eyes-closed participants were receiving TMS pulses and generating task-specific motor responses.
All participants ceased responding at some point during the testing session, after which they were either aroused spontaneously due to TMS, or they were prompted by an experimenter after 3 consecutive unresponsive trials. On average, 13.1% of trials were categorised as "unresponsive " (SD = 9.7, Range = 2.83-40.6), suggesting a notable impact of drowsiness on task performance. As expected, there were more unresponsive trials in the / -defined drowsy trials ( ( Holm, 1979 ).
Alertness Levels and unresponsive trials tended to be spread across the testing session, i.e., participants tended to "oscillate " between awake and drowsy states (see Fig. 1 (E) and Fig. A.2). Consequently, there was no systematic increase or decrease in Alertness Level within a session at the group level. Four participants showed a significant but weak positive correlation between Alertness Level and trial number (mean rho = 0.21), 6 participants showed a significant negative association (mean rho = − 0.27), and the remaining 10 participants showed no significant correlation between Alertness Level and trial number (mean rho = -0.008) (see Fig. A.2). These results suggest that a given participant's level of alertness cannot be assumed to decrease continuously over a testing session. Only concurrent EEG measures can definitively determine a participant's moment-to-moment level of alertness.

Fluctuating levels of alertness modulate MEPs
We first assessed corticospinal excitability as a function of alertness and TMS intensity. To this end we calculated the proportion of trials with MEP peak-to-peak amplitude above 50 V for each of the 9 TMS intensities, separately for the / -defined awake and drowsy trials. A sigmoid function was then fitted across alertness conditions for each participant. The slope of the MEP sigmoid was slightly but significantly shallower in drowsy compared with awake trials (Wilcoxon signed-rank test: z -score = 2.02, p = 0.044, r = 0.32), suggesting mildly increased noise and instability in corticospinal processing (see Fig. 2 (A); individual participant results are shown in Fig. A.3). At the group level, the MEP slope difference between awake and drowsy trials was mainly driven by more the frequent occurrence of MEPs at subthreshold intensities (see Fig. 2 (A)). At the individual level, some participants also showed an increase or decrease in MEPs at suprathreshold intensities (see Fig.  A.3). Contrary to the slope findings, the MEP sigmoid threshold did not differ between awake and drowsy trials ( t (19) = 1.31, p = 0.21, d = 0.13, Bayes Factor in favour of the null = 2.04).
We considered whether the observed difference in MEP slopes was specifically related to alertness, as the amplitude of pre-stimulus alpha oscillations has also been implicated in fluctuations in attention and sensory gating. For instance, it is possible that EEG markers of alertness could be confounded by a rapid co-linear fluctuation in attention. However, in these cases, EEG alpha effects are typically evident only within a relatively short pre-stimulus time period of a few hundred milliseconds ( Romei et al., 2008 ), and are restricted to sensory or frontoparietal regions ( Capotosto et al., 2009 ;van Dijk et al., 2008 ). Contrary to this, the difference observed here in MEP sigmoid slope as a function of EEG / power was temporally and spatially widespread (Fig. A.4), consistent with slow and widely distributed changes in alertness. Interestingly, the strength of association between theta/alpha ratio and MEP slope seemed to fluctuate at a 1 Hz frequency, with a significant peak observed 600-100 ms before TMS, and another peak at 1600-1200 ms before TMS (see Fig. A.4(A)). At the other pre-TMS time bins, the association was not significant but showed a trend in the same direction, i.e. a shallower MEP slope for deeper levels of drowsiness. Thus, we conclude that single-trial EEG / power indeed reflected an instantaneous level of alertness rather than spontaneous fluctuations of attention linked to the phase of alpha oscillations.
We next compared MEP peak-to-peak amplitudes between Alertness Level 1, reflecting relaxed wakefulness, and Alertness Levels 2-5, reflecting increasing levels of drowsiness. As shown in Fig. 2 (B), there was a significant increase in MEP amplitude between Alertness Levels 1 and 4 ( t (19) = 3.5, p = 0.0096, d = 0.64; Bonferroni-Holm-corrected ( Holm, 1979 )), as well as an intermediate stepped increase across Alertness Levels 2 ( d = 0.15; n.s.) and 3 ( d = 0.23; n.s.) and a subsequent decrease at Alertness Level 5 ( d = 0.27; n.s.). A linear trend of increasing MEP amplitude was observed across Alertness Levels 1-4 ( F (1,19) = 11.55, p = 0.003, partial 2 = 0.38), but this was no longer significant when Level 5 was also included ( F (1,19) = 2.11, p = 0.165, partial 2 = 0.1). These findings indicate a reliably non-linear reorganization of corticospinal excitability at a time when drowsy participants are still conscious and responsive. The most noticeable change in dynamics occurred with the disappearance of alpha waves at Alertness Level 4, at a point where there was EEG flattening just before the first occurrence of EEG theta-range ripples, despite the fact that participants were still responding behaviourally in the task. These observations suggest a much earlier modulation of corticospinal excitability in the initial moments of drowsiness than has been reported previously in studies of MEP changes with sleep deprivation or during NREM sleep ( De Gennaro et al., 2007 ;Grosse et al., 2002 ;Manganotti et al., 2004 ).

Fluctuating alertness modulates TMS-evoked potential (TEP) amplitude
We next assessed post-TMS cortical reactivity measured as TMSevoked potentials (TEPs) within the first 40 ms after each pulse. Early TEP amplitude is known to increase in response to homeostatic sleep pressure ( Huber et al., 2013 ) and during NREM sleep ( Massimini et al., 2005 ), likely reflecting a combination of synaptic strengthening, changes in neuromodulation, and impaired inhibition ( Huber et al., 2013 ). We hypothesized that, as with MEP amplitude, TEPs should be affected by the level of alertness in drowsy but responsive participants.
Comparing TEP amplitudes between / -defined awake and drowsy trials revealed a significant increase in cortical reactivity in drowsy trials at the ROI electrodes in a time window from 26 to 36 ms after the TMS pulse ( t (19) = 4.02, p = 0.00074, d = 0.49) (see Fig. 3 (A)). This pattern was evident in 18/20 participants (see Fig. A.5). Essentially the same increase in TEP amplitude was observed when / -defined awake and drowsy trials were matched across TMS intensity conditions (see Fig. A.6) and MEP peak-to-peak amplitude (see Fig. A.7).
While TEP peaked over the right motor region, directly beneath the TMS coil, in both states of alertness, the peak difference between awake and drowsy states was fronto-central (see Fig. 3 (B)). Consistent peak time and location were identified using spatio-temporal clustering of TEP differences between awake and drowsy trials (see Fig. 3 (C)). While the observed difference between awake and drowsy trials spread fronto-centrally (see Fig. 3 (B)-(C)), its occurrence is unlikely to reflect an auditory ERP as the TEP peak latency (27 ms) occurred well before the known onset of the auditory P50 potential ( Pratt et al., 2008 ). Furthermore, we removed identifiable auditory components during ICA cleaning of the EEG signal, as outlined in the Materials and Methods ( Section 2.7 ).
We further compared TEP amplitudes at the ROI site of stimulation across Alertness Levels 1-5. As hypothesized, TEP amplitude increased as participants became drowsier (Page's L trend test: L = 1007, 2 = 22.9, p = 0.0000017) ( Fig 3 (D)-(E)). Planned comparisons between Alertness Level 1, which was treated as a baseline condition, and each subsequent Level revealed a significant increase in TEP amplitude between Alertness Levels 1 and 3 ( t (19) = 4.54, p = 0.00088, d = 0.51), 1 and 4 ( t (19) = 4.38, p = 0.00099, d = 0.68), and 1 and 5 ( t (19) = 3.43, p = 0.0056, d = 0.6), but not between Levels 1 and 2 ( t (19) = 1.54, p = 0.14, d = 0.12), with all contrasts Bonferroni-Holm-corrected for multiple comparisons ( Holm, 1979 ). These findings provide the first direct evidence for an inverse association between cortical reactivity and alertness, suggesting that sleep-related changes in neural activity may intrude early in the transition between wakefulness and sleep, while participants are still able to respond in an ongoing behavioural task. Strikingly, the TEP effects emerged at a relatively early Alertness Level 3, before the appearance of drowsiness ripples or early slow waves ( Hori et al., 1994 ).
Essentially the same increase in TEP amplitudes was evident in the / -defined drowsy condition and across Alertness Levels 1-5 when all the aforementioned TEP analyses were repeated in the voltage domain (see Fig. A.8). Likewise, we observed the same effect when all the TEP analyses were repeated with auditory independent components retained in the EEG data, both in the voltage domain (see Fig. A.9) and using zscores (see Fig. A.10).

Single-trial MEP and TEP variability across different levels of alertness
Having examined group-level changes in MEP and TEP amplitudes with spontaneous fluctuations in alertness, we next carried out single-trial analyses of TMS-evoked response variability, separately for each Alertness Level. Response variability was quantified as the intraindividual standard deviation (intraSD) of the TMS-evoked response amplitude, calculated separately for each participant and Alertness Level. To reduce the impact of uneven trial numbers across the five levels of Alertness, trials were randomly sampled for each participant to match an Alertness level with the smallest number of trials. Given that MEP amplitudes are strongly skewed, intraSD was calculated over log-transformed For each Alertness Level, the red line depicts mean amplitude. Pink shading represents 1 standard deviation (SD), which was very small and is thus difficult to discern in the figure. Blue shading represents the 95% confidence intervals for the mean. Insets on the right indicate locations of hand and scalp electrodes and time windows used to detect peak amplitude values.
MEP amplitudes, and the obtained intraSD values were subjected to a group-level analysis of trend across the five levels of Alertness.

Relationship between single-trial MEP and TEP amplitudes across levels of alertness
In a final step, we asked whether MEP and TEP amplitudes were correlated at a participant level, in line with previous studies ( Fecchio et al., 2017 ;Petrichella et al. 2017 ), and whether any such association was modulated by Alertness Levels. We first carried out intra-individual Spearman's rank order correlations between MEP and TEP amplitudes across all TMS intensities, separately for each participant and Alertness Level. We then tested the group-level distribution of the obtained correlation coefficients by running one sample t tests, separately for each Alertness Level (see Fig A.11). While uncorrected probability values pointed to a significant association at each Alertness Level, only Level 3 showed a significant correlation between MEP and TEP amplitudes after Bonferroni-Holm correction of p values ( Holm, 1979 ): Level 1 ( t (19) = 2.67, p UNCORRECTED = 0.015, p CORRECTED = n.s.), Level 2 ( t (19) = 2.57, p UNCORRECTED = 0.019, p CORRECTED = n.s.), Level 3 ( t (19) = 4.25, p UNCORRECTED = 0.00043, p CORRECTED = 0.0021), Level 4 ( t (19) = 2.48, p UNCORRECTED = 0.023, p CORRECTED = n.s.), and Level 5 ( t (19) = 2.14, p UNCORRECTED = 0.046, p CORRECTED = n.s.). There was no consistent change in the association between MEP and TEP amplitudes across Alertness Levels 1-5 (linear contrast ANOVA: F (1,19) = 1.50, p = 0.24, partial 2 = 0.073).

Discussion
Most studies that use TMS to investigate perceptual, cognitive and motor function in human participants do not consider the possibility that fluctuating levels of alertness across a single daytime testing session might lead to measurable changes in the associated patterns of brain activity. Here we used single-pulse TMS delivered over the right motor cortex while simultaneously measuring MEPs and TEPs across different, objectively defined levels of alertness while eyes-closed participants engaged in a simple tactile perception task. Participants exhibited fluctuating levels of alertness across the testing session, as indexed by continuous EEG recordings, but continued to respond behaviourally even in relatively deep states of drowsiness corresponding to the onset of NREM Stage 1 sleep (Alertness Level 5). Strikingly, both motor evoked responses and TMS-evoked cortical reactivity were altered across different levels of alertness. Specifically, we found that MEP amplitudes peaked during EEG flattening (Alertness Level 4), whereas TEP cortical reactivity increased earlier and remained stable across Alertness Levels 4 and 5. Our findings highlight that a proportion of inter-trial variability in neurophysiological responses to TMS, in particular TEP amplitude, can potentially be attributed to spontaneous fluctuations in alertness during wake-to-sleep transition.
In most recent MEP studies, a significant association between pre-TMS alpha oscillations and MEP amplitude was linked to the sensorimo-tor mu-rhythm, which shows spatially local effects confined to one or two pre-TMS cycles of mu Hussain et al., 2018 ;Madsen et al., 2019 ;Ogata et al., 2019 ;Thies et al., 2018 ;Zrenner et al., 2018 ). For instance, a recent study by Ogata et al. (2019) found a significant positive association between pre-TMS alpha and MEP amplitude in an eyes-open condition, but not in an eyes-closed condition. While this could be interpreted in terms of an alertness difference between eyes-open and eyes-closed conditions, the reported alpha effect was typically restricted to the − 250 ms to 0 ms time window and the motor cortex stimulation site, pointing to the sensorimotor mu-rhythm. Contrary to this, our pre-TMS EEG effects were spread both in the prestimulation time and electrode space (see Fig. A.4), indicating that slow fluctuations in alertness rather than sensorimotor mu-rhythm was driving the observed MEP changes. Thus, our study extends previous research by demonstrating that changing levels of alertness could be an important factor in brain state-modulation of corticospinal excitability. A related effect was reported by Zarkowski et al. (2006) , who demonstrated that MEP amplitude is negatively correlated with pre-TMS alpha power (10-13 Hz) and positively correlated with pre-TMS gamma power (30-60 Hz), with an alpha/gamma ratio being the strongest predictor of MEP amplitude. While a theta/alpha ratio can index the level of alertness in eyes-closed experiments, such as in the present study, the alertness-indexing frequencies are shifted upward in eyesopen paradigms ( Eoh et al., 2005 ;Kaida et al., 2006 ;Zhao et al., 2012 ). It is therefore likely that the results of Zarkowski et al. (2006) , similarly to our study, were influenced by changing levels of alertness in the participant sample.
Linear regression analysis revealed that unique variance of singletrial MEP peak-to-peak amplitude explained by EEG / ratio was on average just 2%. However, this could be due to the non-linear association between MEP amplitude and alertness, i.e. the increase of MEP amplitude at Alertness Level 4. Our finding of non-linear changes in MEP amplitude with decreasing levels of alertness might also explain previous contradictory findings regarding sleep deprivation effects on MEP amplitude. While several studies have reported an increase in corticospinal motor threshold following sleep deprivation ( Manganotti et al., 2001 ;De Gennaro et al., 2007 ), other studies have failed to find any such effect ( Civardi et al., 2001 ;Manganotti et al., 2006 ). Arguably, due to individual differences in instantaneous drowsiness levels, and potentially different times of day and durations of testing, the dominant level of alertness varied between these studies, confounding their comparison. For instance, datasets with a relatively high proportion of trials obtained during Alertness Level 4 would likely indicate higher MEP amplitude compared with other datasets. Unfortunately, a fine-grained measurement of alertness is seldom undertaken in MEP studies, even when EEG is recorded, e.g., "sleepiness " or NREM Stage 1 sleep are usually treated as a uniform state ( Manganotti et al., 2004 ), even though a more detailed analysis can reveal at least 4 micro-states within N1 sleep ( Hori et al., 1994 ; see Fig. 1 (C)).
Regarding the modulation of TEPs with sleepiness, Huber et al. (2013) observed an increase in TEP amplitude as a function of prolonged wakefulness as well as following sleep deprivation. Contrary to our results, however, they found no association between TEP amplitude and short-lasting episodes of drowsiness. This discrepancy could be attributable to the fact that Huber et al. (2013) followed a behavioural definition of drowsiness (specifically, performance in a visuomotor tracking task), kept behavioural sessions much shorter (2-3 min) and instructed their participants to keep their eyes open. Furthermore, while the sessions in Huber et al. (2013) were likely too short for drowsiness to develop following the baseline night, it is feasible that TEP amplitude increase following sleep deprivation was at ceiling, and participants' instantaneous level of alertness could not modulate it any further. Contrary to this, we used fine-grained EEG measures of alertness that could be quantified independently of fluctuations in behaviour, our testing sessions were longer, and participants were instructed to close their eyes, which facilitated spontaneous fluctuations in alertness. In another recent study, TEP amplitude was found to depend on the interaction between sleep pressure and phase of the circadian cycle rather than sleep homeostasis alone ( Ly et al., 2016 ). Furthermore, the same study found that an increase in TEP amplitude was associated with an increase in EEG theta power across 29 h of sustained wakefulness. Unfortunately, the authors did not report whether such an association held over a shorter period of time (e.g., 45-90 min), as would be the duration of experiments in typical TMS studies. Our study thus complements and extends these previous reports by demonstrating a much more rapid increase of TEP amplitude in response to spontaneous fluctuations of alertness.
The TEP amplitude increase we observed during wake-to-sleep transition is reminiscent of the TEP amplitude increase previously found for reduced levels of alertness, such as during NREM sleep and anaesthesia ( Ferrarelli et al., 2010 ;Massimini et al., 2005Massimini et al., , 2012Sarasso et al., 2015 ;Tononi and Massimini, 2008 ). Arguably, our findings reflect the earliest stages of TEP modulation as a function of alertness, with slow wave (0.5-2 Hz) dominated brain states placed at the other side of an alertness continuum. Sleep and anaesthesia-related increases in early TEP amplitude likely reflect facilitation of a stereotypical, local mode of processing ( Ferrarelli et al., 2010 ;Massimini et al., 2005Massimini et al., , 2012Sarasso et al., 2015 ;Tononi and Massimini, 2008 ). We show that such a shift toward local processing starts developing while participants are still conscious of the experimental setup and able to respond behaviourally, i.e. well before they reach unresponsive sleep.
In the present study, we showed that MEP and TEP amplitudes are positively associated at a single trial level. While such a relationship has been reported previously ( Fecchio et al., 2017 ;Petrichella et al. 2017 ), here we extended earlier studies by showing that the MEP and TEP association is non-linearly dependent on alertness: specifically, a reliable association was observed at Alertness Level 3 but not at Levels 1-2 or 4-5. We also observed a gradual increase in inter-trial variability of TEP amplitude across decreasing levels of alertness, whereas variability in MEP amplitude did not show a consistent change as a function of alertness. These observations suggest that the relationship between alertness levels and TMS-evoked neural responses is non-monotonic and can depend on a particular neurophysiological index as well as the level of alertness. If possible, alertness should be controlled at a single-trial level when experiments involve hundreds of trials delivered over a prolonged testing session, especially when participants keep their eyes closed. Even though standard statistical measures of central tendency can reduce the effect of alertness when carrying out within-participant contrasts, more trials are required to offset alertness-related variability, and having a reliable measure of participants' level of alertness could reduce the length of such experiments. Furthermore, measures of central tendency cannot eliminate alertness confounds from between-participant or longitudinal comparisons. For instance, patients or older individuals might have consistently lower or higher levels of alertness compared with healthy controls.
While the change in TEP amplitude across different levels of alertness can be interpreted in terms of cortical reactivity/excitability, an alternative explanation is that TEP increases were triggered by a phasereset of pre-TMS oscillations ( Herring et al., 2015 ;Kawasaki et al., 2014 ;Thut et al., 2011 ). It is conceivable that a phase-reset of relatively slow theta oscillations in drowsy trials, as opposed to the faster alpha oscillations in awake trials, could underlie the increase in TEP amplitude we observed. On the other hand, even though the phase of spontaneous muoscillations predicts TEP amplitude, single-pulse TMS does not seem to reset the phase of mu-oscillations ( Desideri et al., 2019 ). Clearly, more research is needed to investigate the potential role of phase-resetting in the generation of TEPs ( Pellicciari et al., 2017 ), and in this context we argue that the transition from wake to sleep provides an ecologically valid model to study it.
Several strategies exist for dealing with changing levels of alertness during behavioural testing. If alertness decreases throughout a session, an additional factor of trial or block number could be added to a statis-tical model as an alertness regressor or covariate. However, we found that an initial decrease in alertness did not persist throughout the testing session, and participants tended to "oscillate " between awake and drowsy periods (see Fig. A.2), which precludes any straightforward inference of decreasing alertness over the course of a single testing session. Alternatively, reaction times (RTs) could be used as a behavioural index of alertness in active TMS experiments, as RTs typically lengthen with decreases in vigilance ( Schmidt et al., 2009 ), increases in drowsiness ( Ogilvie and Wilkinson, 1984 ), and following sleep deprivation ( Ratcliff and Van Dongen, 2011 ). However, such a strategy would not be possible in passive paradigms in which participants are not required to respond ( Gordon et al., 2018 ;Massimini et al., 2005 ). Furthermore, as both trial counts and reaction times are relative measures, participants who maintain high alertness throughout a session could be falsely labelled as drowsy in a proportion of trials. Arguably, concurrent EEG recording should be the gold standard for assessment of single-trial alertness, as it provides quantifiable and reliable signatures of instantaneous brain-states, including alpha-and theta-derived Hori stages of sleep onset ( Hori et al., 1994 ). As an alternative to the tedious manual scoring of Hori stages (Alertness Levels), an automated EEG method based on wakefulness and sleep grapho-elements is available for the detection of drowsiness from EEG data ( Jagannathan et al., 2018 ). While methods that weight the dominance of EEG theta and alpha oscillations are suitable for eyes-closed paradigms ( Hori et al., 1994 ;Jagannathan et al., 2018 ), such as resting state or phosphene studies ( Bonnard et al., 2016 ;De Graaf et al., 2017 ), the power of higher EEG frequencies should be considered when assessing alertness during active eyes-open experiments ( Eoh et al., 2005 ;Kaida et al., 2006 ;Zhao et al., 2012 ). Finally, when EEG measurements are not available or feasible, TMS experiments could be carried out in short blocks of just a few minutes each (e.g., 3-5 min) and inter-block intervals could be used to assess instantaneous subjective sleepiness, for example by asking participants to undertake the 9-graded Karolinska Sleepiness Scale ( Åkerstedt and Gillberg, 1990 ;Kaida et al., 2006 ). Future studies should compare other methods for capturing the contribution of fluctuating levels of alertness to data variance in TMS studies.
Here we used an eyes-closed, lights-dimmed behavioural paradigm with a long inter-pulse interval and an explicit instruction that participants could fall asleep should they feel the urge to do so, in order to promote sufficient episodes of drowsiness for statistical analysis. As such, our paradigm likely facilitated transitions from wake to sleep and induced much larger fluctuations in alertness than standard TMS-EEG-EMG setups. One implication is that researchers should exercise caution in applying our conclusions to other testing scenarios. To extend the generalisability of the current findings, future studies should employ eyesopen paradigms with more rapid stimulation to investigate the effects of alertness fluctuations on TMS-induced changes in cortical reactivity. It will also be important to investigate whether neurophysiological effects of decreasing alertness are limited to individuals who are likely to fall asleep in a situation of prolonged inactivity, as was the case for most of our participants.
While our participants used earplugs to minimise TMS-related auditory evoked potentials, and we removed auditory components and restricted TEP analyses to the time window < 50 ms, traditionally known to be free from auditory contamination ( Nikouline et al., 1999 ;Tiitinen et al., 1999 ), several recent TMS studies have highlighted the impact of sensory co-stimulation confounds, especially in the somatosensory domain, even at the earliest time windows Conde et al., 2019 ). Given the growing concern about inadvertent sensory co-stimulation in TMS experiments Siebner et al., 2019 ), the addition of a realistic sham condition would be desirable in future TEP studies, preferably with separate auditory and somatosensory conditions. As a final point, we observed that the strength of association between theta/alpha ratio and MEP slope fluctuates at a 1 Hz frequency (see Fig. A.4 A). Given the lack of previous literature on the association between theta/alpha ratio and MEP, we cannot be certain whether it is a genuine phenomenon or a statistical coincidence; hence, it should be replicated before drawing any firm conclusions.
To conclude, our findings challenge the widely held assumption that the cortex is maintained in a more or less "steady state " when participants undertake experimental investigations of perceptual, cognitive or motor function. Our findings demonstrate that spontaneously occurring fluctuations in alertness differentially modulate cortical reactivity over relatively short durations, even when participants are tested during day time hours when they would normally be awake and performing typical activities of daily living. Our study highlights the importance of controlling for spontaneous fluctuations in alertness at a single trial level in non-invasive brain stimulation studies.

Declaration of Competing Interest
The authors declare no competing interests.