Bridging the gap: TMS-EEG from lab to clinic

The combination of transcranial magnetic stimulation (TMS) and electroencephalography (EEG) has reached technological maturity and has been an object of significant scientific interest for over two decades. Ιn parallel, accumulating evidence highlights the potential of TMS-EEG as a useful tool in the field of clinical neurosciences. Nevertheless, its clinical utility has not yet been established, partly because technical and methodological limitations have created a gap between an evolving scientific tool and standard clinical practice. Here we review some of the identified gaps that still prevent TMS-EEG moving from science laboratories to clinical practice. The principal and partly overlapping gaps include: 1) complex and laborious application, 2) difficulty in obtaining high-quality signals, 3) suboptimal accuracy and reliability, and 4) insufficient understanding of the neurobiological substrate of the responses. All these four aspects need to be satisfactorily addressed for the method to become clinically applicable and enter the diagnostic and therapeutic arena. In the current review, we identify steps that might be taken to address these issues and discuss promising recent studies providing tools to aid bridging the gaps.


Introduction
The combination of transcranial magnetic stimulation (TMS) and electroencephalography (EEG) has been around for over two decades (Ilmoniemi et al., 1997;Izumi et al., 1997;Rossini et al., 1991;Virtanen et al., 1999). It is the first non-invasive method allowing the direct probing of neuronal excitability and connectivity of the human brain by providing the exact time and site of cortical activation by focused TMS and measuring the following responses in the time-domain via EEG. In the early days of TMS-EEG, the issues related to measurement of cortical reactions to TMS involved the major artifacts (Ilmoniemi et al., 1997;Izumi et al., 1997;Rossini et al., 1991) and safety of use, e.g. due to temperature rise in the electrodes (Roth et al., 1992;Virtanen et al., 1999). Overtime, the proper electrodes and amplifiers  have provided technical means to conduct TMS-EEG experiments in the larger scale. More recently, technical efforts focused on identifying and interpreting proper TMS-induced EEG responses using software signal processing and analysis (Bertazzoli et al., 2021;Mutanen et al., 2018;Rogasch et al., 2017;Wu et al., 2018). While the number of TMS-EEG studies in the neuroscientific field is on the rise, the method is yet mostly a scientific tool with little clinical relevance and lacks standardization of protocols Fried et al., 2021).
As the major technical challenges have been overcome, numerous clinical feasibility studies and studies identifying and proposing novel biomarkers based on TMS-EEG are conducted, strengthening the evidence for clinical utility. However, TMS-EEG has not yet made its way into clinical use despite the compelling evidence provided in the scientific literature. In this review, we weigh the clinical evidence providing the potential for clinical applications of TMS-EEG and identify the gaps that prevent the neuroscientific tool, that is currently the status of TMS-EEG, to evolve and become established as a clinical tool.

Clinical prospects of TMS-EEG
The potential clinical applications of TMS-EEG have been comprehensively and critically assessed in a recent review article (Tremblay et al., 2019). The key promising areas identified include, inter alia, disorders of consciousness, psychiatric diseases, epilepsy and neurodegenerative disorders (e.g. Alzheimer's and Parkinson's disease).
Disorders of consciousness as a result of brain lesions (e.g. acute strokes or severe traumatic brain injuries) include unresponsive wakefulness syndrome (UWS) or vegetative state and minimally conscious state (MCS) (Tremblay et al., 2019). From a neurophysiological perspective, these disorders are ascribed to disruption of local and global cortico-cortical information processing which in turn degrades the connectivity and complexity of the overall brain network. TMS-EEG is a particularly well-suited method for investigating these abnormal brain states because it does not rely on the active participation of the patient or the integrity of sensori-motor and language networks. In addition, it allows the exploration of causal interactions between interconnected cortical areas (that is, the probing of effective brain connectivity) in a precise spatio-temporal manner (Luppi et al., 2021;Sarasso et al., 2014).
In a series of clinical studies, a TMS-EEG based measure called perturbational complexity index (PCI, Casali et al., 2013), which estimates the amount of information in the integrated EEG responses to perturbational magnetic stimuli, was highly successful in detecting MCS patients (94.7% sensitivity) and stratifying UWS patients in prognostically distinct categories (Casarotto et al., 2016). The PCI classification in a cohort of 24 patients with disorders of consciousness showed a high-level (91.6%) of agreement in diagnosis with 18 F-fluorodeoxyglucose positron emission tomography (FDG-PET), a validated tool, further indicating the clinical usefulness of the method, especially if the two methods were to be combined (Bodart et al., 2017). There is currently compelling evidence that TMS-EEG is an excellent tool for probing the functional state of thalamocortical circuits and represents a highly promising diagnostic and prognostic biomarker in patients with various unconscious states as was systematically reviewed by Arai et al. (2021).
A wide spectrum of psychiatric diseases has been extensively investigated with TMS-EEG, including schizophrenia, mood disorders, substance use and attention deficit hyperactivity disorders. In schizophrenia, TMS-EEG studies provided pathophysiological insight by demonstrating altered cortical inhibition as well as abnormal generation and modulation of gamma oscillations (Kim et al., 2020;Tremblay et al., 2019). In addition, a recent systematic review concluded that TMS-EEG can be applied to develop objective diagnostics and prognostics for schizophrenia, as well as to improve therapeutic strategies . On the other hand, it should be pointed out that studies directly linking TMS-EEG responses to clinical ratings are still relatively few, and strong evidence of robust biomarkers is lacking (di Hou et al., 2021). In mood disorders, TMS-EEG elucidated the neurophysiological effects of antidepressant treatments, including rTMS , and may help in applying personalized approaches by identifying and selecting those patients who are likely to respond to a particular type of treatment (Dhami et al., 2021). In substance use and attention deficit hyperactivity disorders, TMS-EEG disclosed abnormal cortical reactivity and connectivity (Cao et al., 2021). All in all, numerous studies in the diverse groups of psychiatric diseases support the clinical utility of TMS-EEG as a means for unravelling pathophysiological mechanisms and monitoring the effects of therapeutic interventions as well as for personalizing treatments and predicting their outcome (Tremblay et al., 2019). Noda concluded that neurophysiological indicators with TMS-EEG may be able to predict the response of neuropsychiatric patients to a specific type of treatment in the near future (Noda, 2020); however, the clinical feasibility is not yet estimated. Overall, in psychiatric disorders, the outcomes of TMS-EEG studies have been well reviewed, but studies replicating findings in larger populations of patients are substantially missing. As a result, a confident estimation of clinical feasibility cannot be made. Most of the previous studies in this research area, while demonstrating overlap in their findings, have primarily aimed for providing new scientific insights in neuropsychiatric disorders instead of evaluating clinical feasibility. Therefore, replicating the most prominent TMS-EEG outcomes and demonstrating the clinical feasibility of the method remains a great challenge for the near future.
In epilepsy, paradoxically, there is currently no reliable diagnostic and prognostic biomarker for epilepsy (Engel, 2008). Scalp EEG is the principal diagnostic test but is characterized by suboptimal sensitivity due, in part, to the subjective, operator-dependent interpretation of the results. In addition, EEG cannot prognosticate reliably which pharmacological intervention will be most effective and well-tolerated or who will suffer from recurrent seizures following antiseizure medication withdrawal. In recent years, TMS-EEG emerged as a potential novel biomarker in epilepsy promising to increase the diagnostic and prognostic yield of EEG (Kimiskidis, 2016). There are theoretical advantages of TMS-EEG compared to conventional EEG including the objective, operator-independent analysis of TMS-EEG data as well as the fact that it is a prime example of an active paradigm for exploring brain function. Accumulating evidence suggests that active paradigms reveal features of brain states, for instance features of transition to the epileptic state, that are not readily apparent using nonprovocative recordings of brain activity .
A limited number of studies provided evidence that TMS may provoke abnormal EEG responses, including epileptiform discharges (EDs), in patients with epilepsy which may be used for diagnostic and prognostic purposes. In focal epilepsy, TMS-EEG identified reliably the epileptogenic zone (Valentin et al., 2008) and disclosed the existence of cortical hyperexcitability in gray matter heterotopias (Shafi et al., 2015). In genetic generalized epilepsy, TMS-EEG offered insight into the pathophysiological mechanisms of epilepsy and provided sensitive and specific biomarkers for the diagnosis of epilepsy and the prediction of response to treatment with antiseizure medications (Kimiskidis et al., 2017). An additional application of TMS-EEG relates to the monitoring of the acute abortive effects of magnetic stimuli on epileptiform discharges Rotenberg et al., 2009). All these data emphasize the significant potential of TMS-EEG as a clinical tool in epilepsy.
Alzheimer's disease (AD) is the most common dementia, and major efforts have been invested to explore the potential of TMS-EEG as a clinically applicable means to predict conversion from mild cognitive impairment (MCI) to AD (Ferreri et al., 2021;Tremblay et al., 2019). Early results on small populations of MCI and AD patients versus healthy controls demonstrated a direct relation between the TMS-evoked potential (TEP) P30 component amplitude and cognitive performance (Casarotto et al., 2011;Julkunen et al., 2011;Julkunen et al., 2008a), while more recent results have demonstrated an inverse relation both in the M1 and DLPFC (Bagattini et al., 2019;Ferreri et al., 2016). The discrepancy may relate to medication effects . A recent, 6-year prospective study by Ferreri et al. (2021), investigated the conversion from amnestic MCI to AD by means of TMS-EEG and concluded that a parameter discriminating patients converting from MCI to AD was the so-called stability of the dipolar activity, which reflects time-specific alterations in global TMS-induced activity. Thus far, TMS-EEG studies on AD have been conducted in a relatively small number of patients and evidence is still scarce. Accordingly, the use of TMS-EEG in AD-diagnostics is quite far from clinical applications but the clinical prospects are promising.
In Parkinson's disease (PD), preliminary results on transient normalization of dysfunctional beta oscillatory activity in the sensorimotor area have been recently reported with TMS-EEG (Formaggio et al., 2021). From a diagnostic point of view, Maidan et al. (2021) found that a combination of TMS-EEG measures was able to identify PD patients with high accuracy (area-under-curve in receiver operating characteristic, ROC, was 0.898) indicating clinical potential in the early detection of PD. Overall, the TMS-EEG applications for PD diagnostics are still far from clinical applications.
Internal ongoing brain states influence the way evoked activity convolves with the stimulus and the effects on both cortico-cortical  and cortico-spinal connectivity  can be extremely different depending on the instantaneous internal state at the moment of stimulation. Therefore, different directions of plasticity modulation (long-term potentiation or long-term depression) may be determined by the instantaneous brain state at the time of rTMS administration (Baur et al., 2020). Real time EEG state-dependent stimulation has been administered at the troughs of rolandic mu-alpha oscillation in stroke patients on the lesioned sensorimotor cortex and on the same phase of the alpha oscillation generated by the left dorsolateral prefrontal cortex in depressed patients with promising results Zrenner et al., 2020). One objective of effective EEG state-dependent TMS in therapeutic set-ups is the reduction of interindividual variability in the responses to stimulation. TMS-EEG trials have used so far mostly standard ("one-size-fits-all") protocols. This has led to quite variable responses to TMS treatments. To be optimally effective, real-time, EEG-state dependent TMS protocols should be adjusted individually, taking into account the patient's on-going brain-state fluctuations as well as the individual brain-network profile on both anatomical and functional level (Baur et al., 2020;Gordon et al., 2021). Overall, there appears to be great potential in individualized, state-dependent TMS-EEG at least in stroke , depression ), Parkinson's (McNamara et al., 2020 and possibly in epilepsy. However, evidence need further support and replications prior to clinical application.

Gaps identified at present between laboratory research and clinical application of TMS-EEG
At present, we have identified the following challenges in transferring the use of TMS-EEG from lab to clinics (in "chronological" order of practical application): 1) simplicity of application, 2) obtaining highquality signals, 3) high accuracy and reliability, and 4) understanding of the neurobiological substrate of the responses (Fig. 1). These challenges are not independent from each other; simplicity of application is influenced by the requirement of proper and meticulous preparation to reach sufficient signal quality. This in turn affects the reproducibility of the resulting responses and complicates the interpretation of responses. While the lack of standardized routines has been identified as a common denominator, the standardization within the individual gaps possess individual characteristics: 1) standardized application implies a particular setup of TMS-EEG, e.g. type of electrodes, number of electrodes or how to set up the wires, 2) standardization in signal quality refers for instance to setting a minimum sampling frequency, or the use of a certain type of amplifier, or standard signal cleaning and analysis procedures, 3) standardization in accuracy and reliability could mean setting criteria for calculating and reporting these parameters that are essential for planning clinical trials and 4) standardization in understanding of the responses could mean firmly establishing the biological underpinnings and the clinical relevance of the responses as well as setting specific criteria for the presentation and parametrization of the responses.
The methods of clinical interpretation of TEPs have not yet reached the majority of specialists in clinical neurophysiology or neurology, and therefore the number of clinical TMS-EEG experts is low. The expertize of TMS-EEG is not expected from the clinical staff and is considered an advanced skill (Fried et al., 2021). Nevertheless, the expertize is expected to increase if clinical feasibility is demonstrated, and clinical applications become standardized and widely available. The above challenges regarding clinical applications mostly refer to the diagnostic use of TMS-EEG, while EEG can also be used to identify brain oscillations to synchronize with TMS Schaworonkow et al., 2019;Stefanou et al., 2018;Zrenner et al., 2020). Previously, Tremblay et al. (2019), reviewed the clinical utility of TMS-EEG, and identified clinically relevant challenges that are related to the controlling and design of clinical trials, such as influence of medications, coil positioning, stimulation intensity (SI), and control conditions, as well as lack of standardized signal processing. The challenges we have identified are the gaps that need to be filled in order to successfully translate TMS-EEG from laboratory research applications to wide-use clinical applications.

Simplicity of application
For the clinical pipelines the key issue is the possibility to preprocess offline all the information that does not need to be computed at bedside. Of course, a trade-off needs to be found between the accuracy of the measurements and the simplicity in the recordings pipeline. The procedure of TMS-EEG application includes multiple stages, as reviewed by Farzan et al. (2016). In a recent training guideline, setting up a concurrent TMS and EEG recording was not required from the clinicians or technicians, indicating that there is no need for such skills within the clinical staff yet (Fried et al., 2021). It has been well-recognized in the neuroscientific setting that generally the TMS-EEG recordings require in general specialized equipment (Ilmoniemi and Kicic, 2010;Varone et al., 2021;Veniero et al., 2009;Virtanen et al., 1999) and experience tells that also time and effort needs to be invested for preparation prior to the experiments. Multichannel EEG recordings require thorough preparation even before an experiment begins, so that all impedances at the electrode skin interface are low and similar across channels to avoid potentials arising between leads outside the head and coupling with the measured EEG (Ilmoniemi and Kicic, 2010;Julkunen et al., 2008b). In addition, controlled and shielded conditions are required to avoid noise and other external signals coupling with the meaningful EEG signal. To minimize the coupling TMS artifact, the EEG leads need to be positioned optimally, in a radial way (Sekiguchi et al., 2011;Zipser et al., 2018). Also considering that the loud click sound from the coil causes auditory evoked potentials in the EEG signal, it may require additional measures of control to specifically study TEPs for example by use of ear plugs or masking noise played via headphones (Fuggetta et al., 2005;Julkunen et al., 2008a;Paus et al., 2001;ter Braack et al., 2015) and by adding foam between the coil and the electrodes to suppress bone conduction (Massimini et al., 2005;ter Braack et al., 2015).
TMS-EEG combines two very different technologies, and they need to work in synchrony to enable an adequate measurement of TEPs, i.e. TMS needs to trigger EEG or vice versa to provide a proper timing for TEP recording and averaging individual responses. The EEG technology is intended to record a very low amplitude signal as free from the surrounding signal as possible, while the TMS is intended to produce a short and high-amplitude electromagnetic pulse that can activate the brain. While the principles and core aims of the EEG and TMS are not in conflict, the high-amplitude electromagnetic pulse produced during TMS interferes greatly with the EEG signal being recorded thus causing artifacts directly coupling to the EEG leads and electrodes as well as indirectly by causing scalp or facial muscle activation and reactive eye blinks Rogasch et al., 2014). The consideration of TMS in EEG recording has resulted in development of suitable technology, with a new generation of amplifiers which can cope with the extremely large electric signal generated by the TMS stimulation in the electrodes (Ilmoniemi and Kicic, 2010;Varone et al., 2021). Thus, the special technology affects the simplicity of application, as a common EEG device in routine clinical use is insufficient for recording TEPs, while devices meant for recording evoked potentials in clinics often lack the sufficient number of inputs/channels.
Multiple channel recordings (for instance, comprising 50-100 channels) enables source-estimation of the evoked neural activation and connectivity analysis (Bagattini et al., 2019;Bortoletto et al., 2015;Fuggetta et al., 2005;Ilmoniemi and Kicic, 2010;Ilmoniemi et al., 1997;Julkunen et al., 2008a;Massimini et al., 2005;Massimini et al., 2009;Rogasch et al., 2017). However, this is not always implementable in a clinical environment due to time and procedure constraints. The multichannel recordings and source analysis also aids in signal processing when large muscle artifacts are present (Haufe et al., 2014;Mutanen et al., 2016). On the other hand, depending on the aim of the TMS-EEG application, recording settings with very few channels may suffice, e.g., when utilizing TMS-EEG for brain-state dependent stimulation (Baur et al., 2020;Hussain et al., 2019;Stefanou et al., 2018;Torrecillos et al., 2020;Zrenner et al., 2018). In case when source-estimation via multichannel EEG is to be implemented, some of the prerequisites can be processed offline in advance and then employed during the measurement. For instance, if the anatomical MR scan is available, segmentation of the cortical sheets representing white matter and gray matter boundaries can be performed offline and used as a source space for real-time online source localization of signals. More simply, the MR scan can be used for informed neuronavigation which is likely more decisive for functional accuracy in patients than in healthy subjects. For neuronavigation purposes and targeting TMS, brain atlases or standardized coordinates can be utilized with MRI to aid identification of anatomical and functional areas (Hui et al., 2020;Reijonen et al., 2021). On the other hand, if MRI is not available, individualized template MRIs could be an alternative for individual MRI (Fleischmann et al., 2020).
The signal processing of the TEPs currently requires major effort with multiple, semi-automated steps along with several analyzer-dependent decisions Rogasch et al., 2017), which take time. For the artifact rejection, a laborious part of the processing, an automated tool has been proposed ; however, the differences in resulting TEPs with the automated tool appear greater than with the available manual/semiautomatic tools (Bertazzoli et al., 2021). The interpretation of the resulting TEP responses requires skills and experience even after the processing steps, as the TEPs are greatly dependent on the stimulus location Casarotto et al., 2010;Lioumis et al., 2009).
An issue which has recently raised discussion in the scientific community regards the opportunity of "sham" sessions in TMS-EEG experiments to control for somatosensory components due to the stimulus which superimpose to genuine cortical response to TMS . A sham approach can simulate the sensorial components of the stimulus without generating an induced current in the cortex. Subtracting the signal given by the "sham" from the TMS-EEG can be considered true TMS-evoked activity. Despite complications due to doubling the sessions for patients, "sham" control would enhance reliability of TMS-EEG read-outs.

Difficulty of obtaining high-quality signal
For a long time, since the first recordings with TMS-EEG, the challenge has been to produce sufficient quality TMS-induced cortical responses . The challenge arises from the very nature of the combined modalities, that is multichannel EEG recording that is prone to external electromagnetic field disturbances inducing noise and artifacts in the recorded responses, and TMS, an electromagnetic source, that is used to stimulate the brain. The TMS induces a high-amplitude stimulation artifact to the recorded response in the EEG and may saturate the recording or otherwise cover the meaningful recorded EEG signal (Julkunen et al., 2008b;Rogasch et al., 2017;Tomasevic et al., 2017;Veniero et al., 2009;Virtanen et al., 1999). At the same time the TMS induces other physiological artifacts, like a local muscle-artifact (Mutanen et al., 2013;Mäki and Ilmoniemi, 2011) due the TMS causing contraction of small muscles under the scalp and producing high amplitude muscle activity in the mixture of the measured response. Another typical, indirect reaction to the TMS, not elicited via cortical stimulation, is the blink reflex that also causes an artifact to the measured response (Bertazzoli et al., 2021;Rogasch et al., 2014). These types of artifacts are influencing the interpretation of the TEPs, and typically require processing of the signal Rogasch et al., 2017;Rogasch et al., 2014). Artifacts may impair the analysis of the data and source-localization (Hernandez-Pavon et al., 2012). In addition, due to background activity of the brain, the TEPs measured from the scalp are of low amplitude and require averaging of multiple responses, typically > 100 good quality responses, to reach a sufficient signal quality and enable interpretation of TEPs Kerwin et al., 2018;Mancuso et al., 2021). However, the actual number of trials required depends on many aspects affecting the signal-to-noise ratio (SNR) in addition to the stimulus parameters like the co-operation of the subjects, control of conditions and thoroughness of preparation.
Technology has advanced greatly especially regarding the reduction of the effect of TMS-induced electromagnetic artifact, as the amplifier technology in TMS-EEG applications has moved from AC-amplifiers to more wide-band DC-amplifiers more effectively preventing amplifier input saturation after TMS and hence enabling faster recovery from the artifact and consequently improved signal quality at low latency cortical responses following TMS (Tomasevic et al., 2017;Veniero et al., 2009;Virtanen et al., 1999). In addition, increase of sampling frequency has helped the processing of the recorded signal. In TMS-EEG responses, the signal processing is also crucial in removing other than meaningful TEP-signal (Rogasch et al., 2014), and openly available tools (Atluri et al., 2016;Bertazzoli et al., 2021;Mutanen et al., 2018;Wu et al., 2018) have been provided for quite effective removal of artifacts, but they are not yet clinically available or approved, and require some technical expertize when applied. Artifact removal has been reviewed in (Rogasch et al., 2014;Tremblay et al., 2019). The preparation of the recording requires expertize in order to minimize the resulting artifacts and to reach a sufficient SNR with low number of trials (Fig. 2). The number of trials for good quality signal depends on the used SI, as the amplitude of the TEP components depends on the SI Casarotto et al., 2010;Komssi et al., 2004;Komssi et al., 2007;Kähkönen et al., 2005;Saari et al., 2018), and greater amplitude peaks emphasize the signal in SNR. On the other hand, the TMS-induced artifacts are also dependent on the SI (Litvak et al., 2007;Mutanen et al., 2013), which also increases the possibility of artifacts in the signal. The number of required trials to record a TEP makes the recording procedure lengthy, limiting the clinical potential, when multiple recordings are required. However, in advance it is difficult to fully estimate the number of required trials without real-time analysis of the TEPs. For instance, in Fig. 2, it is evident that the SNR does not increase greatly after about 55 trials in that specific recording. As the SNR depends on the amplitude of the TEPs, the level of SNR may depend on the location of the recording electrodes and target of TMS .
Unfortunately, the clinical context puts tighter constraints on the measurement and related set-up than a neuroscientific lab. In some cases, the number of trials is determined by the tolerance of the patient and not by the minimum number of trials to obtain sufficient SNR, hence maintaining patient comfort at the cost of signal quality. While technical inconveniences and sub-optimal number of channels and trials may be significant constraints, several procedures can be optimized through skilled personnel and new approaches to TMS-EEG. In this regard, Parmigiani et al. (2019) have recently proposed a new approach for monitoring online TEPs at single trial through a graphical user interface while measuring TMS-EEG data which could represent an important evolution of the experimental pipelines and contribute to increase EEG SNR even in the presence of a limited number of pulses.

Poor accuracy and reliability
Accuracy stems from sensitivity and specificity. Quite often, the sensitivity and specificity are not studied in basic research in terms of clinical accuracy, and hence the vast majority of the evidence provided in the literature do not provide the means to evaluate accuracy, and no such systematic reviews evaluating the clinical performance of TMS-EEG, employing diagnostic accuracy as an index exist at present. While accuracy can be increased with tailored pipelines and extended measurements (when possible) the nature itself of TEPs has been challenged in recent times (Conde et al., 2019). However, growing evidence Fig. 2. TMS targeting primary motor area of the right hand in a healthy person at a stimulus intensity of 120% of resting motor threshold. A) TEP is shown at the stimulation site (gray areas mark the locations, where signal power was calculated for SNR; orange area shows 95% confidence interval from trial to trial). B) SNR is shown related the TEP in A) as a function of number of averaged trials with TEPs displayed in the smaller plot as averaged from different number of trials. C) topographical maps are shown at different latencies after the TMS (maps were scaled from minimum to maximum to maximize the information about spread of the evolving TEP as latency increases. Manual protocol using the TESA-pipeline  was used in processing the 60-channel EEG recorded with NeurOne amplifier at 5 kHz (Bittium Ltd, Kuopio, Finland). is accumulating in favor of a genuine response of the brain Ozdemir et al., 2021b;Rocchi et al., 2021) to TEPs which can be disentangled from peripherally elicited afferent activity. Moreover, TEPs are reproducible within individuals (Ahn and Fröhlich, 2021;Lioumis et al., 2009;Ozdemir et al., 2020). As mentioned earlier, preprocessing pipelines may affect data cleaning in different ways, resulting in slightly different TEP outcomes potentially affecting the test-retest reliability of the processing (Bertazzoli et al., 2021). Test-retest reliability of several non-invasive neurophysiological biomarkers represents a common issue which is being systematically accessed only in recent times (Brooks et al., 2021). One key issue affecting particularly TMS-EEG is represented by the ICA cleaning of the data. In several labs, two rounds of ICA are used, one aimed to eliminate one or two components directly related to the TMS pulse which being predominant tends to obscure other kinds of artifact Rogasch et al., 2014). The second round aims to eliminate further physiological and technical artifacts. This is a delicate passage, since it is analyzer-dependent and makes it difficult to develop a standardized practice for objective clinical evaluation. However, in recent years an automatized approach to ICA component removal specifically for TMS-EEG data (ARTIST) has been developed, which could reduce the inter-rater component in the variability of responses . The adoption of such a pipeline could constitute a standard for TEPs and EEG induced oscillations preprocessing. Moreover, some cleaning methods based on source localization have been developed (Haufe et al., 2014;Mutanen et al., 2016) and could improve the standardization of TMS-EEG outcomes.
Previous evidence also suggests that the later peaks at N100-P200 complex are the most reproducible (Bertazzoli et al., 2021;Biabani et al., 2019;Kerwin et al., 2018;Lioumis et al., 2009), but may actually be such due to wide-spread sensory activation (Bertazzoli et al., 2021;Biabani et al., 2019;Conde et al., 2019;ter Braack et al., 2015;Tiitinen et al., 1999). Casarotto et al. (2010) investigated the repeatability of the TMS-EEG responses by divergence and demonstrated that the TEPs are repeatable from session to session provided that e.g. coil configuration, SI and stimulation site remain similar.
In a clinical trial Voineskos et al. (2019) evaluated the potential of TEPs in identifying major depressive disorder (MDD) patients based on the TEP components and observed that the N45 component in the TEPs produced at DLPFC optimally distinguished persons with MDD at 76.6% accuracy with the MDD group of patients demonstrating larger components. Kimiskidis et al. (2017) in a phase II study evaluated the diagnostic accuracy of using TMS-EEG in identifying patients with generalized epilepsy as well as classifying patients responsive to AEDs, and demonstrated accuracies of 84% and 76%, respectively. Sun et al. (2016), evaluated suicidal ideation following magnetic seizure therapy using recorded TEPs with component N100 and long-interval cortical inhibition (LICI) response at the DLPFC and found that accuracy in identifying patients with remission of suicidal ideation was 89%. Zipser et al. (2018) found that TEPs and interhemispheric connectivity in early relapsing-remitting multiple sclerosis patients do not differ significantly apart for a late TEP which could potentially represent a biomarker of the disease in its early stages. Hence, certain TEP components have been identified and demonstrate clinical potential.
The reliability of TEPs has been studied to some extent. For instance, Kerwin et al. found that the reliability of peaks at 100 and 200 ms are strongest in the left parietal, centroparietal and central regions, and that the earlier peaks N40 and P60 are found within a stimulus trial, but not reliably at different days (Kerwin et al., 2018). Farzan et al. assessed the test-retest reliability of LICI in TMS-EEG at the M1 and DLPFC and reported high reliability and internal consistency at both sites (Farzan et al., 2010).

Insufficient understanding of the neurobiological substrate of the responses
The interpretation of TMS-EEG responses lacks standardization. Tremblay et al. (2019) reviewed the evidence regarding the sources of TEPs, yet full consensus has not been reached and scattered evidence seems to limit the full understanding of the responses. The TEPs consist of a reliable and reproducible sequence of positive and negative deflections at approximately 30 (P30), 45 (N45), 60 (P60), 100 (N100) and 180 (P180) milliseconds after the TMS pulse when the stimulation target is M1 (Fig. 2). Most of the insights collected so far stem from TEP studies on M1. While Mäki and Illmoniemi (2010) found a positive correlation between amplitude of N15-P30 peaks and motor evoked potentials (MEPs), later TEPs are considered measures of mere cortico-cortical effective connectivity and as such complementary and unrelated to motor evoked potentials. In fact, the early fluctuations in the M1 stimulation are considered to reflect excitatory neurotransmission, as correlated with the MEP amplitudes, but also with SI, coil configuration and neuromodulation with paired-pulse TMS (Bonato et al., 2006;Cash et al., 2017;Ferreri et al., 2011;Komssi et al., 2004;Leodori et al., 2019;Mäki and Ilmoniemi, 2010;Rawji et al., 2021;Rogasch et al., 2013;Saari et al., 2018). Ozdemir et al. (2021b) concluded that TEPs and their causal propagation patterns are highly reproducible at the group level, while at the individual subject TEPs differ greatly from the group TEPs and are heterogeneous across subjects. The TEPs appear different when different cortical locations are stimulated (Lioumis et al., 2009;Ozdemir et al., 2021b;Rosanova et al., 2009;Salo et al., 2018) again complicating the interpretation of the TEPs.
The pharmacological modulation of TEPs has provided some evidence in understanding the cortical processes underlying such evoked potentials Darmani et al., 2016;Premoli et al., 2014a), and neuromodulation phenomena observed in the EEG following TMS (Ozdemir et al., 2021a), but lack of replication of most results has made it difficult to assess the reliability of the findings. Recent studies emphasize the role of neurotransmission in the generation of TEPs supported by findings related pharmacological modulations. A number of studies employing pharmaco TMS-EEG in healthy subjects showed that N45 is related to GABA-A receptor-mediated inhibition: benzodiazepines and zolpidem relevantly increased N45 TEP amplitude (Premoli et al., 2014a;Premoli et al., 2018). Conversely, the experimental compound S44819, a specific antagonist at the alpha-5 subtype of the GABA-A receptor, reduced N45 TEP (Darmani et al., 2016). N45 has been shown to be increased by dextromethorphan, while this non-competitive NMDA receptor antagonist has not demonstrated effects on the motor threshold (MT) and MEP amplitude Wankerl et al., 2010;Ziemann et al., 1998). The topography of N45 increase due to dextromethorphan was found to be similar to that of benzodiazepines, suggesting that N45 amplitude reflects the excitation-inhibition balance of postsynaptic potentials evoked by the TMS Darmani et al., 2016;Premoli et al., 2014a;Premoli et al., 2018).
Unlike N45, N100 seems to be unaffected by dextromethorphan, whereas benzodiazepines may or may not decrease it in the nonstimulated hemisphere (Premoli et al., 2014a;Premoli et al., 2018;Voineskos et al., 2019). Taken together, these data suggest the role of N100 in the frontal cortex of the non-stimulated hemisphere as an index of propagation of cortical activity controlled by GABAergic neurotransmission. The N100 has been considered to be modulated by GABA-B receptors (Premoli et al., 2014a;Roos et al., 2021) as well as a balance between concentrations of GABA and glutamate (Du et al., 2018). As the N100 is evoked in various cortical areas (Casarotto et al., 2010;Kerwin et al., 2018;Lioumis et al., 2009), it has been proposed as a marker for cortical excitability in general with topographical maximum on the side of the stimulation (Bonato et al., 2006;Löfberg et al., 2013;Roos et al., 2021).
Some TEP components may reflect contamination by auditory stimulation and somatosensory effects due to TMS. Therefore, disentangling true cortical responses to TMS from those due to concomitant sensory response is of essential importance. TMS evokes early (5-30 ms after stimulus) cortical potentials located roughly at stimulation site, which are considered not contaminated at subthreshold SI  by indirect sensory decoding if the TMS click is properly masked (Russo et al., 2021). Also, N100 may be considered free of the auditory component if sufficient noise masking is adequately provided. As a general rule of thumb, genuine brain responses to TMS are mostly localized in one hemisphere. For instance, N100 is bilateral in absence of noise mask but when masking is active this component is found smaller and lateralized in one region, while physiological indirect responses due to auditory or somatosensory input are bilateral and symmetrical (Löfberg et al., 2013). Gosseries et al. (2015) investigated in a case-series the presence of TEPs in severe cortical lesions finding that TEPs were absent when the lesions were stimulated with TMS. Gordon et al. (2018) compared TEPs between sham and true stimulation conditions, and while the sham stimulation appeared to generally trigger TEPs, the responses were of much lower amplitude than when induced with real TMS. This result was partly agreed upon by Conde et al. (2019), while their findings emphasized the requirement for multisensory control stimulation to distinguish the true cortical responses to TMS.
EEG responses to TMS, or TEPs of patients are not well known, in comparison to healthy controls. Some studies performed in stroke have characterized the anomalous response of patients to M1 stimulation Sarasso et al., 2020;Tscherpel et al., 2020). However, for most neuropsychiatric disorders the stimulation target is not the motor area and responses from non-motor areas are currently being characterized for healthy subjects but are not clearly established as biomarkers of diseases. Part of this is due to the fact that the majority of neuropsychiatric patients is receive medications to alleviate symptoms. Therefore, one to one comparison to healthy controls are problematic since certain differences could be drug-related and not due to anomalous responses as a result of the underlying disease. Data from patients off medication may be difficult to collect for ethical reasons. While this is a general problem not confined to TMS-EEG measurements, we believe in the necessity of a shared database for EEG responses in neuropsychiatric disorders involving brain regions beyond motor cortex. Notwithstanding difficulties in collecting such datasets, the importance of non-motor TEPs as a direct read-out measure from anomalous cortical activity cannot be overestimated. In fact, by comparison of TEPs between healthy controls and patients, relevant conclusions could be drawn on underlying cortical mechanisms of pathological neural circuits.
Beyond TEPs, valuable complementary information can be obtained by decomposition of responses not time-locked to the stimulus in the time-frequency domain (TMS-induced oscillations) (Premoli et al., 2017;Saari et al., 2018). TEPs and TMS-induced oscillations can also be employed as read-outs of changes due to pharmacological treatments in clinical populations if the same TMS-EEG measure can be repeated on patients off-and on medication (Kaskie and Ferrarelli, 2018).
In recent years, the application of paired-pulse TMS and shortlatency afferent inhibition (SAI) have significantly enhanced our understanding of the neurobiological substrate of TEPs. For instance, the application of long-interval intracortical inhibition (LICI) with pairedpulse TMS has indicated a wide-spread inhibition in the TEP components (de Goede et al., 2020;Fitzgerald et al., 2009;Premoli et al., 2018;Premoli et al., 2014b;Rogasch et al., 2015). Previously, de Goede et al.
(2020) and Opie et al. (2017) found that independent of the interstimulus interval of paired-pulses for inducing LICI (i.e. 100-300 ms), the later components of the TEP (N100 and P180 were dampened because of LICI whereas the early components were maintained. LICI is commonly linked with GABAergic interneuron activity, and activation of GABA-B receptors during the second stimulus (test pulse) in particular (McDonnell et al., 2006). In one study, the short-interval intracortical inhibition (SICI) induced with paired-pulse TMS has also been found to affect the later components of the TEPs similarly to LICI . Accordingly, although SICI is associated with GABA-A rather than GABA-B receptors (Di Lazzaro et al., 2007), the effect of LICI and SICI on TEPs appears similar as both dampen the later TEP components. Noda et al. (2017) found that SICI dampens the N45, while intracortical facilitation (ICF) induced by paired-pulse TMS modulates N45 in an age-dependent manner. Cash et al. (2017) found that SICI also reduced significantly the amplitude of the P60 component of the TEP while ICF increased it and ascribed these findings to GABA-A receptor mediated inhibition and glutamatergic excitatory transmission, respectively. These modulatory effects were evident both in M1 and DLPFC. Noda et al. (2016) studied TEPs within the same two regions via modulation with SAI and found that the N100 amplitude was affected by SAI at both locations, with additional effect in DLPFC on component P60, and in M1on components N45 and P180. Ferreri et al., in addition found that P60 in M1 was modulated by SAI (Ferreri et al., 2012). SAI has been shown to be mediated by cholinergic and GABA-A receptors (Di Lazzaro et al., 2007).

Impact and significance of the gaps
The gaps identified and discussed in this paper are inter-connected, to a certain extent, but exert differential effects on the clinical utility of TMS-EEG. Amongst them, the challenges in application and signal quality affect all TMS-EEG applications with clinical prospects. The accuracy and reliability of the findings is critically dependent on other identified gaps. For instance, the signal quality has obvious effects on the noise included in the TEPs and thereby sets technical limits to the potential accuracy and reliability of the TEPs in diagnostic and therapeutic applications. In addition, the intra-and inter-individual variation of TEPs sets limits to accuracy and reliability as well, and hence is important in developing clinically useful (i.e. accurate and reliable) TEP-based biomarkers. Finally, the understanding of the complex interrelationship between TEP components and the overall function and connectivity of the brain in health and disease is a fundamental, yet difficult to achieve, prerequisite for the identification of potential clinical applications. Therefore, while the first two of the four gaps are practical and can be advanced almost independently from the clinical application, the last two of the gaps require extensive neuroscientific research to unveil the "meaning" of the TEPs and materialize the true potential and impact of TEP-based biomarkers.

Future challenges for bridging the gap between laboratory and clinics
While technical maturity has almost been reached with TMS-EEG, the clinical feasibility studies are still few and, in the vast majority of cases, recruiting a limited number of subjects. It is clear that large, randomized controlled trials providing standardization of protocols, as well as meta-analyzes with systematic reviews, are required to provide high-level evidence supporting the introduction of TMS-EEG in everyday clinical practice. There are still unmet needs regarding the interpretation of responses and the registration of reliable TEPs at the individual level. Once these important methodological issues have been addressed, diagnostic accuracy indices, such as sensitivity, specificity and ROC analysis curves, could be evaluated for different clinical applications of TMS-EEG providing the critical information desperately needed for TMS-EEG to move from laboratories to clinics. In addition, clinical staff needs to be trained properly when standardized protocols have been laid and clinically feasible applications demonstrated with sufficient level of evidence.
Once these crucial prerequisites are met, TMS-EEG might be ready to enter the clinical arena for more effective, patient-tailored clinical diagnostics, prognostics and interventions.

CRediT authorship contribution statement
All authors contributed equally to conceptualization, visualization and writing the original draft as well as reviewing & editing.

Conflicts of interest
PJ has an unrelated patent with Nexstim Plc, a manufacturer of navigated TMS systems. VK and PB have no conflicts of interest to declare.