Inter-individual differences in urge-tic associations in Tourette syndrome

Premonitory urges are a cardinal feature in Tourette syndrome (GTS) and are commonly viewed as a driving force of tics. However, inter-individual differences in experimentally measured urges, tics and urge-tic associations, as well as possible relations to clinical characteristics and abnormal perception-action processing recently demonstrated in these patients have not been investigated in detail. Here, we analyze the temporal associations between urges and tics in 21 adult patients with GTS including inter-individual differences and the relation of such associations with clinical measures and experimentally tested perception-action coupling. At the group level, our results confirm known positive associations between subjective urges and tics, with increased tic frequency and tic intensity during periods of elevated urge. Inter-individual differences in the associations between urges and tics were, however, substantial. While most participants (57-66 % depending on the specific measure) showed positive associations as expected, several participants did not, and two even had negative associations with tic occurrence and intensity being reduced at times of increased urges. Subjective urge levels and tic occurrence correlated with corresponding clinical scores, providing converging evidence. Measures of the strength of urge-tic associations did not correlate with clinical measures nor the strength of perception-action coupling. Taken together, urge-tic associations in GTS are complex and heterogenous, casting doubt on the notion that tics are primarily driven by urges.


Introduction
Gilles de la Tourette syndrome (GTS), a common, childhoodonset neuropsychiatric disorder can manifest with a wide range of symptoms. It is defined by the presence of multiple motor and vocal tics with onset before the age of 18 and a duration of at least one year (American Psychiatric Association, 2013). The vast majority of GTS patients experience premonitory urges to tic prior to the actual motor or vocal tic (Brandt, Beck, Sajin, Baaske, et al., 2016;Houghton, Capriotti, Conelea, & Woods, 2014;Niccolai et al., 2019). Characteristically, premonitory urges develop several years after the onset of tic symptoms (Banaschewski, Woerner, & Rothenberger, 2003) and can be alleviated or increased via tic execution or suppression, respectively (Mü ller-Vahl, 2010). In terms of their sensory quality, premonitory urges are commonly described as unpleasant pressure-like or tingling sensations, occurring either at a body region where a tic is about to occur (Houghton et al., 2014) or as a generalized inner tension (Mü ller-Vahl, 2010). Patients with GTS often report that premonitory urges are relieved by the execution of tics (Delorme et al., 2016;Ganos et al., 2012). This, along with the unpleasantness of premonitory urges has led to the conceptualization of premonitory urges as a (primary) factor driving tic activity in a negative reinforcement cycle (Capriotti, Brandt, Turkel, Lee, & Woods, 2014;McGuire et al., 2015;Piacentini et al., 2010;Woods et al., 2008). If preceding urges were crucial for the occurrence of tics, then successful tic suppression should clearly be related to premonitory urges. However, research has not found conclusive evidence for such a connection between premonitory urges and the ability to suppress tic activity (Banaschewski et al., 2003;Ganos et al., 2012;Mü ller-Vahl, 2010;Sambrani, Jakubovski, & Mü ller-Vahl, 2016). Although the relation between urges and tics thus does not appear to be straightforward, there is good evidence that the presence and strength of urges can profoundly influence the quality of life and psychological well-being of affected patients (Crossley & Cavanna, 2013;Eddy & Cavanna, 2014;Kano et al., 2015). Given their clinical relevance but unclear underlying mechanisms, further studies of urges and their relation to tics are important.
In the past, premonitory urges and their association with tics have mostly been investigated based on (unstructured) self-reports and questionnaires (Woods, Piacentini, Himle, & Chang, 2005). An experimental study (Himle, Woods, Conelea, Bauer, & Rice, 2007) investigated the effect of instructed tic suppression on tics and reported urge in alternating 5-min intervals. To allow a more detailed quantification of urges and time-resolved analyses of urge-tic relations, the urge monitor was developed (Brandt, Beck, Sajin, Baaske, et al., 2016). This computer-based assessment tool continuously records subjective urge intensity (see methods section 2.3 and Fig. 1 for details). The original study of Brandt, Beck, Sajin, Baaske and colleagues (2016) using the real-time urge monitor examined the temporal structure of premonitory urges and simultaneously recorded tic activity. Analyzed at the group level, the data indicated a significant positive correlation between self-reported premonitory urge intensities and recorded tic activity. As both clinical data based on the Premonitory Urge to Tic Scale (PUTS) (Woods et al., 2005) and the urge-tic analysis conducted by Brandt, Beck, Sajin, Baaske, et al. (2016) suggest strong variance in premonitory urge intensity and urge-tic relation, a more detailed and individualized analysis of inter-individual differences in urge-tic association is important. This is particularly relevant against the background that urges and tics, and their relation have recently been conceptualized in the framework of a hierarchical neural model of perception and behavior under the rubric of "predictive processing", or "Bayesian brain" accounts (Rae, Critchley, & Seth, 2019), where perception and action result from Bayesian processes, in which incoming (bottom-up) sensory signals are combined with prior (top-down) predictions to formulate the brain's expectations of the causes of the signal, i.e., the "posterior". Thus, the brain continuously engages in the minimization of mismatches between sensory signals and prior expectations, i.e., sensory "prediction errors," by updating perceptual priors and performing actions to change sensory signals. It has been proposed that in patients with GTS abnormally increased precision of predictions within somatosensory regions of the putamen are automatically released and projected to S1 and the insula, where prediction errors need to be resolved because such sensations were not predicted by insula activity. Spontaneous sensations generated in the putamen are thus processed as unexpected bodily feelings corresponding to urges triggering mitigating actions (tics) to adapt for such sensations, probably via signals from the insula to the supplementary motor area (SMA), from where tics are generated through basal ganglia-thalamocortical loops (Rae et al., 2019). If so, then tics as mitigating actions should be highly correlated with (to be mitigated) sensations. If, on the other hand, correlations between urges and tics are weak or variable, this could be taken as an argument against a Bayesian account of tics and urges. Participants continuously indicate their currently experienced premonitory urge on a vertical scale (0e100) by means of a computer mouse. A webcam mounted above the screen records participants' upper-body movements, for offline video-based tic ratings. c o r t e x 1 4 3 ( 2 0 2 1 ) 8 0 e9 1 Thus, the first aim of the present study was to re-examine in depth the relation between premonitory urges and tics both at a group and an individual level, taking into account not only the presence but also the intensity of tics in relation to urges, as well as the relation between urge monitor data and clinical measures in an independent cohort of GTS patients. We expected high inter-individual variability of urge-tic associations and correlations with clinical urge and tic measures.
The second aim was to investigate the relation between urge-tic associations and perception-action coupling. The fact that GTS patients experience premonitory urges in connection with tics and also appear to be hypersensitive to specific sensory stimuli (Buse, Beste, Herrmann, & Roessner, 2016) suggests that the integration of sensory information and motor activation is altered in them (Beste & Mü nchau, 2018). In fact, the urge-tic interplay led to the suggestion that GTS is not a movement disorder in its strict sense, but may rather be conceptualized as a disorder of perception-action integration (Beste & Mü nchau, 2018). We recently proposed that tics could potentially be viewed as tightly connected event files with premonitory urges as the perception (stimulus) and tic execution as the action component (Beste & Mü nchau, 2018). If so, then urge-tic relations measured with the urge monitor should correlate with markers of increased perception-action binding. Recently, using a cognitive theoretical framework of perception-action integration, namely the Theory of Event Coding (TEC) (Hommel, Mü sseler, Aschersleben, & Prinz, 2001), it was shown that adult GTS patients have indeed abnormally strong automatic associations/bindings between perceptions and actions (Brandt, Patalay, B€ aumer, Brass, & Mü nchau, 2016;Kleimaker et al., 2020). Moreover, increased stimulus-response binding in GTS patients correlated positively with tic frequency (Kleimaker et al., 2020), underscoring its relevance for tic pathophysiology. If abnormally increased perception-action binding is an integral part or underlying mechanism for urge-tic relations, there should be a clear positive correlation between these two measures. If no such correlation can be found, urge-tic relations probably represent a facet of GTS that is relatively independent of automatic perception-action binding. Both possible outcomes are of high relevance for the understanding of urge-tic relations as a major facet of GTS.

2.
Methods and materials All these participants had also participated in an established perception-action integration paradigm in the same experimental session. Results from that paradigm were reported previously (Kleimaker et al., 2020) and are used here for correlation analyses. Three additional participants were not included in the current analysis due to technical problems during urge data acquisition (one) and continuous ticcing (two), which would have undermined analysis of urge-tic associations.
The inclusion criterion was a GTS diagnosis confirmed by an experienced neurologist (A.M.) according to DSM-V criteria (American Psychiatric Association, 2013). Patients with clinically relevant psychiatric or neurological conditions (apart from the common co-morbidities OCD and ADHD) and IQ below 80 were excluded from participation. These Inclusion and exclusion criteria were determined before data acquisition. Exclusion criteria concerning data quality (see above) were determined during preliminary data inspection but prior to data analysis. The sample size was determined in advance based on a power analysis for the perception-action integration paradigm (Kleimaker et al., 2020) but not for the present study, the sample size of which (N ¼ 21) is appropriate for detecting moderate to strong correlations (r ! .58, significance threshold .05, statistical power .8).
The study was approved by the local ethics committee and participants gave their written informed consent prior to participation.

Clinical assessment
Individual clinical data are provided in Additionally, the modified RUSH video protocol was used (Goetz, Pappert, Louis, Raman, & Leurgans, 1999) (11.5 ± 2.31, range 4e15 [of 0e20]). This protocol is especially suitable to capture the current "ticcing state" (e.g., tic intensity and frequency) of participants, taking the symptom variation of tic disorders into account. Participants are placed in front of a video camera in a quiet room and instructed to sit in a still and relaxed way and not to talk. The 10-min video recording consists of two parts with "full view" and "upper body view" (5 min each) which are further subdivided into two segments with examiner present or absent (2.5 min each). Tics are scored for the two segments with no examiner in the room (5 min in total), with respect to number of body areas involved with tics, motor tic intensity, phonic tic intensity, frequency of motor tics, and frequency of phonic tics. In addition, the total number of tics was counted (33.9 ± 20.2 tics/min, range: 4.5e76 tics/min). All videos were independently scored by two clinicians experienced in the assessment of GTS patients and then reviewed jointly. When scores differed, relevant segments of the videos were reviewed and discussed to determine a RUSH consensus score. If tic counts differed by more than 15 %, the count was repeated independently after this review process, which resulted in tic counts differing by less than 15 % in all cases. The tic frequency (tics/min) was computed by dividing the average of both tic counts by the duration of the rated video (5 min).

2.3.
Real-time urge and tic assessment The urge monitor was implemented in Python 2.7, using the PsychoPy toolbox version 1.83.04 (Peirce et al., 2019). Participants were comfortably seated in front of a computer screen (Iiyama B2780HSU-B1, 1920 Â 1080 pixels) and asked to continuously indicate their currently experienced urge intensity by controlling the vertical position of a cursor on the screen using the computer mouse. In addition to the slider indicating the current urge, the recent urge history (3sec) was shown as a continuous curve (see Fig. 1). The urge monitor display, spanning about 6 visual angle horizontally and vertically, was updated at a frequency of 30 Hz. Urge data were recorded at 10 Hz for subsequent analysis. Participants received standardized instructions, indicating possible manifestations of premonitory sensations or urges (e.g., "similar to the feeling preceding sneezing"; "general tension"; "anything that might signal a tic") and defining the scale (0 ¼ "no feeling of urge"; 100 ¼ "strongest urge possible/imaginable"). Participants were instructed not to try to suppress tic expression, as in the "free ticcing" condition of a previous study (Brandt, Beck, Sajin, Baaske, et al., 2016). Subsequently, participants were briefly familiarized with the task (30sec practice run) and then performed the actual task (300sec). The initial slider position was set to 50 %; data recording (for 300sec) started 5sec after task onset (and in addition the first 10sec of recorded data were discarded, see below) to give participants time to indicate their current urge level. While performing the urge monitor, participants' motor and vocal expressions were recorded by a webcam (Logitech c930e, frame rate 30 Hz, 1920 Â 1080 pixels) mounted at the top of the screen, viewing the upper body including the hands on the table. Urge measurements and the video recordings were synchronized by auditory signals played by the experimental software at the beginning and end of the recording. The times of these signals in the video recording were determined by visual inspection of sound waves (Audacity, version 2.2.3, https://audacityteam.org/), showing excellent agreement with the duration of the urge recording (deviation below 20 msec in all recordings).

Video-based tic ratings
Vocal and motor tics were rated from the recorded videos using DataVyu version 1.3.7 (DataVyu Team, 2014) by two independent raters (L.S. and A.B.), personally trained and supervised by A.M. The urge recordings were divided into 300 segments with a duration of 1sec each. For each of these segments, presence/absence and the number of motor and vocal tics (multiple tics could occur in 1sec and tics could continue in subsequent intervals) was determined, as well as the overall "tic intensity". Motor tic intensities were rated on a scale from 1 ("very mild, could be normal movement") to 8 ("very severe, apparently painful or potentially dangerous"), vocal tics on a scale from 1 ("very mild; could be a spontaneous physiological sound") to 7 ("severe; multiple coprolalic words, noisy sound"). More details on the rating scales are provided in Table S2 (Supplementary Materials).
For the present analysis, motor and vocal tic intensities were combined to a single intensity score for each 1sec-interval by taking the maximum of the two scores. Inter-rater reliability was assessed using Pearson's correlation for presence/absence of a tic as well as tic intensity. As an additional quality check, the tic frequency (tics/min) from these ratings was compared to the tic frequency determined in the RUSH video protocol as described above. For further analysis, tic presence/absence data from the two raters were combined by logical conjunction (tic present if scored by at least one rater) and tic intensities were averaged for each 1sec-interval. Moreover, a continuous time series representing the "instantaneous tic intensity" was computed as a central moving average (window size 11sec) of rated tic intensities per 1sec interval (Fig. 2).

2.5.
Urge-tic analysis All statistical analyses were performed in version 3.6.2 of R (R Core Team, 2018). Prior to all analyses, the first and last 10sec of each urge and tic time series were removed to ensure participants had time to reach their current urge level and to avoid boundary effects in the computation of the instantaneous ticcing intensity. For each participant, we computed descriptive statistics of subjective urge intensities (mean, SD), the number of tics, proportion of 1sec intervals with any tic ("percent ticcing") as well as the average tic intensity (only for 1sec intervals with any tic). These measures were also used to assess potential correlations with clinical and experimental measures. Individual time series (urge, tic intensity) were z-standardized (to mean 0, variance 1) per participant prior to the following analyses. The association between urge intensity c o r t e x 1 4 3 ( 2 0 2 1 ) 8 0 e9 1 and tics was analyzed using three statistical approaches: logistic regression (predicting tic from urge), linear regression (urge intensity at tics), and correlation (between urge intensity and instantaneous ticcing intensity). These analyses were performed at the group level to describe the overall relationship across participants using random effects models (glmer and lmer from R package lme4) (Bates, M€ achler, Bolker, & Walker, 2014) with maximal random effects structure (Barr, Levy, Scheepers, & Tily, 2013). Some of the random effects models did not converge or resulted in singular model fits, even when reducing the random effects structure. This problem was not present when fitting the original, nonstandardized data, and for z-standardized data it could be resolved by constraining the analysis window to 50e250sec. Both approaches resulted in qualitatively equivalent results; as z-standardization makes the data more comparable across participants, we decided to report this analysis, with a reduced time window (50e250sec).
To characterize inter-individual differences, corresponding analyses were also performed separately for each participant, using logistic and linear regression (glm, lm in R) and correlation analyses. Coefficients from the logistic and linear regression as well as the correlation analysis were estimated to quantify individual urgeetic relationships (along with their statistical significance) and used to assess potential correlations with clinical and experimental measures.

Stimulus-response binding
Stimulus-response binding was investigated with the "event file paradigm" (Hommel, 1998), based on the Theory of Event Coding (Hommel et al., 2001) as described in detail previously (Kleimaker et al., 2020;Takacs, Mü ckschel, Roessner, & Beste, 2020;. This paradigm (see Fig. 3) assesses the effect of experimentally induced, transient stimulusresponse associations on subsequent stimulus-based responses, which can either be compatible or incompatible with the previously induced association. In short, participants are asked to execute a previously cued motor action (left/right button press) as soon as (after 1000 msec) a stimulus (S1) (500 msec) appears on the screen. This is hypothesized to create a transient association (binding) between the response (R1) and the features orientation (horizontal/vertical), color (red/green), and location (top/bottom) of S1. As R1 is defined by the cue, not S1, this association can be independently manipulated. In contrast, the second response (R2) to a second stimulus (S2) (after 1000 msec, lasting 500 msec) with the same feature dimensions as S1, is defined by one of the feature dimensions of S2 (orientation). Binding between one of the features (e.g., color) of S1 and R1 is said to be compatible with the S2-R2 pairing, if either both stimulus feature and response agree, or both disagree. In contrast, an incompatible pairing, i.e., the same feature requiring a different response, or vice versa, is hypothesized to require cognitively demanding processes of unbinding and rebinding stimulus and response features. The strength of binding is quantified by behavioral performance costs (in accuracy and RT), comparing binding-incompatible (same response and all stimulus features different, or different response, all stimulus features repeated) to binding-compatible conditions (response and all stimulus features repeated; or all different). The whole task comprised 384 trials divided into three blocks of 128 trials. The inter-trial interval was jittered between 1500 and 2000 msec during which a fixation cross was displayed.

Correlation between urge-tic measures and clinical and experimental covariates
Clinical scores were tested for correlation (Pearson) with three kinds of measures derived from the urge monitor: 1) ticrelated measures (tic frequency [tics/min], percent ticcing, mean tic intensity); 2) urge-related measures (mean and SD of reported urge time series); 3) urge-tic association (logistic regression/linear regression/correlation coefficient). In addition, we tested the prediction that a stronger urge-tic relation is associated with stronger stimulus-response binding (binding incompatibility effects on accuracy and reaction time) in the event file task. For each clinical score and measure of binding strength, statistical analyses were adjusted for the number of urge-monitor measures of each type (ticcing: 3, urge: 2, urge-tic: 3) using Bonferroni-Holm correction (Holm, 1979). In addition to classical p-values, Bayes factors (Morey Fig. 3 e Event file paradigm used to assess stimulusresponse binding. The response R1 to S1 is specified by the initial Cue (left-/rightward arrowhead), allowing independent manipulation of the stimulus-response association induced by S1/R1 (dashed oval). The response R2 to S2 is defined by the orientation of S2 (horizontal ¼ left, vertical ¼ right), an association that can be compatible or incompatible (as in the present case, where the same response is associated with two different shapes, colors and locations) with the association between S1/R1. c o r t e x 1 4 3 ( 2 0 2 1 ) 8 0 e9 1 et al., 2015) were used to examine the evidence for alternative versus the null hypothesis. Following established criteria (Kass & Raftery, 1995), Bayes factors above 3.2 were interpreted as "substantial" evidence, those above 10 as "strong" and those above 100 as "decisive" evidence in favor of the alternative hypothesis.

Transparency and openness statement
We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/ exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study. No part of study procedures or analyses was pre-registered prior to the research being conducted. Study data, analysis script (in R) and experimental paradigm (Python 2.7/PsychoPy 1.83.04) are available from a public data repository (https://osf.io/ k9z5c/). The conditions of our ethics approval do not permit sharing of the raw video data supporting this study with any individual outside the author team under any circumstances. Legal copyright restrictions prevent public archiving of the various clinical tests described in section 2.2, which can be obtained from the copyright holders in the cited references.

Descriptive summary of urge and tic data
Inter-rater reliability for presence/absence of a tic (median r: .84, range: .67e.94) as well as for tic intensity (mean r: .84, range: .56e.93) was high. According to the video ratings, participants on average had 22.4 tics/min (range: 5.2e50.9), tics were noted in 47.6 % of 1sec-segments (range 12.7 %e76.7 %) and mean tic intensity was 2.7 (range: 1.72e4.07 [of the total scale 1e8]). Participants' subjective urge ratings during the urge monitor showed substantial inter-individual variability. Mean urge rating (on a scale from 0 to 100) across participants was 11.3 (range: .1 to 48.7) with a mean variation (SD) of 11.1 (range: .76e29.0). The maximal urge per participant was on average 56.3 (range 6.0e100.0). More than half of the participants (12 of 21) used at least 40 % of the available urge scale (0e100 %) and one third (7 of 21) used more than 80 % of the scale. Detailed individual urge and tic data from the urge monitor are provided in the Supplementary Materials (Table  S3 and individual urge-tic time series in section 6). Clinical tic evaluation from the RUSH protocol and detailed tic ratings from the urge monitor cannot be directly compared due to differences in protocol (e.g., camera view, experimenter present/absent), task (no explicit task in the RUSH protocol versus continuous urge ratings during the urge monitor) and tic ratings (tic counting for the RUSH protocol versus rating of tics per body part for the urge monitor). Still, tic counts per minute from the RUSH protocol (mean 33.9 tics/min, range: 4.5e76 tics/min; see Table S1) showed a statistically significant, positive correlation with the tics/min from the urge monitor data (r ¼ .67, p ¼ .001, BF 10 ¼ 32.18; see figure in Supplementary Materials, section 5.1).

Urge-tic analysis
Each of the analyses across participants revealed a significant positive association between self-reported urge intensity and tic occurrence/intensity. Logistic regression analysis, predicting ticcing from z-standardized urge intensity (also included as random slope), revealed a positive effect (b ¼ .50, OR ¼ exp(b) ¼ 1.64) of urge intensity on tic occurrence (c 2 (1) ¼ 10.94, p < .001). Linear regression, predicting (z-standardized) urge from ticcing (also included in the random effect structure), indicated that urge was elevated (by about .26 SD) at times of ticcing (b ¼ .28, c 2 (1) ¼ 9.41, p ¼ .002). Finally, a random effects regression (with urge intensity as random slope) showed a positive effect of urge intensity on instantaneous ticcing intensity (b ¼ .31, c 2 (1) ¼ 15.54,p < .001). The analogous group-level analyses with non-z-standardized measures (and across the full time interval) yielded qualitatively equivalent results, i.e., strong positive associations between urge and ticcing for all three analyses. The per-participant analyses revealed substantial heterogeneity in urge-tic associations between participants, though a majority (depending on the analysis: 57e66 % out of 21 participants) showed a positive association between urges and tics, in line with the group-level analyses reported above. For the logistic regression analysis, predicting ticcing from urge, the measure of urge-tic association ranged between b ¼ À.99 (OR ¼ .37) and b ¼ 2.14 (OR ¼ 8.52). For the linear regression, comparing (z-transformed) urges at times of ticcing versus non-ticcing, the regression coefficient b ranged between À.62 and 1.12. For the correlation between urges and instantaneous ticcing intensity, correlation coefficients ranged between À.36 and .83.
To illustrate the range of inter-individual differences, data from three patients are shown in Fig. 4: One participant with a clearly positive urgeetic relationship, one with no clear association and one with a negative association. More information on individual data is provided in Supplementary Materials (Table S4 and individual time series in section 6).
The analysis of urge-tic associations depends on variation over time both in ticcing behavior and subjective urge. Moreover, frequent ticcing may impair urge reporting by disrupting attention and cognition. In exploratory analyses, neither the amount of ticcing ("percent ticcing") nor the use of the urge scale (urge range, i.e., maximumeminimum) showed a significant correlation with any of the three measures of urge-tic association across participants. This suggests that the urge-tic associations reported here are not confounded with interindividual differences in the amount of ticcing or use/interpretation of the urge scale.

Correlation with clinical scores
The RUSH total score showed a significant correlation with each of the tic measures obtained from the urge monitor video c o r t e x 1 4 3 ( 2 0 2 1 ) 8 0 e9 1 ratings, i.e., tics per minute (r ¼ .54, p adj ¼ .024, BF 10 ¼ 4.540), percent ticcing (r ¼ .63, p adj ¼ .007, BF 10 ¼ 15.192), mean tic intensity (r ¼ .52, p adj ¼ .024, BF 10 ¼ 3.604), the former two indicating substantial and strong evidence, respectively, for a positive association. In contrast, no correlations were found for any of the other clinical measures (YGTSS, PUTS, GTS QoL, YBOCS, CAARS; p adj > .2, BF 10 < 1.5 in all cases). The total PUTS score showed a correlation with both urgerelated measures from the urge monitor, mean urge (r ¼ .66, p adj ¼ .002, BF 10 ¼ 29.52) and variation (SD) of urge (r ¼ .54, p adj ¼ .012, BF 10 ¼ 4.42), the former providing strong evidence for a positive association. Additional, explorative analyses showed analogous but substantially weaker correlations of PUTS subscores (e.g., intensity vs quality, Brandt et al., 2016) with mean or SD of urge during the urge monitor. In contrast, none of the other clinical measures showed a correlation with the urge-related measures from the urge monitor (RUSH, YGTSS, GTS QoL, YBOCS, CAARS; p adj > .1, BF 10 < 1.6 in all cases).
Two of the measures of urge-tic association showed trends for a positive association with YBOCS (logistic regression: r ¼ .45, p adj ¼ .077, BF 10 ¼ 2.013, regression: r ¼ .50, p adj ¼ .067, BF 10 ¼ 2.916), with Bayes factors indicating anecdotal evidence in favor of a correlation. For all other clinical measure, these correlations were clearly not significant (RUSH, YGTSS, PUTS, GTS QoL CAARS; p > .2, BF 10 < 1.2 in all cases). Correlation (scatter) plots for the associations reported above are provided in the Supplementary Materials (section 5).

3.4.
Correlation of urge-tic association with stimulusresponse binding None of the binding scores (accuracy, response time) showed a significant correlation with the measures of urge-tic association from the urge monitor (p adj > .4, .4 < BF 10 < .9 in all cases), see Fig. 5.

Discussion
e present study provides a detailed analysis of premonitory urges, tics and their temporal association in adult patients with GTS. Extending previous research, we investigated inter-individual differences in premonitory urges, tics and urge-tic associations and explored their relation to clinical measures and stimulus-response binding tested experimentally. At the group level, our analyses confirm previous findings of a positive association between urges and tics (Brandt, Beck, Sajin, Baaske, et al., 2016), that is, stronger urges were associated with a greater likelihood of ticcing, urges were elevated at times of ticcing, and tic intensity correlated with urge. Variability of data though was high. As expected, urge intensity and tic occurrence from the urge monitor showed convergent validity with corresponding clinical scores measuring urges (PUTS) and tics (RUSH), respectively. In contrast, no significant correlations were found between measures of urge-tic association and the strength of stimulus-response binding.

Inter-individual differences in urge-tic associations and clinical correlates
Analysis of urge-tic associations at the level of individual participants not only revealed differences in associational strength but also in the direction of the urge-tic association. As expected, based on previous research (Brandt, Beck, Sajin, Baaske, et al., 2016), most participants showed a significant positive association between urges and tics, varying between strong and moderate to small correlations. These results are consistent with the reported clinical phenomenology of premonitory urges showing an increase-decrease pattern around tic execution (e.g., Cohen, Leckman, & Bloch, 2013;Mü ller-Vahl, 2010).
However, in contrast to this predominant pattern, about a third of the participants did not show significant urge-tic associations and, surprisingly, two participants even showed pronounced negative correlations, one of whom (Fig. 4, lower panel) had a pattern with highest urge intensities not during but between time intervals with intense ticcing. This latter pattern might be explained by alternating periods of tic suppression (with elevated urge) and "free ticcing" (with lower urge), despite the instruction not to suppress. Unfortunately, we cannot further investigate this possibility based on our current data as participants were not interviewed in detail about potential strategies or other observations after completing the urge monitor (though all participants were invited to informally describe any observations, and no one described such an alternation pattern).
The heterogeneity of urge-tic associations observed in this study is at odds with the notion of a straightforward urge-tic relation in GTS with tics being a consequence of an increasing urge to tic. More specifically, these findings are difficult to reconcile with the recently proposed hypothesis, conceptualized as a Bayesian account of tics and urges, that tics represent mitigating actions generated in the SMA to adapt for unexpected automatically released putaminal sensory signals that are processes and relayed in the insula (Rae et al., 2019). Instead, the data suggest more complex, idiosyncratic and variable urge-tic associations in GTS patients.
This reasoning is further supported by previously reported evidence of two distinct neurophysiological systems modulating tic generation and generation of premonitory urges (Ganos et al., 2012). Importantly, some GTS patients do not have urges at all, or experience them only occasionally. This is the case in about 10 % of adult patients with GTS (Brandt, Beck, Sajin, Baaske, et al., 2016). Also, urges are reported only in about 25 % of 8e10-year-old children with GTS but about 60 % of 15e19-year-old adolescents with GTS (Banaschewski et al., 2003), though the exact percentage of children and adolescents with GTS experiencing urges might have been underestimated in that study. For instance, using a more sensitive questionnaire (the PUTS), Woods et al. (2005) reported premonitory phenomena in 98 % of a sample of 8-16-year-old GTS patients. This notwithstanding, the probability of the occurrence of premonitory urges increases with age (Kwak, Dat Vuong, & Jankovic, 2003;Leckman, Bloch, Scahill, & King, Fig. 5 e Correlation between measures of urge-tic association (coefficients from logistic regression, linear regression, and correlation analysis, respectively) and binding scores from the event file paradigm (based on accuracy and response time). Shaded areas represent 95 % confidence intervals.
2006). Leckman et al. (2006) reported that the average age of children becoming aware of premonitory urges is~10 years (Leckman, Walker, & Cohen, 1993). Therefore, it could be argued that urges develop in parallel to tics or might represent an adaptation to having tics rather than representing a ticdriving phenomenon. The prevailing and somewhat distorting perception of urges as a driving force might also be related to the fact that GTS patients have been shown to have an increased sense of agency (Delorme et al., 2016), which might contribute to patients reporting tics to be a consequence of premonitory urges rather than being more or less erratic events.
As expected, there was a positive relation between tic frequency determined based on the RUSH video protocol and tic frequency during the urge monitor recording, confirming good reliability and objectivity of the tic ratings. The fact that participants showed higher tic frequencies during the RUSH Protocol is very likely explained by the differences in settings. During the urge monitor, participants were asked to continuously indicate premonitory urges whereas participants had no active task during the RUSH Protocol (Cohen et al., 2013;Ludolph, Roessner, Mü nchau, & Mü ller-Vahl, 2012). Concentrating on reporting urges may thus have diverted attention away from tics, which has been shown to decrease tic frequency (Brandt, Lynn, Obst, Brass, & Mü nchau, 2015;Misirlisoy et al., 2015). Continuously focusing on the urge to tic may also have influenced tic expression in other ways, e.g., by facilitating tic suppression (despite the instruction to tic "freely").
Consistent with prior findings, correlation analysis also showed a positive correlation between reported urge intensity in the urge monitor and PUTS total scores (Brandt et al., 2016;Brandt, Beck, Sajin, Baaske, et al., 2016) documenting substantial convergent validity with respect to the different urge measurement. In contrast, measures of urge-tic association from the urge monitor, based on precisely synchronized urge and tic measurements, did not correlate with the PUTS, which is based on participant's general and subjective assessment of urge intensity and urge-tic associational strength. This supports the assumption that subjective reporting of premonitory urges in relation to one's own tics (PUTS) might overemphasize the role of urges as a prerequisite for tics, contributing to a misconception of urges as a driving force for tics.
There were also no significant correlations between the results from the urge monitor analysis and other tic-related clinical assessment tools (RUSH, YGTSS, GTS QoL). However, two of the measures of urge-tic association showed anecdotal evidence for correlation with a clinical measure of OCD, namely the YBOCS. This is in line with results of other studies suggesting a relation between premonitory urges in GTS patients and co-morbid OCD (Openneer et al., 2019;Rajagopal & Cavanna, 2014;Reese et al., 2014).

4.2.
Lack of association between urgeetic relationship and event file binding In the study by Kleimaker et al. (2020), stimulus-response binding in a visuo-motor event file coding paradigm was found to be increased in GTS patients compared to healthy controls, indicating that the interrelation between perception and action, as conceptualized by TEC (Hommel et al., 2001), might be suitable for the understanding of GTS pathophysiology. In the context of TEC, premonitory urges and tics could be viewed as perceptual and motor events bound together in an event file, with the prediction that GTS patients with stronger stimulus-response binding in an experimental context also exhibit stronger urge-tic associations. However, we did not find evidence for a strong correlation between urge-tic associations and the strength of event file binding. Therefore, our study does not support the notion of urge-tic associations directly representing abnormal event files according to TEC (a possible lack of statistical power is discussed below).
One might argue that the lack of such an association is explained by the fact that the study of Kleimaker et al. (2020) assessed visuo-motor rather than somatosensory-motor binding. As premonitory urges often have a somatosensory quality, urge-tic associations may be more akin to somatosensorymotor binding. Although this cannot be completely refuted, (visuo-motor) perception-action binding in the study of Kleimaker et al. (2020) was correlated with tic frequency in patients. A higher tic frequency was associated with stronger binding in GTS patients. Tic frequency at a given time is probably the most suitable marker reflecting the 'ticcing state' of the brain at that time. The fact that such a measure was related to the strength of binding at that time suggests that increased perception-action binding, even when tested in the visual domain, is a core feature of GTS.
Notably, perception-action binding as tested in the study of Kleimaker et al. (2020) is an automatic process (Hommel, 1998). Urge-tic relations, in contrast, are, at least partially, conscious processes that can be modulated intentionally, for instance by yielding to the urge to tic, or not. This is to say that (automatic) perception-action binding and urge-tic associating probably represent profoundly different processes. Although intuitively one might assume that urge-tic relations reflect perception-action binding, the data presented here and evidence derived from other studies (Ganos et al., 2012;Kleimaker et al., 2020), indicate that this view is too simplistic and misleading.
Our data do not allow to further disentangle the nature of urges and urge-tic relations, which are likely also influenced by experience, environmental influences, comorbidity and other factors. Given that urges seem to develop with age at a later stage than tics (Banaschewski et al., 2003), they might, as pointed out above, represent markers of becoming aware of previously unnoticed extra-movements (tics) that are (partially) involuntary (Ganos et al., 2012) rather than being a driving force of tics. Thus, they might represent an adaptive process directed at perceiving sensory processes in relation to tic activity (Ganos et al., 2012). Tic-directed interoception could create increased awareness for sensory signals preceding movements that generally go unnoticed by individuals without tics (Ganos et al., 2012). For instance, another study utilizing the urge monitor asked healthy participants and GTS patients whether the attention on urges to blink in healthy controls or urges to tic in GTS changed during the task. Results showed that GTS patients reported significantly fewer changes in attention than healthy control subjects, indicating that GTS patients might generally pay more attention to somatosensory signals (Brandt, Beck, Sajin, Baaske, et al., 2016). The concept of premonitory urges as an adaptive phenomenon to tics is supported by the results of our urge-tic relation analysis showing inconsistent associations between premonitory urges and tics in some patients (negative or no correlation).

Limitations and outlook
The number of participants in the present study presents a limitation, especially with respect to the correlational analyses. A sample size of 21 is appropriate for detecting moderate to strong correlations (r ! .58) by usual standards (significance threshold .05, statistical power .8). Weaker or multi-factorial associations might have been revealed with a larger sample size. However, our data base was (evidently) sufficient to demonstrate heterogeneity of urge-tic associations, providing positive evidence against a simplistic view of ticcing being a direct consequence of an increasing urge to tic. The ambiguity of tics within a spectrum of physiological movements (Paszek et al., 2010) results in a certain degree of uncertainty with respect to tic ratings. Moreover, the fact that participants were asked to report urge intensities in general while tics of the lower half of the body were not part of the rating for the urge monitor might could result in lower urge-tic association scores (Brandt, Beck, Sajin, Baaske, et al., 2016). Still, analysis of the agreement of the two independent raters indicates good reliability and objectivity. Also, there was good agreement with tic rating during the RUSH protocol conducted by two experienced neurologists.
An apparent limitation of the urge monitor task is that participants are instructed to focus on their premonitory urges. Previously, it has been revealed that attention on tics during unsuppressed ticcing increased tic frequency and therefore similar effects regarding the urge monitor assessment cannot be ruled out Misirlisoy et al., 2015). Continuously reporting one's urge could also reduce tic expression, both by diverting attention away from tics frequency Misirlisoy et al., 2015) or by facilitating tic suppression.
The fine-grained analysis of temporal urge-tic associations depends on continuous urge ratings, so the abovementioned potential confounds are inherent to the method. However, their influence could be addressed by methodological variations. For instance, an early experimental study (Himle et al., 2007) collected urge ratings every 30 sec rather than continuously. While strongly reducing the temporal resolution and possibly the accuracy of urge ratings available for the urge-tic analysis, this may allow observing more representative/unperturbed ticcing behavior by limiting the continuous influence of attentional and cognitive demands. Also, the relation between urges and other physiological measures (e.g., skin conductance, pupillometry) should be explored. If a strong association is found, these physiological measures could serve as a proxy for urges without explicit urge ratings by participants. Additionally, semi-automatized tic detection methods (supported by, e.g., video-based face tracking and motion capture) could facilitate the acquisition and analysis of larger data sets to increase statistical power and generalizability of results.

Conclusion
Using an established self-report tool, the urge monitor, to capture premonitory urge intensities and determine urge-tic associations in GTS, we could confirm significant positive correlations between premonitory urges and tics, and clinical urge and tic measures. However, inter-individual differences in urge-tic associations were large, with some participants showing no significant or even a negative association between urge and tics. Our findings corroborate the idiosyncratic and complex relationship between urges and tics and are incompatible with a notion of tics solely being a consequence of an increasing urge.

Declaration of competing interest
There are no conflicts of interest.