Impaired context-sensitive adjustment of behaviour in Parkinson’s disease patients tested on and off medication: An fMRI study

The brain's sensitivity to and accentuation of unpredicted over predicted sensory signals plays a fundamental role in learning. According to recent theoretical models of the predictive coding framework, dopamine is responsible for balancing the interplay between bottom-up input and top-down predictions by controlling the precision of surprise signals that guide learning. Using functional MRI, we investigated whether patients with Parkinson's disease (PD) show impaired learning from prediction errors requiring either adaptation or stabilisation of current predictions. Moreover, we were interested in whether deficits in learning over a specific time scale would be accompanied by altered surprise responses in dopamine-related brain structures. To this end, twenty-one PD patients tested on and off dopaminergic medication and twenty-one healthy controls performed a digit prediction paradigm. During the task, violations of sequence-based predictions either signalled the need to update or to stabilise the current prediction and, thus, to react to them or ignore them, respectively. To investigate contextual adaptation to prediction errors, the probability (or its inverse, surprise) of the violations fluctuated across the experiment. When the probability of prediction errors over a specific time scale increased, healthy controls but not PD patients off medication became more flexible, i.e., error rates at violations requiring a motor response decreased in controls but increased in patients. On the neural level, this learning deficit in patients was accompanied by reduced signalling in the substantia nigra and the caudate nucleus. In contrast, differences between the groups regarding the probabilistic modulation of behaviour and neural responses were much less pronounced at prediction errors requiring only stabilisation but no adaptation. Interestingly, dopaminergic medication could neither improve learning from prediction errors nor restore the physiological, neurotypical pattern. Our findings point to a pivotal role of dysfunctions of the substantia nigra and caudate nucleus in deficits in learning from flexibility-demanding prediction errors in PD. Moreover, the data witness poor effects of dopaminergic medication on learning in PD.


Introduction
To behave adaptively, we need to adjust our expectations to persistent environmental changes while sustaining the pursuit of our action goals despite temporary distractions. Environmental changes that do not match our expectations, i.e., prediction errors, are known to cause phasic dopamine signalling in the midbrain. Thereby, they trigger bottom-up processing guiding the adjustment of predictions and the initiation of behaviour (Schultz and Dickinson, 2000;Redgrave and Gurney, 2006;Murty et al., 2011;D'Ardenne et al., 2012). This adjustment pertains to learning since future surprise can be minimised and behavioural implications of prediction errors become more predictable (Fiser et al., 2010;Friston et al., 2014).
The influence of predictions on behaviour is suggested to be regulated by tonic dopamine action, determining the relative weight or precision of bottom-up prediction errors for top-down predictions (Friston et al., 2012). By regulating phasic dopamine release, tonic dopamine has been found to modulate the surprise-driven learning rate, biases action selection and sets transition thresholds between flexible and stable states, favouring either bottom-up sensory input or top-down predictions (Beeler et al., 2010;Beeler et al., 2012;Humphries et al., 2012;Yu et al., 2013;Barter et al., 2015).
During learning, predictions are adapted to a particular context, based on the probabilistic structure of the past (Behrens et al., 2007). Here, the cholinergic and the noradrenergic system is ascribed a role in computations of uncertainty (Yu and Dayan, 2005;Marshall et al., 2016). However, depending on whether a current context consists of prediction violations that either signal the need for adapting to lasting changes of the environment (flexibility-demanding prediction errors, hereafter) or are caused by temporary chance occurrences of uncommon events (stability-demanding prediction errors), the adoption of either a more flexible or stable state is required, respectively. The interplay of flexibility and stability in response selection is suggested to be balanced by the levels of dopamine in frontostriatal circuits (Williams and Castner, 2006;Cools and D'Esposito, 2011;Doll et al., 2011). Specifically, the striatum assumes the role of a gate for relevant versus irrelevant input into working memory (Badre, 2012;Keeler et al., 2014). High or low energy barriers of attractor networks within the prefrontal cortex are suggested to facilitate either shielding or updating of working memory representations (Durstewitz et al., 2000;Durstewitz and Seamans, 2008). These barriers are probably adjusted contingent on a prioritisation of perceptual input utilising relevance (Summerfield and Egner, 2009;Rauss et al., 2011).
Accordingly, previous studies found that patients with Parkinson's disease (PD), a disorder associated with loss of dopamine neurons in the substantia nigra, show difficulties in the selection and inhibition of motor responses (Wylie et al., 2009;Wylie et al., 2010) as well as in cognitive set-shifting, i.e., shifting or switching between particular stimulus-response links (Cools et al., 2001;Monchi et al., 2004). Moreover, Galea et al. (2012) showed that during action reprogramming requiring a switch from an expected to an unexpected response, PD patients show increased reaction times to unexpected events in contexts of predictable compared to unpredictable environments.
In the present study, we tested the idea that a dopamine deficit impairs response selection by impeding probabilistic inference over either flexible or stable states. To this end, we examined learning from different levels of probabilities of flexibility-demanding versus stabilitydemanding prediction errors in healthy controls and patients with akinetic-rigid PD on and off dopaminergic therapy during a digit prediction task. We used fMRI to assess whether activity in key dopaminergic regions varies as a function of learning from prediction errors, with altered signalling in PD patients. During the task, participants were required to indicate the occurrence of digit rule switches, as behaviourally relevant violations leading to an update of the predictive rule (revealing flexibility), and to ignore short interruptions, referred to as drifts hereafter, as behaviourally irrelevant prediction errors provoking a shielding of the predictive rule (revealing stability). Importantly, both the absolute frequency of switch and drift occurrences (prediction errors: predicted digits) as well as the relative proportion of switch and drift occurrences (switches: drifts) changed over time. Varying probability and predictability of these events were quantified as decay-dependent informationtheoretic quantities, i.e., surprise and entropy, respectively (Harrison et al., 2011; see Methods for further details). Our hypotheses were focused on the effects of decay-dependent surprise, with the latter modelled as a regressor for analysing both behavioural performance as well as BOLD time series to assess learning from prediction errors on a trial-by-trial basis. Decay-dependent entropy was additionally modelled as a regressor of nuisance.
At first, behavioural data enabled us to test whether probabilistic inference to derive responses to different types of prediction errors differs between the groups. We expected that PD patients off medication would have problems to learn from prediction errors, i.e., to adopt flexible and stable states depending on the switch and drift probability, respectively (Friston et al., 2012) (Hypothesis 1, H1). On the neural level, we were particularly interested in the contribution of brain regions, which are known to be rich in dopamine, i.e., the substantia nigra and the caudate nucleus. More specifically, we hypothesised that substantia nigra activity would be positively modulated by both switch and drift surprise in controls but not in PD patients off medication, reflecting phasic dopamine release independent of the events' identity (Redgrave and Gurney, 2006;H€ olig and Berti, 2010) (Hypothesis 2, H2). In contrast, we hypothesised an increased activity of the caudate nucleus as a function of switch but not drift probability in controls compared with PD patients due to this regions' role in adaptive motor responses to prediction errors modulated by tonic dopamine Galea et al., 2012;Marshall et al., 2016) (Hypothesis 3, H3). Finally, provided that dopamine is indeed substantially involved in these effects, learning from prediction errors should improve with dopaminergic medication in PD patients, also reflected in a restored surprise-dependent modulation of neural activation within the respective regions in patients on compared with patients off medication (Hypothesis 4, H4).

Participants
Our sample was the same as reported in Trempler et al. (2018). 21 patients (6 females, mean age ¼ 58.81 years, SD ¼ 9.89,range ¼ [40,72]) meeting the United Kingdom Parkinson's Disease (UKPD) Society Brain Bank Criteria for idiopathic Parkinson's disease (Hughes et al., 1992) were recruited via the neurologic outpatient clinic of the University Hospital of Cologne, Germany. Hoehn and Yahr ratings ranged between I and III under regular medication (Hoehn and Yahr, 1967). During the screening session, the severity of symptoms was further defined according to the motor score of the Unified Parkinson's Disease Rating Scale (UPDRS) (Fahn and Elton, 1987). Based on the judgment of an experienced movement disorder specialist, only patients of the akinetic-rigid subtype were selected. This way, a clinically homogenous group could be ensured and potential movement artefacts could be minimised. Moreover, all participants scored between 19 and 30 points in the Parkinson Neuropsychometric Dementia Assessment (PANDA; 18-30 points ¼ "age adequate cognitive performance") (Kalbe et al., 2008) and lower than 19 points in the Beck depression inventory-II (BDI-II; cut-off for depression: ! 20 points) (Hautzinger et al., 2006). The screening included a training session to ensure that patients would be able to perform the task under their regular dopaminergic medication.
Patients were tested twice, i.e., once with their regular medication ("ON"-state) and once without medication ("OFF"-state; after overnight withdrawal of dopaminergic medication, corresponding to at least 10 h after the last dose). Session order (OFF-ON and ON-OFF) was counterbalanced across the participants. Withdrawal affected motor performance as seen in a significant difference in UPDRS-score between patients ON (M ¼ 19.62, SD ¼ 7.48) and OFF (M ¼ 27.14, SD ¼ 9.46), t(20) ¼ 10.61, p < 0.001. A group of 21 healthy participants (6 females, mean age ¼ 60.05 years, SD ¼ 10.05, range ¼ [36, 74]) matched to the patients regarding age and gender served as control subjects. Healthy controls did not receive any medication. They performed the training, the experiment, and all additional assessments on one day. No participant had undergone neurosurgical treatment for the disease or had a history of other neurological or psychiatric diseases.
The study was performed following the Declaration of Helsinki and had been approved by the ethics committee of the Medical Faculty of the University Hospital Cologne, Germany. Each participant submitted a signed informed consent notification and received reimbursement for participation plus travel expenses afterwards.

Task
During the task, a digit sequence was visually presented at the centre of a computer screen, in either ascending (1-2 -3-4) or descending (4-3 -2-1) order (Fig. 1). To enable participants to predict forthcoming input, the sequence was repeated constantly, and digits succeeded one another for 1 s, separated by an inter-stimulus interval of 100 ms. Directional changes from ascending to descending digit sequences or vice versa (switches, hereafter) occurred at pseudorandom ordinal positions within the initial sequence. Subjects were asked to signal these events via button press (switch detection). Besides, single digits were omitted occasionally at variable positions without a temporal gap (drifts, hereafter), and participants were instructed to ignore these omissions (drift rejection). Switches and drifts never appeared at the same time, i.e., their occurrence was always unambiguous. During a motor control task, which was implemented to assess the individual mean reaction time, one digit of the sequence repeated continuously but maximally eight times until the participant pressed the response button. A total of 25 of these motor control trials were randomly interleaved across the experiment. A 6 s presentation of a fixation cross distributed across the experiment in 1.33% of the trials (n ¼ 20) served as the baseline (i.e., rest trial).
To assess adaptation to different environments, the task was binned into 12 blocks that either had a high or low probability of switches, either paired with a high or low probability of drifts. Each block consisted of an average number of 125 trials in a full-factorial 2 (probability: high vs. low) x 2 (event: switch vs. drift) design. Transitions between block types resulting from this factor combination were balanced across the session. Probabilities were based on a pilot study, which assessed the performance of 12 PD patients during a staircase procedure of the task with different switch and drift frequencies. As a result, the maximum event frequency in unmixed blocks, in which switches and drifts occurred with the same frequency, was set to 16% (i.e., 8% per event type) and minimum event frequency was set to 8% (i.e., 4% per event type). In mixed (i.e., high-switch and low-drift or vice versa) blocks, the maximum frequency was set to 12%, whereas the minimum frequency was left at 4%. In this way, the difficulty level regarding the overall probability of events was kept constant across the experiment (except for unmixed lowfrequency blocks). Stimulus presentation was pseudorandomised using the stochastic universal sampling method (Baker, 1987), which ensured a balanced distribution of switches and drifts across the blocks. Mean separation of the events was 6.24 (SD ¼ 5.40).
The training consisted of ten blocks of 80 trials each and a probability of 16% for switch or drift occurrence. To enable participants to get accustomed to the task, presentation speed started at 1400 ms per digit and adapted block-wise with a decrease of 50 ms provided that the participant correctly reacted to 75% of the events. Besides, patients performed a short training before the scanner session with three blocks of 80 trials at the main experiment's digit presentation speed of 1 s. The randomisation was programmed using MATLAB R2012b (The Math-Works Inc., Natick, MA, USA) and stimuli were presented using Presentation 13.1 (Neurobehavioral Systems, San Francisco, CA, USA).

Probability model
In a Bayesian cognitive model, an observer's predictions of forthcoming sensory input are represented as probability distributions based on previous sensory input and prior knowledge. Due to the present task design, in which critical events were per se rather improbable and, thus, unpredicted as opposed to frequent standard digits, it is not plausible to assume an ideal Bayesian observer, which would base estimation of probabilities of these events on all previous events. Instead, we supposed that the observer would not be able to remember distant events. Consequently, we made use of a time-dependent decay model derived from Harrison et al. (2011) in which distant events are weighted less than recent ones. According to their formula, the expectations of an observer's model to observe different types of events at trial N are based on the weighted counts α of each type of event k (switch, drift, or standard digit) in the preceding trials T ¼ f1; 2; …; N À 1g, which are exponentially weighted by half-life τ: In this formula, δðx t ¼ kÞ is equal to 1 if the event at time t corresponds to event type k and 0 otherwise. Further, it is assumed that trial N-1 has the weight 1 (i.e., is fully weighted), while the weights of more distant trials decrease at an exponential rate. Rest trials were neglected during the counting of the events. The weighted counts were computed based on the assumption that τ ¼ 125, according to the mean block length.
The probability of a particular event k occurring at trial N can then be calculated as (Bernardo and Smith, 2009): In words, this probability is characterized by the weighted counts of event k relative to the sum of the counts of all possible events (switch, drift, and standard digit). The prior counts α k ð0Þ before observing the first trial were set to 1/3 for all events representing an uninformative prior (Jeffreys, 1946). Moreover, like previous studies, we used information theoretic indices, i.e., surprise and entropy, to quantify the amount of information provided by the current stimulus that could predict response accuracy and neural responses (e.g., Strange et al., 2005;Bestmann et al., 2008;Mars et al., 2008). The surprise I k ðNÞ of an event, i.e., its improbability, is given by the negative logarithm of the probability: Conversely, entropy measures the average surprise of all possible events and quantifies the expected information of events regarding their predictability: The varying extent to which each stimulus was locally unexpected, Fig. 1. Schematic diagram of the task. Stimuli of a simple 4-digit sequence continuously followed each other with a duration of 1 s and an inter-stimulus interval of 100 ms. Subjects had to indicate changes from ascending to descending sequences (and vice versa) (switch), as displayed in the left row, via a button press. Moreover, they had to ignore the omission of a single digit (drift), as displayed in the middle row. During a motor control task, depicted on the right, one digit repeated continuously until the participant pressed the response button. As depicted in the top left diagram, the probabilities of switches and drifts varied block-wise across the experiment in a 2x2 design.
i.e., its respective surprise value ( Fig. 2), was used to explain error rates and fMRI BOLD response amplitudes at switches and drifts. In doing so, we aimed to assess whether learning differs between the two event types, and between healthy control participants and PD patients, regarding behavioural adaptation and its corresponding brain activity.
Finally, we also explored the predictive ability of models based on counts α k ðNÞ weighted by other half-lives (see supplementary material at https://osf.io/n5ugp/). We used approximate leave-one-out cross-validation (Vehtari et al., 2017) to compare the predictive ability of the different models. As detailed in the supplementary material, models based on a shorter τ were generally better at predicting response accuracy of both healthy controls and patients by switch and drift surprise; however, results based on a more extended specification of τ ! 125 were more sensitive to differences between controls and PD patients. Data and code of these analyses are available in OSF at https://osf.io/n5ugp/.

fMRI data acquisition
Whole-brain imaging data were collected on a 3 T S Magnetom Prisma MR tomograph using a TRTX-head coil. To minimise head motion, the head was tightly fixated with cushions. Functional images were acquired using a gradient T2*-weighted single-shot echo-planar imaging (EPI) sequence sensitive to blood oxygenation level dependent (BOLD) contrast (64 x 64 data acquisition matrix, 192 mm field of view, 90 flip angle, TR ¼ 2000 ms, TE ¼ 30 ms). Each volume consisted of thirty adjacent axial slices with a slice thickness of 4 mm and a gap of 1 mm. Images were acquired in ascending order along the AC-PC plane to provide a whole-brain coverage. Structural data were acquired for each participant using a standard Siemens 3D T1-weighted MPRAGE sequence for a detailed reconstruction of anatomy with isotropic voxels (1 x 1 Â 1 mm) in a 256 mm field of view (256 x 256 matrix, 192 slices, TR ¼ 2130, TE ¼ 2.28). Stimuli were projected on a screen positioned behind the subject's head and were presented in the centre of the field of vision by a video-projector. Subjects viewed the screen by a 45 mirror, which was fixated on the top of the head coil and adjusted for each subject to provide a good view of the entire screen.

Behavioural data analysis
We assessed task performance by accurate detection of switches (hits), and correct non-responses to drifts (correct rejections), or, correspondingly, switch misses and false alarms at drifts. The motor control task was used to determine the 90%-quantile of each participant's reaction times. This quantile served as an individual time window, in which button presses in response to switches and drifts were acknowledged as hits and false alarms, respectively. Using Bayesian logistic multilevel models in R (R Core Team, 2018) via the brms package using Stan (Bürkner, 2017;Carpenter et al., 2017), dichotomous erroneous responses, i.e., switch misses and false alarms at drifts, were predicted by decay-dependent information-theoretic indices (i.e., switch surprise and drift surprise as well as entropy dependent on half-live τ) in interaction with event type (i.e., switch and drift) and group. As regards the latter, two separate models were estimated according to our hypotheses, with differences between controls and PD patients OFF interpreted as effects of the disease (H1) and differences between PD patients ON and OFF interpreted as effects of dopamine medication (H4). No comparisons between controls and patients ON were carried out to avoid a confounding of disease and medication effects that might, for instance, relate to medication side effects. Finally, session was added as a factor to make sure that differences between controls and PD patients were not driven by retest-effects in PD patients on their second visit. For a summary of model parameters, including interaction terms, we report regression coefficients and 95% credible intervals (CIs; i.e., Bayesian confidence intervals). This means that there is a 95% probability that the respective parameter falls within this interval, given the evidence provided by the data (note that it would indicate statistical significance on a 5% level if the interval does not contain zero). For the factors group, session and event type, we used effect coding with À1 for healthy controls and 1 for patients OFF, À1 for patients ON and 1 for patients OFF, À1 for the first and 1 for the second session, and À1 for drifts and 1 for switches. Weak or non-informative default priors of the brms package were used (Bürkner, 2017).

fMRI data preprocessing
Brain image preprocessing and basic statistical analyses were conducted using SPM12 (Wellcome Department of Imaging Neuroscience, London, UK; see: http://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Functional images were slice-timed to the middle slice to correct for differences in slice acquisition time. To correct for three-dimensional motion, individual functional MR (EPI) images were realigned to the mean image. Motion correction estimates were inspected visually as those participants who exceeded a maximum of 3 mm head movements between two scans in the x, y, and z dimensions would have been excluded from further analyses. The anatomical scan was co-registered (rigid body transformation) to the mean functional image. Each subject's co-registered anatomical scan was segmented into native space tissue components. A group-specific template was created using DARTEL with default settings in SPM12. Functional images were then normalised to the MNI space by affine transformations using invertible and smooth deformations (flow fields) for each participant's native space to the template derived from the previous step through the DARTEL tool.
Smoothing was also applied during DARTEL warping with a Gaussian kernel of 8 mm 3 full width at half-maximum.
To reduce effects of physiological noise (e.g., due to potential increased disease-specific motion or pulsatile artefacts in the midbrain), we performed a denoising procedure on the EPI data using the default settings of the CONN toolbox in MATLAB (Whitfield-Gabrieli and Nieto-Castanon, 2012), which implements the anatomical component-based noise correction method (aCompCor). Denoising included regressing out the first five principal components associated with white matter and cerebrospinal fluid as well as the motion parameters and their temporal derivatives from the BOLD signal. Finally, a 128 s temporal high-pass filter was applied to the data to remove low-frequency noise.

fMRI design specification
The statistical analysis was based on a least-squares estimation using Fig. 2. Illustration of the decay-dependent information-theoretic measures surprise, for switches and drifts, and entropy varying throughout the experiment of one example participant. Surprise and entropy depended on half-live according to the mean block length of 125 trials (based on a formula derived from Harrison et al. (2011)). Surprise values were used to predict the participant's performance and BOLD activity at switches and drifts. Dashed vertical lines reflect boundaries between the different blocks. the general linear model (GLM) for serially autocorrelated observations Worsley and Friston, 1995). The GLM included four regressors coding for onsets and durations of the specific event types, i.e., standard digits (std), switches, drifts, and motor control trials, which were then convolved with the canonical hemodynamic response function (HRF) and regressed against the observed fMRI data. Moreover, to model variability in the BOLD amplitude as a function of decay-dependent surprise and entropy, as outlined in detail above (cf. Probability model), two parametric modulators each were added to the switch and the drift regressor, i.e., one for decay-dependent surprise and one for decay-dependent entropy. We mean-centred each of these modulators before entering the GLM. Whenever two trials were separated by less than 2 s (i.e., less than one TR), only the first one was included in the GLM, whereas the second was not modelled and treated as part of the implicit baseline. Likewise, resting periods were not modelled and served as an implicit baseline (Pernet, 2014). The subject-specific six rigid-body transformations obtained from residual motion correction were included as covariates of no interest.
To ensure that neural activity during switches and drifts generally exceeded activation during standard digits, we conjoined one-sample ttests of the contrasts switch > std and drift > std of healthy controls and PD patients OFF. To this end, the corresponding beta images per participant were submitted to a second-level two-way analysis of variance to perform a conjunction analysis testing against the conjunction null hypothesis (p(peak-level FWE) < 0.05, Nichols et al., 2005). Moreover, individual statistical maps for variations of BOLD amplitudes with surprise at switches and drifts were generated for each participant. We performed region of interest (ROI) analyses to test for BOLD activation differences between the groups in the substantia nigra and the caudate nucleus. The substantia nigra ROIs were derived from the probabilistic atlas of the basal ganglia (ATAG; Keuken et al., 2014). Anatomically defined ROIs of the left and right caudate nucleus were derived from the automated anatomical labelling (AAL) atlas and created using the SPM Wake Forest University (WFU) Pickatlas toolbox (http://www.fmri.wfub mc.edu/cms/software, version 2.3) (Maldjian et al., 2003). For the ROI analyses, we extracted the beta scores of switch and drift surprise and corresponding standard errors per voxel and subject. Using Bayesian linear multilevel models in R (R Core Team, 2018) via brms and Stan with default priors (Bürkner, 2017;Carpenter et al., 2017), we predicted these beta scores by event type, group, session, and ROI (left and right caudate nucleus and substantia nigra) while accounting for dependencies between responses belonging to the same voxel or subject. Furthermore, varying degrees of uncertainty in the beta scores (i.e., varying standard errors) were accounted for by using an inverse-variance weighting scheme with more precisely estimated scores receiving higher weight (Cooper et al., 2009; https://osf.io/n5ugp/). We tested hypothesised differences in the beta values in the selected ROIs between healthy controls and PD patients OFF, and between PD patients OFF and ON. The α-level was set to 5%, with a 90% CI for directional hypotheses. Further, in addition to regression coefficients and CIs, we report the posterior probability (pp) that the differences in neural activation between the groups show in the expected directions.

Behavioural results
Bayesian logistic multilevel models on surprise and entropy estimates that depended on decay half-life τ (cf. Probability model) were used to predict behavioural errors, that is, switch misses and false alarms at drifts of healthy controls versus PD patients off medication, and PD patients on versus off medication. We found that the modulation of error rate by decay-dependent surprise and entropy depended on event type and group. Although there was no main effect of switch and drift surprise across the groups, the results are in accordance with H1 by revealing an interaction effect of SWITCH SURPRISE X GROUP X EVENT TYPE, as well as an interaction effect of DRIFT SURPRISE X GROUP X EVENT TYPE: In controls, higher probability, i.e., lower surprise of switches led to more switch hits, whereas in PD patients OFF the rate of switch hits increased as a function of switch surprise. In contrast, while drift surprise did not modulate the error rate of healthy controls at drifts, the false alarm rate of PD patients OFF increased as a function of drift surprise (Fig. 3). Moreover, we observed that increasing entropy affected the error rate at switches more than at drifts in healthy controls only. Independently of their medication, PD patients revealed an interaction effect of DRIFT SURPRISE X EVENT TYPE, i.e., increasing drift surprise was accompanied by a higher rate of false alarms at drifts but also a lower rate of switch misses in both PD patients on and off medication. However, there were no differences between the medication states. Regression coefficients and corresponding 95% CIs for each predictor variable of the two models are given in Table 1 for controls and PD patients off medication and in Table 2 for PD patients on and off medication.

fMRI results
Main effects of switches and drifts. To first identify the network associated with prediction error processing in general, we run a conjunction analysis on the statistical maps of the two groups and the two tasks, i.e., of healthy controls and PD patients OFF during switch and drift processing [controls (switch > std) \ controls (drift > std) \ patients OFF (switch > std) \ patients OFF (drift > std)]. This analysis revealed higher activations in a network comprising -amongst others -the inferior parietal cortex as well as the inferior frontal gyrus extending into the anterior insula during switches and drifts compared to standard digits (Table 3) (Fig. 4).
Parametric effects of decay-dependent surprise. Bayesian linear multilevel models were employed to test our hypotheses (H2-H4) that decay-dependent surprise at switches and drifts modulated the BOLD response in defined ROIs, i.e., the substantia nigra and the caudate nucleus, differently in controls and PD patients OFF and ON. Thus, corresponding to the reported error rate effects, we addressed the hypothesised differences between healthy controls and PD patients OFF, and between PD patients ON and OFF.
Regarding the caudate nucleus, medication did not restore the neural activation within this region. Instead, right caudate nucleus covaried positively with switch surprise in patients on compared to patients off medication, L: b ON-OFF ¼ 16.44, 90%-CI ¼ [-42.49, 75.72 (Fig. 5, bottom-right panel).

Discussion
The present fMRI study aimed at gaining insight into the potential involvement of dopamine in learning from prediction errors requiring either flexible updating or stabilisation of current predictions. We addressed this issue by comparing the effects of prediction error probability based on a specific decay half-live on performance and activity in key dopaminergic regions of healthy control participants and PD patients on and off dopaminergic medication during a digit prediction task. We found that in healthy participants but not in PD patients increasing decay-dependent probability, that is, lower surprise of flexibilitydemanding violations of a predictable digit sequence (i.e., switches) was accompanied by a better switch detection (H1). Contrary to our hypothesis (H4), there were no performance differences between PD patients on and off medication. On the neural level, we observed a deficient modulation of decay-dependent surprise of these violations by the right substantia nigra (H2) and the caudate nucleus (H3) in PD patients off medication, but no changes of surprise responses under medication (H4). In the following, we will discuss our behavioural findings on impaired probability-dependent responding in PD patients, and will then go into more detail regarding our fMRI results.
Error rates of healthy controls and PD patients at switches were differentially affected by changes in the time-dependent probability of these event occurrences. Healthy participants detected more switches when switch probability within a certain time frame became particularly high. In contrast, PD patients, no matter whether on or off medication, reacted less flexibly to switches when these became more probable and thus did not adapt to high probability conditions. This finding indicates   that PD patients are impaired in anticipating flexibility-demanding environmental changes due to difficulties in probabilistic learning. Thereby, this result supports recent studies reporting deficits in implicit contextual learning in PD patients (Perugini et al., 2016;Perugini and Basso, 2017) and extends previous results implicating dopamine in flexible responses to sensory prediction errors by suggesting a corresponding impairment in PD Iglesias et al., 2013). However, since medication did not restore this deficit, further transmitter systems may be involved in learning to adjust one's behaviour to changing contextual demands (see below). We found some evidence that PD patients compared with healthy controls are more susceptible to respond to rare, that is, highly surprising stability-demanding prediction errors requiring motor inhibition. Notably, previous research showed enhanced instead of lower distractorresistance in PD patients (Cools et al., 2009). However, this alleged advantage in PD patients possibly results from inflexibility rather than active stabilisation (Uitvlugt et al., 2016). In the present study, the patients' inflexibility (and alleged stability) in reacting to sequential violations in general had a particular impact upon high-probability conditions, whereas PD patients became more prone to deliver responses when stimuli became more surprisingno matter whether these stimuli required a motor response or not. By that, the patients' somewhat arbitrary responses to rare unpredicted events reflected a deficient response selection (Humphries et al., 2006). Together, our behavioral findings thus indicate that PD patients have deficits in distinguishing between and learning from different types of unexpected events requiring either stabilisation or updating of prediction.
On the neural level, the right substantia nigra activity showed a positive correlation with switch surprise in controls but not in patients off medication. This reflects, in turn, a decrease of neural activity in the course of more frequent event occurrence, which has been regarded as a sign of learning (Turk-Browne et al., 2010;Schiffer et al., 2012). Paralleling our behavioural findings, learning from flexibility-demanding prediction errors in high-probability conditions thus appears to be accompanied by a relative activation decrease within the substantia nigra in healthy subjects, whereas there was no modulation of activity within this region in PD patients. Although this finding is in accordance with a broad literature reporting prediction error coding and learning in the midbrain dopaminergic nuclei (e.g. Schultz and Dickinson, 2000;Redgrave and Gurney, 2006;D'Ardenne et al., 2012), recent fMRI and PET studies report that substantia nigra specifically encode belief updates (i.e., Bayesian surprise) but not sensory (information-theoretic) surprise, that is, pure unexpectedness (Nour et al., 2018;Schwartenbeck et al., 2016). Accordingly, our finding can be explained by the behavioural gain that results from an adaptation to local environmental challenges by learning dynamically changing probabilities of unexpected events over a specific time scale. Moreover, contrary to our hypothesis, substantia nigra activity was not modulated by drift surprise in either group, suggesting that decay-dependent surprise signals within this region are specific to flexibility-demanding violations and thus indeed process information content rather than unexpectedness per se.
In healthy controls but not in PD patients off medication, BOLD amplitudes in the caudate nucleus increased with a higher probability of switch occurrence. The involvement of the caudate nucleus in predictive processing of flexibility-demanding events extends findings from previous fMRI studies and computational models that highlight striatal signalling in delivering gating input to frontal areas to allow flexible updating of cortical representations, possibly modulated by dopamine (e.g., O'Reilly and Frank, 2006;Stelzel et al., 2013). Notably, our results reveal evidence for a caudate activation increase as a function of drift surprise in healthy participants (instead of a decrease as was the case for switch surprise). This could indicate that environments in which stability-demanding events become more probable rather lead to a decrease in striatal signalling but, concurrently, increased striatal firing rates when events are highly unexpected. Although this is speculative, the direction of the reported correlations could be accounted for by the relationship between phasic and tonic dopamine release (Grace, 1991). It is suggested that dopaminergic neurons do not only respond to unpredicted events per se but also encode their precision by tonic dopamine release over longer time scales (Fiorillo et al., 2008;Friston et al., 2012). Therefore, the reciprocal relationship between anticipation and surprise with respective increased neural responses to predicted and unpredicted events might reflect tonic and phasic dopamine signals, respectively (Schmitz et al., 2003;O'Reilly and Frank, 2006;Yu et al., 2013). Table 3 Maxima of activation from the conjunction analysis of the contrast images of switch > std and drift > std of healthy controls and PD patients off medication at p < 0.05 peak-level FWE-corrected. Labels are reported according to the AAL atlas. Entries in italics indicate sub-peak regions that are more than 8 mm apart within a cluster. MNI, Montreal Neurological Institute.  Fig. 4. fMRI activation at p < 0.05, peak-level FWE-corrected threshold for the conjunction analysis identifying the brain regions that were more active during switches and drifts relative to standard digits in both healthy controls and PD patients off medication.
Crucially, the dopaminergic medication did not restore learning from prediction errors, suggesting that not dopamine supply alone can explain the reported effects. Previous studies reported on heterogenous findings regarding the impact of dopaminergic therapy on learning. Studies demonstrate effects of medication on locomotor (Roemmich et al., 2014) and sensorimotor (Wolpe et al., 2018) adaptive learning as well as on reinforcement learning, but with mixed results on whether medication has beneficial and/or detrimental effects (e.g., Frank et al., 2004;B odi et al., 2009;Argyelan et al., 2018;McCoy et al., 2019;see Meder et al., 2019, for a recent review). Similarly, with regard to uncertainty learning, i.e., when decisions are based on the integration of prior and current sensory information, some studies report an improvement due to dopaminergic medication (Wolpe et al., 2015;Vilares and Kording, 2017;Tomassini et al., 2019), while others did not find differences between PD patients on and off medication (Perugini and Basso, 2017;see Perugini et al., 2018).
It has been suggested that dopamine deficiency reduces the modulation of performance rather than learning itself (e.g., Beeler et al., 2010;Smittenaar et al., 2012). Accordingly, a recent fMRI study found that activation of the dorsal striatum associated with the decision event in a stimulus-response learning task increased on dopaminergic medication, whereas signals of the ventral striatum related to learning during a feedback event was depressed by medication (Hiebert et al., 2019). Thus, the particular role of dopamine in learning from prediction errors might consist in the modulation of correct response selection by exploiting already learned uncertainty representations (Beeler et al., 2010;Marshall et al., 2016). In contrast, signals of environmental uncertainty might rather be encoded by other neurotransmitters such as noradrenaline or acetylcholine Dayan, 2002, 2005;Marshall et al., 2016). For example, the hippocampus has been associated with contextual learning by extracting statistical information to create a representation of the environmental volatility (Schapiro et al., 2014;Kluger and Schubotz, 2017). Hippocampal dysfunction in PD has been associated with cholinergic loss resulting in deficits in learning and memory, i.e., a progression towards dementia (Hall et al., 2014). Moreover, previous work highlights the role of noradrenaline depletion due to neuron loss within the locus coeruleus in PD, accompanied by cognitive inflexibility (Delaville et al., 2011;Vazey and Aston-Jones, 2012) and impaired inhibitory control (Borchert et al., 2016;Rae et al., 2016). Using fMRI, Ye et al. (2015) could show that PD patients' improvement in inhibitory control on atomoxetine, a noradrenaline reuptake inhibitor, was associated with increased functional and structural frontostriatal connectivity. Moreover, the authors could show that these beneficial effects on task performance could also be predicted by the patients' levodopa equivalent daily dose .
These findings suggest that dopamine release interacts with other neurotransmitter systems and possibly is affected by and influences uncertainty representations by means of the sensory inputs' goal relevance (Picciotto et al., 2012;Mizumori and Tryon, 2015;Aly and Turk-Browne, 2018). In the present study, we therefore assume that dopaminergic drugs do not enhance learning signals because these are rather provided by other neuromodulators. As a result, predictive strategies cannot be adapted to increasing demands on flexibility, resulting in a suboptimal performance. However, to exhibit the contribution of dopamine to impaired learning in PD, future studies with direct neural recordings should measure dopaminergic neuron responses to prediction errors of varying precision, for example in patients undergoing deep brain stimulation surgery.
Although not part of the hypotheses, it should be noted that increased time-dependent drift probability also improved flexible responding to switches; that is, both increased half-life weighted switch and drift occurrence led to better detection of switches in healthy subjects. Thus, contextual learning seems to rely on teaching signals provided by all types of violations but ultimately only impacts upon the motor response to flexibility-demanding events. This finding supports previous accounts, according to which dopamine sensitises behaviour to higher-level precision estimates by highlighting surprising input in predictable contexts (Bestmann et al., 2014;Marshall et al., 2016). Because each violation elicited changes in both tracked measures, i.e., in switch as well as drift surprise, future studies could also investigate the effect of switch and drift surprise (and their interaction) on BOLD response at drifts and switches, respectively. Accordingly, we acknowledge that both switches and drifts are relevant for updating the expectation of long-run future event occurrence and are thus not qualitatively different in every respect. However, although the same process likely achieves learning of drift and switch probabilities, our findings suggest that the effects of this learning on appropriate response selection to the two event types differ. Further research is warranted to disentangle the specific processes related to uncertainty learning on the one hand and response selection on the other hand.
Finally, it is of importance to acknowledge the dependency of the present results on the time scale (i.e., half-life τ) we used to fit the probability model. Our exploratory results suggest that different temporal scales differentially impact on behavioural and neural responses (see supplementary material at https://osf.io/n5ugp/) so that the present results on differences between the groups in integrating past events must be considered against the background of the half-life we selected according to our experimental manipulation. It would have exceeded the scope of the present study to estimate individual half-lives, that is, to investigate individual differences in time scales over which sensory information is actually accumulated, but we consider this to be an important question of future research. In line with that, it has been suggested that the hierarchical organisation of the cortex is determined by specific time scales over which information is aggregated (Kiebel et al., 2008;Harrison et al., 2011). Moreover, subjects differ concerning the number of samples they use when coding probabilities (Trempler et al., 2017) and the time they spend within one representational state (Vidaurre et al., 2017). Further studies could investigate whether PD patients also represent probabilities of upcoming flexibilityand stability-demanding sensory input but on a different (probably shorter) time scale, as our additional analyses suggest. Moreover, it is reasonable to assume that the accumulated evidence for an event to occur probably consists of an interplay of its absolute probability and the time elapsed since its last occurrence. Previous studies provided evidence for the role of dopamine in gathering information from the passage of time (Pasquereau and Turner, 2014;Tomassini et al., 2016;Tomassini et al., 2019). Thus, future studies should elaborate on the dopaminergic modulation of learning from either flexibilityor stability-demanding prediction errors by taking temporal dynamics of prediction formation into account. In sum, our study provides evidence that altered decay-dependent surprise-driven learning signals in the substantia nigra and the caudate nucleus, though unaffected by dopaminergic therapy, contribute to a deficient adaptation of behaviour in response to flexibility-demanding surprising events in PD. These findings provide novel insight into the specificity of dopamine in exploiting learning and corresponding deficits in PD.

Declaration of competing interest
The authors declare that they have no conflict of interest.