7T functional MRI finds no evidence for distinct functional subregions in the subthalamic nucleus during a speeded decision-making task

The subthalamic nucleus (STN) is a small, subcortical brain structure. It is a target for deep brain stimulation, an invasive treatment that reduces motor symptoms of Parkinson's disease. Side effects of DBS are commonly explained using the tripartite model of STN organization, which proposes three functionally distinct subregions in the STN specialized in cognitive, limbic, and motor processing. However, evidence for the tripartite model exclusively comes from anatomical studies and functional studies using clinical patients. Here, we provide the first experimental tests of the tripartite model in healthy volunteers using ultra-high field 7 Tesla (T) functional magnetic resonance imaging (fMRI). Thirty-four participants performed a random-dot motion decision-making task with a difficulty manipulation and a choice payoff manipulation aimed to differentially affect cognitive and limbic networks. Moreover, participants responded with their left and right index finger, differentially affecting motor networks. We analysed BOLD signal in three subregions of the STN along the dorsolateral-ventromedial axis, identified using manually delineated high resolution anatomical images and based on a previously published atlas. Using these paradigms, all segments responded equally to the experimental manipulations, and the tasks did not provide evidence for the tripartite model.

The so-called tripartite model of the STN was originally based on tracing studies in non-human primates (e.g., Haynes & Haber, 2013;Joel & Weiner, 1997;Parent & Hazrati, 1995) and is used to explain these side effects, by proposing that the STN is subdivided in three parts with different functional roles (Temel, Blokland, Steinbusch, & Visser-Vandewalle, 2005). These subdivisions are hypothesized to be connected to three cortical networks, which can be characterized as a 'limbic' network, a 'cognitive' network, and a 'motor' network. Consequently, the proposed subdivision of the STN also consists of limbic, cognitive, and motor parts, which can be found in the ventromedial, central, and dorsolateral parts of the STN. Assuming this tripartite division, misplacements of the electrode in the limbic or cognitive parts of the STN, rather than the targeted motor part, are thought to be the origin of non-motor side effects of DBS of the STN (Karachi et al., 2009;Temel, Blokland, et al., 2005).
Although empirical studies provide indirect evidence that the tripartite model is plausible in the human brain (Greenhouse, Gould, Houser, & Aron, 2013;Lambert et al., 2012;Mallet et al., 2007), a systematic review of the neuroanatomical literature on the STN in the non-human primate shows little consistency in the number of subdivisions and their topological organization (Keuken et al., 2012). Moreover, to the best of our knowledge, there is no study that shows direct evidence in a healthy human population that different subparts of the STN are involved in different functions. Currently, the evidence for putative functional subdivisions in the STN comes nearly exclusively from neuroanatomical data in non-human primates (Haynes & Haber, 2013;Joel & Weiner, 1997;Parent & Hazrati, 1995), rather than from functional data acquired during the execution of tasks that require limbic, cognitive or motor processing. Having said that, Greenhouse et al. (2011), showed that stimulating more dorsal versus more ventral DBS contacts led to reduced response times in PD patients, whereas more ventral stimulation increased reports of positive emotion. More ventral stimulation also improved task switching performance in PD patients, unlike dorsal stimulation (Greenhouse et al., 2013). Patients with obsessive-compulsive disorder (OCD) have received DBS stimulation in the STN as well. Stimulation in these patients seems to be more effective in more ventrally located electrodes (Mallet et al., 2007;Mulders et al., 2016) and increased activity in ventral ein contrast to more dorsally located electrodese has been related to OCD severity, consistent with a "limbic" STN region (Burbaud et al., 2013;Piallat et al., 2011;Welter et al., 2011). Some anatomical tracing (Nambu, Takada, Inase, & Tokuno, 1996) and electrophysiological recording/ stimulation work (Nambu et al., 2000) in non-human primates actually shows very widespread projections from (pre)motor cortex across the entire main axis of the STN, providing some functional-anatomical data that speaks against a tripartite model. Moreover, recent studies doing in-vivo tract tracing using diffusion-weighted MRI have shown that hyperdirect limbic cortical projections are very scarce (Temiz, S ebille, Francois, Bardinet, & Karachi, 2020) and those projections that do get close to the STN might actually terminate in more medial regions such as the tegmental area (Coenen et al., 2022).
Here, we test the tripartite hypotheses by imaging the STN during a perceptual decision making task known as the random-dot motion task (RDM; Ball & Sekuler, 1982;K. H. Britten, Shadlen, Newsome, & Movshon, 1993;K. Britten, Shadlen, Newsome, & Movshon, 1992;Pilly & Seitz, 2009). In the RDM task, the participants are presented with a cloud of randomly moving dots of which a subset moves coherently to the left or the right. The participant has to decide what the dominant direction of motion is by making left or right button presses (Fig. 1A). To modulate the hypothesized cortico-basal ganglia networks, two manipulations were added to the task based on earlier literature (for review, Mulder, Van Maanen, & Forstmann, 2014).
The second manipulation altered cognitive processing, by changing the difficulty of the stimulus discrimination task, which has been shown to change the rate of evidence accumulation during decision making. On half of the trials, the coherence of the dots on the screen was relatively high (easy trials), whereas on the other trials the coherence was relatively low (hard trials). This manipulation has been shown to modulate activity in 'cognitive' frontal cortical areas such as the dorsolateral prefrontal cortex (Heekeren, Marrett, Bandettini, & Ungerleider, 2004;Kaiser, Lennert, & Lutzenberger, 2007), as well as in the insula (Binder, Liebenthal, Possing, Medler, & Ward, 2004;Ho, Brown, & Serences, 2009;Keuken, Mü ller-Axt, et al., 2014;Thielscher & Pessoa, 2007). Furthermore, in earlier work using electrophysiological recordings and UHF-fMRI, the STN has been shown to reflect a 'conflict' or 'normalization' signal (Bogacz & Gurney, 2007;Frank et al., 2015;Keuken et al., 2015) that should be inversely proportional to the similarity of two choice options.
While performing the task, participants were scanned using ultra-high field (UHF) 7 Tesla (T) functional MRI. Unlike fMRI at lower fields, UHF fMRI can potentially resolve fine activation patterns within the STN, because of its increased spatial resolution and signal-to-noise ratios (De Hollander, Keuken, van der Zwaag, Forstmann, & Trampel, 2017;Mileti c et al., 2020; van der Zwaag, Sch€ afer, Marques, Turner, & Trampel, 2016). To maximize anatomical specificity of our functional measurements, we used manually delineated, anatomically defined masks of the STN, based on high resolution structural images (Keuken, Bazin, et al., 2014). These STN masks were subdivided in three parts of equal volume along their dorsolateraleventromedial axis (see Fig. 2) using an automated procedure (see Methods for details and Discussion for additional considerations regarding this operationalization). In Supplementary Materials, we also used an atlas-based tripartite subdivision that was based on diffusion MRI (Accolla et al., 2014). To maximize the sensitivity of the functional imaging data, an optimized fMRI protocol was used that maximizes blood-oxygenation level dependent (BOLD) sensitivity in the STN Mileti c et al., 2020).
If the STN has functional subdivisions that are both involved in perceptual decision-making as well as differentially involved to the three main cognitive factors that were manipulated (response biases, task difficulty, and lateralized motor processing), the BOLD-response in these three Fig. 1 e A. Example trial of the experimental paradigm. Each trial started with a fixation cross, followed by the potential payoff cue, which was either an arrow pointing leftwards or rightwards (biased trials) or an X (neutral trials). After the potential payoff cue, the fixation cross was shown again, followed by the random-dot motion stimulus, during which the participants made their response by means of a button press. After 1.5 s, feedback was shown to present the reward earned, which depended on the accuracy of the response and the congruency of the potential payoff cue. The trial ended with a fixation cross. B. Behavioral data. Error bars are within-subject standard error (Cousineau, 2005). Labels on the x-axis are congruent (C), neutral (N), or incongruent (I) potential payoff cue conditions. C. Model comparison. The bar plot shows, per subject, the difference between BPIC-values for DDM1 (starting point shift) and DDM2 (drift rate shift), to compare the strength of evidence for one over the other strategy. Colors indicate winning model per participant; the overall winning model was DDM3 (lowest summed BPIC) with both drift rate and starting point varying by potential payoff cue. D. Size of parameter changes due to the experimental manipulations. v ¼ drift rate, z ¼ starting point. E. Quantile probability plot of model fit (model 2). Crosses are data (D), circles are model (M), colors indicate the difficulty (left two panels, easy (E) and hard (H)) or congruency of the potential payoff cue (congruent (C), neutral (N), incongruent (I)). Grey dots are individual predictions using a random sample from the posteriors, the colored circles are the mean of these posteriors predictive data points. The model overestimates the skewness of the RT distributions, which combined with the slow errors in the data, is suggestive of an urgency signal. The model furthermore overestimates accuracy after incongruent potential payoff cues. c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 segments should be presumably differentially modulated as well. Specifically, (a) the ventromedial 'limbic' segment would be most sensitive to the response bias cues, (b) the central 'cognitive' segment would be most sensitive to the difficulty manipulation, and (c) the dorsolateral 'motor' segment would be most sensitive to the response hand.
To test these hypotheses, we first contrasted the BOLD responses during the main task conditions. Specifically, we tested for differences between BOLD responses between payoff and neutral cues (limbic), between easy and hard stimuli (cognitive), and between the two response hands (motor). Furthermore, we tested whether these contrasts had different sizes in the three STN subdivisions. Subsequently, we adopted a model-based cognitive neuroscience approach, capitalizing on individual differences. This approach potentially increases sensitivity and specificity to differential activation patterns (De Hollander, Forstmann, & Brown, 2016;Forstmann, De Hollander, Van Maanen, Alkemade, & Keuken, 2017;Forstmann, Wagenmakers, Eichele, Brown, & Serences, 2011;B. M. Turner, Forstmann, Love, Palmeri, & Van Maanen, 2017;B. M. Turner, Palestro, Mileti c, & Forstmann, 2019). Concretely, we fit the diffusion decision model (Forstmann, Ratcliff, & Wagenmakers, 2016;Ratcliff, 1978Ratcliff, , 2006Ratcliff & McKoon, 2008;Ratcliff, Smith, Brown, & McKoon, 2016) to the behavioral data. We then tested whether individual differences in behavioral performance, as quantified by the DDM, covaried with the individual differences in the size of the corresponding fMRI contrasts (Forstmann, Brown, et al., 2010;Forstmann et al., 2008;Mulder et al., 2014Mulder et al., , 2012. In this way, we tested the hypothesis that the different segments were specifically related to the two latent processes underlying the two task manipulations.

Methods
No part of the study procedures or analysis was publicly preregistered prior to the research being conducted. We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study.

Participants
We analyzed two data sets collected at different scanner sites: Max Planck Institute for Human Brain and Cognitive Sciences in Leipzig, Germany, and the Spinoza Centre for Neuroimaging in Amsterdam, the Netherlands. The behavioral paradigms were identical but for one task parameter, and data acquisition protocols were built to be as identical as possible within the technical constraints of both scanners. Total sample size was determined based on previous studies with comparable experimental paradigms (Forstmann, Brown, et al., 2010;Forstmann et al., 2008;Ho et al., 2009;Mulder et al., 2012).  2 e Illustration of the three STN segments in individual space overlaid on the .5 mm isotropic averaged ME-GRE image from dataset 1 (radiological convention). To obtain these segments, manually delineated STN masks were first transformed to MNI2009c space. On the (demeaned) coordinates of these masks, a principal component analysis was performed to identify the 3D vector that explained most variance in the coordinates, which was the ventromedialedorsolateral axis.
Within each mask separately, all voxels were given a 'ventromedial score', which subsequently was used to divide the mask in three main segments: The 'posterior-dorsolateral' segment A (blue), the central segment B (B), and the 'anteriorventromedial' segment C (red). Top row corresponds to a coronal view, where three white lines indicate the axial cuts illustrated on the bottom. The blue region corresponds to the dorsolateral segment A, the green region to central segment B, and the red region to ventromedial segment C. For an illustration of the three STN segments based on the atlas provided by (Accolla et al., 2014), see Figure S1.
In dataset 1, 19 healthy subjects were scanned (10 males; mean age 26.9, std. age 2.4, range 23e32). All subjects had normal or corrected to normal vision and no history of neurological or psychological disorders. All subjects were righthanded, as confirmed by the Edinburgh Inventory (Oldfield, 1971). All subjects participated in an earlier study using both structural and functional MRI in the basal ganglia . The study was approved by the ethical committee at the Medical Faculty, Leipzig University, Germany. All subjects gave written informed consent and received a monetary reward for their participation, as well as an additional monetary reward based on their task performance.
In dataset 2, 14 healthy subjects were scanned (7 males; mean age 24.14, std. age 3.3, range 20e29). Two participants were left-handed. The study was approved by the local ethical committee of the University of Amsterdam, the Netherlands. All subjects gave written informed consent and received a monetary reward for their participation, as well as an additional monetary reward based on their task performance.
Sample size of the first dataset was determined based on an earlier study into the STN (De Hollander et al., 2017). Sample size of the second data set was determined based on effect size estimates of the first data set. Exclusion criteria were not established prior to the study; no subjects were excluded. We report all manipulations and measures in the study below.

Experimental paradigm
The experimental paradigm is illustrated in Fig. 1A. The task paradigm was a random-dot motion (RDM) task (Ball & Sekuler, 1982;K. H. Britten et al., 1993;K. Britten et al., 1992;Pilly & Seitz, 2009), in which the participant is presented with a cloud of randomly moving dots. A subset of the dots, determined by the coherence level, moves coherently to the left or right. The participant is instructed to decide in which direction the cloud moves on average by making a button press with the left or right hand. Response time is defined as the time between the onset of the RDM stimulus and the button press. We used a factorial design with potential payoff cue type and RDM stimulus difficulty as independent variables. Every trial (384 in total) started with a potential payoff cue (one out of three different options), followed by the RDM stimulus (one out of two different difficulty levels), and feedback. The potential payoff cues consisted of an arrow pointing to the left (25% of the trials) or right (25% of the trials), or a cross ('neutral' condition, the remaining 50% of the trials). Subjects could earn additional monetary reward based on their performance. The potential payoff cue indicated different potential monetary rewards: different responses yielded different payoffs, provided that the response was correct. Specifically, if a response was correct and congruent with the direction of the potential payoff cue arrow, the subject earned .04 euro for that trial. However, if the response was correct, but incongruent with the potential payoff cue arrow, the subject earned only .01 euro for that trial. If a response was correct on a neutral trial, the subject earned .025 euro. When subjects gave the incorrect response, they did not gain any reward, regardless of the cued direction.
The RDM consisted of a cloud of dots in a circle with a diameter of 5 (dva) and with on average 16.7 dots per square degree. The dots moved around with a speed of 5 dva per second. Frames were presented at a speed of 60 frames per second. On the first three frames, the dots were randomly positioned within the circle. The task then subsequently looped over three frames while the dots of the presented frame were repositioned. Before a frame was drawn, a portion of dots, determined by the 'coherence' parameter (determining the difficulty of the trial), was repositioned a fixed amount to the left or right (making sure that the speed was 5 per second). The remaining portion of dots was moved by the same amount, but in a random direction. The RDM was always presented for 1500 msec, independent of the subject responses, to prevent visual stimulus duration from confounding response behavior. In dataset 1, the coherences were set to 16% (easy, 50% of trials) and 8% (hard, 50% of trials). In dataset 2, the coherences were set to 35% (easy, 50% of trials) and 15% (hard, 50% of trials). Since the analyses below rely on withinsubject differences in behavior between difficulty conditions (and no ceiling or floor performance was reached), this between-dataset difference in the coherence levels should not affect the results. Regardless, we accounted for any potential confounds by incorporating a random effect of dataset in the whole-brain GLM to model any dataset-related effects in the group statistics. This random effect was not significant in any of the regions of interest for any of the contrasts of interest.
After the RDM stimulus, feedback was presented on whether they were correct and how much money they earned on that trial 'þV0.01' (incongruent potential payoff cue), 'þV0.025' (neutral potential payoff cue), or 'þV0.04' (congruent potential payoff cue) for correct trials, or 'þV0.00' for incorrect trials. If subjects responded in less than 250 msec, or needed more than 1250 msec, they received the feedback 'too fast' or 'too slow', respectively.

General procedure
After the participants were screened for MRI, they were introduced to the task on a laptop. They were explicitly explained how the cue payoffs determine the amount of money earned per trial, with a few examples. The participants were then also explained why, with limited information, following the potential payoff cues is a rational strategy. After this introduction, subjects performed 384 trials of an RDM task with potential payoff cues and two difficulty levels, divided over 3 runs, each consisting of 128 trials in 19 min and 21 sec (approximately one hour of functional scanning in total). Over these 384 trials, subjects could gain an additional monetary reward, based on their performance, of at most 9 euro and 60 cents. A trial always took 9 sec (3 TRs) in total (see Fig. 1A). It started with a fixation cross for 0, 750, 1500, or 2250 msec (pseudo-randomly sampled), after which the response payoff cue was presented for 1000 msec. After the potential payoff cue, a second fixation cross was presented for 1000, 1750, 2500, or 3250 msec. Subsequently, the dots were presented for 1500 msec and the subject had to respond with their left or right index finger. Immediately after the RDM stimulus, feedback was presented for 500 msec. It showed how much the subject earned on that trial. For the remaining 500e5000 msec in each trial, a blank screen was presented.

MRI scanning protocols
Functional neuroimaging of the basal ganglia at UHF is challenging compared to most cortical brain regions (De Hollander, Keuken, & Forstmann, 2015;Forstmann et al., 2017;Keuken, Isaacs, Trampel, van der Zwaag, & Forstmann, 2018). Therefore, we drew upon earlier validated MRI protocols for both structural (Keuken, Bazin, et al., 2014) and functional Mileti c et al., 2020) imaging of the STN. Specifically, in dataset 1, subjects were scanned in a Siemens MAGNETOM 7 T system (Siemens Healthineers, Germany) with a 32 channel phased array head coil (Nova Medical Inc, USA). We used T 2 *-weighted images from a spoiled multi-echo Gradient-Recalled Echo (ME-GRE) protocol (TEs ¼ 11.22,20.39,29.57 msec) with a resolution of .5 mm isotropic that were collected during earlier studies Keuken, Bazin, et al., 2014) for visualization of the STN and its neighboring structures. We also acquired .7 mm isotropic T1-weighted images using a MP2RAGE sequence (Marques et al., 2010) for registration purposes.
For the functional imaging, we used a recently developed BOLD fMRI protocol that balances spatial resolution with subcortical SNR and has a relatively short echo time, accounting for the relatively very short T 2 *-relaxation values in the STN (Aquino et al., 2009;De Hollander et al., 2017;Deistung et al., 2013;Keuken, Bazin, et al., 2014;Keuken et al., 2017). In dataset 1, a T 2 *-weighted 2D-EPI protocol with a spatial resolution of 1.5 mm isotropic, 90 slices, interleaved acquisition, TR ¼ 3000 msec, TE ¼ 14 msec, flip angle ¼ 60 , bandwidth 1446 Hz/Px, echo spacing .8 msec, FOV 192 Â 192 Â 135 mm, phase encoding direction A >> P, partial Fourier 6/8, GRAPPA acceleration factor 3, matrix size 128 Â 128 was used. Due to the increased number of slices compared to the protocol presented in De Hollander et al. (2017;90 vs 60), the acquired volume now covered the entire brain for all subjects. To correct for field inhomogeneities, a corresponding B0 field map with the same FOV was acquired (TR ¼ 1500 msec, TE 1-2 ¼ 6, 7.02 msec).
In dataset 2, participants were scanned in a Philips Achieva 7 T system (Philips, The Netherlands). The anatomical images of these subjects were already collected as part of a database (Alkemade et al., 2020), in which an MP2RAGE-ME protocol (Caan et al., 2019) was used (.7 mm isotropic) that was used to calculate quantified T1, T 2 *, and QSM images. For functional imaging, we used a T 2 *-weighted 2D-EPI protocol with a resolution of 1.5 mm isotropic, TR ¼ 3000 msec, slice thickness 1.5 mm, 80 slices, interleaved acquisition, TE ¼ 14 msec, flip angle ¼ 60 , bandwidth 1461 Hz/Px, echo spacing .85 msec, FOV 192 Â 192 Â 120 mm, phase encoding direction A >> P, HalfScan .76, SENSE acceleration factor 3, matrix size 128 Â 128 (since the gradients have a lower slew rate, to keep TRs consistent, we could acquire less slices than in dataset 1). Again, we collected a B0 field map with the same FOV.
We would like to note that due to the heavy-duty gradient cycles in our fMRI protocols, there is the potential for serious heating of the gradient coils. Indeed, when running pilots for follow-up studies using the same sequence at the Philips site (dataset 2), we noticed that the temperature in the bore notably increased during scanning. The scanner system never reported temperatures outside its safety limits, but we would like to urge researchers to exhaustively pilot new studies with gradient-heavy fMRI protocols such as ours on a phantom, as well as monitor the development of the temperature inside the bore and on the bore wall during a multi-run scan protocol, before commencing data collection on human subjects, and always ask about the temperature in the bore during subject debriefings.

Anatomical labeling
In dataset 1, the left and right STN were manually delineated on the .5 mm isotropic T 2 *-weighted ME-GRE images by two independent raters, following a previously published protocol Keuken, Bazin, et al., 2014). Similar to earlier studies, only voxels that were labeled as part of the STN by both raters were included in further analyses. In dataset 2, the manual delineations were based on QSM images obtained from the MP2RAGE-ME (Caan et al., 2019) sequence. The database included manual delineations by two independent rates. Again, only voxels that were labeled as part in the STN by both raters were included in the analyses.
To increase the limited functional signal-to-noise ratio in the STN without resorting to spatial smoothing, we subdivided the STN in smaller segments a-priori. All individual STN masks were transformed to the space of the ICBM 152 Nonlinear Asymmetrical template version 2009c (MNI2009c; Fonov, Evans, McKinstry, Almli, & Collins, 2009). On the meansubtracted coordinates of the voxels in these masks, we performed a principal components analysis to find the 3D vector (direction) that explained the most variance in the coordinates . We interpreted this vector as the 'main ventromedial-dorsolateral' axis of the STN (De Hollander et al., 2014;Haynes & Haber, 2013;Temel, Blokland, et al., 2005).
Then, for every individual subject mask separately, all voxel coordinates on this axis were given a 'ventromedial score'. This voxel-wise score allowed us to divide individual left and right STN masks into three segments: the 'posterior-dorsolateral' (segment A), the 'central' segment (segment B), and the 'anterior-ventromedial' segment (segment C; see Fig. 1). Note that these segments always covered the entire individual anatomical mask for each subject separately, with an equal volume. The segments were transformed to individual functional (EPI) space by the inverse transformation from individual to standard MNI space, as provided by the fmriprep registration procedure, using a single interpolation step, and used as regions of interest (ROIs) in further analyses. On average, the segments had a volume of 22.5 mm 3 (std 6.9 mm 3 ), corresponding to approximately 7 voxels in functional space (1.5 mm isotropic resolution), and never overlapped with other segments.
In a second set of analyses, we defined ROIs based on the tripartite subdivision atlas provided by (Accolla et al., 2014), which defined a motor, associative, and limbic part of the STN based on diffusion MRI data. This atlas was warped to individual space using the non-linear transformation provided by fmriprep. Contrary to the PCA-based subdivisions, the atlas-based subdivisions were not of equal volume: The motor segment is 38 mm 3 (std 3.8 mm 3 ), the associative segment 36.3 mm 3 (std 4.9 mm 3 ), and the limbic segment 13 mm 3 (std 2.6 mm 3 ), corresponding to approximately 12, 11, and 4 voxels, respectively, in functional space. In the main text below, we report the ROI analyses and results based on the PCA subdivisions, and we refer to the Supplementary Materials for the results of the same analyses based on the atlas subdivisions. Both sets of masks lead to the same conclusions.

Functional data preprocessing
For each of the 3 BOLD runs found per subject (across all tasks and sessions), the following preprocessing was performed. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. In dataset 1, a B0-nonuniformity map (or fieldmap) was estimated based on a phase-difference map calculated with a dual-echo GRE (gradient-recall echo) sequence, processed with a custom workflow of SDCFlows inspired by the epidewarp.fsl script and further improvements in HCP Pipelines (Glasser et al., 2013). In dataset 2, a B0-nonuniformity map was directly measured with an MRI scheme designed with that purpose (typically, a spiral pulse sequence). The fieldmap was then coregistered to the target EPI (echo-planar imaging) reference run and converted to a displacements field map (amenable to registration tools such as ANTs) with FSL's fugue and other SDCflows tools. For one subject of dataset 2, no B0nonuniformity map was available. For this subject, a deformation field to correct for susceptibility distortions was estimated based on fMRIPrep's fieldmap-less approach. The deformation field is that resulting from co-registering the BOLD reference to the same-subject T1w-reference with its intensity inverted (Huntenburg, 2014;Wang et al., 2017). Registration is performed with antsRegistration (ANTs 2.2.0), and the process regularized by constraining deformation to be nonzero only along the phase-encoding direction, and modulated with an average fieldmap template (Treiber et al., 2016). Based on the estimated susceptibility distortion, a corrected EPI (echo-planar imaging) reference was calculated for a more accurate co-registration with the anatomical reference. The BOLD reference was then co-registered to the T1w reference using bbregister (FreeSurfer) which implements boundary-based registration (Greve & Fischl, 2009). Coregistration was configured with six degrees of freedom. Head-motion parameters with respect to the BOLD reference (transformation matrices, and six corresponding rotation and translation parameters) are estimated before any spatiotemporal filtering using mcflirt (FSL 5.0.9;Jenkinson, Bannister, Brady, & Smith, 2002). BOLD runs were slice-time corrected using 3dTshift from AFNI 20160207 (Cox & Hyde, 1997; RRID:SCR_005927). The BOLD time-series (including slicetiming correction when applied) were resampled onto their original, native space by applying a single, composite transform to correct for head-motion and susceptibility distortions. These resampled BOLD time-series will be referred to as preprocessed BOLD in original space, or just preprocessed BOLD. The BOLD time-series were resampled into standard space, generating a preprocessed BOLD run in MNI152NLin2009cAsym space. First, a reference volume and its skull-stripped version were generated using a custom methodology of fMRIPrep. Several confounding time-series were calculated based on the preprocessed BOLD: framewise displacement (FD), DVARS and three region-wise global signals. FD and DVARS are calculated for each functional run, both using their implementations in Nipype (following the definitions by Power et al., 2014). The three global signals are extracted within the CSF, the WM, and the whole-brain masks. Additionally, a set of physiological regressors were extracted to allow for component-based noise correction (CompCor; Behzadi, Restom, Liau, & Liu, 2007). Principal components are estimated after high-pass filtering the preprocessed BOLD time-series (using a discrete cosine filter with 128s cut-off) for the two CompCor variants: temporal (tCompCor) and anatomical (aCompCor). tCompCor components are then calculated from the top 5% variable voxels within a mask covering the subcortical regions. This subcortical mask is obtained by heavily eroding the brain mask, which ensures it does not include cortical GM regions. For aCompCor, components are calculated within the intersection of the aforementioned mask and the union of CSF and WM masks calculated in T1w space, after their projection to the native space of each functional run (using the inverse BOLDto-T1w transformation). Components are also calculated separately within the WM and CSF masks. For each CompCor decomposition, the k components with the largest singular values are retained, such that the retained components' time series are sufficient to explain 50 percent of variance across c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 the nuisance mask (CSF, WM, combined, or temporal). The remaining components are dropped from consideration. The head-motion estimates calculated in the correction step were also placed within the corresponding confounds file. The confound time series derived from head motion estimates and global signals were expanded with the inclusion of temporal derivatives and quadratic terms for each (T. D. Satterthwaite et al., 2013). Frames that exceeded a threshold of .5 mm FD or 1.5 standardised DVARS were annotated as motion outliers. All resamplings can be performed with a single interpolation step by composing all the pertinent transformations (i.e., head-motion transform matrices, susceptibility distortion correction when available, and co-registrations to anatomical and output spaces). Gridded (volumetric) resamplings were performed using antsApplyTransforms (ANTs), configured with Lanczos interpolation to minimize the smoothing effects of other kernels (Lanczos, 1964). Non-gridded (surface) resamplings were performed using mri_vol2surf (FreeSurfer). Many internal operations of fMRIPrep use Nilearn .6.1 (Abraham et al., 2014; RRID:SCR_001362), mostly within the functional processing workflow. For more details of the pipeline, see the section corresponding to workflows in fMRI-Prep's documentation (https://fmriprep.readthedocs.io/en/ latest/workflows.html).

Data analysis
Two subjects performed only 256, instead of 384 trials, due to fatigue (the entire experiment took approximately an hour). After careful inspection of the data set, these subjects were included in the analyses to improve statistical power.

Behavioral analyses
To test whether the experimental manipulations were successful, behavioral data were analyzed using frequentist mixed effects models (MEMs; Barr, Levy, Scheepers, & Tily, 2013;Gelman & Hill, 2007). Two MEMs were fit: One logistic model, to test the effects of the two manipulations on choice accuracy, and one linear model, to test the effects on choice response times (RTs). The logistic MEM included two main task manipulations: potential payoff cue congruency (congruent, neutral, incongruent), and difficulty (easy, hard), as well as their interaction. The linear MEM included the aforementioned task manipulations as well as the response accuracy, plus all interactions. For both models, subjects were included as random intercepts. We report estimates of the MEMs coefficients, their standard errors (SE), 95% confidence intervals (CIs), and a p-value obtained from a t-test using Satterthwaite's method to estimate the degrees of freedom (F. E. Satterthwaite, 1941). These analyses made use of software packages lme4 (Bates, M€ achler, Bolker, & Walker, 2015) and lmerTest (Kuznetsova, Brockhoff, & Christensen, 2017) for the R programming language (R Core Team, 2017). Next, we used the DDM to estimate parameters quantifying key properties of the latent cognitive processes underlying decision making. The DDM assumes that participants gradually accumulate noisy evidence for each choice option, until a threshold level of evidence is reached. At that point, the participant commits to the decision and executes the motor response. The process is characterized by four main parameters: The starting point of evidence accumulation z, which lies between thresholds -a and a which quantifies response caution, the mean speed of evidence accumulation v (the drift rate), and the non-decision time t0 that quantifies the time required for perceptual encoding processes and motor execution. The 'full' DDM assumes additional parameters for between-trial variabilities in drift rate s v , starting point s z and non-decision time s t0 , which we did not estimate but fixed to 0. The between-trial variabilities in starting point and drift rate have been shown to be difficult to accurately recover (Boehm, Annis, et al., 2018). However, in an exploratory analysis, we also fit the winning DDM (reported below) including s v and s z , and found that the estimates of v, a, z, and t0 were highly correlated (all r > .99) with the same estimated obtained from a DDM with s v and s z constrained to 0. Hence, not estimating s v and s z did not bias our parameters (for further discussion on estimating the between-trial variability parameters, see Boehm, Annis, et al., 2018). Within-trial noise parameter s was fixed to 1 to satisfy scaling constraints (Donkin, Brown, & Heathcote, 2009;van Maanen & Mileti c, 2021).
To model starting point bias, we estimated the neutral starting point plus a starting point shift in the direction of the potential payoff cue, such that z left cue ¼ z neutral þ z cueÀshift and z right cue ¼ z neutral À z cueÀshift (in our parametrization, the DDM's upper threshold corresponds to the leftward response). Similarly, to model drift rate bias, we assumed that where the sign of v is negative if the dots move towards the right (and the size of v depends on the difficulty condition). In total, the three DDMs had six ðv easy ; v hard ; z neutral ; z cueÀshift ; a; t0Þ, six ðv easy ; v hard ; v cueÀshift ;z;a;t0Þ, and seven ðv easy ; v hard ; v cueÀshift ; z neutral ; z cueÀshift ; a; t0Þ free parameters, respectively.
Since we intend to use the median posterior parameter estimates in correlation analyses, we did not estimate parameters hierarchically, since this would create a dependency between subjectelevel parameters and violate the assumption of independence of observations in correlation tests (Boehm, Marsman, Matzke, & Wagenmakers, 2018). Subjectlevel posterior distributions of each model's parameters were estimated using a combination of differential evolution Cross-over probability (as part of the differential evolution) was set to 2:38= ffiffiffi ffi D p . Migration probability was set to .05 (only during burn-in). Convergence was assessed using visual inspection of the chain traces and Gelman-Rubin diagnostic (Brooks & Gelman, 1998;Gelman & Rubin, 1992) (individual and multivariate potential scale factors <1.02 in all cases). Datapoints with responses faster than 150 msec were considered as guesses and removed prior to fitting (Ratcliff & Childers, 2015;Ratcliff & Tuerlinckx, 2002).
The three DDMs were compared using the Bayesian Predictive Information Criterion (BPIC; Ando, 2007). Lower BPIC values indicate a better trade-off between quality of fit and model complexity. To visualize the quality of model fit of the winning model, we took 100 random samples from the posterior parameter distributions, with which we simulated the DDM (using the same trial numbers as in the experiment). We then calculated the 10th, 30th, 50th, 70th and 90th simulated RT quantile per condition for both the correct and error responses, and compared their (across-simulation) mean values with the empirical data (Fig. 1E).

Whole-brain GLMs
We then fit voxel-wise whole-brain GLMs to the fMRI data, in order to test whether the whole-brain BOLD contrasts are in line with the literature. In the first-level GLMs, we included eight task regressors: Leftward, rightward, and neutral potential payoff cues (on cue onset); easy and hard stimuli (on RDM stimulus onset); left and right responses (on response time); and errors (on feedback onset). Errors were explicitly modeled to decorrelate the effects of task difficulty from the effects of error (feedback) processing. It is well-known that limbic structures show heightened activity after an error has been made (e.g., W. H. Alexander & Brown, 2011;T. A. Klein, Ullsperger, & Danielmeier, 2013;Ullsperger, Harsay, Wessel, & Ridderinkhof, 2010). Therefore, 'drift rate'-related activity in limbic areas such as the insula might be a result of a larger number of errors in trials with a lower drift rate, unless the BOLD response related to errors is modeled. All eight task regressors were convolved in the canonical double-gamma hemodynamic response function (HRF; Glover, 1999), and their temporal derivatives were included in the design matrix. Additionally, we included a set of 16 cosines to model lowfrequency drifts, and seven head-motion parameters (translation and rotation in 3 dimensions, and framewise displacement), DVARS (Power et al., 2014), and the first five aCompCor components (Behzadi et al., 2007) to model physiological noise. GLMs were fit using FSL FEAT (Woolrich, Ripley, Brady, & Smith, 2001). A smoothing kernel (FWHM ¼ 5 mm, c.f. De Hollander et al., 2017;Mileti c et al., 2020) was used to spatially smooth the data prior to fitting (Smith & Brady, 1997). Note that functional data were only smoothed prior to the whole-brain GLMs; the ROI-based analyses (both deconvolutions and GLMs) were based on the unsmoothed functional data.
In a second-level GLM, the three runs per subject were combined using a fixed effects analysis. In a third-level GLM, FSL's FLAME1 (Woolrich, Behrens, Beckmann, Jenkinson, & Smith, 2004) was used. The design matrix included a groupwise intercept, a dummy variable to model potential dataset differences, and three model-based subject-wise covariates: (1) the drift rate difference (v hard À v easy ); (2) the starting point shift (z cueÀshift ); (3) the drift rate shift (v cueÀshift ). These covariates were demeaned.
Three main contrasts of interest were defined: (1) Response left e response right, to test for motor lateralization effects; (2) Payoff e neutral cues, to test for group-level main effects (i.e., testing for regions showing differential BOLD-responses after payoff compared to neutral cues) as well as covariances with potential payoff cue-related starting point or drift rate shifts (i.e., testing for regions in which the BOLD contrast size covaries with z cueÀshift or v cueÀshift ); (3) Hard e easy stimuli, to test for group-level main effects and across-subject covariances with drift rate shifts (v hard À v easy Þ. All third-level statistical parametric maps (SPMs; Fig. 3) were thresholded using FSL's cluster by first voxel-wise thresholding at z ¼ 3.1 and then at P < .05 at the cluster level (Worsley, 2001).

BOLD response deconvolutions
Next, we tested whether we had sufficient BOLD sensitivity, by estimating the shape and size of BOLD responses in the STN segments using deconvolution analyses (Dale, 1999;Gitelman, Penny, Ashburner, & Friston, 2003;Glover, 1999;Poldrack, Mumford, & Nichols, 2011). We first extracted the unsmoothed mean timeseries within each of the six (three per hemisphere) STN segments. Each timeseries was rescaled to percent signal change by dividing the timeseries by the mean signal value, multiplying by 100, and subtracting 100. The timeseries from the three runs per subject were then concatenated. Then, we performed a deconvolution using Fourier basis sets (Bullmore et al., 1996;Gitelman et al., 2003) with 7 regressors (an intercept, three sines, and three cosines) for two events of interest: The moving dots stimulus and potential payoff cue. Since the purpose of this deconvolution analysis is not to identify contrasts between the task manipulations, but instead to characterise the shape of the HRF, we ignored RDM stimulus difficulty and potential payoff cue types. Additionally, the model contained the same set of nuisance variables as used in the whole-brain GLM above: Five aCompCor components, 16 cosines to model low-frequency scanner drift, the DVARS, and seven head motion related parameters (rotation and translation in three dimensions, and the framewise displacement). The model was fit using ordinary least squares. Deconvolution analyses were performed using nideconv (De Hollander, Knapen, & Snoek, 2019).

ROI-wise GLMs
Subsequently, we defined four GLMs, each fit to the same six STN mean timeseries separately, to test how strongly the six segments responded to the task events. Importantly, and contrary to the whole-brain GLMs above, all STN timeseries were extracted from the unsmoothed functional timeseries data to prevent the mixture of signals originating from the different segments (De Hollander et al., 2015). In all four GLMs (detailed below), task events were convolved with the canonical double-gamma HRF (Glover, 1999), and their temporal derivatives were included. Additionally, the design matrix included temporal derivatives of the main regressors of interest, and (as in the previous analyses) the first five aComp-Cor components to model physiological noise, seven head motion related parameters, DVARs, and a set of 16 cosines to model low-frequency scanner drifts. The GLMs were fit using ordinary least squares. The estimated BOLD responses, quantified in the GLM's parameter estimates (betas), were then used as a dependent variable in higher-level analyses to test for functional specialization of STN segments, as detailed below.
2.7.4.1. GLM1: DIFFICULTY AND POTENTIAL PAYOFF CUE VALENCE EF-FECTS. The first GLM included (1) easy and (2) hard stimuli (RDM stimulus onset locked), and (3) payoff and (4) neutral cues (potential payoff cue onset locked) as task events of interest. Firstly, we tested for the presence of the following two contrasts of interest using a within-subject (frequentist and Bayesian) t-test for each segment separately: task difficulty (hard e easy), and potential payoff cue type (payoff e neutral). We hypothesized that 'cognitive' segment B should respond differently to hard compared to easy trials, and ventromedial 'limbic' segment C should show a larger BOLD response after payoff than after neutral cues.
Although these tests indicate whether contrasts were present in each STN segment, the tripartite model of functional specialization holds that the STN segments differ in their responses. Suppose that STN segment B indeed has higher BOLD responses for hard than for easy trials, then this would only support functional specialization of segments if the other segments do not (or less strongly) show that contrast (Nieuwenhuis, Forstmann, & Wagenmakers, 2011). Hence, we then tested for interaction effects between the experimental conditions and the segments using linear MEMs, with task event type, STN segment (A, B or C), and their interaction to predict the BOLD response size: To avoid confusion with the GLM's beta that quantify the BOLD response size, we refer to the MEM parameter estimates with b. We fit MEM (1) twice: Once using easy and hard stimuli as events, and once using payoff and neutral cues as predictors. Subjects were included as random intercepts.
Significant contributions of the segment Â event interaction would support functional specialization of segments. We used both frequentist and Bayesian approaches to quantify whether this interaction was supported by the data. Firstly, we estimated the p-value associated with the interaction term segment Â event. Since the predictor segment consists of three levels, we report F-tests using Satterthwaite's method (F. E. Satterthwaite, 1941) to approximate the denominator degrees of freedom. Secondly, we used a Bayesian model comparison approach, in which we sequentially removed individual terms from MEM (1) until the simplest (intercept-only) model remained (i.e., comparing all seven possible models). These models were compared using Bayes factors, which quantify the likelihood ratios of the models (Kass & Raftery, 1995  and against a hypothesis. To describe the strength of evidence (ranging from 'anecdotal' to 'decisive'), we use the conventions proposed by Wetzels and Wagenmakers (2012), who adapted Jeffrey's (1961) original proposal. Bayesian analyses were performed using the BayesFactor (Morey et al., 2018) software for the R programming language (R Core Team, 2017). 2.7.4.2. GLM2: POTENTIAL PAYOFF CUE LATERALIZATION EFFECTS. Potential payoff cue-related BOLD effects might be lateralized and represent some a-priori motor facilitation or suppression towards a left or right response. Such a potential payoff cuerelated lateralized signal has been shown before in the putamen (Forstmann, Brown, et al., 2010) and could potentially cancel out potential payoff cue-related effects when cue direction is ignored. We therefore also fit GLM2 which took into account the direction of the potential payoff cue. It contained the following task regressors: (1) neutral cue, (2) payoff cue (left), (3) payoff cue (right) (all cue onset locked), (4) easy RDM stimulus, and (5) hard RDM stimulus (both RDM stimulus onset locked). The contrast of interest was payoff cue (left) > payoff cue (right).
We tested whether the BOLD responses are different after contralateral potential payoff cues compared to ipsilateral cues, and whether there is any difference therein between STN segments. We constructed a new MEM (1) but using ipsilateral and contralateral cues (relative to the STN hemisphere) as event types. We used the same frequentist and Bayesian approach as described in 2.7.4.1 to test whether the MEM's interaction term, which would indicate functional specialization of STN segments, is supported by the data. 2.7.4.3. GLM3: DIFFICULTY EFFECTS, DE-CONFOUNDED FROM ERROR-RELATED PROCESSING. As mentioned above, errors are known to cause activity in limbic areas, which could confound drift raterelated activity. To test for activity related to drift rates that cannot be explained by a difference in the error rate, we fit GLM3 that de-confounded the task difficulty from error trials. Here, easy and hard event types were separated by the accuracy of the response on that trial. GLM3 included the following regressors: (1) neutral cue, (2) payoff cue, (3) easy RDM stimulus (correct), (4) easy RDM stimulus (error), (5) hard RDM stimulus (correct), and (6) hard RDM stimulus (error). The two potential payoff cue regressors were locked to the potential payoff cue onset, the four RDM stimulus regressors were locked to the RDM stimulus onset. The main contrast of interest for this GLM was 'hard RDM stimulus (correct) > easy RDM stimulus (correct)'. To test for the presence of functional subregions, we again constructed a MEM as described Equation (1), using the task events 'hard RDM stimulus (correct)' and 'easy RDM stimulus (correct)'. Again, the frequentist and Bayesian methods described in 2.7.4.1 were used to test for functional specialization of STN segments.

GLM 4: MOTOR LATERALIZATION EFFECTS.
To test for lateralized motor activity in the STN, we also fit GLM4, focusing on responses. It included the following regressors: (1) neutral cue, (2) potential payoff cue (both potential payoff cue onset locked), (3) easy trials, (4) hard trials (both RDM stimulus onset locked), (5) left responses, and (6) right responses (both response onset locked). We hypothesized that potential STN 'motor' segments should show lateralized motor activity. Specifically, we expected to find increased activity for left versus right responses for right STN segment A versus the left STN segment A, and vice versa. We repeated the LMEM analysis described in 2.7.4.1 but using left response and right response as events of interest. The same frequentist and Bayesian methods described previously were used to test for functional specialization of segments.

Interindividual correlations
We then tested whether interindividual variability in DDM parameters was related to interindividual variability in the corresponding BOLD effects (Forstmann et al., 2008Lebreton, Bavard, Daunizeau, & Palminteri, 2019). The first hypothesis pertained to potential payoff cue-related bias and STN activity. The shift in starting point of every subject was correlated with the contrast 'payoff cue > neutral cue' (ROIwise GLM1). We expected STN 'limbic' segment C involved in implementing response biases (the putative 'limbic' area) to show such an across-subject correlation between the size of the starting point shift and the size of the BOLD response contrast. We repeated this analysis with the potential payoff cue-related drift rate bias as covariate, with the same hypothesis. Correlations were tested using frequentist and Bayesian tests (Wetzels & Wagenmakers, 2012) as implemented in the BayesFactor software package (Morey et al., 2018).
The second hypothesis pertained to the difficulty manipulation and the resulting drift rate reduction. Specifically, the reduction in drift rate for hard compared to easy trials was correlated, across subjects, with the size of the contrast 'hard RDM stimulus > easy RDM stimulus' (ROI-wise GLM1). We expected that activity in the central, 'cognitive' STN segment B, should show a significant correlation with the drift rate effect of the difficulty manipulation. We test this hypothesis using the same frequentist and Bayesian correlation tests.

Results
We first describe three sets of control analyses. Firstly, we test whether the experimental manipulations led to the expected behavioral effects. Secondly, we perform whole-brain voxelwise GLMs to ensure that the overall patterns of BOLD responses elicited by the experimental manipulations are in line with the literature. Thirdly, we test whether we have sufficient signal quality in the STN segments to be able to detect BOLD responses. After these control analyses, we turn to the confirmatory analysis to test for functional specialization of STN segments.
Hence, the effects of the manipulations can be summarized as follows. Hard trials led to less accurate and slower responses, and the effect on RTs was larger for correct compared to error RTs. Furthermore, congruent potential payoff cues increased accuracy, and if an error was made, the corresponding RT was high (compared to neutral cues). Incongruent trials decreased accuracy, and the errors were fast (compared to neutral cues).
The effects of both manipulations are qualitatively consistent with the DDM's predictions. Choice difficulty is assumed to affect drift rates (e.g., Palmer et al., 2005;Ratcliff & McKoon, 2008;Voss et al., 2004). Increasing the drift rate leads to faster, more accurate responses, which we indeed find in the difficulty manipulation. Potential payoff cues are often thought to cause a bias in the starting point z (Diederich & Busemeyer, 2006;Edwards, 1965;Link & Heath, 1975;Mulder et al., 2012;Ratcliff, 1985;Ratcliff & Rouder, 1998;Voss et al., 2004); and/or drift rate (Ashby, 1983;Diederich & Busemeyer, 2006;Mulder et al., 2012;Ratcliff, 1981;Urai et al., 2019). Both biases lead to overall faster responses and more responses for the biased option than for the unbiased option. However, the two types of bias differ in their predictions about the shapes of the RT distributions, as well as about the relative speed of correct and incorrect choices: A start point bias predicts that choices corresponding to the unbiased option are slower compared to choices corresponding to the biased option, whereas a drift rate bias predicts equally fast biased and unbiased choices. Hence, formal model fitting was used to discover which type of bias was likely present in the participants' data. Three DDMs were fit, all assuming that the difficulty manipulation affected drift rates, but assumed that the potential payoff cues biased starting points (DDM1), drift rates (DDM2; i.e., both the potential payoff cue and difficulty manipulation affect drift rates), or both (DDM3). Formal model comparison (Fig. 1C) indicated that, when summing BPICs across subjects, the third model provided the most parsimonious trade-off between quality of fit and model complexity. However, a comparison of individual participants' BPICs (Fig. 1C, colors) shows that DDM1 was most parsimonious for 17 participants, DDM2 for 7, and DDM3 for the remaining 9. Thus, there is substantial interindividual variability in the strategies that participants used: A starting point shift was most common, followed by both a combination of starting point and drift rate shifts, but a drift rate shift alone was least common. Most participants (26) used one strategy only, but across subjects, DDM3, allowing for both strategies, provided the most parsimonious fit.
To take into account the observed interindividual strategic differences in subsequent analyses, we used Bayesian model averaging (Hoeting, Madigan, Raftery, & Volinsky, 1999) to calculate subject-wise average (across DDMs) parameters. In this approach, a weighted average of each DDM parameter is calculated per subject, with wBPIC-values (Wagenmakers & Farrell, 2004) functioning as weights. Hence, better-fitting DDMs contribute more to the weighted parameter estimate. The averaged drift rate effect of difficulty (Dv difficulty ¼ v hard À v easy ) and the averaged potential payoff cue-related drift rate ðDv cue ) and starting point shift ðDz cue Þ parameters are shown in Fig. 1D.
The posterior predictive distributions in Fig. 1E show the quality of fit of DDM3. Whereas DDM3 accounts for the median RTs in most conditions reasonably well, there is a strong misfit in the predicted RT distribution skewness: The DDM predicts much stronger right-skewed distributions than seen in the data, and also misfits the leading edge. Possible causes of this are the (implicit) deadlines of 1.5 sec and presence of a delay between the choice and feedback, which both have been shown to induce urgency effects that decrease skewness in RT distributions (Evans & Hawkins, 2019;Hawkins, Forstmann, Wagenmakers, Ratcliff, & Brown, 2015;Katsimpokis, Hawkins, & Van Maanen, 2020;Mileti c & Van Maanen, 2019). Another indication that an urgency signal influenced decision making are the error RTs, which are slower than the correct RTs. Since urgency entails that the amount of evidence required to inform a decision decreases with decision time, slow decisions are based on less evidence and therefore more likely to be erroneous than faster decisions (Ditterich, 2006;Hawkins et al., 2015). In summary, the task manipulations led to the expected effects on behavior. While the quality of fit of the DDM could likely be improved by using a model that incorporates urgency, the best-fitting DDM can explain the effects of the manipulations in terms of drift rate and starting point effects.

Whole-brain GLMs
As a second set of control analyses, we fit whole-brain GLMs to test for the effects of the response hand, errors, potential payoff cue, and difficulty. Statistical parametric maps (SPMs) of the main contrasts are shown in Fig. 3; a summary of all cluster locations and sizes is presented in Supplementary  Table S1. Replicating established findings (Dassonville et al., 1997;S.-G. Kim, Ashe, Georgopoulos, et al., 1993;S.-G. Kim, Ashe, Hendrich, et al., 1993), the response hand contrast shows a strong contralateral BOLD response in primary motor c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 (M1) and sensory cortex (S1), as well as in posterior putamen and thalamus. The expected ipsilateral BOLD response is present in the superior cerebellum (Lotze et al., 1999). An error-related BOLD response was present in insula, anterior cingulate cortex (ACC), and a smaller diffuse cluster in the midbrain. The insula (Eckert et al., 2009;T. A. Klein et al., 2013;Ullsperger et al., 2010) and ACC (Botvinick, Braver, Barch, Carter, & Cohen, 2001;Duncan & Owen, 2000;Paus, 2001;Paus, Koski, Caramanos, & Westbury, 1998;Spunt, Lieberman, Cohen, & Eisenberger, 2012) have frequently been associated with processing of errors and conflicting task demands. A negative BOLD response to errors was present in putamen (which could indicate a negative prediction error (McClure, Berns, & Montague, 2003), although these are more commonly associated with ventral striatum), posterior cingulate cortex (speculatively suggesting a lapse of attention on error trials; Leech & Sharp, 2014), and a diffuse cluster in left occipital cortex. The latter cluster might be the result of attention-lapse related activity modulations of the visual system (Reynolds & Heeger, 2009) and/or involuntary saccades that impaired task performance (Sylvester, Haynes, & Rees, 2005).
Payoff cues caused an increased BOLD response compared to neutral cues in the superior parietal cortex, left premotor cortex, and pre-SMA. Parietal cortex has often been implicated in evidence accumulation during perceptual decision making (Gold & Shadlen, 2001;J. N. Kim & Shadlen, 1999;O'Connell, Dockree, & Kelly, 2012;Shadlen & Newsome, 2001), and the increased BOLD response could reflect an increased starting point or drift rate. However, the model-based analysis showed no evidence that the BOLD response contrast size covaried with either the size of the shifts in starting point or drift rate. The increased pre-SMA and premotor BOLD responses likely reflect an increased motor readiness in response to a payoff cue (Cunnington, Bradshaw, & Iansek, 1996;Cunnington, Windischberger, Deecke, & Moser, 2002;Cunnington, Windischberger, & Moser, 2005;Pedersen et al., 1998). Contrary to expectations, we did not find potential payoff cuerelated BOLD responses in limbic areas such as the orbitofrontal cortex (c.f. Forstmann, Brown, et al., 2010;Keuken, Mü ller-Axt, et al., 2014;Summerfield & Koechlin, 2010).
The difficulty manipulation caused an increased BOLD response in hard trials compared to easy trials in anterior insula (over and above the effect of errors), as commonly found using difficulty manipulations with a variety of decision-making tasks (for a meta-analysis, see Keuken, Mü ller-Axt, et al., 2014). Furthermore, the model-based analysis shows that in right insula, the size of the BOLD response difference (hard e easy) covaries with the drift rate difference ðv hard À v easy Þ, indicating that participants with a larger difference in drift rates between conditions, also show a larger difference in right insular BOLD responses. The right insula specifically has earlier been implicated in drift rate effects in perceptual decision making (Ho et al., 2009). No evidence was found for difficulty-related BOLD responses in the dorsolateral prefrontal cortex (Heekeren et al., 2004).
In sum, the whole-brain results partly led to the expected effects of the task manipulations, consistent with the literature. Unfortunately, no difficulty-related BOLD responses in the dorsolateral prefrontal cortex were found, and the model-based analysis of the effect of the potential payoff cue showed no regions in which BOLD response size covaried with the behavioral effect size. Note that the payoff cue manipulation is known to be relatively weak compared to cue-based probability manipulations (Forstmann, Brown, et al., 2010;Mulder et al., 2012;Voss et al., 2004). Here, we preferred the payoff manipulation to activate limbic processing. Furthermore, the two different strategies of incorporating the payoff information in decision-making behavior (via a starting point or drift rate bias) may rely on partially different brain networks, which could weaken statistical power to find a covariances with DDM parameters. In the supplementary materials, we report SPMs of the limbic manipulation with a liberal, exploratory threshold.

Deconvolution analyses
As a third control analysis, we aimed to test whether there was sufficient BOLD sensitivity in the STN, and whether the shape of the BOLD response was in line with the canonical, double-gamma HRF typically used in GLM analyses. To do so, we applied a deconvolution analysis to the mean timeseries of each STN segment. The deconvolution model used Fourier basis sets with 7 regressors per event type (payoff cue and RDM stimulus), which allows for a high degree of flexibility in terms of the potential shapes of the BOLD response that can be captured, at the cost of increased estimation variance. The fit deconvolution model was visualized by plotting the percentage signal change as a function of time after onset of the event, as shown in Fig. 4. All STN segments show a BOLD response to the RDM stimulus, which reach a peak at approximately .2 percent signal change. The relatively low size of the responses (in units of percent signal change) can partly be explained by the use of a very short echo time of 14 msec, as percent signal change increases linearly with echo time (e.g., Kundu et al., 2017). Furthermore, the shape of the BOLD responses was in line with the canonical double-gamma BOLD response, although we do note between-region variability in the shape. Specifically, in left segments A and B, and in right segment C, the size of the post-stimulus undershoot was larger than commonly assumed in the canonical HRF Mileti c et al., 2020). Between-region variability in the shape of the HRF has been found before (Boillat & Van der Zwaag, 2019;Gonzalez-Castillo et al., 2012;Handwerker, Ollinger, & D'Esposito, 2004;Miezin, Maccotta, Ollinger, Petersen, & Buckner, 2000). Furthermore, it appears that the peak of the BOLD response occurs roughly 4 sec after RDM stimulus onset, thus earlier than the canonical 6 sec. However, some caution must be taken in interpreting this difference, since the TR of 3 sec is relatively high, which may limit our ability to determine the exact time at which the response peaks. Contrary to the RDM stimulus, the potential payoff cue seems to evoke no typical BOLD response in most segments (in line with Keuken et al., 2015), except for a potential trend towards a BOLD response in left STN ventromedial segment A. The flexibility of the deconvolution models comes at the cost of an increased estimation variance, and therefore large confidence intervals and decreased sensitivity. Because the shapes of the deconvolved BOLD responses approximate the c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 canonical HRF, we subsequently used the canonical HRF to model the BOLD responses in the ROI analyses detailed below. This led to much smaller confidence intervals, and thereby larger sensitivity, than the deconvolution analyses (see also Figure S3).

ROI-wise GLMs
Having confirmed that fMRI signal quality was sufficient to detect BOLD responses in the individual segments of the STN, we fit four GLMs to test our main hypotheses. Fig. 5 summarizes all four GLMs' BOLD response estimates. These BOLD response estimates were used as dependent variables. In Table 1, we present per-segment, within-subject t-tests for all contrasts of interest. Using linear MEMs, we then tested for differences between conditions and differences between segments. Below, we report the results using the PCA-based partitioning of the STN; in the Supplementary Materials, we report the results using the partitioning provided by Accolla et al. (2014). First, we used GLM1 to test whether potential payoff cue valence affected BOLD responses, and whether there were between-segment differences in these BOLD responses. A linear MEM showed no evidence for an effect of potential payoff cue type (payoff vs neutral; F(1, 160) ¼ 2.766, p ¼ .098), segment (A, B or C; F(2, 160) ¼ .434, p ¼ .648), nor, critically, of their interaction (F(2, 160) ¼ .073, p ¼ .929) on the BOLD response size. Bayesian model comparison furthermore preferred an intercept-only model over all seven possible alternative models that included potential payoff cue type, Fig. 4 e Deconvolution of BOLD-responses to the potential payoff cue (blue) and RDM stimulus (orange) using Fourier basis sets. Shaded areas correspond to 95% confidence intervals. For the deconvolution results using the STN segments based on the atlas provided by (Accolla et al., 2014), see Figure S4.  Figure S5.
STN segment, and/or their interaction predictors. Specifically, there was substantial evidence against a model with potential payoff cue type as predictor (BF 01 ¼ 4.368), strong evidence against a model with STN segment as predictor (BF 01 ¼ 17.151), strong evidence against a model with their interaction as predictor (BF 01 ¼ 10.637), and even stronger evidence against more complex MEMs (all BF 01 > 46). Hence, the BOLD responses to payoff and neutral cues were the same in all STN segments.
Second, we used GLM1 to test whether RDM stimulus difficulty affected STN BOLD responses, using the same analysis but including easy and hard stimuli as event types. A linear mixed effects model also indicated no evidence for an effect of difficulty (F(1, 160) ¼ .281, p ¼ .597), nor of STN segment (F(2, 160) ¼ .068, p ¼ .934), or their interaction (F(2, 160) ¼ .147, p ¼ .86) on BOLD response size. Bayesian model comparison again favored an intercept-only model. There was substantial evidence against an influence of difficulty (BF 01 ¼ 6.126), strong evidence against a difference between segments (BF 01 ¼ 18.769), and against their interaction (BF 01 ¼ 10.313). There was even stronger evidence against the more complex models including multiple predictors (all BF 01 > 65). Thus, the BOLD responses to hard and easy stimuli were the same in all STN segments.
Third, we used GLM2 to test for an effect of the potential payoff cue direction on BOLD response sizes. Here, a linear MEM predicting the BOLD response suggested no effect of potential payoff cue laterality (ipsilateral vs contralateral relative to the STN hemisphere; F(1, 160) ¼ .172, p ¼ .679), no effect of STN segment (F(2, 160) ¼ .774, p ¼ .463), nor an interaction (F(2, 160) ¼ .617, p ¼ .541). Bayesian model comparison again preferred an intercept-only model, with substantial evidence against a model including potential payoff cue laterality as predictor (BF 01 ¼ 6.325), strong evidence against a model including STN segment (BF 01 ¼ 15.94), substantial evidence against a model with an interaction between subregion and laterality (BF 01 ¼ 9.431), and stronger evidence against all more complex models (all BF 01 > 59). Thus, we also found that all STN segments responded equally to contralateral and ipsilateral potential payoff cues.
Fourth, we used GLM3 to test whether difficulty affects the STN when only the trials are considered in which the participants gave correct responses, to de-confound the effect of error-related processing. Hence, the MEM with difficulty (hard, correct vs easy, correct), STN segments, and their interaction as predictors, showed no effect of difficulty (F(1, 160) ¼ .876, p ¼ .351), of STN segment (F(2, 160) ¼ 1.732, p ¼ .180), nor their interaction (F(2, 160) ¼ .241, p ¼ .786). Bayesian model comparison favored a null model again, with substantial evidence against an effect of difficulty (BF 01 ¼ 5.363), substantial evidence against an effect of STN segment (BF 01 ¼ 9.603), substantial evidence against an interaction effect (BF 01 ¼ 9.92), and strong to decisive evidence against all more complex models including multiple predictors (all BF 01 > 53). Thus, also when considering only the trials with correct responses, all STN segments responded equally to easy and hard trials.
Fifth, we tested for motor lateralization effects using GLM4. Here, we constructed a linear mixed effects model predicting the BOLD response size using the response direction (contralateral vs ipsilateral) and subregion. Again, we found no main effects of response direction (F(1, 358) ¼ .049, p ¼ .835), no main effect of subregion (F(2, 358) ¼ 2.546, p ¼ .08), nor an interaction (F(2, 358) ¼ .042, p ¼ .959). Bayesian model comparisons favored a null model again, with substantial evidence against a model including a main effect of response direction (BF 01 ¼ 8.887), strong evidence against a model including a main effect of subregion (BF 01 ¼ 11.177), strong evidence against a model including their interaction (BF 01 ¼ 18.892), and decisive evidence against more complex models including multiple predictors (all BF 01 > 102). Note that, although we find no lateralized motor response signals, our results from GLM4 do not preclude the STN from being involved in motor preparation/execution processes bilaterally. In our experimental paradigm, subjects had to respond on all trials and therefore we cannot dissociate bilateral motor processing signals from those related to stimulus processing per se.
To summarize, in this section, frequentist tests suggested no evidence for effects of any of our manipulations (difficulty, potential payoff cue valence, motor response direction) on the STN BOLD responses, and no evidence for differences between the segments or for interactions. The Bayesian analyses consistently showed substantial to strong evidence against such effects.
Table 1 e Results of per-subregion within-subject frequentist and Bayesian t-tests. Frequentist t-tests had 32 degrees of freedom, and none were significant after correcting for the false discovery rate (q < .05). Bayes factors are presented as BF 01 , where values greater than 1 indicate evidence against the presence of a difference, and values lower than 1 evidence in favor of a difference. For the within-subject t-tests based on the Accolla et al. (2014)  c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 3.5.

Interindividual differences
Next, we took an interindividual differences approach to analyze the data, reasoning that the size of the behavioral differences should covary with the size of the BOLD response differences in the STN. Specifically, we tested whether the size of the shift in the DDM's starting points and drift rates due to a potential payoff cue was correlated to the difference in BOLD responses between payoff and neutral cues (obtained from GLM1). Furthermore, we tested whether the difference between the DDM's drift rates (hard e easy) was correlated with the corresponding difference in BOLD responses (also obtained from GLM1). All correlations, including their coefficients and corresponding Bayes factors, are shown in Fig. 6. Frequentist tests showed that none of the correlation coefficients was significant, and the Bayes factors indicated anecdotal evidence against correlations. In summary, there was no evidence for any interindividual correlations between the size of the behavioral effect and the size of the STN's BOLD responses in all subregions.

Discussion
A prominent hypothesis on the organization of the STN holds that it contains three functionally distinct subregions, respectively specialized in cognitive, limbic, and motor processing (Temel, Blokland, et al., 2005). Here, we tested this hypothesis in healthy volunteers using ultra-high field 7 T BOLD-fMRI with protocols optimized to study anatomy and function of STN (De Hollander et al., 2015, 2017Mileti c et al., 2020;Mulder, Keuken, Bazin, Alkemade, & Forstmann, 2019), manual delineations to maximize anatomical accuracy, and a well-understood task with manipulations hypothesized to elicit cognitive, limbic, and motor effects on behavior. Specifically, we used a random dot motion decisionmaking task with a payoff bias and a difficulty manipulation, and we furthermore analyzed the different response directions. Control analyses confirmed that the task manipulations elicited the expected behavioral effects. Furthermore, whole-brain analyses of the functional neuroimaging data directly replicated well-established findings on the effect of motor lateralization. The difficulty manipulation caused differential BOLD responses in right anterior insula, and the size of the BOLD response difference covaried with the size of the behavioral difficulty effect. Drift rate effects on anterior insula have frequently been reported in decisionmaking tasks with difficulty manipulations (Keuken, Mü ller-Axt, et al., 2014). The payoff cues resulted in increased BOLD responses in parietal cortex, which is commonly implicated in evidence accumulation in perceptual decision making, but the presence of between-subject strategy differences limited our ability to directly compare our results with the (relatively scarce) literature on this manipulation. Deconvolution analyses confirmed that we were able to detect BOLD responses in all STN segments. Despite the promising results of our control analyses, we found no evidence for functionally specialized subregions in the STN. More strongly, Bayesian analyses repeatedly suggested substantial to strong evidence against functionally distinct subregions in the STN.
Other studies that are considered to support a three-partite functional organization include LFP recordings from DBS electrodes in PD patients. Various studies report the peak amplitude of beta oscillations are near the dorsolateral border (de Solages, Hill, Yu, Henderson, & Bronte-Stewart, 2011;Kü hn et al., 2005;Seifried et al., 2012;Trottenberg, Kupsch, Schneider, Brown, & Kü hn, 2007;Weinberger et al., 2006;Zaidel, Spivak, Grieb, Bergman, & Israel, 2010), and alpha peak amplitude more towards the ventromedial border (Horn, Neumann, Degen, Schneider, & Kü hn, 2017). Yet, there is substantial variability in peak site locations, and because DBS surgery targets the dorsolateral part of the STN, the ventromedial tip is inherently undersampled, complicating the interpretation of these findings in terms of the functional organization of the STN (van Wijk et al., 2020).
It is possible that the STN is indeed specialized in functionally distinct subregions, but that we were unable to find these due to experimental design choices. One important consideration is how we operationalized the hypothesized subregions of the STN into three segments. Here, we used a PCA on the individual STN masks to identify the ventromedialedorsolateral axis, along which we divided the STNs in three segments of equal volume. Some other studies suggest more complex shapes of subregions (e.g., Ewert et al., 2018;Horn et al., 2017; but see Lambert et al., 2012), but there is considerable variability in the reported shapes and even number of hypothesized subdivisions (for review, see Keuken et al., 2012). Other work suggests the STN is organized along a gradient of changes with no clear boundaries De Hollander et al., 2014). Some of the mismatches in neuroanatomical data that speaks in favor of two versus three subdivisions might be due to the fact that although the indirect cortico-pallido-subthalamic projections show a motor/ associative/limbic gradient (e.g., Haber, Lynd-Balta, & Mitchell, 1993;Parent & Hazrati, 1995;Shink, Bevan, Bolam, & Smith, 1996), the hyperdirect cortico-subthalamic seems to predominantly involve only (pre)motor and limbic cortical areas (Temiz et al., 2020;Coenen et al., 2022; but see Haynes & Haber, 2013). What the functional significance of these different indirect versus hyperdirect projection patterns might be in the context of speeded perceptual decisionmaking is, however, hard to predict and both models would predict at least some functional specialization across STN subregions. c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 The PCA-based approach provides a principled way to formalize the hypothesis that the ventromedialedorsolateral axis is the most important axis of organization. We also repeated our ROI analyses based on the subdivision proposed by Accolla et al. (2014), which was based on diffusion MRI. These masks delineate more complex-shaped subdivisions with unequal volumes (see also Figure S1), but the corresponding analyses lead to the same conclusions.
An alternative analysis strategy could have been to use purely voxel-based analyses and define three subdivision ROIs in a more data-driven manner. However, preliminary analyses clearly showed that the signal-to-noise ratio of the BOLD signal at a field strength of 7 T is not reliable enough to elucidate activation patterns in single voxels within single individuals. Therefore, to improve experimental power and to refrain from overly exploratory analysis approaches, we opted to use the Fig. 6 e Interindividual correlations between BOLD-response contrasts and parameter estimate contrasts. Both the BOLDresponse contrasts (y-axis) and the parameter estimate contrasts (x-axis) are z-scored. Colors indicate subregions A (blue), B (green), and C (red). Shaded areas correspond to 67% confidence intervals. Pearson's correlation coefficient r is included in each panel, with the corresponding Bayes factors. For the correlations using the STN segments based on the atlas provided by (Accolla et al., 2014), see Figure S6.
ROI-based approach employed here and advocated elsewhere (De Hollander et al., 2015, see also 2017. We would like to stress that although our analyses used two sets of three predefined ROIs, this did not preclude us from finding a topological organization, even when the true underlying organization would be of a more gradual nature. Specifically, if there is a gradual change along the dorsolateraleventromedial axis of the STN that is relevant to the task and manipulations used in the present study, then we would still expect to find different amounts of task-related activity in the three segments, and especially between the ventromedial 'limbic' and dorsolateral 'motor' segment which form the opposite ends of the main axis of organization. Thus, our ROI-based approach was more of a practical data analysis choice rather than a theoretical commitment and we believe our results also speak against a somewhat 'gradual' tripartite functional organization of the STN, at least with respect to the current task used. A second design consideration is our operationalization of the cognitive, limbic, and motor domains in the experimental paradigm. We used the random-dot motion task since the STN is often thought to play a central role in speeded (perceptual) decision making (Bogacz, 2007;Bogacz & Gurney, 2007;Frank, 2006;Frank et al., 2015;Green et al., 2013;Keuken et al., 2015;Keuken, Van Maanen, Boswijk, Forstmann, & Steyvers, 2018;Mink, 1996). The operationalizations of the cognitive domain by means of a difficulty manipulation, of the limbic domain by means of the potential reward, and the motor domain using the response hands, were based on earlier studies using similar manipulations (Mulder et al., 2014). The difficulty manipulation has previously been shown to affect 'cognitive' brain areas such as the dorsolateral prefrontal cortex (Heekeren et al., 2004;Kaiser et al., 2007). Earlier work also suggested that STN activity reflects a 'conflict' or 'normalization' signal (e.g., Bogacz & Gurney, 2007;Frank et al., 2015;Keuken et al., 2015) that should be inversely proportional to choice difficulty. The detection of conflicts allows the STN to increase response thresholds (e.g., Bogacz & Gurney, 2007;Frank et al., 2015;Herz, Zavala, Bogacz, & Brown, 2016) to prevent premature, inappropriate responses. This proposed inhibitory role of the STN extends to other cognitive tasks (Soh & Wessel, 2021;Wessel et al., 2016), and could be facilitated by the short neural pathways between the visual system and the STN (Coizet et al., 2009) which could allow the STN to quickly detect whether a stimulus is sufficiently clear to allow for a fast response, or whether thresholds should be increased.
The potential reward manipulation has been shown to affect 'limbic' brain areas such as the orbitofrontal cortex (Forstmann, Anwander, et al., 2010;Summerfield & Koechlin, 2010). Unfortunately, we were not able to replicate these findings. We found significant task-related BOLD responses in the STN, but found no differences in the BOLD response between hard and easy trials, between payoff and neutral cues, or between left and right responses. It is possible that the tripartite model is correct, but that our manipulations (despite being tailored to do so) failed to tap into the three distinct cortico-basal ganglia networks, hence failing to elicit specific cognitive, limbic, and motor functioning of the STN. If so, theorists of STN functioning that are proponents of the tripartite model should strive to extend this model with specific attributions of cognitive, limbic, and motor functions and how they would be involved in standard experimental paradigms, as such that the model remains testable. Then, future work can assess whether other experimental designs can elicit BOLD responses that support a tripartite model. Complementary, more data-driven approaches might also be used to leverage functional (in addition to structural) data. For example, unsupervised machine learning techniques that can situate multivariate activity patterns across different gradual axes of connectivity (e.g., Haak, Marquand, & Beckmann, 2018;Huntenburg, Steele, & Bazin, 2018;Margulies et al., 2016) might be used to subdivide the STN using functional MRI data (in addition to structural connectivity data). Preferably, resulting connectivity patterns with respect to cerebral cortex should in turn be related to cognition via automated metaanalysis techniques (e.g., Neurosynth; Yarkoni, Poldrack, Nichols, Van Essen, & Wager, 2011).
Our study is the first to test for functional specialization of STN subregions using ultra-high field 7 T BOLD-fMRI, which, contrary to LFP recordings, allowed for testing healthy volunteers and sample across the entire STN. However, the spatial resolution of the BOLD response can be lower than the nominal scanning resolution due to draining veins that cause BOLD responses in regions downstream from the activated neural tissue (R. Turner, 2002;Uǧurbil, Toth, & Kim, 2003). One study at 7 T estimates the full-width-half-maximum of the BOLD pointspread function in cortical gray matter to be in the range of 1.5e2 mm (Shmuel, Yacoub, Chaimow, Logothetis, & U gurbil, 2007). This is a known issue for laminar fMRI, where non-BOLD fMRI methods (Huber et al., 2014(Huber et al., , 2018(Huber et al., , 2015Huber, Uluda g, & M€ oller, 2019) are more frequently used to disentangle neural activity from the different cortical layers using sub-millimeter scanning protocols. However, the estimated point spread function is still smaller than the length of the STN along the ventromedialedorsolateral axis (approximately 1 cm). Another consideration here is the use of partial Fourier (required to obtain a TE of 14 msec), which decreases the effective resolution in the phase encoding direction (anteriorposterior) in our functional data. The STN, however, extends across multiple slices (see Fig. 2), which increases the number of independent samples from the different segments. Furthermore, compared to lower field strengths, 7 T fMRI appears to be less sensitive to macrovasculature, since the T 2 * of macrovasculature is relatively much lower compared to the microvasculature at 7 T (Markuerkiaga, Barth, & Norris, 2016;Uluda g, Mü ller-Bierl, & U gurbil, 2009). In future work, it will be of utmost importance to investigate the neurovascular coupling in and surrounding the STN in order to understand the limitations of the spatial specificity of the BOLD response in such subcortical regions.
Another limitation of using BOLD-fMRI to study the STN is the gradient of increasing iron concentrations towards the ventromedial tip of the structure (De Hollander et al., 2014;Stü ber et al., 2014). The presence of iron in brain tissue decreases the transversal relaxation time T 2 *, which in turn affects the size of the BOLD response (under a constant echo time): Under the assumption of mono-exponential signal decay, signal-to-noise ratios decrease exponentially with echo time, but BOLD contrast increases linearly (e.g., Kundu, Inati, Evans, Luh, & Bandettini, 2012). Hence, voxels with relatively low T 2 * values are expected to have a relatively large BOLD c o r t e x 1 5 5 ( 2 0 2 2 ) 1 6 2 e1 8 8 response (e.g., in percent signal change) and relatively large amount of signal noise. Optimal BOLD contrast-to-noise ratios (and therefore, highest first-level t-values) are reached when the echo time is equal to the T 2 * of the voxel (Posse et al., 1999). Therefore, it can be expected that both the BOLD contrast estimates as well as the t-values vary along the ventromedialedorsolateral axis of the STN. Future studies supporting the tripartite model using BOLD-fMRI should therefore ensure that the T 2 * variability within the STN did not influence the results.
In addition to spatial limitations, the temporal resolution of functional MRI also remains rather sluggish compared to the millisecond timescale at which speeded decision-making unfolds neurally (Gold & Shadlen, 2007;Yarkoni, Barch, Gray, Conturo, & Braver, 2009). Here, we have tried to circumvent this issue using an experimental design that aims to tear apart different aspects of decision-making throughout time ). An alternative, future approach might be to sample neural activity at a higher temporal resolution using either invasive electrophysiological recordings (van Wijk et al., 2017) or highly accelerated functional MRI at a small volume-of-interest (e.g., using line-scanning fMRI), which can now reach a time scale of dozens of milliseconds (Albers, Schmid, Wachsmuth, & Faber, 2018;Ekman, Kok, & De Lange, 2017;Raimondo et al., 2021;Setsompop, Feinberg, & Polimeni, 2016;Yu, Qian, Chen, Dodd, & Koretsky, 2014). However, the electrophysiological approach suffers from biased sampling, because invasive recordings can only be made in patients with a clinical need for DBS, and neurosurgeons generally target the electrodes dorsolaterally from the STN (van Wijk et al., 2020). The fast fMRI approach will suffer particularly high SNR hits due to the necessary acceleration and the fact that the STN lies near the center of the brain Mileti c et al., 2020). Moreover, some temporal blurring will always occur due to neurovascular coupling (but see for example Ekman et al., 2017;Siero et al., 2013). However, future methodological developments like larger electrode arrays and optimized receiver coils might overcome these limitations.
In sum, we tested for the presence of three functional subdivisions in the STN using ultra-high field 7 T BOLD-fMRI in healthy volunteers with a task paradigm. The results did not support functional subdivisions, and more generally show no BOLD responses in the STN related to task difficulty, payoff cues, or motor response direction. It is important for future work that the tripartite model is further augmented to clarify which experimental paradigms and manipulations are expected to lead to differential activity between subregions, such that these predictions can be empirically tested.

Open practices
The study in this article earned an Open Materials badge for transparent practices. Materials and data for the study are available at https://osf.io/nurde/.

Code & data availability statement
All code used to analyze the data can be found on https://osf. io/nurde/. Due to legal restrictions from GDPR, we are unable to make dataset 1 fully publicly available, because data sharing was not part of the informed consent procedure at the time of data collection, which restricts us from sharing these data outside of the author team. Dataset 2 can be found on https://osf.io/nurde/or https://doi.org/10.21942/uva.20160598.

Conflict of Interest
None.