Decomposing the neural pathways in a simple, value-based choice

Understanding the neural implementation of value-based choice has been an important focus of neuroscience for several decades. Although a consensus has emerged regarding the brain regions involved, including ventromedial prefrontal cortex (vmPFC), posterior parietal cortex (PPC), and the ventral striatum (vSTR), the multifaceted nature of decision processes is one cause of persistent debate regarding organization of the value-based choice network. In the current study, we isolate neural activity related to valuation and choice selection using a gambling task where expected gains and losses are dissociated from choice outcomes. We apply multilevel mediation analysis to formally test whether brain regions identified as part of the value-based choice network mediate between perceptions of expected value and choice to accept or decline a gamble. Our approach additionally makes predictions regarding interregional relationships to elucidate the chain of processing events within the value-based decision network. Finally, we use dynamic causal modelling (DCM) to compare plausible models of interregional relationships in value-based choice. We observe that activity in vmPFC does not predict take/pass choices, but rather is highly associated with outcome evaluation. By contrast, both PPC and bilateral vSTR (bilaterally) mediate the relationship between expected value and choice. Interregional mediation analyses reveal that vSTR fully mediates between PPC and choice, and this is supported by DCM. Together these results suggest that vSTR, and not vmPFC nor PPC, functions as an important driver of choice.

A reasonable starting point to explore neural pathways mediating choice is to begin with relatively simple choices (to simplify the space of choice parameters) and use this to focus on specific choice processes (to restrict the distribution and size of the decision-making network that is engaged). A critical question in value-based choice is which brain region (or regions) represents and integrates value-based information and forms a choice. Given the large, distributed nature of decision-making networks in the brain, we chose to restrict our focus to regions that both encode value and predict choice from exploratory voxelwise analyses. These regions likely include relevant nodes in a network comprising the vmPFC, PPC, and vSTR. Similar value-representation and choiceselection processes have been ascribed to several regions, including the PPC, vmPFC/OFC, and vSTR (see Fig. 1). PPC activity covaries with value representation and choice response times (Basten et al., 2010;Domenech et al., 2017;Rodriguez et al., 2015). VmPFC/OFC activity represents value at decision (Boorman et al., 2009;Boorman et al., 2013;Padoa--Schioppa and Assad, 2006;Strait et al., 2014;Strait et al., 2015), integrates value information (Chib et al., 2009), and is modulated by decision duration (Rich and Wallis, 2016;Sokol-Hessner et al., 2012;Strait et al., 2014). Likewise, vSTR activity represents subjective and chosen values (Levy et al., 2010;Peters and Büchel, 2010;Strait et al., 2015) and dopamine neuromodulation of choice (for a review see Frank and Claus, 2006). These lines of research highlight the distributed nature of choice (Hunt and Hayden, 2017) but may present alternative predictions regarding the hierarchical chain of processing events in value-based choice. While some research indicates that certain decision-related regions are more proximal for choice than other regions (e.g., parietal and lateral prefrontal regions have been labelled as implementing choice process, but not valuation processes in macaques (Kable and Glimcher, 2009)), more recent views emphasize recurrence within the decision-making network along with hierarchical organization of processing timescales (Hunt and Hayden, 2017).
One reason for debate regarding the value-based choice network is that decision-making is multifaceted and requires many component processes from stimulus encoding, valuation, integration, selection, etc. In this study, we chose to isolate take/pass choices that require valuation and selection processes without adding complexity related to value comparisons when selecting between options. This simplified choice architecture isolates component processes that are critical for value-based choice, and our aim is to trace the neural implementation of these isolated decision-making components.
To address this question we isolate decision processes from outcome evaluation and value comparison processes. To isolate decision processes, we exploit probabilistic associations between the expected values and uncertain outcomes and temporal dissociation between actions (Domenech et al., 2017;Jocham et al., 2014;Metereau and Dreher, 2015;Studer et al., 2012;e.g., Tom et al., 2007). To further isolate valuation and selection processes, we employ take/pass choices rather than more complex, multi-option choices that elicit additional comparisons between options. By isolating simple, value-based choice processes, we can explore how they are implemented in brain networks without contamination from similar, non-decision processes that may rely on similar regions. To that end, we adapted the Duplex Gamble task (Slovic and Lichtenstein, 1968) in which participants choose to take/pass a probabilistic gain and loss as a single gamble where comparator processes are not required.
We present a hierarchical mediation framework to model choices as a direct product of neural activity and environmental parameters. We formally explore whether neural activity mediates choice on a trial-bytrial basis (Bauer et al., 2006;Kenny et al., 2003) and whether trial-level activity in one region mediates the relationship between activity in other regions and choice. Together, isolation of valuation and selection processes combined with innovative analytical methods will provide novel insight into value-based choice networks within the human. Given the potential for multiple choice mediating regions (i.e., more than one region may mediate choice) and mutual mediation between regions (i.e., interregional mediation may work jointly in both directions in recurrent networks), our analyses will explore each of these possible mediation relationships. Our approach is both hypothesis-driven, i.e., we expect PPC, vSTR, and/or vmPFC to be a critical substrate that mediates value-based choice, given the prior evidence consistent with this role, and exploratory, i.e., we were agnostic as to the precise pathway(s) and timing of information flow between these regions for choice. Moreover, we are exploring choice-mediation in a whole brain manner to explore whether other brain regions are among the critical regions for choice. Finally, despite our prediction that PPC, vSTR, and vmPFC have been implicated in the literature in decision-making, we are exploring whether these roles are restricted to decision-making or are also involved in the evaluation of choice outcomes as well.

General
We measured brain activity during a duplex gamble task using the blood-oxygenation dependent (BOLD) signal acquired with functional magnetic resonance imaging (fMRI). Given that fMRI data is inherently hierarchical (i.e., trials nested within scanning runs nested within participants), these sources of variance should be modelled with a multilevel modelling approach (e.g., Chen et al., 2013). We present the first, formal, multilevel test of the fully mediated pathway from expected value to neural activity to choice. We then extend the logic of the brain-as-predictor approach (Berkman and Falk, 2013) into a voxelwise, logistic, multilevel framework to model choices as a direct product of concurrent neural activity and environmental parameters. A key difference between our study and previous studies is the voxelwise application of generalized multilevel models data instead of regions of interest (Knutson et al., 2007). In addition, mediation analyses of neural activity (e.g., Wager et al., 2008) are useful in evaluating hypotheses relating to pathways from stimuli to brain activity to behavior as well as interregional relationships, where activity in one region may mediate between activity in other regions and behavior.

Participants
We recruited 23 individuals (14 women; one woman was excluded for being unable to complete the task, 22 individuals were analyzed in the final sample) from the Queen's University community through advertisements. Participants provided consent in accordance with the Queen's University Institutional Review Board. All participants had no history of neurological or psychiatric disorder, had normal or corrected vision, and were right-handed. Participants were compensated with $40.

Experimental design
Participants performed a 10-min pre-scanning task to familiarize them with the gambles to be used in the fMRI scanner. Gambles were presented as color-coded bar charts (red or blue) representing gain and loss magnitude (20, 40, 60, 80 points as bar height) and pie charts representing probabilities (13, 33, 50, 67, 83%) (Fig. 2). Gains and losses were presented simultaneously on the screen. Color and position of gambles (gains on left or right) were counterbalanced across participants. Alternatives for the value-based choice network. Potential regions that mediate choice include posterior parietal cortex, ventromedial prefrontal cortex, and ventral striatum. Arrows represent information flow, where each of these regions potentially mediates between value and choice. All possible interregional relationships are displayed. T.R. Koscik et al. NeuroImage 214 (2020) 116764 Response position was not varied within-participants in order to simplify decision-to-motor mappings which follow rather than precede decisionmaking processes; while this choice simplified decision-motor mappings for participants it may have led to some laterality of effects though these are not critical for the hypotheses being tested here. Following a 2s fixation cross, gambles were displayed for 4 s during which participants made a button press to take or pass the gamble. Participants decided whether to take the potential gain and loss as a whole, i.e., decisions for prospective gain and loss were not separate for each trial. Conceptually this is a choice between taking the gamble and selecting an option of 0 points. Following each decision and a jittered delay of 2, 4, or 6s, feedback appeared for 2s indicating how many points had been (or could have been) gained and lost; both, either, or neither the gain and, or, nor loss could occur. Participants completed seven blocks of 18 trials; gain and loss magnitude and probability were randomized on each trial. Participants were informed that their points would determine their compensation, though in reality these were independent. For fairness to all participants, we paid out the maximum amount regardless of performance.

fMRI acquisition
Images were acquired with a Siemens Tim Trio 3T scanner. For whole-brain functional coverage, 32 axial slices (slice thickness ¼ 3.5 mm, 0.5 mm skip) were prescribed parallel to the AC-PC line. Functional images were acquired from inferior to superior using a single-shot gradient echo planar pulse sequence (TE ¼ 25 ms, TR ¼ 2s, in-plane resolution ¼ 3.5 Â 3.5 mm, matrix size: 64x64, and FOV ¼ 224 mm, TRs per run ¼ 149).

preprocessing
fMRI scans were preprocessed using FSL (Jenkinson et al., 2012) and included the following operations: (1) motion correction; (2) non-brain voxel removal; (3) spatial smoothing, 5 mm FWHM Gaussian kernel; (4) intensity normalization using 4D grand mean; (5) temporal filtering, high pass Gaussian-weighted least squares straight line fitting (sigma ¼ 70s); and (6) affine registration to MNI standard space. We obtained single-trial beta estimates for BOLD response magnitude on each trial by modelling fMRI time series with individual trial regressors (Mumford et al., 2012;Rissman, et al., 2004) using AFNI (Cox, 1996). For each trial, single-trial beta estimates for decisions were obtained using onsets of gamble options modulated by the response time for each trial. Likewise, single-trial beta estimates for outcomes were obtained using onsets of the outcome display modulated by the duration and amplitude of the outcome on screen. We chose to use individual trial regressors (LSA) to estimate single-trial activity estimates. While summed trial regressors (LSS) may be preferable in many circumstances (e.g., multivariate pattern analysis), particularly for rapid event-related designs, our task design which was not rapid given our dissociation of decisions and outcomes in time and using LSA preserves more trial-related variability and provide non-biased parameter estimates which is preferred in this context (Mumford et al., 2012). Either method could be reasonably be used, though the current implementation of the LSS method in AFNI does not allow amplitude modulation; we do not intend to advocate for one over the other. This resulted in 5544 (252/subject) beta estimates at each voxel in the brain, with 2772 (126/subject) corresponding to each choice and outcome. Given the large number of predictors per scanning run (18) and the number of samples in the fMRI timeseries (149) there is a potential that regression models generating single-trial beta estimates are overfit. This is not a problem for these estimates, as it is desirable for them to match the observed data as closely as possible (a consequence of overfitting), and there is no need to extrapolate or predict values in a new sample as the regression model only applies to a specific scanning run (overfitting would make these extrapolations inaccurate).
Given the potential for over-fitting in subsequent models of decision, descriptions of this deconvolution method demonstrate this technique using 60 trials for two tasks (Mumford et al., 2012); in the current paradigm with roughly double this number (126 trials per participant), we are confident that the number of trials is sufficient relative to the number of predictors in our models (up to 4 fixed-effects, and 2 nested random effects for the most complex models). Furthermore, none of the mixed-effects models that were run resulted in "singular" fits, where a predictor or set of predictors fit the data with close to zero variance or correlations close toAE1, this suggests that mixed effects models did not result in overfitting.

Statistical analysis
Brain regions exhibiting activity that mediates between expected value and observed choice must meet the following criteria: (a) neural activity must be predicted by expected value, and (b) neural activity must predict choice while controlling for expected value (for a thorough discussion of mediation see Baron and Kenny, 1986). Demonstrating that neural activity is modulated by expected value (or other choice-related values) is insufficient to conclude that it determines a choice; rather, neural activity in mediating regions must also predict the choice being made (Cisek, 2012). Simply demonstrating that neural activity is (a) predicted by expected value and (b) that neural activity predicts choice is not sufficient to demonstrate mediation, as the variance associated with each of these relationships measured separately may be completely independent. Mediation analysis quantifies the effect of one variable on the relationship between others, or in other words, mediation analysis quantifies whether the variance in (a) and (b) is associated consistent with the mediation model.
Critically, we are interested in trial-level mediation: that individuals tend to exhibit a particular response due to changes in neural activity on that trial instead of individual-level effects (e.g., membership in an experimental condition or group). By applying multilevel mediation methods, we can formally test whether neural activity mediates choices Each trial begins with a fixation cross for 2 s, followed by a gamble. For each gamble, bars indicate points, pies indicate probability, and color indicates gain or loss. Gambles remain on-screen for 4s, during which time participants choose to take or pass the gamble by making a button response. Participants take or pass the gambles as a whole using a button press to indicate their choice. Gambles are followed by a fixation cross for 2, 4, or 6 s (jittered). Finally the outcome for each gamble is presented. If participants chose to take the gamble, they could receive either both the gain and loss, the gain-only, the loss-only, or neither gain nor loss. on a trial-by-trial basis (Bauer et al., 2006;Kenny et al., 2003). Finally, once regions that mediate choice have been identified, interregional relationships can be tested, where trial-level activity in one region may mediate the relationship between activity in another region and choice. This analytic approach provides insight into process-level neural organization.
When conducting mediation analyses with fMRI data, we assume that the direction of effects is that brain activity is a product of environmental stimuli and observed choice actions result from brain activity. While it is possible for choice to cause brain activity, we assume this is less likely. When the direction of effects is less clear, such as when considering interregional relationships, mediation provides convergent evidence, but does not confirm, directionality between component processes. To explicitly test the directionality of interregional relationships, we employ dynamic causal models (DCM) to compare between hypotheses of interregional connectivity for value-based choice deemed plausible from our multilevel mediation analysis.

Voxelwise multilevel models
We modelled trial-related estimates of BOLD activity at each voxel in the brain in R (R Core Team, 2016) using lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2015) packages as well as custom software. Core assumptions of multilevel modelling are the same as those of the general linear model (GLM); given that multilevel modelling is literally simultaneous GLM across participants and runs, processing approaches and assumptions are logically identical.
Timepoints were excluded voxelwise where the beta estimate of BOLD activity was greater than 3 standard deviations from the mean beta estimate for each subject (on average 0.15% were removed per voxel, combined across all subjects). These large spikes in the data do not reflect realistic changes in neural activity and likely result from random fluctuations in deconvolution of raw BOLD signals. Variables were mean centred as required for each model, and when necessary (i.e., for logistic models), variance components of BOLD time series estimates were decomposed into specific individual-level, subject by run-level, and trial level signals. Initial multilevel models were computed, then time points were excluded where the standardized residual of a given time point was greater than 3. Approximately 1% of data points were removed on average per model per voxel. The remaining variables were then re-mean centred and variance components were decomposed for logistic models so that outlying data points did not impact calculation of within-subjects and within-run means.
All p-values are FDR corrected (q ¼ 0.05) using its implementation available in FSL; the cerebellum was excluded from all analyses. Reported coordinates indicate the centre of mass of a cluster in MNI space and the size of the cluster in mm 2 . Multilevel models included random effects of participant and scanning run. Linear multilevel models at decision predicted neural activity with expected gain and loss. Linear multilevel models at outcome predicted neural activity with gain and loss magnitude interacting with choice [coded as 1 (taken), 0 (no response), and À1 (passed)].
To model choices, we specified multilevel logistic regression models where neural activity at decision (i.e., during the time period preceding a response) predicted choice [coded 1 (taken) and 0 (passed), missed excluded] while controlling for expected gain and loss. Given our interest in trial-related changes, neural activity in choice models was decomposed into three variance components: (1) trial-related variance, beta estimate minus within-scanning run means, (2) scanning run-related variance, within-scanning run means minus within-subjects means, and (3) subject-related variance, within-subjects means minus grand mean.

ROI mediation in multilevel models
Potential mediating regions were identified by the conjunction of significant effects at decision: (a) expected gain related to increased activity, (b) expected loss related to decreased activity, and (c) trialrelated activity predicted responses. Given this three-way conjunction and the likelihood that this resulted in small clusters, we excluded clusters (<15 voxels) to balance retention of regions likely to be involved according to prior research (e.g., PPC, vSTR, and vmPFC) and exclude extraneous small clusters and single, spurious voxels. For each region that met these criteria, we calculated the within-subjects mean time series across and then used these values to calculate the models necessary for mediation analysis.
Mediation analysis in multilevel models presents some unique challenges. For example trial-level mediation effects are nested within person-level effects (Kenny et al., 2003). One could compute separate models for each participant and then average them, but this loses the nested data structure and does not account for individual differences in mediation strength. Thus, we utilized the stacked multilevel regression procedure adapted from previous work exploring mediation analyses in multilevel models to estimate the indirect effect (Bauer et al., 2006).
Stacked regression allows simultaneous modelling of multiple multilevel models, e.g., both components of the mediated pathway (value to brain and brain to choice), which is critical for mediation analysis. Given the nested structure in multilevel data, stacked multilevel modelling accounts for the covariance of slopes in the indirect path (e.g., value to brain and brain to choice) in addition to modelling the variance in slopes for both components of the indirect path (Bauer et al., 2006). These procedures were adapted to account for the fact that some relationships require a linear fit while others (namely when predicting responses) require fitting a binomial distribution. Thus, we employed a hybrid approach and took parameter estimates from individual models that make up the indirect path: (A) trial-related changes in the mean time series predicted by overall expected value (expected gain minus expected loss), and (B) choices predicted by trial-related neural activity controlling for expected value and subject-related differences in neural activity. From these models, we constructed a covariance matrix corresponding to the covariance of models A and B. We calculated the indirect effect and covariance between A and B using a stacked regression with a linear fit. This covariance matrix is then used to calculate confidence intervals for the indirect effect using a Monte Carlo multivariate simulation (Bauer et al., 2006;Mackinnon et al., 2004). This mediation analysis procedure was repeated for each of the identified regions for expected value-choice mediation, and was repeated on pairwise groupings of regions, to explore interregional mediation of choice. In regard to interregional mediation, we chose not to explore potential multisynaptic pathways (ie., whether a region mediates the effect between two other regions), as choice is not explicitly part of these pathways and the number of combinations quickly increases beyond reasonable computational complexity, e.g., voxelwise exploration in 2 mm 3 space would require~200,000 models per region pair.

Dynamic causal modelling of plausible value-based choice models
Together, voxelwise and interregional multilevel mediation models will isolate a set of plausible neural models of how the brain mediates value-based choice. These methods provide a powerful combination to identify and evaluate the plausible models of value-based choice across the whole brain. As mentioned above, this multilevel mediation approach is less certain when the direction of effects is unclear (as is the case for interregional mediation). By contrast, dynamic causal modelling (DCM) is impractical on a voxelwise, whole brain level, but excels when a model space is constrained to plausible models, and can more strongly infer the direction of effects. Thus, we employ behavioural DCM [SPM12 version 7219, (Penny et al., )] to provide an additional test of the likelihood of plausible interregional relationships. We constrain the model-space for DCM by including only those models where multilevel mediation is unclear in the inference of the direction of interregional effects, and exclude those that do not match mediation models of the relationship between value and choice.
Left and right vmPFC represent received gain and loss at outcome but not expected gain and loss at decision, despite the literature surrounding the vmPFC as a critical substrate for value-based choice. To explore this possibility further, we examined left and right vmPFC at decision, using the regions that represent received gain and loss at outcome. The left vmPFC outcome region had weak relationships at decision to expected gain [β ¼ 0.059, t(2662.  Fig. 7F) mediates between value and choice.

Identification of neural regions for mediation analysis
Potential value-choice mediating regions must (a) be predicted by expected value at decision, increased activity to expected gain and decreased activity to expected loss as demonstrated above, and (b) activity in these regions must also predict choice when controlling for expected value. We calculated voxelwise, logistic multilevel models where choice was predicted with trial-related neural activity while controlling for subject-related and scanning run-related activity and expected gain and loss. Several clusters meeting these criteria were observed: bilateral vSTR [right: Fig. 6A, (12, 6, À8 For whole brain results where trial-related brain activity predicted choice at decision see Fig. A12 and Table A12 and for whole brain potential mediating regions see Fig. A13 and Table A13.

Ventral striatum and PPC mediate between expected value and choice
Within potential, value-choice mediating regions, we calculated the mean time series of trial-related activity for each region, then modelled the neural activity in the region as a linear function of expected value (expected gain -loss), and modelled binary choice response as a logistic function of trial-related activity within each candidate region within our multilevel framework. We then used stacked multilevel modelling to estimate the covariance structure of these models simultaneously (see Methods).    Fig. 7D). Thus, it is unlikely that significant mediation is due to some general, shared variance component of neural activity.
It is important to note that motor processing in the brain is necessarily proximal to the actions that implement choice relative to value processes, e.g., motor cortex directly evokes muscle movements to enact choice. Thus, motor cortex and neural substrates of value-driven behavioural control, such as the dorsal striatum (Frank and Claus, 2006) should mediate between value processes and choice. Indeed, we observed a region in left pCG (contralateral to right-handed responses) that mediates between value and choice [IE ¼ 1.097, CI 95% ¼ 0.569-1.745], but not in dorsal striatum. While task design features allow disambiguation of processes up to abstract representations of value-based choice, action-value representations were not conditional on any feature of the gamble task. Thus, these processes could not be distinguished here.

Interregional mediation in the value-based choice network
Since multiple brain regions mediate between expected value and choice, it is important to understand how information flows between them in service of choice. We tested whether these value-choice mediators also mediate between the neural activity in other mediator regions and choice. VmpFC does not predict decision and has already been excluded as a mediator; the remaining alternatives regarding interregional mediation of choice, include the PPC and vSTR (see Fig. 8). A third alternative is that they function together to aggregate value information, where there is partial mediation for both pathways. We performed pairwise comparisons between value-choice mediators in both directions, e.g., whether right vSTR mediated between left PPC and choice and whether left PPC mediated between right vSTR and choice.

Ventral striatum mediates the relationship between PPC and choice
First, linear multilevel modelling, where brain activity in one region is used to predict activity in a potentially mediating region, reveals that all pairwise comparisons between regions are significant, which is to be expected with highly correlated brain signals.  Fig. 9A).
Second, generalized multilevel modelling predicted choices with trial-related brain activity in the potential mediating region while controlling for activity in the other brain region and expected value (see Fig. 9B). Activity in right vSTR predicted choice consistently, after controlling for left vSTR [β ¼ 0.289, z ¼ 4.781, p ¼ 1.741x10 À6 ] or left PPC [β ¼ 0.299, z ¼ 6.659, p ¼ 2.760x10 À11 ], as did left vSTR after con- Finally, given the numerous pairwise comparisons, we report an FDR corrected 99.75% confidence interval corresponding to an FDR correction due to 20 tests. Pairwise mediation analysis (see Fig. 9C The results are consistent with the notion that the vSTR forms a pathway through which information must flow before making a decision (see Fig. 10). We also observe that the right vSTR mediates between left VSTR and choice [IE ¼ 0.219, CI 99.75% ¼ 0.082-0.359] but not vice versa [IE ¼ 0.056, CI 99.75% ¼ -0. 083-0.195] suggesting the vSTR mediation of decision may be somewhat lateralized in this context. To provide additional validation, we tested the mediation models using multilevel structural equation modelling as implemented in laavan for R (Rosseel, 2012). As with the stacked regression approach,

Dynamic causal modelling agrees with mediation model
To provide an additional test, whereby we can infer the directionality of interregional relationships, we used DCM to compare the plausible models of the value-based choice network. By combining multilevel mediation analysis with DCM, we can effectively reduce the model set to only those models that are plausible and consistent with the prior knowledge generated by multilevel mediation. DCM excels in indicating which models are relatively more plausible (Daunizeau et al., 2011), especially when we can constrain the model set included in the comparison (Stephan et al., 2009) by including only those models that are plausible given our multilevel mediation results. For the simple, value-based choices in our experiment, we can eliminate models from the ostensibly complete set (see Fig. 1) by: 1) excluding models where vmPFC is involved at decision as activity in this region is related to outcomes not choices per se, and 2) including only models where vSTR mediates between PPC and choice (see Fig. 8). For simplicity, we included only right vSTR in DCM models, as this had the strongest mediation effect and appeared to mediate between left vSTR and choice as well. The models include all models where vSTR mediates between PPC and choice (Fig. 11), value processing occurs in both regions (except for models E and F, where vSTR receives value information secondarily from PPC only) and: A) bidirectional PPC/vSTR connections (Fig. 11A) and B) unidirectional connection from vSTR to PPC (Fig. 11B) and C) unidirectional connection from PPC to vSTR (Fig. 11C) and D) no connection between PPC and vSTR (Fig. 11D) and E) bidirectional PPC/vSTR connections (Fig. 11E), and F) unidirectional connection from PPC to vSTR (Fig. 11F).
DCM indicated that Model C (see Fig. 10) where both vSTR and PPC process value information, but this information flows unidirectionally from PPC to vSTR, and vSTR determines choice is the most likely model of value-based choice. This was consistent when using fixed effects (relative log evidence ¼ 12,346, posterior p ¼ 1) or random effects (ϕ c ¼ 97.2%). All other models were less likely for fixed effects [relative log evidence: Model C is most consistent with our multilevel mediation results as well.

Discussion
Understanding value-based choice requires understanding how stimulus values are manipulated by the brain to produce choice. Research in this regard has heretofore predicted neural activity from choices and/ or task parameters. We provide the first direct test of how activity in reward-related brain regions mediates between expected value and choice. This mediation constitutes direct evidence that bilateral vSTR and PPC transform incoming information regarding the expected value of a decision to select whether or not to take a gamble. By contrast, vmPFC activity does not predict choice nor does it mediate between expected value and choice selection. These data are consistent with the notion that deployment of neural resources in the value-based decision network depends on choice parameters (i.e., simple, take/pass gambles); network nodes are not necessary for all types of choices. Moreover, our observation that activity in the vSTR fully mediates between activity in the PPC suggests that vSTR has a more proximal role to choice selection (i.e., our data are consistent with the notion that the vSTR and not the PPC is the final arbiter of choice for these simple value-based choices), further refining the directionality of processing in the decision network. These results are consistent with the literature that indicates that activity in the vSTR (also labelled as nucleus accumbens) generalizes to predict future choices as well. For example, when individuals made decisions whether to fund a proposed project, vSTR activity predicted choices on a trial-bytrial basis and predicted future choices (Genevsky and Knutson, 2015;Genevsky et al., 2017). Likewise, vSTR activity has been shown to be the strongest predictor of response to advertisement (Venkatraman et al., 2015).
The timeline of processing events that is consistent with our results would start with concurrent activity in PPC and vSTR, where PPC extracts numeracy/magnitude information and vSTR extracts valuation information. Numeracy/magnitude information is fed into the vSTR to be utilized for valuation and value comparison. A decision is then formed when valuation information is compared between choices in the vSTR then fed forward to motor systems to implement the choice. This is consistent with PPC activity that has been shown to be more related to the numeric magnitude of options rather than value (Kanayet et al., 2014), which is consistent with localization of numeric representations (Cohen Kadosh et al., 2011). However, this specific interpretation is limited by the current study design, as numeric magnitude and monetary value were confounded. Additionally, lateralization of effects in the PPC appears to differ as a function of decision (left only) versus outcome (bilateral). During decision, gain and loss magnitudes need to be multiplied by their probabilities which is consistent with left lateralization of PPC activity and outcome evaluation only requires addition/subtraction to running totals which may be more related to bilateral PPC activity (Chochon et al., 1999). Future research designed to explore these effects would be needed to further delineate the potential for lateralization by numeric computation in the PPC.
Adding nuance to the literature regarding the role of the vmPFC for choice (Boorman et al., 2013; e.g., Padoa-Schioppa, 2011), our results T.R. Koscik et al. NeuroImage 214 (2020) 116764 indicate that vmPFC is unlikely to be necessary for the type of simple take/pass choices used in the current study. Indeed, our data are more consistent with a role for vmPFC in outcome evaluation rather than at decision. This result contrasts with other results in the literature that observed that vmPFC activity related to expected loss at decision, where take/pass gambles were not paid during fMRI scanning (further dissociating decisions from outcomes) (e.g., Tom et al., 2007). These results might be clarified by further examination of the differences between paradigms, for example, our task and that of Tom et al. (2007) differ in the difficulty of calculating the probabilistic relationship between expected values and outcomes (i.e., consistent 50/50 chances are potentially easier to calculate than variable probability values in the current design). Wunderlich et al. (2009a,b) observed that vmPFC encoded the expected value of chosen options (i.e., after an action has been selected) with a task where decisions and outcomes were probabilistically and temporally dissociated, which is consistent with our results in that vmPFC contributions to choice evaluation come after decisions are made. Activity in these regions is related to expected value at decision, trial-related activity in these regions predicts choice, and the indirect effect representing the mediation pathway from expected value through each of these brain regions to choice is significantly different from zero. Despite being highly correlated with activity in mediating regions, activity in visual cortex (D) does not mediate between expected value and choice. Moreover, neither left (E) nor right (F) ventromedial prefrontal cortex mediate decision. The leftmost column of graphs represent the relationship between expected value and neural activity at decision. The central column represents trial-related activity predicting choice. Shaded regions on left and centre columns represent 95% confidence limits. The rightmost column represents histograms of the posterior distribution of indirect effects, dashed lines indicate zero or no indirect effect, solid lines indicate the lower 95% confidence limit for each distribution, and darkly shaded distributions indicate effects significantly different from zero. These along with other experimental design differences highlight the need for decision-making paradigms that isolate components of value-based choice, and it underscores the notion that deployment of neural resources for decision-making is highly dependent on circumstances rather than a core region (or set of regions) always necessary for choice. Moreover, isolating components of value-based choice provides a potential base for our understanding on which we can scaffold more complex decision processes to work toward an understanding of multifaceted, real-world decision-making from the ground-up. Our data do not preclude the possibility that the vmPFC/OFC is necessary or sufficient for other value-based choices, e.g., when comparing options, when options vary in abstraction (food vs. money vs. points, etc.), or when time constraints are reduced (Jocham et al., 2014), though we have demonstrated that it is not necessary under all conditions. Further research comparing these differences in experimental manipulation within-subjects may help resolve this and other issues. Evidence of significant mediation is insufficient to determine a causal relationship, but can act to rule out models that are causally less plausible (i.e., non-significant results argue against causal mediation, though significant results do not prove causal mediation). When the causal chain of events is clearly specified, as is assumed to be the case for expected value to brain activity to choice, interpretation is straightforward. One limitation of the current multilevel mediation method is that when the causal chain is less clear, as in interregional mediation, interpretation is less clear cut and further research will be need to confirm these directional relationships. Our multilevel mediation method effectively constrains the set of possible models to a small set of plausible models which can then be compared using dynamic causal modelling. DCM is consistent with our interregional mediation analyses, and indicates that a unidirectional flow of information from PPC to vSTR which in turn flows to choice, is the  The convergence of our multilevel mediation models along with consistent dynamic causal modelling results, the value-based choice network, at least for the simple take/pass decisions in our task appears to be fully mediated by the ventral striatum. most likely model. Our data suggests that the vSTR and not the PPC is the most proximal value computation for choice. After accounting for vSTR activity, the relationship between PPC and choice is eliminated, suggesting that the vSTR fully mediates the pathway. These results conflict with the conclusion that because PPC activity relates to between-subjects differences in reaction time and activity in other value processing regions (Basten et al., 2010;Domenech et al., 2017;Rodriguez et al., 2015) that the PPC must mediate choice. In regard to decision-making, between-subjects differences in response time is not the correct level of analysis. Rather, trial-level changes in response times within-subjects reflect the aggregation processes necessary for decision-making. Moreover, previous analyses of relationships between PPC and value ignored their 'causal' direction. Future research could benefit from targeting trial-level differences in response time and apply the multilevel mediation methods demonstrated here to further probe the direction of interregional relationships. There is also uncertainty regarding the reliability of these measures. There is substantial measurement error associated with brain activity due to the physics underlying the BOLD response, and this likely results in underestimation of the mediation effect (Fritz et al., 2016). In addition, given the number of brain regions or neurons therein, it is likely that variables are omitted. The present approach does not preclude the possibility that activity in other regions might cause both activity in our regions of interest and choices. Omitting a variable may overestimate the mediated effect (Fritz et al., 2016). Independent, hypothesis-driven experiments specific to variables of interest are necessary to further elucidate the details of the value-based choice network.
Our data suggest that the vSTR implements the selection process most proximal to choice as demonstrated by mediating between expected value and choice as well as between activity in PPC and choice. This vSTR function is consistent with the view that dopamine activity in the vSTR influences action selection and modulation of choice behaviours in conjunction with the dorsal striatum (Frank and Claus, 2006;Salgado and Kaplitt, 2015). Furthermore, our data are consistent with the notion that the vSTR is a necessary common path between cortical and limbic value processing and the motor system (Groenewegen et al., 1996;Nicola, 2007) including influences on cortical motor systems and potential direct (non-cortical) influences on locomotor activity via midbrain regions (Haber et al., 1990;Hikosaka et al., 2000;Takakusaki et al., 2003;Takakusaki et al., 2004). Clarifying what this role is or what the relative contributions of dorsal and ventral striatum are will require further data.
Given the lack of obvious direct connections between PPC and vSTR, this pathway is likely multi-synaptic, further experiments will need to be designed that target these portion of the pathway. Finally, our results do not preclude the possibility that the vSTR is involved in evaluating all stimuli regardless of choice, or it might be inextricably linked to implementing value-based choice and active only when value-based choices are required. Future, hypothesis-driven research could leverage these multilevel mediation techniques to further elucidate the role of the vSTR in relation to value-based decision-making and illuminate its role in relation to action selection and motor output regions.

Conclusion
Our results show that neural activity in the VStr and PPC mediate the relationship between expected value and choice. Moreover, the VStr provides a final common path between neural representations of value and choice. In addition, we provide an application of linear and generalized multilevel modelling to functional neuroimaging to account for the hierarchically structured error inherent to functional neuroimaging, and demonstrate that mediation models can provide evidence consistent with causal interpretations of brain activation.

Funding sources
This work was supported by the National Science & Engineering Research Council (Canada) [CC 11,378 CFC 205,586 Fund 495,218,Fund 458,036].

Declaration of competing interest
The authors declare no competing financial interests. For dynamic causal modelling of interregional relationships, we included only models where vSTR mediated between PPC and choice, consistent with our multilevel mediation results. These models include: A) bidirectional PPC/vSTR connections, B) unidirectional connection from vSTR to PPC, C) unidirectional connection from PPC to vSTR, D) no connection between PPC and vSTR, E) bidirectional PPC/vSTR connections, and F) unidirectional connection from PPC to vSTR. In models A-D, both PPC and vSTR receive value-related information as input, but in models E and F only the PPC receives value information as input, which it relays to vSTR.