Decreased reward circuit connectivity during reward anticipation in major depression

Highlights • MDD patients showed decreased temporal difference-related ventral striatum activation during reward anticipation and delivery vs. controls.• In gPPI analyses, during reward anticipation, MDD patients showed decreased functional connectivity in the reward-network.• Exploratory analyses showed functional connectivity patterns were mainly apparent in the MDD group that used antidepressants.


Introduction
One of the core characteristics of major depressive disorder (MDD) is anhedonia, the inability to experience pleasure. Anhedonia affects approximately 37 % of individuals diagnosed with MDD (Pelizza and Ferrari, 2009). A dysfunction of the reward system is thought to comprise the neural basis of anhedonia (Der-Avakian and Markou, 2012;Pizzagalli, 2014;Treadway and Zald, 2011;Whitton et al., 2015). The presence of anhedonia has been found to predict poor treatment response in MDD patients (Spijker et al., 2001;Uher et al., 2012), and impairments in reward-relates processes appear to be insufficiently addressed by current treatments (Calabrese et al., 2014).
Neural reward processing has been related to phasic firing of dopaminergic neurons (Schultz, 1998;Tobler et al., 2005) and is often studied with the Monetary Incentive Delay (MID) task. The MID gives the participant an opportunity to either win or lose rewards based on how fast they respond to a target, and covers two phases of reward processing: anticipation versus delivery (Knutson et al., 2001). In incentive trials, dopamine activity is dependent on the combination of reward anticipation (expectancy) and the subsequent delivery (i.e., consumption or outcome) of the reward. When a reward is anticipated but omitted, there is a decrease in dopaminergic firing (referred to as a negative prediction error (PE)) whereas a phasic burst of dopamine (i.e., positive PE) is observed when the reward delivery is better than expected (Schultz, 1998). Positive and negative PEs can be used as parametric modulators in order to reflect the magnitude of dopaminergic activation. PEs have been predominantly used in fMRI related reinforcement learning models in order to capture reward learning signals (Dombrovski et al., 2015;Geugies et al., 2019;Gradin et al., 2011;Kumar et al., 2008;Rothkirch et al., 2017). However, PEs also exist in incentive fMRI tasks without an explicit learning compound like (card-) guessing tasks or the MID task (Chase et al., 2013;Staudinger et al., 2009;Ubl et al., 2015;Yacubian et al., 2006;Cao et al., 2019), although PEs here are often not distinctively examined.
Whereas studies contrasting experimental conditions from incentive tasks have revealed important information about the neural correlates of reward processing, temporal difference modeling of reward-related PEsignals might give a more accurate representation of the reward system (Staudinger et al., 2009). So far, only few studies investigated rewardrelated PE signaling in depression. Reinforcement learning studies found increased activation of the VTA (Geugies et al., 2019;Kumar et al., 2008) and decreased VS (Gradin et al., 2011;Kumar et al., 2008) and DS (Gradin et al., 2011) activity in (remitted) MDD. Reward expectancy studies revealed reduced frontal and striatal activity during anticipation of gain (Chase et al., 2013;Ubl et al., 2015) and losses (Ubl et al., 2015) in MDD. Moreover, these altered reward-related processes in depressed individuals seems to be substantially associated with anhedonia. Several studies report a negative correlation between anhedonia and basic reward activity in the VS (Der-Avakian and Markou, 2012), as well as temporal difference-related VS activity (Rothkirch et al., 2017), during reward processing in MDD. However, one recent study found that higher anhedonia was associated with higher VS activity during anticipation in MDD (Ubl et al., 2015).
Despite these promising insights regarding neural correlates, there is evidence that MDD is associated with alterations in connectivity between components of the reward circuitry in addition to dysfunction of individual brain areas (Admon et al., 2015b). Admon and colleagues found decreased connectivity between the caudate (i.e. DS) and the dorsal ACC in response to monetary loss outcome and increased connectivity between these two regions in response to monetary gain outcome in MDD patients (Admon et al., 2015b). In line with this finding, Dombrovski and colleagues demonstrated disrupted connectivity between the DS and prefrontal cortex during probabilistic reversal learning in patients with late-life depression (Dombrovski et al., 2015). Despite these interesting findings, it remains largely unexplored if alterations in connectivity between other elements of the reward circuitry besides the DS (i.e. the VS and VTA), exist and whether these alterations can be linked to depression. Therefore, this study aimed to (I) investigate PE-related striatal and VTA activation in MDD in response to anticipation and delivery of monetary rewards (providing a more accurate representation of the reward system), and explore the association with anhedonia. Furthermore, we (II) also wanted to investigate, with an exploratory approach, whether MDD is characterized by alterations in connectivity within the reward circuitry, by looking at abnormal striatal (VS and DS) and VTA connectivity in response to rewards. In line with the literature, we expected reduced PE-related activity in MDD patients compared to healthy controls (HC) in the VS (Kumar et al., 2008;Pizzagalli et al., 2009) and DS (Admon et al., 2015b;Pizzagalli et al., 2009) and increased activation of the VTA (Kumar et al., 2008) during both reward anticipation and outcome. In addition, we expected a negative correlation between reward activity and anhedonia severity during reward processing (Der-Avakian and Markou, 2012;Rothkirch et al., 2017). Moreover, decreased reward-circuitry connectivity in MDD patients compared to HC was expected for the VTA, the VS and the DS (Admon et al., 2015b).

Participants
Data was derived from the Depression In the Picture (DIP) neuroimaging study conducted at the University Medical Center Groningen investigating the neural correlates of depression. Permission for the study was obtained from the local ethics committee and written informed consent obtained from all participants. Twenty-four MDD patients were recruited through specialized mental health care institutions and advertisements at the participating institutions and satisfied the following criteria: (1) presence of at least mild depressive symptoms defined as a Beck Depression Inventory (BDI-II) (Beck et al., 1996) score > 13 at screening, (2) current depressive disorder diagnosis according to the MiniScan (Nienhuis et al., 2010), administered by trained postgraduate students, and 3) age ≥ 18 years. Twenty-four ageand sex-matched HC were recruited by means of advertisements at public places and in local newspapers. Inclusion criterion for HC was a BDI-II < 9 and HCs were excluded if there was a personal history of psychiatric disorders. General exclusion criteria for both groups were: (1) a current or lifetime diagnosis of drug dependence, excluding nicotine dependence or history of alcohol dependence/abuse, (2) current neurological problems that may interfere with task performance, (3) inadequate comprehension of the Dutch language, (4) MRI contraindications such as metal implants, (5) presence of any cardiovascular disease. Exclusion criteria specific for MDD patients were: (1) presence of current or lifetime psychiatric disorders other than MDD or anxiety disorders, (2) concrete suicidal plans, (3) psychotropic medication use other than SSRI/SNRI/TCA or infrequent benzodiazepine use.

Task
After a short practice run before scanning, participants performed a monetary incentive delay (MID) task to asses reward processing. The task was a shortened version of the task design previously described by Pizzagalli et al. (2009). The task consisted of 4 blocks of 13 trials with a total of 20 reward trials, 20 neutral trials, and 12 loss trials. Each trial consisted of the presentation of a cue (+€ / ±€ / − € indicating a reward, neutral or loss trial), a target presentation (blue square), and reward feedback (i.e., +€1.85). Cues and feedback were presented for 1.5 s and the target was presented for a fixed duration of 0.5 s. Monetary outcomes trials varied for successful reward (+€1.75, +€1.85, +€1.95 and +€2.05) and loss (− €1.60, − €1.70 and − €1.80), but were fixed at +€0.00 for non-reward and neutral trials. We used fixed reward success rates because monetary outcomes were determined by task order (not response time). The 80 % reward success rate was chosen to ensure that task activation would be detectable for both anticipation and outcome phases of the task. At very high reward success rates, task activation will be much stronger in the anticipation phase and sensitivity is lost in the outcome phase. However, when reward success rates are at chance level, task activation is expected to be much stronger in the outcome phase and by limited strength of the learned association between cue and outcome, sensitivity is lost in the anticipation phase (Schultz, 2007). Unsuccessful reward trials ensured sufficient variability in reward prediction errors. The inter-stimulus interval varied between trials (inter-stimulus interval between cue and target: 3.5 s -9.5 s; inter-stimulus interval between target and feedback: 2.5 s -8.5 s) to prevent expectancy effects, as was the duration of the fixation cross presented between trials (3 s -7 s). Stimuli were presented in E-prime 2 (Psychology Software Tools, Pittsburgh, PA). Given our aims, neural correlates of loss trials were not examined, but maintained for comparability with previous MID studies and to prevent participants from associating neutral trials with a loss experience. Participants were instructed to press the button on an MRIcompatible button box as quickly as possible after target presentation on each trial, in order to maximize their chances of obtaining a reward. If a participant neglected to press the button, no reward could be obtained for that trial. Reward success rates were fixed at 80 % to ensure a total obtained reward of €10 per participant. This reward was added to the financial compensation for participation, to increase motivation of the participants.

Temporal difference learning model
In order to parametrically modulate fMRI signals, PEs after (repeated) rewards and during (unexpected) non-rewards were computed for the time series of stimuli. Unexpected non-rewards occurred when the button was pressed on time but no reward was obtained. The calculation of temporal difference PEs for all trials was derived from Staudinger and colleagues (Staudinger et al., 2009). This model defines a reward expectation EV that was defined as: where m is corresponding to the expected gain and p is the gain probability. As expected gain we chose average win and loss values from the practice run. The gain probability was set to 0.8 as 80 % of the reward trials resulted in an actual win and the other 20 % in an omission.
The PE was defined as: where R is corresponding to the amount of reward that was actually received.

Analysis of sample characteristics
Sample characteristics and behavioral data was analyzed in SPSS package v22.0 (SPSS Inc., USA). We used independent samples t-tests, χ2-tests and non-parametric Mann-Whitney U test to compare demographic and clinical variables between MDD patients and the HC group.

Analysis of behavioral data
For anhedonia scores, we used a subscale measurement of the Beck Depression Inventory (loss of pleasure, interest, energy and libido; (Pizzagalli et al., 2009)). We used repeated measures analysis of variance to examine main effects of group (MDD and control) and condition (reward and neutral) and a group × condition interaction with reaction times as dependent variable.

Imaging data
Pre-processing and analysis were performed using SPM12 (http s://www.fil.ion.ucl.ac.uk/spm) implemented in Matlab R2013a (The MathWorks Inc., Natick, MA). First the PAR/REC files were converted to NIfTI format. Both structural and functional images were reoriented in AC-PC alignment. Functional images were realigned. To detect possible motion artefacts, framewise displacement (FD) was calculated (Power et al., 2012). Motion was deemed excessive when FD > 0.9 for a certain volume (Siegel et al., 2014). However, the number of volumes with excessive motion was minimal (<10 %) for all participants. Median FD was 0.133 (IQR 0.036) for MDD patients and 0.127 (IQR: 0.039) for HC. We observed no significant difference in FD between the MDD patients and HC group (p = 0.224). Functional images were co-registered to the structural T1 images. All images were spatially normalized to Montreal Neurological Institute (MNI) space. Finally, all images were smoothed using an 8 mm Full Width Half Maximum Gaussian kernel.

Temporal difference-related activity
For each participant, first-level hemodynamic responses for the different conditions were modelled with general linear models. Reward anticipation, reward delivery, neutral anticipation, neutral delivery, loss anticipation and loss delivery were defined as regressors. Onset times were for the anticipation: start of cue presentation and for delivery: start of feedback. Durations for all regressors were 0.75 s. E-prime log files were used to extract onset times and durations. In separate GLMs, prediction errors were entered into the model as parametric modulators. The parametric modulation regressor was mean-corrected by SPM to be orthogonal to the main outome regressor (Staudinger et al., 2009). Low frequency noise was removed via a high pass filter (128 s). Furthermore, realignment parameters, their first derivatives and FD calculations were added to the model to address residual movement not corrected by realignment. For all participants, separate first-level contrasts for the total temporal difference-related activation (RewardAnticipation + RewardDelivery > NeutralAnticipation + NeutralDelivery) and for reward anticipation (RewardAnticipation > NeutralAnticipation) and reward delivery (RewardDelivery > NeutralDelivery) were defined and taken to second level.
A priori regions of interest (ROI) were the striatum (caudate and nucleus accumbens) and VTA. ROI selection was based on the Reinforcement Learning Atlas (Pauli et al., 2018). Accordingly, the boundaries of the Caudate (dorsal striatum) are clear with its tail traveling caudally and ventrally around the lateral ventricle. Exception is its ventral boundary with the nucleus accumbens (ventral striatum). We used the definition that the caudal limit of the NAC coincides with the appearance of the anterior commissure in coronal sections. The VTA lies ventral to the raphe nuclei (RN) at the ventromedial limit of the parabrachial pigmented nucleus (PBP) in coronal sections. Rostrocaudally, the VTA extends from the approximate rostrocaudal midpoint of the RN to just beyond the caudal limit of the RN (coronal sections). The boundary with the RN is a well defined and explicit, but the transition from PBP to VTA are more implicit (Pauli et al., 2018).
At second-level, we used a one sample t-test to investigate main effects of task (RewardAnticipation + RewardDelivery > Neutral contrast). Main effect images were thresholded at P < 0.001 uncorrected. We used independent two-sample t-tests to determine group differences. As we had clear a priori regions of interest, a small volume correction (SVC) was applied with significance defined as P < 0.05 FWE corrected.

Generalized psycho-physiological interaction (gPPI) analysis
We investigated group differences in temporal difference-related connectivity during the reward task with a generalized psychophysiological interaction (gPPI) analysis (McLaren et al., 2012) with VTA, ventral striatum and dorsal striatum as seeds during reward versus neutral, both in anticipation and delivery. The seeds were extracted from the Reinforcement Learning Atlas (Pauli et al., 2018) and were resliced to match the dimensions of the functional data. On first level, separate gPPI models for each seed were estimated for each participant. Each first level model contained regressors for the task conditions, one regressor for the seed, and regressors for the seed × condition interaction. Furthermore, realignment parameters, their first derivatives and FD calculations were added to the model to address residual movement not corrected by realignment. Effects for the obtained interaction variable were convolved using a canonical hemodynamic response function (HRF). For all participants, first-level contrasts for reward anticipation (RewardAnticipation > NeutralAnticipation) and reward delivery (RewardDelivery > NeutralDelivery) separate were defined and taken to second level (so no contrast with TD related activity was taken). On second level, we used independent two-sample t-tests to determine group differences. An initial threshold was set to P < 0.001 uncorrected (voxel level), where group differences were defined significant at P < 0.017 (Bonferroni correction P 0.05/3 ROIs), FWE cluster-level corrected.
In order to interpret temporal difference-related activation and connectivity findings, we also investigated correlations with anhedonia with separate multiple regression analyses with temporal differencerelated activation signal and connectivity findings respectively as the dependent variable, while anhedonia scores, group and the group*anhedonia interaction were examined.

Exploratory analysis: Effect of medication
Because of two recent meta-analyses that indicate that some types of antidepressants may have a small positive effect on cognitive functioning (Keefe et al., 2014;Rosenblat et al., 2015), we chose to do an exploratory analysis by splitting up the patient group into a medication free group N = 14) and an antidepressant using MDD group (MDDmed+, N = 10) in order to rule out any medication effects on the results.

Sample characteristics
No significant differences were observed between MDD patients and HC (Table 1). The exploratory analysis with three groups (HC vs MDD with/without medication) also revealed no significant differences between groups (Table 1).

Behavioral results
We observed no significant differences in reaction times between the two groups (MDD versus HC) and observed no significant group × condition interaction (Fig. 1). There was a main effect of condition (F (2,92) = 10.79, P < 0.001). Post-hoc least significant difference (LSD) comparisons revealed that all participants reacted significantly faster to reward trials and to loss trials compared to neutral trials (P = 0.000 and P = 0.019, respectively).

Main effect of task
We found a main effect of task in reward related areas, especially when we incorporated the parametric modulation of the BOLD-response using the prediction errors (Fig. 2).

Temporal difference-related activity results
We found a trend towards decreased temporal difference-related activation in the VS in MDD patients compared to HC during reward anticipation and delivery combined (P FWE,SVC = 0.052, Cohen's d = 0.82, Table 2, Fig. 3). Cohen's d for the whole right VS (36 voxels) was 0.56.

Functional connectivity (gPPI) results HC vs MDD
Our gPPI analyses revealed that during reward anticipation, MDD    patients exhibited decreased functional connectivity between the VS and precuneus/superior occipital gyrus/cerebellum, angular/middle orbital gyrus, superior/middle frontal gyrus/medial prefrontal cortex (mPFC)/ ACC, superior/middle temporal gyrus and left insula compared to HC (Table 3, Fig. 4). Moreover, MDD patients showed decreased functional connectivity between the VTA and left insula compared to HC during reward anticipation (Table 3, Fig. 5). No group differences were found for the DS seed. For all seeds, no group differences were found in functional connectivity during reward delivery.

Exploratory analysis: effect of medication
When separating the medication free (MDDmed-) from the antidepressant using MDD patients (MDDmed + ) we found no significant effect of medication status on the temporal difference-related activity in ventral striatum or caudate (data available on request). For the gPPI, we found that compared to HC, MDDmed + and MDDmed-patients showed decreased functional connectivity between the VS and precuneus and areas of the occipital lobe during reward anticipation (Table 4). Moreover, there was a trend in decreased functional connectivity between the VS and mPFC, insula and thalamus (p = 0.025-0.063) in the MDD + patients. This trend was not visible in MDD-patients (all p > 0.187). Furthermore, MDDmed + patients showed decreased functional connectivity between the VTA and left insula compared to HC during reward anticipation (Table 4). No group differences were found in   functional connectivity during reward delivery.

Correlation with anhedonia
We found no correlation between temporal difference-related reward activation/connectivity and anhedonia scores, neither when the analysis was corrected for age and gender.

Discussion
The present study explored temporal difference-related responses of the reward system during a monetary incentive delay task. We demonstrated that parametric modulation of the BOLD-response with prediction errors optimizes monetary incentive task activation. Using the temporal difference, we found decreased temporal difference-related activation in the VS in MDD patients compared to HC during reward anticipation and delivery combined. We found no group differences in temporal difference-related VTA activation. Secondly, we exploratory investigated connectivity between reward circuitry brain areas with gPPI. We revealed that during reward anticipation, MDD patients exhibited an overall decrease in reward circuit connectivity compared to HC. Exploratory analysis separating medication free patients from patients using antidepressants revealed decreased functional connectivity between VTA and left insula in the MDD group that used antidepressants, with an additional trendwise decrease in functional connectivity between the VS and mPFC, insula and thalamus. Of note, all group differences were not related to the reward delivery condition, suggesting that these results are specific to reward anticipation.
The decrease in temporal difference-related activation in the VS is supported by a robust body of evidence showing decreased VS activation in MDD during basic reward processing (Pizzagalli et al., 2009;Ubl et al., 2015). Although our results have to be interpreted with caution, as this effect narrowly missed statistical significance (P = 0.052 FWE/SVCcorrected), this finding is bolstered by the fact that it also replicates previous results specifically investigating temporal difference-related VS activation (Kumar et al., 2008). No differences in reaction times were observed between groups, indicating that fMRI findings were not confounded by differences between groups in task difficulties. A similar lack of group differences on behavioral responses has been reported before (Knutson et al., 2008;Pizzagalli et al., 2009;Ubl et al., 2015).
Impaired reward functioning as a construct relevant for depression is further corroborated by our gPPI findings of decreased functional connectivity between the reward system and several other brain areas including the insula. It should be noted that the good fit of our gPPI analyses might be a result of relatively poor fit of the basic reward anticipation model. The insula has been linked to anticipating future rewards (Tanaka et al., 2004) and delayed gratification (Wittmann et al., 2007). Moreover, a recent meta-analysis of 42 studies has demonstrated functional connectivity between the VS and the insula (Cauda et al., 2011). This connectivity is critical in detecting salient external stimuli and adjust behavior to these incentives (Cho et al., 2013). Our observation of decreased VS-insula connectivity during anticipation of rewards in MDD suggests that MDD patients have difficulties in integrating salient information into motivational processes to shape behavior. Besides this involvement, insula activity also appears during PE encoding of reward (Haruno and Kawato, 2006;Jones et al., 2011), suggesting encoding of a salience PE (Gu et al., 2016;Metereau and Dreher, 2013). The decreased VTA-insula functional connectivity in MDD suggests an impairment in encoding these salience PEs.
We also found decreased functional connectivity between the VS and the mPFC and superior/middle frontal gyrus during reward anticipation in MDD patients. Animal studies provide fundamental evidence that the mPFC is part of the reward system and is involved in reward seeking and reward effort (Tzschentke, 2000). The mPFC receives dopaminergic projections from the VTA and sends glutamatergic projections back to the VTA and VS These functional interactions have been suggested to strongly modulate the mesocorticolimbic dopamine circuit (Tzschentke and Schmidt, 2000) and have been suggested to be specifically related to reward anticipation (Balleine et al., 2007;Knutson et al., 2001;Wittmann et al., 2007). Animal studies report that inactivation of the mPFC reduces the firing rate of VS neurons in response to reward-predictive cues (Ishikawa et al., 2008). Disrupted functional connectivity from the VS to the mPFC during anticipation could hamper activation of the mPFC, which in turn may decrease alter the feedback projections to the VTA and VS resulting in mesolimbic reward circuitry abnormalities. These current results substantiate the notion that dysfunctions in frontostriatal activity during reward anticipation are an important marker of MDD (Zhang et al., 2013).
Besides their role in the reward circuitry, the ACC/mPFC are, together with the precuneus, important areas of the default mode network (DMN). In healthy controls, functional connectivity has been reported between the VS and DMN regions including the precuneus and mPFC (Di Martino et al., 2008). A previous study in depressed individuals found that compared with controls, depressed subjects showed decreased connectivity between the precuneus/PCC and the striatum (Bluhm et al., 2009), which is in line with the current results. The DMN has been found to support internal mental activity and is also critical for self-relevance and self-referential processing (Raichle, 2015). It is possible that decreased VS-DMN connectivity causes an impairment in assigning salience to external and internal stimuli, potentially leading to aberrant salience.
Our analyses also revealed decreased functional connectivity between the VS and the cerebellum during reward anticipation in MDD  patients. Interestingly several recent studies have pointed to the important connection between the cerebellum and striatum during the anticipation of reward. Animal studies provided insight in the role of cerebellum in the encoding of reward which may serve to be integrated in goal directed behavior (Wagner et al., 2017) and there are also direct projections from the cerebellum to reward centers in the brain (Carta et al., 2019). In humans there is also accumulating evidence that the cerebellum is active during reward anticipation phase of the monetary incentive delay tasks (Wilson et al., 2018), while there are reciprocal connections between cerebellum and striatum (for a review see Milardi et al., 2019). It is possible that decreased VS-cerebellar connectivity impairs the flexible integration of reward and cognitive resources in the context of motivated behavior in depression. Schutter (2016) has already pointed to the important role of the cerebellum in depression, via its contributing role to reward-related predictive coding in the service of homeostasis. When separating the medication free patients from the patients using antidepressants, we found that the decreased connectivity patterns were mainly apparent in the MDD group that used antidepressants. Given the association between antidepressant use and diminished neural responses of the reward system (McCabe et al., 2010), and the suggestion that SSRI treatment blunts dopaminergic activity, explaining symptoms such as anhedonia and affective blunting (Goodwin et al., 2017), it can be argued that reward related connectivity may be affected by antidepressant treatment, however, this remains entirely speculative.
No differences between groups were observed in temporal difference-related activity during reward delivery. This finding is in line with studies by Stoy and colleagues (Stoy et al., 2012) and Ubl and colleagues (Ubl et al., 2015), who also report depression related dysfunctions during reward anticipation but not during the receipt of reward. Given that other studies report decreased fronto-cingulatestriatal activation during the reward delivery phase (Forbes et al., 2009;Knutson et al., 2008;Pizzagalli et al., 2009), and considering the modest sample size of our study, our null findings should be interpreted with caution. Future studies should reveal the extent of dysfunctions during reward delivery in MDD.
The present study did not identify a correlation between brain activation/connectivity of the reward system and hedonic capacity. This lack of an association is in contrast to other papers (Chase et al., 2010;Ubl et al., 2015). However, differences in task paradigms and anhedonia questionnaires might account for these differences. E.g., Chase and colleagues (Chase et al., 2010) used a probabilistic selection task and Ubl and colleagues (Ubl et al., 2015) employed a modified version of the MID task we used. In both studies hedonic capacity was measured with the Snaith Hamilton Pleasure Scale (SHAPS), while we assessed anhedonia with the BDI anhedonia subscore, which resulted in a narrow range of anhedonia scores. The SHAPS embodies a more extensive measurement of consummatory anhedonia which may have been more sensitive in mapping anhedonia levels.

Strengths and limitations
The current design enabled us to explore functional connectivity alterations in the reward circuitry, which is a novel feature compared to measuring altered activity of reward related brain areas during reward processing, as supported by previous work (Admon et al., 2015a;Pizzagalli et al., 2009;Smoski et al., 2009;Zhang et al., 2013). Furthermore, this study is novel in modeling temporal difference signals in a MID task which might give a more accurate representation of rewardrelated brain activity and connectivity. Nevertheless, potential limitations exist. First, no temporal difference-related VTA task activity was found. The nature of the task used in this study may account for the absence of temporal difference-related activity in the VTA. Traditionally, the MID task has been designed to investigate changes in neural activity in response to basic processing of reward. Activation in the VTA, elicited from firing of dopaminergic neurons during reward-related learning, is most likely best reflected by a classical conditioning paradigm, for example used by Kumar et al. (2008). Second, in our approach we restricted our analyses to a previously validated TD-model of reward anticipation (Staudinger et al., 2009), however more recent drift diffusion models like a complete serial compound (SCS) TD model might show better fit of the data (Luzardo, 2018). Third, ten out of twenty-four MDD patients were receiving antidepressants at time of scanning. Splitting up the patient group into two group in order to rule out any medication effects on the results, showed detrimental effects of antidepressants on reward processing. However, this resulted in small sample sizes per subgroup. Interpretation of these results should therefore be done with caution until they can be replicated in larger samples. Third, the overall sample size of our study with 48 participants in total, and 24 per condition, is generally at the lower end of the spectrum for a neuroimaging study and therefor runs the risk of type 2 error, i.e. false negative findings. Because, however, our study has a clear theoretical framework, clear hypothesis, and we found robust task activations (Fig. 2), we are confident in the findings of our study.

Conclusion
The present study showed that MDD is characterized rather by alterations in reward circuit connectivity than isolated activation impairments in brain areas underlying the reward-system. These findings represent an important extension of the existing literature since improved understanding of neural pathways underlying depressionrelated reward dysfunctions, may help currently unmet diagnostic and therapeutic efforts. The finding that antidepressants might decrease connectivity in the reward-system requires future research with primary interest in the effects of antidepressants in larger samples.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.