Orbitostriatal encoding of reward delayed gratification and impulsivity in chronic pain

Central robust network functional rearrangement is a characteristic of several neurological conditions, including chronic pain. Preclinical and clinical studies have shown the importance of pain-induced dysfunction in both orbitofrontal cortex (OFC) and nucleus accumbens (NAc) brain regions for the emergence of cognitive deficits. Outcome information processing recruits the orbitostriatal circuitry, a pivotal pathway regarding context-dependent reward value encoding. The current literature reveals the existence of structural and functional changes in the orbitostriatal crosstalk in chronic pain conditions, which have emerged as a possible underlying cause for reward and time discrimination impairments observed in individuals affected by such disturbances. However, more comprehensive investigations are needed to elucidate the underlying disturbances that underpin disease development. In this review article, we aim to provide a comprehensive view of the orbitostriatal mechanisms underlying time-reward dependent behaviors, and integrate previous findings on local and network malplasticity under the framework of the chronic pain sphere.


Introduction
Chronic pain adversely disrupts the daily lives of patients and continues to impose a burden on modern-day societies.With no survival benefits, chronic pain instead entraps individuals in an unceasing cycle characterized by catastrophizing and suffering, profoundly eroding their quality of life (Elman and Borsook, 2018;Woolf, 2011).The myriad of animal and human studies has fostered a deeper comprehension of how pain hijacks multiple neurophysiological mechanisms, ultimately leading to a spectrum of cognitive impairmentsparticularly manifesting as deficits in decision-making (Bushnell et al., 2013;Dourado et al., 2016;Leite-Almeida et al., 2009;Moriarty et al., 2017), attention (Bushnell et al., 2013;Moriarty et al., 2017), learning (Bushnell et al., 2013), and memory (Alemi et al., 2023;Cardoso-Cruz et al., 2013;Cardoso-Cruz et al., 2019;Leite-Almeida et al., 2009).The capacity to adapt our behavior within an ever-changing environment hinges on the continuous maintenance, manipulation, and updating of information.Consequently, organisms strive to optimize their success by ensuring that their decisions yield the most desirable and advantageous outcomes across both short-term and long-term scales.Integrated value, specifically the intricate interplay between time and reward, operates as currency for ensuring optimal decision-making (Hirokawa et al., 2019).However, chronic pain is considered to disturb this neural trade-off by rendering previously rewarding stimuli less gratifying, diminishing the attractiveness of delayed rewards, and amplifying the cost associated with waiting for postponed rewards to an excessive degree.This shift in preference tilts the balance towards instant gratification (Borsook et al., 2016;Martucci et al., 2018).
The orbitofrontal cortex (OFC) is a key region participating in numerous cognitive processes, including associative learning (Izquierdo, 2017;O'Doherty et al., 2017;Stalnaker et al., 2015), expectation representation (Izquierdo, 2017;Rudebeck and Murray, 2014), and emotional risk assessment (Pais-Vieira et al., 2009;Rolls, 2019).In addition, the OFC also contextualizes pain levels, integrating them with rewarding events and corresponding emotional states (Rolls, 2023;Wakaizumi et al., 2019).The intersection of emotional and reward frameworks within the OFC makes this region a hallmark for optimizing outcomes and discerning valence.Moreover, there is evidence indicating alterations in orbitostriatal communication strength in chronic pain conditions (Chang et al., 2014;Ong et al., 2019).Given its strong connection with gratification, the OFC exerts top-down modulatory effects over reward-related areas, such as the nucleus accumbens (NAc), to determine the selection and sustainment of the contextually most advantageous options (Knutson et al., 2001;Stopper and Floresco, 2011).The NAc complex compiles and integrates information from cortical, temporal, and limbic regions, thereby facilitating efficient reward-seeking behaviors (Salgado and Kaplitt, 2015;West et al., 2018).Beyond its primary role as a pleasure center, the NAc is also engaged in pain modulation and the processing of pain-related emotional events (Harris and Peng, 2020).Collectively, the orbitostriatal circuitry can therefore be considered a dominant neural underpinning of integrated value encoding.
The core focus of this short review centers on the OFC-to-NAc circuit, encompassing its structural and functional connections, serving as a pivotal interface for the emergence of time-reward cognitive impairments within chronic pain states.The initial section will delve into the distinct contributions of the OFC and NAc to reward valence and temporal encoding, followed by an integrated overview of orbitostriatal circuitry-related disturbances in the context of chronic pain.

Orbitofrontal cortex
The OFC is a crucial brain area in emotional and reward processing where multiple external sensory information and reward information converge.Located in the ventral portion of the prefrontal cortex (PFC), the OFC in human and other primates includes brodmann areas 11, 47/ 12 and 13 ( Öngür and Price, 2000).Although functionally related, it can be distinguished from other regions of the PFC, such as the dorsolateral PFC, the ventrolateral PFC and medial PFC, through neural connections and participation in specific functions ( Öngür and Price, 2000).Conversely, the OFC in rodents is located in the dorsal bank of the rhinal sulcus and has completely agranular properties (Izquierdo, 2017).Despite the differences between species, its location and connectivity suggest the rat OFC is partially homologous to non-human primates OFC ( Öngür and Price, 2000;Price, 2007;Rudebeck and Rich, 2018).Furthermore, the OFC can be subdivided into different cytoarchitectonic regions with specific functional differences (Izquierdo, 2017).Several animal neurotracing studies have elegantly demonstrated the variety of anatomic and neurochemical afferents to the OFC from cortical and subcortical regions (Fig. 1) (Barreiros et al., 2021;Cavada et al., 2000;Morecraft et al., 1992;Murphy and Deutch, 2018).These inputs provide representations of the stimuli's identities, independent of its reward value, and include the amygdala, prelimbic and infralimbic cortices, hypothalamus, thalamus, pyriform cortex, inferior temporal cortex, somatosensory cortex, and insula.Additionally, midbrain dopaminergic and non-dopaminergic neurons also innervate the OFC and medial PFC without collateralization (Murphy and Deutch, 2018).This supports the observations of selectivity concerning distinct cognitive components, such as the significant role of the medial PFC in working memory maintenance/manipulation mechanisms, and the involvement of the OFC in cognitive judgement bias and reward valuation (Goldman-Rakic, 1995;Golebiowska and Rygula, 2017).Conversely, the OFC neural output is modulated by a local inhibitory network composed of fastspiking GABAergic interneurons (Varga et al., 2017;Wright et al., 2021).In addition, glutamatergic projections originating from the OFC densely target regions such as the olfactory tubercle, medial PFC, NAc, amygdala, thalamus, hypothalamus, periaqueductal gray, laterodorsal tegmentum and ventral tegmental area (VTA) (Fig. 1) (Hoover and Vertes, 2011).This dense convergence of limbic, sensory and motor networks reinforces the role of the OFC in action-outcome learning and emotional regulation (Rudebeck and Murray, 2014).Based on its unique positioning, the OFC serves as a hub for encoding information related to emotional states, time-reward dependencies, and outcome expectancy (Rolls, 2023).It accomplishes this by transmitting updates to key structures, signalling the current valuable options (Roesch and Olson, 2004;Rolls, 2023;Schoenbaum and Roesch, 2005;Simon et al., 2015).Over time, distinct neuronal ensembles within the OFC refine their firing patterns in anticipation of receiving desired or undesired outcomes, while also begin firing when predictive cues are exhibited (Roesch et al., 2007;Schoenbaum et al., 2003;Schoenbaum et al., 2009;Takahashi et al., 2009) Several rodent studies have shown that lesioning the OFC or inducing transient/permanent inactivation results in reduced risk assessment (Barrus et al., 2017;Pais-Vieira et al., 2007), preference for high-magnitude rewards (Mar et al., 2011;Pais-Vieira et al., 2007), impaired reversal learning (Dalton et al., 2016;Winstanley et al., 2004), and diminished confidence based on waiting time without disruption of choice accuracy (Lak et al., 2014;Miyazaki et al., 2020).Animals with OFC lesions also exhibit an inability to devalue reinforcers, indicating deficits in accessing the newly updated value of cue-evoked rewards (Gallagher et al., 1999;Izquierdo et al., 2004;Schoenbaum et al., 2003).However, contrasting findings have suggested that OFC damage can lead to risk aversion and a preference for smaller and immediate rewards (Orsini et al., 2015;Sellitto et al., 2010).These divergent observations might be attributed to the specific OFC region targeted (e.g.medial, ventral or lateral), the type of task employed (such as delay discounting, appetitive or monetary risky decision-making), and the timing of inactivation.Together, these complexities could offer insights into the multifaceted and time-sensitive functions of the OFC within varying contexts.Furthermore, OFC is considered to play a role in predicting errors by sharing updates on outcome value information with other brain regions, such as the VTA and the NAc (Stalnaker et al., 2018).By performing computations that compare expected versus actual outcome values and maintaining a continuous representation of subjective reward value, animals adjust their behavioral expression to optimally align with changing environmental conditions.This hypothesis may help explain behavioral flexibility impairments observed in studies involving OFC lesions or inactivation, which encompass deficits in tasks requiring reversal learning based on time-reward dependencies (Mar et al., 2011;Schoenbaum et al., 2009;Takahashi et al., 2009).The link between stimuli and their corresponding emotional states can also exert direct modulation over goal-directed behaviors (Rolls, 2023).Constraints related to the timing and magnitude of reward delivery, combined with the anticipated affective state induced by the reward, may impact the decision to opt for immediate rewards or to postpone immediate action in favour of larger, delayed rewards.In this context, the OFC has been observed to become activated during the assessment of value, choice, and expectancy of present and past events (Kimmel et al., 2020;Sosa et al., 2021).

Nucleus accumbens
The NAc is a subcortical structure significantly linked with rewardseeking behaviors, reward encoding, motivation, and emotional processing (Day et al., 2011).The NAc is primarily composed of GABAergic medium spiny neurons (MSNs) that contain either D1 or D2 receptors (D1R and D2R) (Kauer and Malenka, 2007; Soares-Cunha et al., 2020), and a small fraction of cholinergic interneuron populations (Meredith et al., 1993).Traditionally, D1-and D2-MSN were categorized as participants in direct (reward encoding) and indirect (aversion encoding) pathways respectively.However, recent research has offered an updated perspective on this dichotomy, suggesting that both D1-and D2-MSNs can encode reward and aversion, with both types involved in direct and indirect pathways of information transmission to the thalamus (Klawonn and Malenka, 2018;Soares-Cunha et al., 2020;Soares-Cunha et al., 2022).Previous research portrays the NAc as the primary input kernel of the basal ganglia.Animal studies employing retrograde, anterograde and immunohistochemistry methods have unveiled several afferent connections (along with their respective neurotransmitters) to this region (Li et al., 2018;Wright et al., 1996).These connections encompass the PFC, hippocampus, VTA, locus coeruleus, motor and sensory cortices, laterodorsal tegmentum, habenula, amygdala, thalamus, hypothalamus, substantia nigra pars compacta, and dorsal raphe nuclei (Fig. 1) (Li et al., 2018;Phillipson and Griffiths, 1985;Salgado and Kaplitt, 2015).Conversely, the projecting MSNs primarily establish connections with other subcortical regions, including the ventral pallidum, bed nucleus striata terminalis, and amygdala; as well as several diencephalon regions, including the thalamus, hypothalamus, and habenula (Salgado and Kaplitt, 2015).Moreover, the NAc output also extends to midbrain areas such as VTA and substantia nigra, as well as brain stem areas like the pedunculopontine nucleus (Fig. 1) (Salgado and Kaplitt, 2015;Williams et al., 1977) Functional and cytoarchitectural investigations have led to the division of the NAc into two distinct subregions: the core area (NAcC) and the shell area (NAcSh) (Baliki et al., 2013;Richard et al., 2013).Differences in the innervation and projections of NAcC and NAcSh have been documented, encompassing variations in the origin and target, as well as the density of inputs and outputs (Salgado and Kaplitt, 2015).Research indicates that while the NAcC is involved in regulating appropriate responses by assessing behavioral performance and effort costs during decision-making (Ghods-Sharifi and Floresco, 2010), the NAcSh plays a role in integrating and updating outcome information, responding differently to both rewarded and non-rewarded cues, as well as to changes in incentive value (Ambroggi et al., 2011;Floresco et al., 2008;West and Carelli, 2016).The effects of NAcC lesions and inactivation have yielded mixed results.Excitotoxic lesions have led to a reduced preference for high rewards as a function of time delay, while regional deactivation has resulted in a decrease in delay discounting (Cardinal et al., 2001;Moschak and Mitchell, 2014;Steele et al., 2018).Conversely, NAcSh selective lesion or inactivation resulted in attention impairments, excessive reinstatement of appetitive-related conditioned stimulus, reduction of impulse control, and impairments in waiting capacity in animals performing tasks such as the T-maze and 5-CSRTT (Dutta et al., 2021;Feja et al., 2014;Floresco et al., 2008).These findings support the hypothesis that the distinct afferents and efferents of the NAc subregions underlie different roles for each parcel.These roles could either complement the input from the other subregion for a specific role or be entirely selective to either NAcC or NAcSh (Bossert et al., 2007;Feja et al., 2014).Nonetheless, our understanding of specific roles of NAcC and NAcSh in delay discounting taskswhether through lesion, pharmacological blockade, or modulationis still limited.Only a few studies have directly compared the region-specific contributions of NAcC and NAcSh to time-reward dependency during decision-making on the same behavioral task (Feja et al., 2014;Pothuizen et al., 2005).

Orbitostriatal circuit in time-reward dependence and chronic pain
At cellular and network levels, the OFC sends monosynaptic glutamatergic projections to the NAc (both NAcC and NAcSh) (Hirokawa et al., 2019;Li et al., 2018).Through this connection, the OFC can exert influence on encoding reward value and outcomes by temporallyspecific activation of GABAergic ensembles in the NAc area (Knutson et al., 2001;Sesack and Grace, 2010;Stopper and Floresco, 2011).
Additionally, there is a strong functional connectivity shared between the OFC and the NAc (Chang et al., 2014).To best evaluate the interaction between these two areas, Jenni and colleagues used a pharmacological approach to sever communications, and elucidated the importance of OFC-NAc pathway in the maintenance of decision biases through transmission of reward history information.This OFC-NAc stabilization of task states may allow for facilitation of reward-and risk-related decision-making, and help discern optimal strategies to obtain rewards (Jenni et al., 2022).However, the modulatory drive from the OFC to the NAc competes also with inputs from other cortical areas, contributing to a fine power balance during information processing (Asher and Lodge, 2012;Jenni et al., 2022) The OFC connections with insular, anterior cingulate, somatosensory, and subcortical (namely the NAc) areas indicate a role in pain processing and modulation, as these are commonly reported as active during noxious stimulation (Apkarian et al., 2005).Human studies have shown that pain-inhibitory effects of rewards have been associated with increased OFC activity (Becker et al., 2017).In this regard, this mediator function is possibly due to the crosslink of information concerning pain value and importance, rather than noxious processing, shared between OFC and other brain regions (Winston et al., 2014).Moreover, recent clinical research has even proposed the OFC as a potential biomarker for chronic pain states (Shirvalkar et al., 2023).Individuals with chronic neuropathic pain presented temporally-specific and long-term stable OFC power differences between transient, evoked pain and sustained, spontaneous pain states (Shirvalkar et al., 2023).This can indicate that OFC greatly participates in integration of pain in chronic conditions, and could be used as a therapeutic target for prospect treatments.Furthermore, studies have demonstrated that selective activation of the OFC can lead to a reduction in anxio-depressive behaviors induced by neuropathic pain (Sheng et al., 2020).Due to its anatomical placement, the OFC can act as a gating system for subjective classification of stimuli in a pleasure-pain spectrum, participating in noxious information integration and transmission to the NAc for expression of adequate behavioral responses (Becker et al., 2017;Chang et al., 2014).Notably, rewarding or pleasurable stimuli (such as receiving monetary or appetitive rewards), have been shown in both humans and rats to exert an overriding effect on painful stimuli if they are deemed more valuable (Becker et al., 2013;Becker et al., 2017;Dum and Herz, 1984).The ultimate behavioral response is determined by the revised subjective value assigned to external stimuli, considering factors such as pleasure, relief, aversion, or pain (Rolls, 2004;Rolls, 2023).However, the lack of comprehensive studies evaluating the specific role of the OFC in pain processing mechanisms has constrained our current understanding.Therefore, further research is imperative to unveil the intricate involvement of the OFC in pain-related processes, including pain information integration and transmission to the NAc.
Apart from its well-established role in reward processing, the NAc has also been implicated in the assessment and encoding of persistent pain (Becerra et al., 2001;Becerra and Borsook, 2008;Harris and Peng, 2020;Makary et al., 2020).Due to the large local presence of opioid receptors, the NAc reacts heavily to painful stimuli (Altier and Stewart, 1999;Harris and Peng, 2020;Massaly et al., 2019;Navratilova et al., 2015;Skirzewski et al., 2022).Functional magnetic resonance imaging (fMRI) studies have revealed that the ventral striatum can exhibit distinct connectivity clusters during the processing of nociceptive and rewarding information (Baliki et al., 2010;Baliki et al., 2013).In tasks involving the perception of thermal pain, different subregions within the NAc also showed varied activity patterns: the NAcSh signalled impending pain, while activation of the NAcC was associated with the anticipation of the end of thermal pain (Baliki et al., 2013).At a broader network level, activation of D2R or local administration of lidocaine in rats led to a reduction in neuropathic pain-related behaviors (Sato et al., 2022).This finding suggests that the GABAergic MSN neurons in the NAc play a crucial role in pain modulation and the transmission of pain signals through descending pain pathways (Chang et al., 2014;Sato et al., 2022;Taylor et al., 2016).
Both the OFC and the NAc have been found to exhibit dysfunctional patterns and altered communications under chronic pain conditions (Fig. 2) (Chang et al., 2014;Ong et al., 2019).Persistent sensory nociceptive overload in rodent models has been shown to lead to an increase in GABAergic activity in the OFC and disrupt emotional decision-making processes (Huang et al., 2021;Pais-Vieira et al., 2009;Pais-Vieira et al., 2012).More specifically, chronic pain onset severely disrupted the encoding of reward magnitude and drastically diminished the fraction of risk-sensitive neurons, leading animals to alter risk preference on a gambling task (Pais-Vieira et al., 2012).In patients with chronic pain, there are significant structural and functional alterations in both brain regions.For instance, chronic pain has been associated with a decrease in grey matter volume in the OFC and NAc, compromising their neural integrity (Ong et al., 2019;Taylor et al., 2016), as well as an abnormal increase in functional connectivity between NAc and prefrontal regions (Baliki et al., 2012;Cardoso-Cruz et al., 2022).These changes in the NAc could potentially underlie the preference for immediate and smaller rewards observed in these individuals (Jenni et al., 2022).
Furthermore, chronic pain has also been shown to lead to local DA depletion in the OFC (Huang et al., 2021;Pais-Vieira et al., 2009) and NAc (Ren et al., 2016).DA activity in these brain regions is recognized as a primary substrate for the expression of goal-directed behaviors and the encoding of reward value (Borsook et al., 2016;Cetin et al., 2004;Martucci et al., 2018;Winstanley et al., 2006).In animal studies, it has been observed that DA depletion specifically in the OFC does not significantly affect the sensitivity to changes in probability for obtaining large/risky rewards, but it does impair the ability to respond to more delayed long-term rewards and leads to impulsive choices in rats (Kheramin et al., 2004;Mai and Hauber, 2015).This suggests that DA signalling in the OFC plays a crucial role in decision-making processes related to delayed reward outcomes.Conversely, humans and animals with chronic pain displayed altered DA signalling in the NAc in response to painful and rewarding stimuli (Kato et al., 2016;Martikainen et al., 2015;Wood et al., 2007).More specifically, DA release in the NAc elicited from pain relief gradually diminished after neuropathy onset in a time-dependent manner (Kato et al., 2016).The changes in DA availability could consequently explain the lower binding potential of DA receptors 2 and 3 in the ventral striatum of patients with chronic pain (Martikainen et al., 2015).This dysregulation in the mesolimbic system over time can potentially compromise several cognitive processes associated with encoding reward information, including delayed rewards (Kobayashi and Schultz, 2008;Saddoris et al., 2015).Since DA neurons increase their activity to reward retrieval with longer delays, the deficiency of DA in the OFC and NAc could be a significant factor contributing to impairments in evaluating time-reward relationships, specifically delayed gratification.The effects of this disruption are more evidenced in delay discounting and risky decision-making behavioral tasks, with more immediate reward-prone and impulsive profiles observed in both animals and humans under pain conditions (Becker et al., 2017;de Visser et al., 2011;Pais-Vieira et al., 2009;Pais-Vieira et al., 2012).Ultimately, this supports the hypothesis that prefrontalstriatal dysfunctions may contribute to an emotionally-driven state and potentially increase vulnerability to opioid addiction in the context of chronic pain conditions (Borsook et al., 2016).Further information regarding OFC and NAc time-reward dysfunctions in pain can be found in Table 1.
Understanding the factors contributing to altered impulsive control during chronic pain is crucial for guiding treatment strategies.Although several studies have reported the individual and communal contribution of these areas to temporal-and reward-related behaviors, the combination of OFC-NAc circuitry dysfunction and time-reward dependency relations under chronic pain contexts is currently deficient.This limited exploration is particularly relevant in clinical settings involving manifestation of impulsive traits and addiction proneness in neural diseases (Tompkins et al., 2016).Eventually, chronic pain may transform regular motivational and decision-making processes into heightened incentive salience, with hedonic systems as prospective player in mediating impulsivity and preference for immediate rewarding or pleasurable actions (Borsook et al., 2016;Tompkins et al., 2016).These short-term effects only exacerbate over time the chronicity and comorbidities associated with chronic pain syndromes, and give rise to anxiety, fear Fig. 2. Main behavioral pain-associated alterations in the orbitostriatal pathway related to cognitive processing.
M. Cerqueira-Nunes et al. and depression-prone affective states (Elman and Borsook, 2018).In the future, real-time prediction of chronic pain state will potentially allow personalisation of on-going therapeutic paradigms, such as deep brain stimulation or pharmacological interventions.Alternatively, recent studies have found that other circuit-unspecific interventions including cognitive behavioral therapy and physical exercise can lead to higher OFC activation in healthy and chronic pain patients (Bao et al., 2022;Miyashiro et al., 2021).Further investigation to robustly understand the therapeutical effects of these paradigms on OFC-NAc circuitry and its associated behavioral function is therefore imperative.

Conclusions and future perspectives
Overall, the alterations observed in various brain regions as a consequence of chronic pain, encompassing changes in chemistry, volume and connectivity, underlie cognitive impairments like deficits in mnemonic encoding and alterations in decision-making processes (Baliki et al., 2014;Baliki et al., 2012;Pereira et al., 2023;Yang et al., 2020).The orbitostriatal circuitry, benefiting from its strategic location and extensive connections, plays a crucial role in encoding time-reward information and regulating appropriate behavioral responses.Although human and animal studies have provided valuable insights into the functional contributions of the OFC and NAc brain regions to cognitive and pain-related mechanisms, our understanding remains incomplete.Ongoing investigations into the specific role of this circuitry in processing reward values based on delays, particularly in the context of chronic pain, as well as its involvement in pain perception and modulation, will contribute to a deeper comprehension of orbitostriatal function and the underlying neurotransmitter systems in both normal and pathological conditions.

Fig. 1 .
Fig. 1.A simplified schematic diagram depicting the most relevant cortical and subcortical connections of the orbitofrontal cortex and nucleus accumbens to the initiation of actions and responses to reward stimuli.The orbitostriatal pathway involves mono-synaptic projection and multi-synaptic connections: AMY, amygdala; HIP, hippocampus; IC, insular cortex; LH, lateral hypothalamus; OFC, orbitofrontal cortex; PAG, periaqueductal gray; PFC, prefrontal cortex; MDth, mediodorsal nucleus of the thalamus; and VTA, ventral tegmental area.

Table 1
Example of human and animal studies involving the orbitofrontal cortex and ventral striatum during the encoding of time-dependent reward associations under chronic pain conditions.