Socially relative reward valuation in the primate brain

Reward valuation in social contexts is by nature relative rather than absolute; it is made in reference to others. This socially relative reward valuation is based on our propensity to conduct comparisons and competitions between self and other. Exploring its neural substrate has been an active area of research in human neuroimaging. More recently, electrophysiological investigation of the macaque brain has enabled us to understand neural mechanisms underlying this valuation process at single-neuron and network levels. Here I show that shared neural networks centered at the medial prefrontal cortex and dopamine-related subcortical regions are involved in this process in humans and nonhuman primates. Thus, socially relative reward valuation is mediated by cortico-subcortically coordinated activity linking social and reward brain networks.


Introduction
Many earnings and possessions are countable and can be measured in numerical units. For example, salary can be expressed in dollars and land in acres. In general, more is better than less in the reward domain; so, a person might feel happier with an annual salary of sixty thousand dollars than forty thousand dollars. In this way, the valuation of rewards in private contexts is straightforward. However, the situation changes considerably in social contexts. The same person, who had been satisfied with the sixty thousand dollars, might now feel unfairness or envy when they find out that their same-aged peer earns twenty thousand more for assignments with comparable workloads. They might then lower the value of their payment and consequently the motivation to work. It is now well documented that social contexts matter for valuation of rewards and other behavioral attributes, such as ability, performance, and status [1,2,3 ,4-6]. However, the questions of how reward-related signals for the self and other are represented in the brain at the cellular level and how the two signals are integrated into a subjective value remain unanswered.
Reward valuation and related emotions are complicated in social contexts. As outlined below, rewards can evoke a pleasure despite absolute loss or a painful emotion despite a bunch of gains [7]. Why do these apparently opposing responses occur? What neural mechanisms underlie changes in value orientation in the presence of others? In the following sections, I review literature demonstrating that reward valuation in social contexts is made in a relative, and accordingly subjective, manner on the basis of self-other comparison and competition. I then summarize and discuss the current knowledge about the neural foundations of such socially relative reward valuation by first focusing on human neuroimaging studies and then macaque electrophysiological studies. These studies reveal that the medial prefrontal cortex (MPFC) and dopamine-related subcortical regions play a central role in relative reward valuation in social contexts.

Socially relative reward valuation: behavioral background
Why do we care at all so much about others' rewards? Why is the value of our own rewards not determined simply by their absolute amount, but instead affected so profoundly by the rewards of others? These questions are relevant not only to humans [8,9], but also to our phylogenetically close relatives, nonhuman primates [10,11]. Whether and how individuals consider others' payoff may form the basis for sociality, such as prosocial behavior and inequality aversion in both primate species [12,13].
There seem to be at least two psychological factors that underlie other-referential reward processing. The first factor is comparison -between the self and others [14]. According to the social comparison theory, humans are inclined to understand and evaluate their own attributes and selves by comparing themselves to others in terms of relevant behavioral domains. These domains are not limited to monetary rewards, but can encompass many other domains, including more abstract levels, such as appearance, competence, status, and achievement [2,3 ,4-6]. The other psychological factor is competition -between the self and others [3 , 15,16]. Without deliberate social systems such as sharing and redistribution, social organisms inevitably compete against others for limited resources, such as food, territory, and mating partners. Social comparison and competition are ubiquitous in daily life and readily evoke various other-regarding emotions, such as envy, schadenfreude, and admiration [17].
These two factors -social comparison and social competition -are closely related. When two players in a lottery game are presented with each other's payoffs and can compare the two, they spontaneously come to follow competitive strategies [18]. This psychological propensity toward social competition occurs despite the fact that the outcome of one player does not at all influence the outcome of the other. One possible explanation is that both factors take root based on the same ecological experiences that valuable resources are fundamentally finite and each individual cannot get everything they want.

Socially relative reward valuation: human neuroimaging
Pioneering work using functional magnetic resonance imaging (fMRI) revealed that unequal payoffs influence cortical activity in humans [9]. Later work addressed this issue systematically, along with emotional consequences of relative reward gain and loss (see below). These studies were carried out mainly in the context of self-other comparisons using monetary rewards as a comparison domain [4]. There were also studies, however, in which comparisons were made at more abstract levels, such as intelligence and attractiveness [19][20][21].
Neuroimaging of relative reward valuation via social comparison has most consistently identified the ventral striatum (VS), mainly the nucleus accumbens, and the MPFC as active foci [4, 6,22], which are central nodes in the brain's reward and social networks, respectively ( Figure 1). When two subjects both correctly performed a dot-number estimation task, but their payoffs were unequal, activity in the VS tracked one's own payment relative to the other's payment [23]. Specifically, VS activity was highest when the subject acquired more monetary rewards than the other, followed by the condition with equal payment, and was lowest when less rewards were earned, even though the subject's absolute gains were the same in all three conditions. In another study in which two subjects chose between lotteries and their win-loss outcomes were concurrently revealed, VS activity was significantly higher when the subject won more than their counterpart (social gain) than when they won the same amount in isolation (private gain), while it was significantly lower in the face of social loss than private loss [18]. In addition to the VS, levels of activation in the amygdala in response to unfair offers were significantly larger in prosocials than in individualists [13], suggesting that the VS and amygdala in prosocials encode reward inequity or subsequent behaviors to achieve equity (e.g. rejection). In the MPFC, activation levels during a choice phase [Montreal Neurological Institute coordinates (x, y, z) = (À3, 42, 39 and 9, 54, 3)] and a feedback phase (0, 54,9) were significantly higher in social conditions than in private conditions [18]. The MPFC, especially its dorsal part (0, 22, 38), signals relative value -here, the value of a chosen option minus the value of an unchosen option -for both oneself and others in task-invariant manners [24 ]. Moreover, a broad expanse of the MPFC (À7, 49, 6) extending into the orbitofrontal cortex (OFC; À4, 31, À15) signals subjective reward value as a function of the closeness between self and other [16].

DA
Other-reward & self-reward [29] Current Opinion in Neurobiology Summary of neural signals in distributed brain regions (left). The location of each region is schematically shown in a sagittal section of the macaque brain (right). MPFC, medial prefrontal cortex. OFC, orbitofrontal cortex. VS, ventral striatum. LH, lateral hypothalamus. DA, dopaminergic midbrain nuclei.
estimates, respectively, when interacting with high and low performers in cooperative contexts; however, the opposite occurred in competitive contexts [3 ]. The MPFC centered at BA9 (2, 44, 36) tracked others' performance, whereas more ventrally located perigenual anterior cingulate cortex (pgACC; 0, 40, 6) tracked one's own performance. Critically, the other-performance signal in the BA9 predicted the degree to which subjects' self-performance decreased as a function of interacting with high performers, suggesting that BA9 is involved in subjective valuation of one's own ability by taking others' ability into consideration. Activity in the VS in response to social gain predicts choice-related MPFC activity in the next trial [18], implying that the bottom-up projection from the VS to the MPFC may contribute to social decision-making on the basis of relative reward valuation. It should be noted, however, that an inverse top-down influence from the MPFC (À7, 49,6) to the VS has also been demonstrated during a task in which subjects compete for monetary rewards against others [16]. These findings suggest that bilateral information flow between the brain's reward networks and social networks mediates relative value modulation under social conditions and subsequent social decision-making.
Relative reward valuation via self-other comparison and competition evokes other-regarding emotions that can override the emotional consequences of absolute reward valuation. Specifically, one can express joy and schadenfreude (gloating) despite absolute loss if another subject loses more money, while one can feel envy despite absolute gain if another gets more [7]. It has been shown that the level of envy is associated with activity in the dorsal anterior cingulate cortex (À2, 10, 52) [25], and the level of schadenfreude with activity in the VS [7,25].
Notably, a subset of single neurons in the human MPFC (areas 24 and dorsal 32) respond to one's own gain and others' loss, potentially encoding schadenfreude [26]. One might hypothesize from these findings that such other-regarding emotions are concerned with activation of single brain regions. It should be noted, however, that the VS is also activated in private gain conditions in which schadenfreude should not occur [7]. Thus, a more plausible hypothesis is that complex social emotions have little one-to-one correspondence to a particular brain region, but can emerge via inter-areal interactions between reward and social neural networks. Together, these observations in humans demonstrate that the VS and a large expanse of the MPFC are involved in subjective valuation of one's own rewards and other aspects of behavioral domains by taking those of other individuals into consideration.

Socially relative reward valuation: monkey electrophysiology
The refinement of social task paradigms using two interacting monkeys, combined with electrophysiological neuronal recordings at fine spatiotemporal resolutions, has made it possible to study the neural basis of social cognition at the single-neuron level [27]. During performance of a task in which a subject monkey was required to work for a reward given only to itself (individual reward) or to both itself and another non-working monkey (joint reward), the subject was more willing to work in the individual reward condition [28,29]. In these studies, valuation was made by considering the others' payoff, because the absolute self-reward amount remained the same during the private/individual condition and the social/joint condition. Interestingly, when faced with a choice between giving a reward to the other monkey or giving a reward to no one, monkeys are more willing to choose the former, suggesting vicarious reward processing [29]. Neurons in the OFC, in which relative value coding was originally found at the single-neuron level in nonsocial conditions [30], exhibited increased activity when self-reward earnings were increased in the individual condition, but exhibited decreased activity when another monkey also received a reward. Thus, the OFC encodes relative reward values in social contexts. Reward-related and no-reward-related neuronal responses in the anterior striatum including both dorsal and ventral portions were also affected by the concomitant reward outcomes to another in a manner consistent with reward inequity signaling [31]. Both the OFC and striatum are richly interconnected with dopaminergic midbrain nuclei [32][33][34], a central node in the brain's reward system.
To study how cortico-subcortical regions linking social and reward neural networks coordinate during social relative reward valuation, Noritake et al. [35 ] extended a classical conditioning procedure into a social framework ( Figure 2). Here, two monkeys were conditioned with visual stimuli, each of which predicted their reward outcomes with different probabilities. Not surprisingly, the value of self-reward, as indexed by licking and choice behaviors, increased as its probability increased. Despite no objective changes in probability and amount, however, the value of self-rewards decreased as the others' reward probability increased. This value modulation did not occur when the other monkey was replaced with a water-collecting bottle. Critically, such a socially relative reward value was faithfully encoded by dopamine (DA) neurons in the midbrain. It seems likely that activation in the human VS during socially relative reward valuation reflects reward-related DA transmission [36].
In contrast to DA neurons, neurons in the dorsal portions of the MPFC, that is, area 9 m [37] and the pre-supplementary motor area, rarely encoded subjective value signals [35 ]. Instead, distinct sets of MPFC neurons encoded self-specific and other-specific reward probability information. Furthermore, judging from Granger causality analysis applied to local field potential (LFP) recorded simultaneously in the two remote brain regions, neural information flowed predominantly in a MPFC-tomidbrain direction, consistent with human neuroimaging studies [16]. These findings suggest that agent-selective reward probability information in the dorsal MPFC is conveyed to the midbrain, where DA neurons integrate this information into a subjective reward value. It has been documented that DA neurons encode a subjective reward value in nonsocial contexts by taking account of single reward dimensions, such as amount, probability, delay, and cost, individually [38][39][40][41], or different dimensions, such as amount, risk, and type, integrated together [42]. Thus, DA neurons play a key role in relative subjective reward valuation in both nonsocial and social contexts. Regarding the functional role of the MPFC, an alternative explanation remains that agent-selective signals might in fact encode relative value of one's own reward using others' reward as a reference point, rather than the absolute probability information per se, given that 18 The social brain  Variable-reward probability  In the self-variable block, the probability of self-rewards varies depending on which of three stimuli is presented, but the probability of partner-rewards is constant for all three stimuli. In the partner-variable block, the partner-reward probability varies, but the self-reward probability is constant. (b) Behavioral data for two M1 monkeys. Subjective value as indexed by licking movement increases as the self-reward probability increases, but decreases as the partner-reward probability increases. (c) DA activity encoding subjective value. Ensemble-averaged DA activity increases as the self-reward probability increases, but decreases as the partner-reward probability increases. Grayed areas, early stimulus epoch (151-450 ms after stimulus onset). (d) MPFC activity encoding agent-selective reward probability information. Ensemble-averaged activity for MPFC neurons that differentiates between self-reward probabilities (top) and partner-reward probabilities (bottom). Adapted from [35 ].
the MPFC is involved in social comparison processes [4,18,20,22,25,43]. This hypothesis is in line with the human neuroimaging study discussed above, although the comparison domain is not identical (reward [35 ] versus performance ability [3 ]).
In a follow-up study, Noritake et al. [44 ] further found that neurons in the lateral hypothalamus (LH), a subcortical center in the brain's reward networks [45,46] and potentially social networks [47], encode a mixture of social reward signals in time-dependent manners Socially relative reward valuation in the primate brain Isoda 19 Slope (M1-variable block) Self (52) Value (15) n.s.
Partner (66) Mirror (8) Late epoch ( Figure 3). In response to reward-predicting cues, LH neurons first encode a socially relative reward value and then agent-selective reward probability information. Neural information is conveyed mainly from the MPFC to the LH. These findings from the MPFC-DA and MPFC-LH pathways suggest that coordinated activity between the brain's social networks and reward networks plays a vital role in socially relative reward valuation, consistent with human neuroimaging studies [16,18]. In these subcortical regions, devaluing rewards by considering those of others might be implemented by functional interactions between the dopaminergic midbrain nuclei, LH, and lateral habenula [48][49][50]. How the LH in humans contributes to socially relative reward valuation will be an active area in future research.
Single neurons in the amygdala can encode various signals that are useful for socially relative reward valuation, such as reward values, social agents, and others' future choices [51 ], similarly to single neurons in the dorsal [35 ,52] and ventral [53] MPFC. It has been shown that the amygdala volume is positively correlated with increasing social ranks [54] and that individual amygdala neurons encode social ranks and reward values in overlapping manners [55].
The rostral ACC gyrus (ACCg) in the MPFC coordinates with the amygdala during the reward-allocation task mentioned above [56 ]. Monkeys performing this task showed preference to self-only reward over joint rewards (negative other-regarding preference, ORP), suggesting that their own reward was devalued when another monkey was also rewarded. In contrast, monkeys gave preference to other-only reward over neither reward (positive ORP), suggesting that observing someone else receive a reward was rewarding when there was no chance of selfrewards. Positive ORP was associated with increased coherence between neuronal spikes in the amygdala and beta-band LFP oscillations in the ACCg as well as between neuronal spikes in the ACCg and gamma-band LFP oscillations in the amygdala. Conversely, negative ORP was associated with a decrease in these coherent activities. The ACCg plays a causal role in the valuation of social signals [57] and contains neurons signaling which monkey -self or other -receives a reward [29]. The human ACCg (8,32,12) also signals mainly, albeit not exclusively, the likelihood of rewards for others [58,59]. The similarities and differences in social reward coding scheme need to be determined in more detail between dorsal MPFC and ACCg in both humans and nonhuman primates.

Conclusion
In social contexts, valuation of one's own reward is often made in reference to others' rewards. This form of reward valuation readily invokes complex other-regarding emotions depending on the context at hand, ranging from those that can hinder interpersonal relations, such as envy and schadenfreude, to those that can promote productive social exchanges, such as empathy, reciprocity, and vicarious happiness. Although socially relative reward valuation is mediated by multiple brain regions, core components are centered in social and reward neural networks. These findings invite an interesting hypothesis that it is not a single brain region, but the combination of regions within the distributed neural networks and their coherent interaction that determine the type of other-regarding emotions and subsequent social decisions. Thus, a critical next step is to better understand fine-grained mechanisms underlying social rewards and emotions at the pathway level via electrophysiological decoding and pathway-selective intervention using well-controlled social task paradigms, the strategy of which has been developed in macaque monkeys [60]. Currently, the domain of comparisons between self and other is confined to rewards in monkey studies. However, other domains, such as the status and performance ability, would also be testable given that monkeys are sensitive to hierarchical relationships [61] and are equipped with metacognitive capability [62][63][64].

Conflict of interest statement
Nothing declared.