Social monitoring of actions in the macaque frontopolar cortex

The frontopolar cortex (FPC) of primates appeared as a main innovation in the evolution of anthropoid primates and it has been placed at the top of the prefrontal hierarchy. The only study to date that investigated the activity of FPC neurons in monkeys performing a cognitive task suggested that these cells were involved in the monitoring of self-generated actions. We recorded the activity of neurons in the FPCs of two rhesus monkeys while they performed a social variant of a nonmatch-to-goal task that required monitoring the actions of a human or computer agent. We discovered that the role of FPC neurons extends beyond self-generated actions to include monitoring others' actions. Their monitoring activity was very specific. First, neurons in the FPC encoded the spatial position of the target but not its object features. Second, a dedicated representation of the human agent actions was tied to the time of target acquisition, while it was reduced or absent in the successive epochs of the trial. Finally, this other-specific neural substrate did not emerge during the interaction with a virtual agent such as the computer. These results provide a new perspective on the functions of a uniquely primate brain area, suggesting that FPC might play an important role in social behaviors.


Introduction
The frontopolar cortex (FPC), also known as Brodmann's area 10, remains one of the least investigated parts of the prefrontal cortex to date. Located at the rostral end of the brain, it is the most extended part of the human prefrontal cortex (Semendeferi et al., 2001) and connects with several cortical brain regions, mainly the prefrontal and temporal lobes (Petrides and Pandya, 2007;Rosa et al., 2019;He et al., 2020). From an evolutionary perspective, a peculiarity of the FPC is that it is unique to anthropoid primates, and it evolved in size and percentage of brain volume across primates' species. The increase in brain size across evolution has been linked to the necessity to deal with the advanced social skills that are typical of primates (Dunbar, 2009). The evolution of the prefrontal cortex, and its phylogenetically more recent brain regions such as the FPC, may have been guided to fulfill social requirements. For this reason, the FPC might process social information, and despite differences across species (Semendeferi et al., 2011;Petrides et al., 2012), a comparative approach using nonhuman primate models appears essential to decipher the functions of this area (Passingham, 2009;Koechlin, 2011).
What we know about the cognitive processes in which the FPC is involved comes mainly from noninvasive neuroimaging studies in human subjects. These studies suggest that this area contributes to a wide range of cognitive abilities, including the integration of operations performed by other prefrontal areas (Ramnani and Owen, 2004), cognitive branching (Koechlin et al., 1999), management of internal and external stimuli (Gilbert et al., 2005(Gilbert et al., , 2006a(Gilbert et al., , 2006b, mnemonic functions (Burgess et al., 2011;Benoit et al., 2012;Kim et al., 2015), and exploratory decision making (Daw et al., 2006). Two studies (Mansouri et al., 2015;Boschin et al., 2015) have investigated the consequences of FPC lesions in nonhuman primates for a variety of cognitive skills. The majority of the behaviors tested were not impaired; the main deficits observed were the ability to learn novel rules and an impairment in rapid learning. Furthermore, lesioned monkeys' ability to return to the original task after a distraction task was even enhanced, suggesting that compared to control monkeys, monkeys with lesion in the FPC were not able to explore alternatives, remaining focused on the main task. Combined, these results suggest that the FPC mediates exploration in relation to the relative values of alternative stimuli.
Little is known about the properties of single FPC cells due to the shortage of neurophysiological studies that have investigated this area; to date, only one study has investigated this topic (Tsujimoto et al., 2010). That study used a cued strategy task that involved choosing between two targets, based on two instructed strategies. The authors found that FPC neurons did not show the rich patterns of modulation that are typical of other prefrontal cortex areas involved in representing multiple task aspects (Tanji & Hoshi, 2008;Wallis, 2010;Tsujimoto et al., 2011;Rigotti et al., 2013). Rather, their modulation was limited to the spatial position of the chosen target after the choice was made, suggesting that the FPC is involved in monitoring self-generated actions. In the present study, therefore, we investigated whether the FPC is also involved in monitoring others' actions in a social context. We trained two male macaque monkeys to perform a social variant of the nonmatch-to-goal (NMTG) task (Falcone et al., 2016(Falcone et al., , 2017Cirillo et al., 2018), which was designed to study the ability of monkeys to monitor the actions of a human agent (Fig. 1A), and we recorded single-unit activity in the FPC (Fig. 1B). The monkey and a human agent alternated between the roles of actor and observer in a block of trials ( Fig. 1C top). As actor, the monkey had to choose between two peripheral targets (PT) displayed on opposite sides of a touchscreen. It had to follow a nonmatch-to-goal rule, which involved rejecting the target chosen in the previous trial and choosing the alternative. As observer, it had to monitor the human agent who followed the same rule in order to make its own future choice correctly. In a subset of sessions a computer interaction block was performed in addition to the human interaction block. The computer performed the task via a cursor moving on the screen and making choices following the same rules as the human agent, alternating with the monkey (Fig. 1C bottom).

Subjects
Two monkeys (Macaca mulatta; Monkey 1: male, age 7 years, weight ~8 kg; Monkey 2: male, age 15 years, weight ~17 kg) were trained to perform a social interactive task before starting the recording sessions. The monkeys sat on a primate chair with the head fixed and facing toward a touch-screen monitor (3 M MicroTouch M1700SS 17" LCD touch monitor, 1280 × 1024 resolution). Animal care, housing, and experimental procedures conformed to the European (Directive 210/63/EU) and Italian (DD.LL. 116/92 and 26/14) laws on the use of nonhuman primates in scientific research. The research protocol was approved by the Italian Health Ministry (Central Direction for the Veterinary Service).

Behavioral task
In the nonmatch-to-goal task (NMTG), the task rule requires monkeys to discard the target chosen in the previous trial and select the new target. Two noncommercial software packages, CORTEX (NIMH, Bethesda, MD, USA, for Monkey 1) and MonkeyLogic (NIMH, Bethesda, MD, USA, for Monkey 2), were used to display the targets on the monitor, control reward delivery, and record behavioral responses. Eye positions during the realization of the task were monitored and recorded through the ViewPoint Eye Tracker system (220 Hz, Arrington Research, Scottsdale, USA). Trials started with the appearance of a white central target (CT) (Fig. 1A) on the monitor, which the monkey had to touch and continue touching for a period of 500 or 800 ms (Holding CT epoch).
After that, two peripheral targets (PT) appeared on either side of the central target, one on the left and one on the right. The two peripheral targets were selected from four possible objects ( Fig. 1A top left): one was the same object that was chosen in the previous trial (in Fig. 1A, a pink rhombus), and the other was randomly chosen from the remaining three objects (in Fig. 1A, a green triangle). A delay period of 800 or 1200 ms separated the appearance of the peripheral targets from the "go" signal, which was the disappearance of the central target. The monkey had to touch one of the peripheral targets for 400 or 600 ms (Holding PT epoch), and then feedback appeared to indicate whether its Fig. 1. NMTG task and behavioral results. A) Sequence of task events and epochs. At the top are the four stimuli used as peripheral targets (PTs) and the four stimuli used as feedback. B) Approximate location of the FPC (area 10) in the macaque brain. C) Examples of two sequences of trials during the human and computer interaction blocks. D) Behavioral performance across 24 (Monkey 1) and 22 (Monkey 2) sessions in After-monkey trials (AM) and After-human trials (AH). In each box, the central yellow line represents the median, the upper and lower bounds of the box represent the 25th and 75th percentiles, and the vertical lines extend to the highest and lowest values. n=number of sessions. choice was correct or not. Two types of feedback for correct and incorrect answers were used, in which different colors and shapes were randomly alternated ( Fig. 1A top right). After 400 or 600 ms, the feedback period ended, and a reward (a drop of fruit juice) was delivered in the case of a correct choice. An incorrect choice led to the repetition of the trial (correction trial). If the monkey broke off its touch during any of the periods, the trial was aborted and started again. The monkeys were trained to perform a social variant of the NMTG task, in which they had to alternate with a human agent in performing the roles of actor and observer across trials. The human agent was seated next to the monkey, facing the monitor, and he could intervene in the task by moving his hand toward the center of the screen during the intertrial period to signal the role switch to the monkey. The human agent could perform from one to four trials in a row, while the monkey observed. The human agent conducted the task following the NMTG rule: he rejected the object chosen in the previous trial and selected the alternative one. The human agent performed only correct trials; in all human trials the reward was delivered to the monkey. The removal of the human's hand from the center of the screen signaled to the monkey to switch to the actor role in the next trial. The upper panel of Fig. 1C illustrates an example sequence of trials in which the monkey and the human agent alternate as the actor.
In a subset of sessions, the monkeys performed the same task but they interacted also with a computer agent. These trials proceeded in the same way as with the human agent: the monkey and the computer agent switched roles as actor and observer from trial to trial. A computer trial started with a red central target instead of a white one to signal to the monkey that it had to stop and observe. All the periods of the trial were identical to those in the human interaction trials. If, at the beginning of a computer trial, the monkey did not touch the screen, a gray bar appeared on the central target, and after the go signal, it moved toward the correct target. The reaction and movement times of the computer bar were set to simulate those measured for the human agent (~900 ms in total). The computer performed only correct trials and performed one to four trials in a row; in all computer trials the reward was delivered to the monkey. The appearance of a white central target signaled to the monkey that it was its turn to perform the trial. The bottom panel of Fig. 1C illustrates an example sequence of trials during which the monkey and the computer agent alternate as the actor.
We discarded the sessions in which performance was below 60% correct trials of either class (5 out of 51 recorded sessions were discarded; three of these were with Monkey 1 and two with Monkey 2). The very low number of interruptions between consecutive trials in both trial classes indicates that the monkeys were engaged in the task and could correctly take turns with the human agent (Monkey 1 completed 93.1% and 92% of AM and AH trials, respectively; Monkey 2 completed 98.6% and 99.4% of AM and AH trials, respectively).

Surgery and data collection
The two monkeys were bilaterally implanted with high-density microelectrode chronic systems to record extracellular activity in the FPC (CerePort Utah Array, Blackrock Microsystems, Salt Lake City, UT, USA; a 96-channel array in each hemisphere for Monkey 1; a 48-channel array in each hemisphere for Monkey 2) (Fig. S1). Electrical signals were amplified and processed with an RZ2 bioamp processor (Tucker-Davis Technologies, Alachua, FL, USA) and sampled at 24.414 kHz. Raw data collected during recordings were then high-pass filtered (300-3000 Hz), and spike sorting was performed using a fully automatic algorithm (Chung et al., 2017) (MountainSort v4 0.2.3 and dependencies). To detect single-unit activity, the individual clusters identified by the algorithm in each channel were further filtered by applying specific thresholds to their quality parameters (firing rate > 0.1 Hz; noise overlap < 0.1; isolation > 0.95; signal-to-noise ratio > 2; for a detailed description of how the parameters were computed, see (Chung et al., 2017)). Finally, we visually inspected the mean waveforms of the remaining clusters to exclude neurons with irregular shapes. Our final dataset was thus composed of 907 single neurons, with 626 neurons from Monkey 1 across 24 recording sessions (448 from the right array and 178 from the left array; mean ~28 single neurons per session), and 281 neurons from Monkey 2 across 22 recording sessions (197 from the right array and 84 from the left array; mean ~13 single neurons per session). To check for possible repetitions of single neurons across sessions, we applied another filter to our database. Since in the Utah array the spacing between electrodes is sufficiently large (400 µm) to avoid possible recordings of the same neuron across different channels, we checked whether the same electrode could record the same neuron across different recording days. Starting from the second recording session and continuing in chronological order, we removed from the dataset all neurons that in the previous session were recorded with the same electrode. For example, if in the first and second sessions a neuron was recorded in channel five, we removed the neuron from session number two. This led to the removal of 333 single neurons (36.7% of the total recorded neurons) for a remaining total of 574 neurons (377 neurons for Monkey 1 and 197 neurons for Monkey 2). We refer to this new dataset as the reduced dataset and we used it to run control analyses on the main results.

Epochs of interest
We focused our main analyses on two different classes of trial (correct monkey and human trials) and on five different epochs of 400 ms each during each type of trial: Early Delay (from the appearance of the peripheral targets to 400 ms after), Late Delay (from 400 to 800 ms after the appearance of the peripheral targets), Touch PT (from 200 ms before the Touch PT to 200 ms thereafter), Feedback (from the appearance of the feedback to 400 ms thereafter), and Reward (from the delivery of the reward to 400 ms thereafter). Unless otherwise specified, we only used complete and correct trials for the analysis.

Single-unit analysis
For each epoch of interest, we studied the encoding of two different task variables: the correct object (four alternatives: rhombus, rectangle, triangle, and cross) and the correct spatial position (two alternatives: right and left). We used a one-way ANOVA to analyze the mean firing rate activity for monkey and human trials separately. To determine whether the proportion of significant neurons for each agent in each epoch was significantly above chance, we performed a permutation test with 1000 iterations for each neuron while shuffling the labels of the conditions. We used a one-way ANOVA to analyze each iteration and obtain a null distribution of percentages of significant cells to be compared with the real percentage value, with a cutoff for significance at the 95th percentile of the distribution. Next, we investigated the overlap between the two groups of neurons using a hypergeometric distribution test (Marcos et al., 2017): we identified the number of neurons that encoded the spatial position in monkey trials (nM), the number that encoded the spatial position in human trials (nH), and the number that were selective for both (nB). To assess whether nB was significant, we calculated the probability of selecting nM neurons from the total number of recorded cells and obtaining a number equal to or higher than the number of nB neurons that belonged to the nH group of neurons. We then calculated the p-value as the sum of the probabilities of selecting a number equal to or higher than the number of nB neurons.

Decoding procedure
To determine the contribution of the whole population of recorded neurons to the monitoring of self and others' actions, we used a decoding procedure to discriminate the spatial position of the correct target in the three epochs in which we found a significant number of modulated neurons. We used the procedure described by Meyers (2013), with a maximum correlation coefficient classifier to discriminate between right and left in monkey and human trials separately. The procedure was as follows: for each neuron, data were binned in the epoch of interest, and trials were labeled based on the condition (right vs left). For each neuron, firing rate activity was normalized using a z-score transformation, then the classifier was implemented using a k-splits procedure, where k represents the maximum number of available trials for each condition for each neuron. For example, given two recorded neurons, the first with 20 right trials and 15 left trials and the second with 10 right trials and 15 left trials, the number of k-splits available is 10 for each condition for both neurons. The classifier was trained on the activity of k − 1 trials in the right and left conditions separately (training trials) and tested on the remaining trials in the right and left conditions (test trials). This procedure was repeated k times, each time leaving out a different trial as a test trial, and for n resample runs (n = 50) repeating the k-splits procedure from the top to obtain a mean classification accuracy. The whole procedure (k x n) was repeated again 1000 times to obtain a distribution of mean classification accuracies. The mean classification accuracy was obtained based on the number of correct guesses, which occurred when the correlation coefficient between training and testing trials belonging to the same condition was higher than that between training and test trials belonging to different conditions. This procedure was performed separately for monkey and human trials to assess classification accuracy for the right and left conditions within the specific agent trials (congruent decoding). We then repeated this entire procedure, except that the condition labels were randomly shuffled to obtain a null distribution. To determine the similarity across real and shuffled distributions, we computed an overlapping index (η) that defined the proportion of the overlapping area between the probability density functions of two distributions (Pastore and Calcagnì, 2019). The same procedure was applied to discriminate the correct object in the same three epochs.
To test whether the coding of the target spatial position generalized between agents, or in other words, whether there was an invariant representation of the correct target position across agents (monkey and human), we used cross-agent decoding with a modification to the procedure that has been described above. In this case, the classifier was trained with right and left trials performed by one agent, and then tested with right and left trials performed by the other agent (incongruent decoding). A high classification accuracy would indicate that, at the population level, spatial coding generalized across the two classes of trial (monkey and human). The classification accuracy significance was assessed again by shuffling the condition labels and repeating the whole procedure 1000 times. To determine the similarities across congruent and incongruent distributions, we computed the η overlapping index.
The same procedure was used to investigate cross-agent decoding between human and computer trials and between monkey and computer trials. We did not used the cross agent training and testing procedure for the correct object decoding, since the classification accuracy in congruent condition was not above chance. To quantify the invariance of the spatial representation between different agents during the Touch PT epoch, we calculated a generalization index as the ratio of the sum of the mean classification accuracies between conditions (incongruent decoding) to the sum of the mean classification accuracies obtained within the same conditions (congruent decoding). Values of approximately 1 would indicate high generalization across conditions, with lower values indicating lower generalization of coding across conditions.

Preference index
To further investigate the properties of agent-specific neurons, we examined the spatial preference of Monkey-and Human-only cells across the three epochs of interest. The preference was assessed using a receiver operating characteristic (ROC) analysis (Britten et al., 1992), which reflects the performance of a binary classifier. The area under the ROC curve (AUROC) indicates not only the accuracy of the discrimination between two conditions based on the firing rates of individual neurons, but also provide information about which one of the two conditions was the preferred one. AUROC values range from 0 to 1, with 0.5 indicating no discrimination; values < 0.5 indicate in our case a neuron with a preference for right trials, while values > 0.5 indicate a neuron with a preference for left trials. We calculated AUROC values for Monkey-and Human-only neurons for both monkey and human trials and we performed a linear regression analysis with the AUROC values calculated in monkey trials as the independent variable and those calculated in human trials as the dependent variable for Monkey-only cells (and vice versa for Human-only cells). A significant p-value indicates that, as a population, neurons share the same preferred target position between Monkey and Human trials.
For the neural analysis instead, we divided the trials in two different classes: trials performed by the monkey and trials performed by the human agent. For each epoch of interest (see Methods), we calculated the percentage of neurons that selectively encoded the correct target object or position separately for monkey and human trials (one-way ANOVA, p < 0.05). We found no modulation for the target object in any epoch in either trial class (approximately 5% of significant neurons in the five epochs; Table S1). Instead, we found that a significant number of neurons were modulated by the target position in three epochs, from the Touch PT epoch, through the Feedback, to the Reward epoch (Table S2), both in monkey and human trials. Interestingly, there was no evidence of spatial modulation in the delay period, although information about the target position was already available after the presentation of targets on the screen. Thus, neither the planning of one's own actions nor the prediction of others' was encoded by FPC neurons.
We divided the neurons into three groups ( Fig. 2A) based on whether they encoded the target position only during trials performed by the monkey (Monkey-only), by the human agent (Human-only), or during both (Both-agents). During the Touch PT epoch, 266 of the 907 neurons (29.3%) showed spatial modulation, but only 31 of these did so in both monkey and human trials (p = 0.06, hypergeometric distribution test). The majority of the spatially modulated neurons thus separately encoded the target position during self-(Monkey-only neurons, Fig. 2B) or others' actions (Human-only neurons, Fig. 2C). These types of neurons were found in similar proportions (12.6% and 13.3% of the total recorded cells, respectively). Similar agent-selective spatial modulation was observed in the Feedback and Reward epochs (Table S2). However, the number of Both-agents neurons, despite still being low, was significantly different from that expected based on chance in both epochs (p = 0.01 and p < 10 − 5 for the Feedback and Reward epochs, respectively; hypergeometric distribution test). The proportion of these neurons increased through the three epochs ( S4) and with the reduced dataset used to control for possible repetitions of neurons across sessions (Table S5).
As a further control, we analyzed the eye position during the task (Fig. S2) and we checked whether the activity of neurons classified as Monkey-Only or Human-Only could be related to this variable. Because target location and eye position were associated in the epochs in which we found target selectivity, we examined the eye position modulation before, in the Early Delay period, in which we did not found any coding of the correct target position. We analyzed 26 sessions out 46 (8 sessions in Monkey 1 and 18 sessions in Monkey 2). We discarded eye data from the remaining sessions because of the poor quality of the signal (typically due to difficulties in the calibration procedure). We calculated the percentage of neurons that selectively encoded the eye position independently of whether it was a right or left trial or a monkey or a human trial. We found that 48 out of 488 neurons tested (9,8%, one-way ANOVA, p < 0.05) showed a significant modulation for the eye position in the Early Delay epoch, but only a minority of these neurons were classified as Monkey or Human-Only in the subsequent epochs (Table S6).
To determine the contribution of the whole population of neurons to the monitoring of self and others' actions, we used a decoding procedure to classify the spatial position of the correct target and we found that the classification accuracy was above that predicted by a null model in all three epochs (external bars, Fig. 2D). We then performed a cross-agent decoding analysis to quantify the extent to which classification accuracy was agent specific, or if it was generalized between agents. In the Touch PT epoch (internal bars, Fig. 2D, top), the cross-agent decoder poorly classified the target position in the human trials when the training of the classifier had been performed in monkey trials, and vice versa, showing that coding of the target position was highly agent Green markers indicate feedback appearance. C) Raster plot of a single neuron classified as Human-only. Markers as in (B). D) Classification accuracy for the right and left target positions in monkey and human trials obtained using a decoding procedure in the three epochs of interest with training and testing performed within the same conditions (external bars, congruent decoding) and between conditions (internal bars, incongruent decoding). n=number of neurons. Blue bars: decoding accuracy. Yellow bars: decoding accuracy obtained by shuffling the labels. Orange and purple vertical lines: mean standard deviation over resamples for blue and yellow bars, respectively. / η > 0.05, * η < 0.05, * * η < 0.01, * ** η < 0.001. specific. In contrast, during the Feedback and Reward epochs, the crossagent classification accuracy was higher and similar to the classification accuracy obtained when training and testing were performed within the same class (internal bars, Fig. 2D, middle and bottom). We obtained comparable results when the monkeys were analyzed separately (Fig. S3) and when we used the reduced dataset (Fig. S4). This can be partially explained by the increase in the proportion of Both-agents cells across the three epochs. However, the main reason for this result was that within this group of cells, the proportion of neurons with the same spatial preference between monkey and human trials increased from 61.3% in the Touch PT epoch (19 of 31 cells) to 93.7% and 91.4% in the Feedback and Reward epochs, respectively (30 of 32 and 32 of 35, respectively), which was significant (Pearson's chi-squared test: χ2, 14.495; df, 2; p = 0.000712). The same decoding procedure was used to classify the correct object both in monkey and human trials separately; as expected, we did not find any significant result in any of the epochs (Fig. 3). Since the task rule required to keep in memory the object chosen in the previous trial and then to discard it and select the new one, we further investigated whether the previous correct object (which correspond to the 'incorrect' object in the current trial) was encoded in any epoch, including the epochs that preceded the target selection (Touch CT and Delay). Also in this case we did not find any significant result neither in monkey nor in human trials, in any of the epochs (Fig. S5).
We investigated the source of this cross-agent increase in accuracy (internal bars, Fig. 2D), and examined the spatial preference of the Monkey-only and Human-only neurons identified in the three epochs. In the Feedback and Reward epochs, the area under the receiver operating characteristic curve (AUROC) values, calculated from the activity of Monkey-only neurons in monkey trials, were a significant predictor of the AUROC values calculated for human trials, and vice versa for Human-only neurons (Fig. 4, middle and bottom panels). Since AUROC values range from 0 to 1, with lower and higher values indicating opposite preferences, this shows that even if single neurons significantly encoded the target position only for a specific agent, they significantly shared the same preference between agent trials at the population level. This relationship was absent during the Touch PT epoch (Fig. 4, upper panels), indicating an agent-specific representation of the target position during this epoch, in contrast to a generalized representation between the agents during the Feedback and Reward epochs.
In a subset of sessions, monkeys interacted with a virtual agent, represented by a bar moving on the screen. The performance was assessed in AM trials and "After Computer" trials (AC; Fig. 1C, bottom). Monkey 1 performed accurately in all trial classes (6 sessions; for AM, AH and AC, respectively: mean ± standard deviation [STD], 75.9 ± 7.2%, 75.9 ± 8.7% and 73.0% ± 7.7%; Fig. 5A). However, Monkey 2 did not perform accurately in the AC trials (15 sessions; for AM, AH and AC, respectively: mean ± standard deviation [STD], 82.6 ± 7.9%, 69.9 ± 7.2% and 55.1% ± 14.3%), indicating that it failed to monitor the computer's actions, in contrast to its successful monitoring of the human agent. We therefore did not include the electrophysiological data collected during the computer condition with Monkey 2. During the human interaction block, we found that neurons in the Touch PT epoch monitored the actions of the human agent separately from the monkey's own actions. To test whether this difference could simply be due to an absence of movement by the monkey, we compared the coding of the target position during human and computer trials in this epoch. This involved analyzing the activity of 126 neurons that were recorded during both human and computer interaction blocks (6 sessions out of 24 performed by Monkey 1). Of these, 36 (28.6%) had previously been identified as Human-only, since they had encoded the target position during only human trials, and not during monkey trials, in the human interaction block. Most of these did not encode the target position during trials performed by the computer (Human-specific neurons, Fig. 5B). The proportion of Observation neurons (neurons that encoded the target position only in the two observation conditions, human and computer) was at chance level (4.8%), and the same was true for Computer-specific neurons (neurons that encoded the target position only in computer trials and not in monkey and human trials; 6.3%). Fig. 5C shows an example of a Human-specific neuron across the three agent conditions. The cross-agent decoding procedure showed that, at the population level, the spatial coding did not generalize across the two observation conditions (Fig. 6A). Furthermore, neurons that encoded the target Fig. 3. Classification accuracy for the correct objects in monkey and human trials obtained using a decoding procedure in the three epochs of interest with training and testing performed within the same conditions. Note that since the decoding accuracy was not above chance, training and testing between conditions (as in Fig. 2D) was not performed. n=number of neurons. Blue bars: decoding accuracy. Yellow bars: decoding accuracy obtained by shuffling the labels. Orange and purple vertical lines: mean standard deviation over resamples for blue and yellow bars, respectively. / η > 0.05, * η < 0.05, * * η < 0.01, * ** η < 0.001. position in human trials did not share the same spatial preference (right or left) between human and computer trials (Fig. 6B). Contrary to our initial expectations, the spatial coding was more generalized when we compared computer and monkey trials (Fig. 6C), in contrast to the crossagent decoding accuracy obtained by comparing the two observation conditions. This was confirmed by the shared spatial preference between monkey and computer trials (Fig. 6D). The similarities across each pair of agents are summarized in Fig. 6E by a generalization index, shedding light on the specificity of the human trials, and the similarity of the monkey and computer trials.

Discussion
We thus examined the properties of single FPC neurons during a social interactive task. In line with previous findings (Tsujimoto et al., 2010), we found that neurons in the FPC encoded the target position since the time of the target acquisition. We extended this, showing that neurons in the FPC are involved in the monitoring not only of self-generated actions, but also of others' actions. Both are distinctly represented in the FPC only at the time of target acquisition. FPC neurons were however not involved in the encoding of the object identity.
The only other electrophysiological investigation of FPC functions in monkeys (Tsujimoto et al., 2010) used a task rule that required the monkeys to remember the spatial position chosen in the previous trial, and to stay or shift the spatial position depending on the strategy dictated by the object. In that study, therefore, the task goal was represented by the spatial position, not by the object. The authors found no coding of the object features, but left open the possibility that the object was not encoded because it represented an instruction cue, rather than the task goal, which indicates the strategy to adopt to achieve the task goal. In the NMTG task, the key feature to monitor is the object, which is the task goal. Despite this task rule, which required monkeys to select the new object regardless of its spatial position, we found no coding of the object at any time during the trial, although we did find strong encoding of the target position. This suggests that, at least in the context of these specific experimental paradigms, the FPC does not encodes object features, whether the object represents the task goal or not, but encodes spatial position.
The main aim of our study was to investigate whether the monitoring of actions is extended to others' actions, and whether these actions have a dedicated representation, i.e., a representation that is encoded in a separate neural substrate from that which encodes own actions. We found that FPC neurons were strongly modulated during the observation of others' actions. In human trials, the target position was encoded after target acquisition and lasted until the reward period. Encoding of the target position by single neurons was highly agent specific throughout the epochs, with few neurons that encoded it for both agents. However, all these neurons (Both-agents, Human-only, and Monkey-only) consistently shared the same spatial preference across agents in the Feedback and Reward epochs, in contrast to the Touch PT epoch. This congruence in the spatial preference at the level of single neurons ensures that at the population level, the specificity is limited to the moment of target acquisition, and thereafter dissipates quickly.
The computer condition allowed us to further investigate whether the coding of the target position that we found in human trials at the moment of target acquisition could be generalized to any observation condition, or if it was specific to the human agent. It is possible that what initially appeared to be a specific other-related signal may instead be related to the inactivity of the monkey during observation trials (e.g., associated with action inhibition). If that were the case, Human-only neurons should encode the target position similarly in computer trials, and a cross-agent decoding procedure would generalize across the Fig. 5. Computer interactive condition. A) Behavioral performance of Monkey 1 across 6 sessions in the human and computer interaction blocks in After-monkey trials (AM), After-human trials (AH), and After-computer trials (AC). In each box, the central yellow line represents the median, the upper and lower bounds of the box represent the 25th and 75th percentiles, and the vertical lines extend to the highest and lowest values. Crosses represent outliers. n=number of sessions. B) Overlap between the populations of Human-only neurons (n = 36, neurons modulated in human trials but not in monkey trials) and Computer-only neurons (n = 14, neurons modulated in computer trials but not in monkey trials). Observation neurons are neurons that were modulated in both interaction trials (human and computer), while Specific neurons are neurons that were modulated in only one trial type (Human-or Computerspecific). C) Raster plot of a single neuron classified as Human-only in the human interaction block that did not encode the target spatial position in the computer trials (Human-specific neuron). The activity is aligned to the Touch PT. The light blue box indicates the period that was tested for significance (200 ms before the touch to 200 ms after). Green markers indicate feedback appearance.
observations of the two agents. Instead, the coding in human and computer trials was independent, while it generalized between computer and monkey trials. This similarity in monkey and computer trials could be interpreted as a mental rehearsal process elicited by an interaction with a nonphysical, inanimate agent (Cisek and Kalaska, 2004), while an interaction with a physical, animate agent may promote a separate, dedicated representation of other actions (Nougaret et al., 2019;Ferrucci et al., 2021). By using both a human hand and a cursor as a control, we have shown that the representation of other actions occurred regardless of the type of visual stimulation, either biological or non-biological. However, we cannot assess whether the difference in the representation of the monkey and human actions, which is absent between monkey and computer, might have been influenced by the view of the human biological hand in different spatial positions. Therefore, it remains to be determined whether the specificity of the human action is driven by visual-social variables such as the view of the human hand or reflects a more abstract process of self-vs-other distinction.
Framing these results in a broader context that incorporates our understanding of the role of the FPC developed using different methodologies and models is a complex challenge. A first endeavor to establish the role of the FPC in cognition using imaging studies (Burgess et al., 2007) proposed the gateway hypothesis, which considers this region important in the regulation of attention paid to self-generated or externally produced signals; in our case, these signals are the own or other's action during social interaction. A recent review (Mansouri et al., 2017) suggested that the FPC plays a crucial role in exploratory behavior, in contrast to posterior parts of the prefrontal cortex, which are primarily devoted to exploiting the current task. This exploratory behavior is expressed by monitoring the relevance of current goals, and eventually by redistributing cognitive resources to other goals. From this perspective, this is compatible with a dedicated neural code for monitoring others' actions that is limited to the moment in which the choice becomes explicit, that is when the action leads to target acquisition, which is the most relevant moment, and is lost immediately thereafter. This other-related specific information may be sent to posterior areas to contribute to different aspects of social cognition. Among these areas, the medial frontal cortex (MFC) seems to play a key role in processing other related information as shown by several studies (Yoshida et al., 2011;Yoshida et al., 2012, Haroush et al., 2015, Falcone et al., 2017, 2022. Using a role reversal task in a monkey-monkey interacting paradigm, colleagues (2011, 2012) were the first to report that neurons in the MFC selectively encoded not only the action of the partner at the time of the choice but also his erroneous actions. Subsequent studies showed that in the MFC is also present a predictive activity of the action of the social partner (Haroush et al., 2015, Falcone et al., 2017 and how this predictive activity during a delay period relate to the monitoring activity during action observation (Falcone et al., 2022). Our results show that FPC, an area whose single-neurons social functions are largely unknown, show a similar activation during the observation of others actions, but with lack of predictive activity compared to MFC, and with a dedicated representation that is strongly tied the target acquisition.
According to the social brain hypothesis (Dunbar, 2009) the larger ratio between brain size and body size found in primates compared to accuracy obtained by shuffling the labels. Orange and purple vertical lines: standard deviation over resamples for blue and yellow bars, respectively. * η < 0.05, * * η < 0.01, * ** η < 0.001. B) Left: normalized z-score population activity during human trials for all neurons that encoded the target spatial position in human trials, divided by preferred and anti-preferred positions that were assessed during human trials. Right: normalized z-score population activity during human trials for all neurons that encoded the target spatial position in human trials, divided by preferred and anti-preferred positions that were assessed during computer trials. The light blue box indicates the period in which the preference was assessed. Purple bars at the bottom indicate time periods in which population activity was significantly different (Wilcoxon rank sum test with Bonferroni correction, p < 0.01). Shaded areas represent ± standard error of the mean (SEM). n=number of neurons. C) Classification accuracy for right and left target positions in monkey and computer trials obtained using the decoding procedure in the Touch PT epoch with training and testing performed within the same condition (external bars, congruent decoding) and between conditions (internal bars, incongruent decoding). n=number of neurons. Blue bars: decoding accuracy. Yellow bars: decoding accuracy obtained shuffling the labels. Orange and purple vertical lines: standard deviation over resamples for blue and yellow bars, respectively. / η > 0.05, * η < 0.05, * * η < 0.01, * ** η < 0.001. D) Left: normalized z-score population activity during computer trials for all neurons that encoded the target spatial position in computer trials, divided by preferred and anti-preferred positions that were assessed during computer trials. Right: normalized z-score population activity during computer trials for all neurons that encoded the spatial position in computer trials, divided by preferred and anti-preferred positions that were assessed during monkey trials. Markers as in (B). E) Generalization index calculated from data shown in Fig. 2D (top), 6 A, and 6 C. In each box, the central red line represents the median, the upper and lower bounds of the box represent the 25th and 75th percentiles, and vertical lines extend to the highest and lowest values. Crosses represent outliers.
other mammals evolved to cope with their complex social life. For this reason, it should not be surprising that the prefrontal cortex and especially the FPC, which have undergone expansion during evolution, play an active role in the representation of social information. For instance, it has been found that social status and social network size positively correlate with grey matter increase in the macaque rostral prefrontal cortex (Sallet et al., 2011;Noonan et al., 2014), showing how social environment contributes to changes in prefrontal cortex structures. From an anatomical point of view, recent studies ( Sliwa and Freiwald, 2017;He et al., 2020) have indeed claimed that some regions of the FPC show connections with cortical regions involved in the social-interaction network, in which other related activity has already been investigated (Chang et al., 2013;Chang et al., 2017;Isoda et al., 2018;Ferrucci et al., 2021). However, we need to point out that the current results are not a prove that the main evolutionary advantage of the FPC is the social domain and other functions could have been equally important for the expansion of this area during evolution, as for example fast learning as proposed by Boschin et al. (2015). To understand whether the frontopolar cortex provide a unique and specific contribution to social demands in particular it will be critical to record simultaneously other areas for comparison. Our findings represent the first step in improving our understanding of the role of the FPC in social behaviors as restricted to a monitoring function that can support for example observation learning or evaluating of each other's actions in joint actions tasks. Further neurophysiology studies will be crucial for understanding the contribution of this area to representing social information in the broader social-interaction network.

Funding
This work was supported by the Sapienza University of Rome [Avvio alla ricerca AR12117A862169C0 (to L.

Declaration of Competing Interest
The authors declare no competing interests.