Ventrolateral prefrontal neurons of the monkey encode instructions in the ‘pragmatic’ format of the associated behavioral outcomes

The prefrontal cortex plays an important role in coding rules and producing context-appropriate behaviors. These processes necessarily require the generation of goals based on current context. Indeed, instructing stimuli are prospectively encoded in prefrontal cortex in relation to behavioral demands, but the coding format of this neural representation is, to date, largely unknown. In order to study how instructions and behaviors are encoded in prefrontal cortex, we recorded the activity of monkeys (Macaca mulatta) ventrolateral prefrontal neurons in a task requiring to perform (Action condition) or withhold (Inaction condition) grasping actions performed on real objects. Our data show that there are neurons responding in different task phases, and that the neuronal population discharge is stronger in the Inaction condition when the instructing cue is presented, and in the Action condition in the subsequent phases, from object presentation to action execution. Decoding analyses performed on neuronal populations showed that the neural activity recorded during the initial phases of the task shares the same type of format with that recorded during the final phases. We propose that this format has a pragmatic nature, that is instructions and goals are encoded by prefrontal neurons as predictions of the behavioral outcome.

Monkey electrophysiology and human fMRI studies evidenced a rostro-caudal gradient in the PF functional organization (Koechlin et al., 2003;Badre and D'Esposito, 2007;Koechlin and Summerfield, 2007;Nee and D'Esposito, 2016;Riley et al., 2016;Tang et al., 2021;Jung et al., 2022). Despite theoretical differences among the authors, there is general agreement that within this gradient the mid-portion of PF is involved in behavior selection and organization. This implies a strong relation among neurons coding the different aspects and temporal phases of a behavioral task. In line with the idea of a functional gradient, anatomical studies in monkeys showed that this sector of the ventrolateral PF (VLPF) is anatomically connected with the parieto-premotor circuits (Barbas and Pandya, 1989;Cavada and Goldman-Rakic, 1989;Cipolloni and Pandya, 1999;Borra et al., 2011;Gerbella et al., 2013;Saleem et al., 2014) involved in organization and control of grasping actions (Rizzolatti et al., 2014;Borra et al., 2017). The inclusion of VLPF in the grasping circuit suggests that this region plays a crucial role in the context-based behavioral control (Miller, 2000;Tanji and Hoshi, 2008;Rozzi and Fogassi, 2017).
Most electrophysiological studies on monkey PF cortex adopted tasks involving the presentation of an instructing cue arbitrarily associated to a specific behavior, to be executed after a delay. There is clear evidence that PF neurons activity recorded during cue presentation, delay period, and behavior production is deeply influenced by the general rule governing the paradigm (Asaad et al., 1998;White and Wise, 1999;Hoshi et al., 2000). Generally, these studies employed as behavioral output the execution of eye movements (Funahashi et al., 1989;Spaak et al., 2017;see Funahashi, 2014) or simple forelimb movements (Hoshi et al., 1998;Yamagata et al., 2012), while the role of PF in the guidance of object-oriented natural actions has been rarely investigated (Tanila et al., 1992;Bruni et al., 2015;Simone et al., 2015). Previous works from our lab demonstrated that VLPF contains grasping-related neurons and visually responsive neurons activating stronger when an observed object is going to become a target for a grasping action (Simone et al., 2015;Rozzi et al., 2021). These studies focused on specific neuronal populations responding to object observation and/or during grasping execution. However, many neurons recorded in these studies also responded in other task epochs, indicating that our knowledge about the role of prefrontal neurons in linking the context to the actions is still largely incomplete.
The first aim of the present study is to assess the role of VLPF in linking the abstract rules, guiding a Visuo-Motor task, and the object features with the instructed behavior. To this aim, we recorded VLPF neurons in a Go/NoGo task requiring executing or withholding objectoriented grasping actions.
The second aim of the study is to describe the format in which instructions and behaviors are encoded by VLPF neurons. To this aim, we described the temporal dynamics of VLPF neural activity during task unfolding and employed temporal-cross decoding analyses of specific neuronal populations.

Methods
The experiment was carried out on two female Rhesus monkeys (Macaca mulatta, M1, M2) weighing about 4 kg. The animals have been previously employed in a series of experiments, whose results have already been published (Simone et al., 2015(Simone et al., , 2017Rozzi et al., 2021). All methods were carried out in accordance with the European (2010/63/EU) and the ARRIVE guidelines. The experimental protocols, the animal handling, and the surgical and experimental procedures complied with the European guidelines (2010/63/EU), and Italian laws in force on the care and use of laboratory animals, were approved by the Veterinarian Animal Care and Use Committee of the University of Parma and authorized by the Italian Health Ministry.

Training and surgical procedures
The monkeys were first trained to seat on a primate chair and to familiarize with the experimental setup. At the end of the habituation sessions, a head fixation system (Crist Instruments Co. Inc.) was implanted. Then, they were trained to perform the task described below. After completion of the training, a recording chamber (32 ×18 mm, Alpha Omega, Nazareth, Israel) was implanted on VLPF, based on MRI scan. All surgeries were carried out under general anesthesia (ketamine hydrocloride, 5 mg/kg, i.m. and medetomidine hydrocloride, 0.1 mg/ kg, i.m.), followed by postsurgical pain medication.

Experimental apparatus
During training and recording sessions the monkeys seated on a monkey chair with the hand contralateral to the hemisphere to be recorded on a resting position, located 9 cm in front of the abdomen. A box containing three different objects was located in front of the monkey, 22 cm from the monkey's chest. A small door (7 ×7 cm) facing the monkey at eye's height allowed, when opened, to present the objects, one at the time. The objects were a small sphere (diameter 1 cm), a large cube (side 2 cm) and a horizontally oriented cylinder (length 4 cm, diameter 1.5 cm) and were chosen so as to elicit three different types of grip, i.e., precision grip, whole hand prehension and finger prehension, respectively. Two laser spots (instructing cues) of different colors (green and red) could be projected onto the box door or onto the object, depending on the task phase.

Visuo-Motor task
The Visuo-Motor task is the same as that described in Simone et al. (2015). Briefly, the task consisted of two basic conditions: Action and Inaction (Fig. 1). Each trial started with the monkeys' hand on the starting position. Then, one of the two instructing cues (green=Action condition; red=Inaction condition) turned on and was projected onto the closed box door. In both conditions, the monkeys had to maintain fixation within a 6 • x6 • fixation window centered on the instructing cue for a randomized time interval (500-1100 ms). Then, the box door opened allowing the monkeys to see one of the three objects.
In the Action condition, during object presentation, the monkeys had to maintain fixation with the green cue still on, projected onto the object. After a randomized time (700-1100 ms), the green cue turned off (Go signal), instructing the monkeys to reach for, grasp the object and pull it. Note that after the Go signal the monkey was not overtly required to keep fixation, but both monkeys maintained fixation until reward delivery.
In the Inaction condition, the monkeys were instructed by a red cue. The condition unfolding was the same as in the Action condition till the red cue turned off, after which the monkeys were required to keep fixating for further 600 ms, refraining from acting.
The order of presentation of both objects and conditions was randomized.
If the monkeys correctly performed a trial, a drop of liquid reward was delivered at the end of it. A trial was discarded when one of the following types of error occurred: 1) releasing the hand from the resting position before reward delivery in the Inaction condition or before the Go signal in the Action condition; 2) breaking fixation before reward delivery in the Inaction condition or before the Go signal in the Action condition; 3) failing to reach and grasp the object; 4) grasping the object with an incorrect prehension. Discarded trials were repeated at the end of the sequence to collect at least 30 correct trials for condition (10 trials x 3 objects).

Recording techniques and task events acquisition
Neuronal recordings were performed by means of a multi-electrode recording system (AlphaLab Pro, Alpha Omega Engineering, Nazareth, Israel) employing glass-coated microelectrodes (impedance, 0.5-1 MΩ) inserted through the intact dura. The microelectrodes were mounted on an electrode holder (MT, Microdriving Terminal, Alpha Omega) allowing electrodes displacement, controlled by a dedicated software (EPS; Alpha Omega). The MT holder was directly mounted on the recording chamber. Neuronal activity was filtered, amplified, and monitored with a multichannel processor and sorted using a multi-spike detector (MCP Plus 8 and ASD, Alpha Omega Engineering). Spike sorting was performed using the Off-line Sorter (Plexon, Inc, Dallas TX, USA).
The experiment was controlled by a homemade Labview software. Digital output signals determined the onset and offset of laser spots, opening of the door and reward release, and contact-detecting electric circuits provided the digital signals related to monkey hand contact and release of the resting position and the beginning and end of object pulling. Eye movements were recorded using an infrared pupil/corneal reflection tracking system (Iscan Inc., Cambridge, MA, USA) positioned above the box. Sampling rate was 120 Hz. The voltage related to the X and Y coordinates of eye position was fed as analog input to the system for recording neural signal and in parallel to the computer controlling the behavioral paradigm allowing the online control of the task (see above) and offline analyses of the oculomotor behavior.

Neuronal analysis
The digital signals were also employed to align neuronal activity and to create the response histograms and the data files for statistical analysis.
The neural activity was recorded for at least 60 successful trials (thirty per condition, 10 for each object). For statistical analysis of the neural activity, nine epochs have been defined (see also Simone et al., 2015), based on the digital signals ( Fig. 1) as follows: 1) Baseline: from 750 ms to 250 ms before the onset of the instructing cue; 2) Pre-cue: 250 ms preceding the onset of the instructing cue; 3) Cue: 250 ms following the onset of the instructing cue; 4) Prepresentation: 500 ms preceding the opening of the box door; 5) Presentation: 500 ms following door opening (object presentation); 6) Set: 250 ms before the offset of the instructing cue; 7) Go/NoGo, from the offset of the instructing cue to the release of the hand starting position (Action condition) or 250 ms following the offset of the instructing cue (Inaction condition); 8) Grasping/Fixation: from 250 ms before to 250 ms after the Pulling onset (Action condition) or a time period ranging from 250 ms to 500 ms after the offset of the instructing cue (Inaction condition); 9) Reward: 500 ms following reward delivery.
Single-neuron responses were statistically evaluated by means of a 9 × 2 ANOVA (Factors: Epoch, Condition, p < 0.01) followed by Newman-Keuls post hoc tests. Since trials were randomized, changes of the baseline activity across trials were not expected, and the neurons showing a significant different between baselines were discarded. Accordingly, neurons were included in our dataset and were defined as task related when the 9 × 2 ANOVA revealed at least one of the two following effects: 1) a significant main effect Epoch (p < 0.01) with the relative post-hoc test showing a significant difference between the activity recorded in the Baseline epoch and in at least one of the other epochs (Condition-independent neurons); 2) an Interaction effect (Condition x Epoch, p < 0.01), with the subsequent post-hoc test showing a significant difference between at least one epoch of one condition and both its baseline and the corresponding epoch of the other condition (Condition-dependent neurons). Considering that the epochs of Pre-cue and Reward fall in the inter-trial period, when eye movements are not controlled, we decided to consider, for our analysis, the remaining six epochs plus the Baseline.
In order to test Condition and Object selectivity in the neurons active during object presentation, all the neurons with significant responses in the Presentation epoch were further analyzed with a 3 × 2 ANOVA (factors: Object and Condition, p < 0.01) followed by Newman-Keuls post hoc test. Neurons were considered object selective when the analysis revealed a significant main effect Object (p < 0.01) and the relative post-hoc test showed a significant difference between at least two objects.

Population analysis
To characterize the time course and the discharge rate of different neuronal populations with respect to the main task phases, the neuronal activity of each population was aligned with the main behavioral events. For each of these analyses we considered the neurons responding with an increase in discharge (excited) in a specific epoch. The population activity was computed as follows. The mean single neuron activity over trials, in term of firing rate, was calculated for each 20 ms bin in the two conditions. The average baseline activity was then subtracted from the mean single neuron activity over trials for each bin. Thus, in this analysis, 0 represents baseline activity. The net average discharge frequency of each neuron was used for subsequent statistical analysis. Each neuron contributed one entry to each data set. The statistical design adopted was the same as that employed for single neuron activity (see above). Statistical analysis was performed with a significance criterion of p < 0.01.

Demixed principal components analysis
In order to evaluate how the population of task-related neurons encodes the two conditions and the three objects during task unfolding, we adopted a data-simplification method: the demixed principal components analysis (dPCA), using freely available code provided by Kobak and coworkers (Kobak et al., 2016; http://github. com/machenslab/dPCA).
Besides reducing the dimensionality of dataset, dPCA uses information related to specific task factors (Condition and Object) to calculate the percentage of variance explained by the identified factors of the task. In addition, this analysis allows to identify components unrelated to the chosen factors, reflecting the dynamic changes of the population activity in time, which are similar for all factors. The toolbox uses a linear classifier (stratified Monte Carlo leave-group-out cross validation) to evaluate at which time points the given task elements belonging to a factor (i.e. Action vs. Inaction; Sphere vs. Cylinder vs. Cube) are significantly different from each other (see below).
First, since in our paradigm time intervals were randomized, we aligned the neural activity with different events and defined the following time periods: 1) Baseline: 750 250 before cue onset; 2) Cue: 0-500 ms following cue onset; 3) Presentation: 0-500 ms following object presentation; 4) Decision: − 200 to + 200 ms around the Go/ NoGO signal; 5) Behavioral response: − 300 to + 100 ms around the beginning of object holding in the Action condition and 200-600 ms after the NoGO signal in the Inaction condition. Subsequently we joined these time periods to create a matrix of the same time length for each trial.
Then, we classified the activity of task related neurons according to the six possible types of trial (i.e., the combination of 2 conditions and 3 objects), and calculated trial-by-trial the averaged 20-ms bins firing rate. The result was smoothed with a Gaussian-weighted moving average filter with a window of six 20 ms bins.
Starting from this dataset, we calculated the first 30 principal components. The number of repetitions used for optimal lambda calculation was 10, the number of iterations for cross-validation was 100, and the number of shuffles used to compute the Monte Carlo chance distribution was set to 100.
Finally, we plotted the time course of the two largest demixed principal components for which variance was mainly attributable to the Condition and the Object factors as well as to other possible factors unrelated to them. We considered a statistical separation of the curves when the actual classification accuracy exceeded all 100 shuffled decoding accuracies in at least 10 consecutive time bins.

Decoding analysis
In order to estimate which type of information is coded by the different neuronal populations considered in this work and how this information is encoded in dynamic patterns of activity, we adopted a population decoding approach according to the methodology described by Meyers and coworkers (Meyers, 2013; http://www.readout.info/; see also Loriette et al., 2022).
First, for each neuron, the activity was aligned with the main behavioral events as described in the dPCA section and binned as follows: for each trial we calculated the average firing rate in bins of 60 ms, sampled at 20 ms intervals. Thus, each trial expressed in bins was defined as data point. We concatenated data points of each neuron to obtain a population of binned data characterized by a number of data points corresponding to the number of trials per decoding factor (30 data points x 2 conditions for condition decoding; 20 data points x 3 objects for object decoding). Labels were then assigned to each data point to identify the corresponding factors to be analyzed (Condition or Object) in an N-dimensional space (where N is the total number of neurons considered for each analysis). Next, we randomly grouped all the available data points for each neuron into k non-overlapping splits, where k is the number of data points per decoding factor (30 in the condition decoding, 20 in the object decoding). A split contained a number of data points corresponding to 2 (number of Conditions) in the Condition decoding or 3 (number of Objects) in the Object decoding, for each neuron used for the analysis. Note that each split contained a "pseudopopulation", namely, a population of neurons that were possibly recorded separately but treated as if they were recorded simultaneously. Then, we performed a cross-validation procedure consisting in training a Poisson naïve Bayes classifier on k-1 splits and testing it on the remaining one. This procedure was repeated k times, leaving out a different split each time. Finally, to increase the robustness of the results, the whole decoding procedure was repeated 50 times using different data points in the training and test splits, and the decoding accuracy from all these repetitions was averaged.
In order to evaluate the dynamics of the temporal evolution of information coding, we applied a temporal-cross decoding analysis, which consists in training the classifier at time t and testing it at all the other time bins.
The data alignment on task events and the binning procedure described above led to merge in the same bin the activity at the border between two subsequent periods of the task (bins of 60 ms, sampled at 20 ms intervals). Accordingly, in our analysis, we removed the last two bins of each task period considered in the analysis.

Anatomical reconstruction of the neuronal properties
The recording region was reconstructed based on the location of the penetrations in stereotaxic coordinates plotted onto the MRI scans of the brain of each investigated monkey (see Fig. 2). Penetration depth, as reported by the protocol, was matched with its location with respect to the sulci.

Properties of task-related neurons
We recorded neural activity from the left VLPF of two monkeys during the execution of the Visuo-Motor task. The recorded sector covers a large cortical region including most of VLPF, excluding its rostralmost sector, and slightly extending in the dorsal prefrontal cortex (Fig. 2 A).
We recorded 2929 neurons (M1: 1462; M2: 1467), of which 1382 (M1: 596; M2: 786) were classified as task-related based on the criteria described in the Methods section. Of them, 672 had a significant response with respect to the baseline in only one epoch, while the remaining had a significant response in two or more epochs. Fig. 2 B shows the number of neural responses recorded in each of the considered epochs. The most represented epoch is that of Presentation followed by Grasping/Fixation, Go/NoGo and Cue. Out of all taskrelated neurons, 430 (31%; M1: 166; M2: 264) showed a significant differential discharge between Action and Inaction conditions in at least one epoch (Condition-dependent neurons; Interaction effect, followed by Newman Keuls post-hoc test, p < 0.01), while 952 (69%; M1: 430; M2: 522) did not show any significant difference between conditions (Condition-independent neurons). Fig. 2 C depicts for each epoch the percentage of Condition-dependent neurons preferring the Action (green) or Inaction (red) condition. It is clear that, while in the Cue epoch there is a much higher percentage of neurons preferring the Inaction condition, this difference reverses from Presentation epoch onward.

Fig. 2. A
Reconstructions of the left hemisphere of the two monkeys (M1 and M2), showing the recorded region (shaded area). IA: inferior arcuate sulcus; L: lateral fissure; P: principal sulcus; SA: superior arcuate sulcus; ST: superior temporal sulcus. B Distribution of neuronal responses in the different task epochs. For each epoch the grey bar indicates the number of Condition-independent neurons, the green bar that of Action-related neurons, and the magenta bar that of Inaction-related neurons. C Percentage of Condition-dependent neurons preferring the Action or Inaction condition in the different epochs. For each epoch, the ratio, expressed as percentage, is calculated as the number of neurons with preferential response in each condition divided by the total number of neurons showing an Interaction effect in that epoch.

Condition-dependency of task related neurons
We carried out a demixed principal component analysis (dPCA, see Methods) on the task related neurons to evaluate how this population encodes the two conditions and the three objects during task unfolding. The analysis was conducted on 1350 neurons, after the exclusion of 32 units for which some digital event was missing due to technical problems occurred during recordings. Fig. 3 A shows the distribution of variance among factors. Most of the variance of the activity (69%) is factor independent. For the rest, a larger part of the variance (18%) is captured by the Condition factor, a smaller one (9%) by the Object factor and the remaining (4%) depends on the interaction between the two factors. Fig. 3 B depicts the first two demixed principal components for which variance is mainly attributable to Factor-independent parameters (upper row), Condition factor (middle row), and Object factor (lower row). The trajectories of the first two principal components not dependent from the two considered factors show two peaks on cue and presentation onset. Note that component 2 remains modulated after object presentation in the late phase of the task, during decision and Behavioral response period (see Methods for the definition of the temporal period). In both the components related to the Condition factor the firing rate between the Action and Inaction conditions start differing just after cue onset. In Component 4 the difference between conditions gradually increases until the end of the task, reaching its maximum in the behavioral response period, in component 7 it reverts a first time just before object presentation and a second time in the behavioral response phase. In both the components related to the Object factor, the firing rate difference among objects emerges after object presentation and remains significantly different till the end of the task. Fig. 3 C shows the net mean activity relative to the two conditions of the population of task-related neurons showing an increase in discharge (excited) in at least one of the considered statistical epochs (n = 978). In order to evaluate differences in the neuronal population discharge in different epochs and condition we carried out a 9x2ANOVA on the same factors used for single neuron analysis (see Methods). From the figure, it is evident that there are three main task phases that elicit a strong neural discharge. The first corresponds to cue onset in which the peak during the Inaction condition is higher than that of the Action condition (9 ×2 ANOVA, Interaction effect p < 0.01); the second peak, the highest, occurs after object appearance, the activity being stronger in the Action condition. The third corresponds to the period going from the Go/NoGo signal to the beginning of object pulling, in which the activity is stronger in the Action condition. Note that the difference in activity between the two conditions abruptly ceases on object pulling.

Coding of the instructing cue
Four hundred fifty-one neurons (M1: 218; M2: 233) had a significant response in the Cue epoch (see Methods). Of them 359 (80%) were Condition-independent, while 92 (20.4%) were Condition-dependent, showing a preference for the Inaction (68) or the Action (24) condition. Fig. 4 A shows the distribution of condition preference expressed in the different epochs by all neurons responding in the Cue epoch. Note that, while Cue related neurons preferring the Action condition show a quite balanced condition preference in the subsequent Presentation Epoch, those preferring the Inaction condition tend to lose their Inaction preference, some of them developing a preference for the Action condition (see Supplementary Table 1 for details). Fig. 4 B-D shows examples of neurons responding to the appearance of the instructing cue. The neuron depicted in Fig. 4 B responds equally well to the two cues, those in C and D have a clear preference for the green or red cue, respectively. Fig. 5 A-D shows the mean net activity of the populations of neurons responding with an increase in discharge (excited) to cue appearance. The four graphs correspond to the response of whole population of neurons (n = 402), the Condition-independent neurons (n = 310), the Condition-dependent neurons preferring the Action (n = 24) and Inaction (n = 68) condition, respectively.
The whole population of neurons (A) has a strong activity in the Cue and Presentation epochs in both conditions. The mean Inaction-related activity in the Cue epoch is significantly higher than the Actionrelated one, and this preference reverses in the Presentation epoch (9 ×2 ANOVA, Interaction effect, p < 0.01, see Methods). It is also evident that the population shows, only in the Action condition, a prolonged response above baseline from the Go signal till the beginning of holding, after which the response has an abrupt decrease. A similar profile is also visible in the Condition-independent population (B). The two populations of Condition-dependent neurons show different profiles. In fact, the Action-related neuronal population (C) shows a significant preference for the Action condition in the Cue Epoch, which tends to remain, though not statistically significant, in the Presentation Epoch. Differently, the population of Inaction-related neurons (D) shows a preference for the Inaction condition in the Cue epoch which reverses in the Presentation epoch. The preference for the Action condition in the Presentation epoch in both Condition-dependent populations does not completely account for that observed in the whole population of cuerelated neurons, since is also present in the population of Conditionindependent neurons. Fig. 5 A'-D' shows the accuracy level of the decoding of the Condition factor when the classifier was trained and tested on different time periods (temporal-cross decoding plots, see Methods) for the four Cuerelated populations. In all populations, the highest decoding accuracies occur along the diagonal, with the lowest level of accuracy shown by the population of Condition-independent neurons (n = 358). In addition, along the diagonal there is a clear decrease in accuracy mainly around the Decision period in all populations, especially evident in that of Action-related neurons (n = 24). Finally, in this latter population of neurons, the decoding performance is also high when training on data from the Cue period and testing on data from the Behavioral response one and vice versa. This result might indicate that there is a common pattern of activity encoding the Condition factor between the Cue and the Behavioral response periods. Finally, in the population of Inactionrelated neurons (n = 67), as well as in the whole population (n = 443), the decoding performance is quite high when training on data from the Decision period and testing on data from Cue period, while it is weaker vice versa.

Neural responses to object presentation
Condition dependency. About half of task-related neurons showed a significant response in the Presentation epoch (678, 49%; M1: 253; M2: 425). Of them, 551 (81.2%) were Condition-independent, while 127 (18.7%) were Condition-dependent, 38 preferring the Inaction and 89 the Action condition. Fig. 6 A shows the distribution of condition preference expressed in the different epochs by all neurons responding in the Presentation epoch. In all populations the response to object presentation is the highest among all epochs. Note that in the whole population of neurons (n = 530) as well as in that of Condition-independent neurons (n = 407) the response to object presentation is significantly higher in the Action than in the Inaction condition (9 ×2 ANOVA, Interaction effect, p < 0.01, see Methods). A second, although smaller, response present in all populations occurs in the Cue epoch. A third response, present in all Pie chart shows how the total signal variance is split between the factors indicated on the right. B Each panel depicts the time course of the projections of the two largest demixed principal components that can be attributed to the Factor-independent (first row), Condition (second row) and Object (third row) factors. In each panel the dashed vertical lines indicate the beginning of the considered time periods (see Methods). Horizontal thick lines below the trajectories indicate the time intervals where task factors (Condition and Object) are reliably decoded (see Methods). C Temporal profile of the discharge of the neuronal population. In the upper part of each panel, the magenta and green curves indicate the population mean net activity in the Inaction and Action condition, respectively, of task-related neurons showing a significant increase of discharge in at least one epoch. The shaded area around each curve represents standard errors. In the lower part of each panel, the magenta curve represents the differential activity (Action minus Inaction), and the blue bars represents three standards errors. The neuronal activity is aligned on the main task events (vertical dashed lines), that are used for the identification of statistical epochs (magenta and green bars). Abscissae: time. Ordinates: mean net activity.   5. A, B, C, D Temporal profile of the mean net and differential activity of the populations of neurons responding with an increase in discharge to cue appearance. A Whole population; B Conditionindependent neurons; C Action-related neurons; D Inaction-related neurons. Alignments and conventions as in Fig. 3 C. A', B', C', D' Temporal-cross decoding analysis of the Condition factor (Action and Inaction) in the populations of neurons responding to cue appearance. A' Whole population; B' Condition-independent neurons; C' Action-related neurons; D' Inaction-related neurons. The decoding accuracy (color-coded) is computed in bins of 60 ms, sampled at 20 ms intervals. For each plot, the vertical and horizontal lines indicate the beginning of the considered time period (see Methods). Decoding periods of testing and training are indicated on the X and Y axes, respectively. showing Condition dependency in the Presentation epoch, having a preference for the Action or Inaction condition in the different task epochs. B Example of neuron showing a similar discharge in the two conditions. C Example of neuron responding to object presentation exclusively in the Action condition. D Example of neuron discharging for object presentation only in the Inaction condition. Rasters and histograms are aligned (vertical dashed line) with the onset of object presentation. Green circles: Action cue appearance/Go signal; Magenta circles: Inaction cue appearance/NoGo signal; Blue triangles: release of the hand starting position (Action condition); Cyan diamonds: beginning of object pulling; Ocher squares: reward delivery. Other conventions as in Fig. 4. populations but that of Inaction-related neurons, occurs only in the Action condition and starts after the Go signal, abruptly ending after the beginning of object holding. The two populations of Conditiondependent neurons (Action: n = 88; Inaction: n = 38) show markedly different profiles. In fact, while in Action-related neurons there is a significant preference for the Action condition in the Presentation, Go/ NoGo and Grasping/Fixation epochs, in the Inaction-related neurons there is a preference for the Inaction condition only in the Presentation epoch. Fig. 7 A'-D' shows the accuracy level of the decoding of the Condition factor when the classifier was trained and tested on different time periods (temporal-cross decoding plots, see Methods) for four Presentation-related populations, presented in the same order as that of Fig. 5. In all populations, the highest decoding accuracies occur along the diagonal, with the lowest level of accuracy shown by the population of Condition-independent neurons. A clear decrease in accuracy is present along the diagonal, mainly in the Decision period in the populations of Condition-independent (n = 551) and in that of Inaction-related neurons (n = 38). In this latter population the decrease in accuracy continues also in the Behavioral response period. In addition, in this population of neurons, the decoding performance is high also when training on data from the Cue period and testing on data from the Behavioral response one and vice versa. Concerning the population of Action-related neurons (n = 88), this is characterized by a high decoding accuracy when the classifier was trained on data from the initial and late phase of the Presentation period and tested on the Decision and Behavioral response period. The decoding performance is high also when training on data from the Decision period and testing on data from the Behavioral response one and vice versa. This suggests that in these two periods there is a common pattern of activity encoding the Condition factor. This static pattern also extends to specific phases of the Presentation and Cue periods, although the decoding performance is oscillating in terms of accuracy. The general decoding performance observed in the whole population (n = 673) is very similar to that of Action-related neurons.
Object preference. In order to assess whether neurons responding to object presentation had some type of object and/or condition preference, we run a 3 × 2 ANOVA (factors: Object and Condition, p < 0.01; see Methods). This analysis revealed that 27% of presentation neurons differentially respond in the two conditions, while a smaller percentage (11%) shows some type of object selectivity. This selectivity is almost equally distributed among the three objects (33, sphere; 23, cube; 21, cylinder). Fig. 8 A shows an example of a neuron responding to the presentation of the three objects only in the Action condition. Fig. 8 B depicts a neuron preferentially responding to the presentation of the cylinder in both conditions. Only 3% of presentation neurons show a significant Interaction effect (Condition x Object), indicating a high selectivity for a specific object in a given condition, as shown by the neuron in Fig. 8 C, responding strongest to the presentation of the cylinder in the Action condition. Fig. 8 D, E show the accuracy level of the decoding of the Object factor when the classifier is trained and tested on different time periods for the populations of neurons responding to object presentation with a differential response for conditions (n = 182) and objects (n = 76). In the population of neurons differentially responding in the two conditions, the decoding accuracy is high only along the diagonal in the second half of the presentation phase. In that of neurons differentially responding to objects, the highest decoding accuracy is present after object presentation, but it is also evident when training on data from the Presentation period and testing on data from the Behavioral response period and vice versa.

Neuronal activity during the behavioral response phase
Since the pattern of activity of the population of task-related neurons is very similar in the Go/NoGo and Grasping/Fixation epochs (see Fig. 3 C), we decided to analyze the neurons responding during these two epochs together (behavioral response phase). Note also that the large majority (68%) of neurons activating in the Go/NoGo and/or Grasping/ Fixation epochs, actually respond in both epochs.
About half of task-related neurons showed a significant response in the behavioral response phase (710, 51%; M1: 312; M2: 398). Of them, 494 (70%) were Condition-independent, while 216 (30%) were Condition-dependent, 52 preferring the Inaction and 164 the Action condition. Fig. 9 A shows the distribution of condition preference expressed in the different epochs by all neurons responding in the behavioral response phase. Fig. 9 B-D shows examples of neurons responding in the behavioral response phase, with no statistical difference between the two conditions ( Fig. 9 B), a clear preference for the Action (C) or Inaction (D) condition, respectively. Fig. 10 A-D shows the net mean activity of the populations of neurons showing a significant increase in discharge (excited) in the Go/NoGo and/or Grasping/Fixation epochs with respect to the Baseline epoch. The whole population of neurons responding in this epoch with an increase in activity (n = 424) shows a strong discharge in the Presentation epoch and in the behavioral response phase, but a significant difference between the two conditions emerges only in the Go/NoGo and/or Grasping/Fixation epochs. In the population of Condition-independent neurons (n = 222) the response to object presentation is the highest, while it decreases in the behavioral response phase, and no significant differences between conditions in any epoch is present (9 ×2 ANOVA, Interaction effect: n.s.). In the population of Action-related neurons (n = 160) the differential response starts growing during Presentation epoch, further increases in the subsequent epochs peaking on object pulling and decreases during object holding. In the population of Inaction-related neurons (n = 44) the discharge increases during the behavioral response phase. Fig. 10 A'-D' shows the accuracy level of the decoding of the Condition factor when the classifier was trained and tested on different time periods for the four Behavioral response-related populations, presented in the same order as that of Fig. 5. In all populations, the highest decoding accuracies occur along the diagonal, with the lowest level shown by the population of Inaction neurons (n = 52), in which in the Cue and Decision period the accuracy is quite low. Note that in this population the highest level of accuracy is reached in the last part of the Behavioral response phase, and only weak decoding performance is evident outside the diagonal. Concerning the population of Actionrelated neurons (n = 158), this is characterized by a high decoding accuracy when the classifier was trained on data from the Presentation period and tested on the Behavioral response phase, and vice versa. This suggests that in these two periods there is a common pattern of activity encoding the Condition factor. In the whole population of neurons (n = 693), the pattern of decoding is very similar to that of Actionrelated neurons, but the static pattern also partly extends to the Cue period, although in this latter the decoding performance is oscillating in terms of accuracy.

Discussion
In this work, we recorded single neuron activity from ventral prefrontal cortex to investigate its role in coding execution or withholding of intentional actions instructed by abstract rules and based on object features. In particular, we focused on the temporal dynamics of functionally identified populations of neurons.
The results of our work show that: a) the main factor influencing neural discharge is the behavioral condition (Action/Inaction), while object coding is a less relevant factor; b) there is a clear preference of the whole neuronal population for the Inaction condition when the instructing cue is presented, while this preference inverts in favor of the Action condition, from object presentation onward; c) the study of the dynamic response of specific populations during the different phases of   the task reveals that the same type of neural coding is shared between different epochs.
In order to study typical prefrontal functions such as action selection and inhibition, we chose a Go/NoGo paradigm. However, our Go/NoGo task necessarily implied a difference in complexity between conditions: while in the Inaction condition the choice of the behavioral response relied only on the instruction cue, in the Action condition, the type of object and the timing of action initiation were also relevant. Accordingly, the behavioral assessment clearly showed that the percentage of errors was higher in the Action condition. These considerations allow us to better interpret the results, that show a difference in the dynamic of discharge between the two conditions (see below).

The cue instructing the task rule is prospectively encoded in the format of its associated behavioral outcomes
The majority of cue-responsive neurons are Condition-independent, while about 20% are Condition-dependent. The response of the first neuronal category could be triggered by arousing or attentional factors such as increase in luminosity or beginning of the task, which requires to fixate, to "read" the current instruction, and prepare for the subsequent phases of the task. Indeed, neural responses related to attentional factors or to the occurrence of relevant task phases have been previously described in prefrontal cortex (di Pellegrino and Wise, 1993;Everling et al., 2002;Ninokura et al., 2003;Shima et al., 2007). Note that the Condition-independent population develops, in the Presentation epoch, a preference for the Action condition, suggesting that part of its neurons achieve a specificity for this condition when additional object-related information is provided.
Most Condition-dependent neurons prefer the Inaction cue, very likely because at this time the associated behavior is already set. In fact, the subsequent object presentation and NoGo signal, although necessary for accomplishing the Inaction condition, are not relevant for the decision of which behavior to perform. On the other hand, the preference of the minority of Condition-dependent neurons for the Action cue could indicate that the monkey is going to actually execute an action and that subsequent events will be relevant for fulfilling the task. A similar neural pattern has already been described and interpreted in terms of rule coding (White and Wise, 1999;Asaad et al., 2000;Hoshi et al., 2000;Wallis et al., 2001;see Miller and Cohen, 2001;Tanji and Hoshi, 2008). Similar to Condition-independent neurons, also some of the Condition-dependent Inaction neurons show a change in condition preference after object presentation. A similar evidence has been described in a study on prefrontal neurons recorded from the same region (Bruni et al., 2015) showing that the great majority of visually responsive neurons discharged stronger when the monkey could decide which behavior to perform, thanks to the integration of visual information about the object with a previously presented acoustic cue. Further confirmation of this concept has been provided in a subsequent work (Rozzi et al., 2021), showing that the response to object presentation is modulated by the contextual information (passive observation vs observation to withhold action vs observation to grasp).
A factor possibly contributing to the observed changes in condition preference is reward expectation. Indeed, there is evidence that this factor is relevant in tasks based on associative learning and that the expected rewards modulate information processing (Amiez et al., 2006;Kennerley and Wallis, 2009;Matsumoto et al., 2003;Watanabe, 1996;Watanabe et al., 2007). In our task, the amount of reward is the same, but the condition difficulty is different, thus the same reward could have for the monkey a different value, leading to a different modulation of the neuronal response.
The decoding analysis applied to Action and Inaction populations of cue-responsive neurons allowed us to determine their coding format. In fact, in the Action population, the activity recorded in the Behavioral response period allows an accurate condition decoding in the Cue period, while this phenomenon is slightly weaker in the other direction.
In the Inaction population, decoding accuracy is high when training the classifier on data from the Decision period and testing it on the Cue period, while it is much lower in the opposite direction. Note that the "generalization" found with the decoding analyses between different time periods is typically stronger backward than forward in time also in populations of neurons coding other task epochs. This phenomenon very likely depends on the fact that the final phases of the task contain more condition-related information than the early ones.
We believe that Cue-related neurons encode the instructions in terms of behavioral outcomes ('pragmatic coding'). For this type of coding, feedback signals about the correctness of the performed behaviors, such as keeping the hand on the start position or object grasping/pulling, are very relevant for the subjective judgement. A further final feedback is represented by reward delivery (see also the last section of the discussion).

Neural response to observed objects is coded in terms of behavior selection and execution
During object presentation, the response in the Action condition is prevalent on that of the Inaction condition, both in terms of neurons numerosity and of average population discharge. The presentation of the object, in the Action condition, allows the monkey to progress from the general programming of behavioral goal (to act) to the specific motor program to be executed.
In the whole population of object presentation neurons, as well as in that of action-related neurons, the response is stronger in the Action condition not only in the Presentation epoch, but also from the Go signal to object pulling. The permanence of this preference until the end of the action is confirmed by the decoding analysis, showing a high performance when training the classifier on data from the Presentation period and testing it on the Decision and Behavioral response periods and vice versa. Note also that the decoding accuracy, very high in the first and late phases of presentation, falls during its middle phase. This could be due to two possible, not mutually excluding factors: a) during this phase, some other feature is encoded (see below discussion on object preference); b) the high accuracy in decoding the first, very short phase of presentation is not actually related to presentation per se but is a sort of tail of the pre-presentation activity. We favor this interpretation for two main reasons: first, PF neurons activity recorded in delay periods preceding an event typically ceases or peaks and abruptly falls just after the event occurrence (Funahashi et al., 1989;Watanabe, 1996;Saga et al., 2011;see Funahashi, 2014); second, the duration of the first phase of presentation (about 60 ms), characterized by high accuracy decoding, occurs before the population activity reaches its peak (200 ms, for a similar timing see Freedman et al., 2001;Yamagata et al., 2012;Rozzi et al., 2021).
It is well known that VLPF neurons activate during the observation of visual stimuli (Wilson et al., 1993;Ó Scalaidhe et al., 1997;Romanski, 2007;Rozzi et al., 2021). Indeed, our recorded region includes sectors connected with inferotemporal and/or parietal cortex (area 12, 45 and 46; Barbas, 1988;Preuss and Goldman-Rakic, 1989;Petrides and Pandya, 2002;Saleem et al., 2008;Borra et al., 2011;Gerbella et al., 2013). The prefrontal neurons described in the present work, in general, do not show a marked object selectivity, likely because our task does not overtly require object discrimination. If the task had required object discrimination or categorization, very likely the percentage of object-selective neurons could have been higher (Freedman et al., 2001(Freedman et al., , 2002Kusunoki et al., 2010). Nonetheless, about 10% of neurons responding to object presentation have some type of object preference. In addition, the decoding of the Object factor in the population of neurons with object preference shows a high accuracy in the Presentation and Behavioral response periods, suggesting that this population contains neurons similar to visual and visuomotor neurons involved in the parieto-premotor grasping circuit (Murata et al., 1997(Murata et al., , 2000Raos et al., 2006;Rozzi et al., 2021). Accordingly, we propose that object coding, in the Action condition, is related to motor implementation, in line with our previous studies on prefrontal cortex (Simone et al., 2015;Rozzi et al., 2021). Thus, the object presentation phase in the Action condition has a double coding valence: general goal (acting) and specific goal (type of action).
Inaction neurons have a sustained response from Presentation to NoGo signal, when the monkeys must actually withhold the movement. This discharge could be related to the inhibition of the unwanted action. Thus, the 'pragmatic' interpretation can also apply to these neurons and is supported by clinical literature showing that PF damage in human patients leads to compulsory actions on objects (e.g., Utilization Behavior) and to behavior disinhibition (Lhermitte, 1986;see Iaccarino et al., 2014).

Encoding and monitoring behavioral goals
In the Behavioral response phase, the monkey, already instructed on the condition to perform and on the object to grasp (in the case of Action condition), at cue disappearance can complete the task. Among Condition-dependent neurons, the large majority showed a clear prevalence for the Action condition. This is also evident from the population of task-related neurons significantly active in this phase. These findings are in line with the widely accepted idea that prefrontal neurons prospectively encode the behavioral output (Rainer et al., 1999;see Passingham and Sakai, 2004). In addition, in the Action condition the population activity decreases abruptly during object holding, likely signaling goal achievement. As discussed above, a differential coding of the behavioral output is already evident from the presentation phase of the task, especially in the population of Action-related neurons. In fact, decoding analysis carried out on this population, as well as on that of all neurons responding in the Behavioral period, reveals high accuracy when training on the Presentation period and testing on the Behavioral response period and vice versa, indicating that goal achievement is already predicted when enough information to fulfil the task requirements is provided.
Although less represented, there are neurons preferring the Inaction condition. The activity of this population of neurons, enhanced during the fixation period, falls to baseline when reward is delivered. This evidence is in agreement with the decoding analysis, showing that the highest accuracy is reached in the final phase of the Behavioral response period (i.e. just before reward delivery) and suggests that these neurons encode the Inaction condition in terms of the final goal of the task/ reward achievement. Furthermore, the same analysis reveals that the task goal/reward is already predicted from the Presentation period onward. In our task, it is not possible to determine whether this neuronal activity is more related to reward expectation or goal achievement. We favor the second interpretation, also based on the available literature showing that, although in PF there are neurons coding the expectancy of reward already in the delay period preceding reward delivery (Watanabe, 1996), some of them encode both the reward amount and the monkeys' behavioral response. This suggests that this cortical sector may use reward-related information to monitor goal achievement and thus control behavior (Watanabe, 1996;Wallis et al., 2001;Wallis and Miller, 2003).
Altogether, these observations indicate that the VLPF neurons responding in the behavioral phase are involved in coding the crucial aspects of the intended behavior and monitoring it until its goal is achieved (e.g. grasping/holding or keeping fixating without moving and getting the reward).

VLPF neurons encode general task goals in terms of their outcome
The observation of the pattern of activity of the whole population of task related neurons (Fig. 3 C) can provide a general picture of the role of the investigated sector of VLPF in encoding intentional actions execution and withholding. Both the temporal profile of activity and the results of the dPCA (Fig. 3) clearly show that the whole population codes differently the two conditions in the different phases of the task and that in the final phases the activity drops with different timings (taking possession and pulling the object in the Action condition and reward delivery in both conditions). Note that, in both conditions, reward delivery objectively signals the correct execution of the trial, but only in the Action condition there is, before that, a further feedback signal about the accomplishment of the goal of the grasping action.
Noteworthy, the activity observed in the Action condition during the behavioral response resembles the post-saccadic activity described in prefrontal cortex by Funahashi and coworkers (Funahashi et al., 1991;see Funahashi, 2014see Funahashi, , 2022, since it begins with movement initiation, is context-dependent, and, in some cases, neurons also discharge in relation to an instructing cue. In agreement with the interpretation provided in these studies, we propose that VLPF neurons encode the goal of intended actions in terms of the prediction of the critical events signaling the behavioral outcome (e.g., for example, taking possession and pulling the object and reward delivery, 'pragmatic' hypothesis). This internal representation of goals would be crucial for maintaining active the sensory-motor representation of an action during its selection, programming and execution. This process probably relies on a top-down modulation of the parieto-premotor grasping neurons anatomically connected with those of the investigated sector (Miller and Cohen, 2001). The feedback signals sent by the parietal and premotor areas would, in turn, confirm the outcome prediction (goal) and suppress VLPF activity related to goal representation, ending the action.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Data will be made available on request.