Using experience to improve: how errors shape behavior and brain activity in monkeys

Previous works have shown that neurons from the ventral premotor cortex (PMv) represent several elements of perceptual decisions. One of the most striking findings was that, after the outcome of the choice is known, neurons from PMv encode all the information necessary for evaluating the decision process. These results prompted us to suggest that this cortical area could be involved in shaping future behavior. In this work, we have characterized neuronal activity and behavioral performance as a function of the outcome of the previous trial. We found that the outcome of the immediately previous trial (n−1) significantly changes, in the current trial (n), the activity of single cells and behavioral performance. The outcome of trial n−2, however, does not affect either behavior or neuronal activity. Moreover, the outcome of difficult trials had a greater impact on performance and recruited more PMv neurons than the outcome of easy trials. These results give strong support to our suggestion that PMv neurons evaluate the decision process and use this information to modify future behavior.


INTRODUCTION
The consequences of actions are fundamental for shaping future behavior. The way they are detected, represented and evaluated in the brain to guide behavior has been the focus of several research lines related to learning, in the context of value-based decisions (for reviews, see Glimcher, 2013;Schultz & Dickinson, 2000;Wallis & Rushworth, 2013). In perceptual decision making, however, the outcomes and their influence in future behavior have received less attention (Purcell & Kiani, 2016). This may be due to the fact that, once participants are trained up to their psychophysical thresholds, there is little room for learning and performance is assumed to depend mainly on sensory factors that do not change dramatically from trial to trial (Gold & Shadlen, 2007). However, it is known that, even under these circumstances, the outcomes of the preceding trials provoke behavioral adjustments, such as post-error slowing (PES; Dutilh et al., 2012;Notebaert et al., 2009;Ullsperger & Danielmeier, 2016), post-error its accuracy (post-error behavioral accuracy). There was no effect whatsoever when neural activity and behavior were conditioned on the outcomes of trial n-2. Finally, the effect of the previous outcome was stronger, both at the behavioral and neuronal levels, when the previous trial was difficult than when it was easy.

MATERIALS AND METHODS
General Experiments were carried out on one male monkey (Macaca mulatta). The animal (BM7, eight kg) was handled according to the standards of the European Union (86/609/EU), Spain (RD 1201(RD /2005 and the SFN Policies and Use of Animals and Humans in Neuroscience Research. The experimental procedures were approved by the Bioethics Commission of the University of Santiago de Compostela (Spain; 15005AE/10/fFUN01/Fis02). The monkey's head was fixed during the task and looked binocularly at a monitor screen placed at 114 cm away from its eyes (one cm subtended 0.5 to the eye). The room was isolated and soundproof. Two circles (1 in diameter) were horizontally displayed 6 at the right and 6 at the left of the fixation bar (a vertical bar; 0.5 length, 0.02 wide) displayed in the center of the screen. The visual stimuli-lines with different lengths-were displayed in the center of the screen. The monkey used right and left circles to indicate, with an eye movement, whether the second stimulus (S2) was shorter or longer than the first stimulus (S1), respectively. Binocular eye movements were recorded with SMI iView X Hi-Speed Primate, sampled at 500 Hz and acquired with MonkeyLogic 1.0 (www.monkeylogic.org; Asaad & Eskandar, 2008). The eye position was calibrated daily using the five points automatic routine included in this toolbox. Visual stimuli were created in a 2.67 GHz Intel Ò Core TM i7 PC using a 1024 MB NVIDIA GeForce GT 240 graphic card and presented in an ASUS VH226H monitor. The monitor mode when the task was running was 1,920 Â 1,080 (75 Hz). MonkeyLogic 1.0 was used for task control and to generate visual stimuli.
The experiment consisted of two phases. The first one (training phase) lasted for about 12 months and was aimed at training the monkey and estimating the psychometric functions relating performance (proportion choices "longer than") with the difference in length between S1 and S2. These functions were then used to select a reduced set of stimuli for the second phase (recording phase), based on the discrimination capability of the monkey. Single cell recordings were performed during the second phase.

Stimuli
The visual stimuli were stationary bright lines, subtending 0.15 in width. During the training phase, three different lengths (2, 2.18 and 2.36 ) were used as S1. For each of them, 10 lengths (five longer and five shorter) were used as S2, in 0.09 steps. This stimuli set allowed us to confirm that the animal was using the difference in length between S2 and S1 to solve the task and to reliably estimate the psychometric functions relating this difference with the probability of perceiving S2 as "longer than" S1. These functions were then used to select the stimuli set for the recording phase depending on the monkey's capability to discriminate. In this second phase, we used the same three lengths as S1 and four lengths (two longer and two shorter) as S2 for each S1. The lengths of the S2 were selected, independently for S2 shorter than S1 and S2 longer than S1, so as to provoke 90 and 70% correct discriminations, for easy and difficult conditions, respectively.

Discrimination task
The monkey was trained for about one year to discriminate up to its psychophysical threshold in a two-alternative forced-choice task, the length discrimination task (LDT) sketched and explained in Fig. 1A. The lengths of S1 and S2 changed randomly from trial to trial and aborts were repeated. The inter-trial interval was 1,500 ms. During the training phase, in which 30 conditions were presented, trials were grouped in blocks of 240 trials. During the recording phase, in which only 12 conditions were presented, trials were grouped in blocks of 192 trials.  Figure 1 Behavioral task, behavioral performance and recordings localization. (A) Sequence of events during the length discrimination task. The fixation target (FT) and two response buttons appear on screen. The monkey initiates the trial by fixating in the FT and, after a 200 ms fixation time, two lines of variable length (S1 and S2) are presented sequentially, separated by a 1 s delay. S1 is presented for 500 ms and S2 remains on screen until the subject indicates, with an eye movement, whether S2 is longer (right button) or shorter (left button) than S1. After the behavioral response (BR), the monkey has to hold the response for 500 ms and then feedback (FB) is provided. Correct choices are indicated with a 60 dB SPL, high pitch (3,000 Hz) sound (200 ms) and a drop of liquid; incorrect choices are indicated with a 60 dB SPL, low pitch (500 Hz) sound (200 ms). The inter-trial interval is 1,500 ms. (B) Psychometric functions, one per S1 length, estimated in the end of the training phase. Dots represent proportions averaged across 30 blocks of trials; error bars represent 95% CI. (C, D) Behavioral accuracy (mean ± 95% CI) and RTs (median ± 1 quartile), respectively, during the recording sessions, showing the effect of difficulty on discrimination accuracy and reaction times. ÃÃÃ p < 0.001. Full-size  DOI: 10.7717/peerj.5395/ fig-1 Recordings Extracellular single unit activity was recorded with tungsten microelectrodes (1.5-3.5 M) in the posterior bank of the ventral arm of the sulcus arcuatus and adjacent surface in the PMv in the two hemispheres of the monkey. Standard histological techniques were used to confirm the location of the penetrations. Briefly, the metal chamber for holding the microdrive was implanted over PMv following stereotaxic coordinates and the position of the electrode-which changed from session to session-was recorded daily to reconstruct the penetrations map. In each recording session, the electrode was advanced slowly, perpendicularly to the surface of the cortex, until clear spikes were observed. Then a behavioral block (192 trials) was run. Once the block was completed, the electrode was advanced further (a minimum of 300 mm) and a new behavioral block was run. The number of blocks per penetration ranged between 1 and 5 (mean = 2.08, SD = 0.96). After several months of daily recordings, the animal was euthanized and the localization of the chamber was confirmed using external landmarks (i.e., the sulcus arcuatus and the sulcus subcentralis anterior).

Spike sorting
Continuous data was processed offline with AutoSort (DataWave Technologies; www.dwavetech.com) to detect and sort spikes. Spike detection was performed with an amplitude threshold manually set far from the average amplitude of the recording (background "noise" level). Spikes were sorted with a semiautomatic procedure based on principal component analysis. Usually, we were able to isolate between one and three neurons simultaneously recorded (i.e., during the same behavioral block).

Data analysis
All analyses were carried out using custom-made programs in Matlab 2012b (http://www.mathworks.com). Firing rates were estimated by counting the number of spikes within 10 ms bins and then averaging them with a 200 ms sliding window (10 ms steps). The effect of previous outcomes on neuronal activity was assessed with Receiver Operating Characteristics (ROC; Green & Swets, 1966) analysis, which allows the measure of the degree of overlap between two response distributions. For each neuron with at least 10 correct and 10 incorrect trials, we computed the area under the ROC curve (AUC ROC) within a 200 ms bin that was slid in 10 ms steps (see Pardo-Vazquez, Leboran & Acuña, 2008 for detailed information on this method). AUC ROC values significance was assessed with Fisher's exact tests (n = 2,000 iterations), and significance level was set at p < 0.01, corrected for multiple comparisons (Fujisawa et al., 2008). Psychometric curves were estimated with the same procedure described elsewhere (Pardo-Vazquez, Leboran & Acuña, 2008). Significance of behavioral comparisons was evaluated with Fisher's exact tests (n = 20,000 iterations).

RESULTS
The activity of single cells was recorded while one monkey performed the LDT (Fig. 1A). A total of 146 penetrations (60 and 86 in the left and right hemispheres respectively) were performed in the PMv. In the task, two lines (S1 and S2) were presented sequentially and separated by a 1 s delay, and the monkey had to decide whether S2 was longer or shorter than S1 and communicate its decisions with an eye movement.
Psychometric functions, stimulus selection and behavioral performance during the recordings During the training phase, three S1 and 10 S2 per S1 (see Methods) were used to estimate the psychometric functions relating the proportion of responses "longer than" with the length of S2 (Fig. 1B). These functions, which suggest that the subject based its choices on the difference between S2 and S1, were then used to select the stimuli set for the recording phase: S2 lengths that provoked 90 and 70% correct choices, for easy and difficult conditions respectively, were independently chosen for each S1 and for "shorter than" (S2 < S1) and "longer than" (S2 > S1) conditions. For the 2 S1, the S2 could be 1.78 and 2.35 for easy conditions and 1.95 and 2.16 for difficult conditions. For the 2.18 S1, the S2 could be 1.9 and 2.45 for easy conditions and 2.08 and 2.29 for difficult conditions. Finally, for the 2.36 S1, the S2 could be 2.06 and 2.61 for easy conditions and 2.23 and 2.44 for difficult conditions. Performance during the recording sessions ( Fig. 1C) shows that accuracy in the LDT depends on the difficulty of the discrimination and that the monkey was using the lengths of S2 and S1 to solve the task also for this subset of conditions. For S2 < S1 trials, the percentages of correct responses were 91 and 71% for easy and difficult conditions, respectively. For S2 > S1 trials, the percentage of correct responses were 82 and 67%. The differences between easy and difficult trials were significant for both S2 < S1 and S2 > S1 discriminations (p < 0.001). Note that the average percentage of correct responses closely matches the target (90 and 70%) we established for selecting the stimuli set.
Regarding discrimination speed (Fig. 1D), task difficulty had little effect on reaction times (RTs), that were slightly and significantly faster only for easy S2 > S1 discriminations. For correct S2 > S1 trials, the median RTs were 210 ms (first Q = 201 ms; third Q = 226 ms) and 205 ms (first Q = 192; third Q = 216 ms) for difficult and easy discriminations, respectively, and they were significantly different (p < 0.001). For correct S2 < S1 trials, the median RTs were 205 ms (first Q = 188; third Q = 223 ms) and 206 ms (first Q = 189 ms; third Q = 223 ms) for difficult and easy discriminations, respectively, and they were not significantly different (p = 0.91).
The outcome of the previous trial (n-1) has significant effects on behavior and neuronal activity during the current trial (n) Since we were interested in proving the contribution of PMv neurons to shaping behavior based on the outcomes of previous trials, the first step was to verify whether these outcomes had significant effects on performance in the LDT. To this end, we analyzed all the data gathered during the recording sessions (88,400 trials divided in 442 blocks of 200 trials). Behavioral accuracy was calculated as the mean percentage of correct choices with respect to the number of completed trials, for the subsets of trials preceded by errors and correct choices separately. We found a significant (p < 0.01) PIA from 76.9% (SD = 5.5%) correct decisions after correct trials to 78.5% (SD = 9%) after errors ( Fig. 2A).
Since, in perceptual tasks, performance is mainly limited by sensory constrains that are not expected to change dramatically from trial to trial, we looked for changes in behavioral engagement, as reflected in an alteration of the percentages of completed trials. We found that the outcome of the preceding choice produced a significant post-error improvement in engagement (PIE; Fig. 2C): the percentage of completed trials was significantly higher (p < 0.001) in post-error (79.7%, SD = 10.1%) than in post-correct trials (72.7%, SD = 14.3%). These differences in performance could result from an imbalance in the frequency of difficult trials; erroneous decisions could be followed by easy trials more often than correct decisions, thus showing a spurious improvement in performance, even when conditions were selected pseudorandomly in each trial. To rule out this explanation, we compared the percentages of difficult conditions after correct (49.9%) and incorrect (49.8%) trials and did not find significant differences (p = 0.403).
The outcome of preceding trials is also known to affect speed; to compare the RTs following correct choices and errors, we split the 88,400 trials in 442 blocks of 200 trials, estimated the median RT after correct and error trials for each block and then compared their means. We found the RTs to be slightly, but significantly, faster (p < 0.001) in post-error trials (mean = 205 ms, SD = 8.9 ms) than in post-correct trials (mean = 207 ms, SD = 7.2 ms).
Therefore, our behavioral results confirm that, in the LDT, the outcome of the previous trial had a significant effect on performance in the current trial, increasing both the accuracy of the discriminations and also the probability for the monkey to be engaged in the task and complete the trial. To assess whether neuronal activity during the current trial (n) reflects the outcome of the previous trial (n-1), we compared the firing rate of single cells in post-correct (Post-C) and post-error (Post-E) trials and used ROC analyses to quantify that effect. We focused the analysis in the time period going from the onset of S1 to the end of the delay, to avoid differential neuronal responses owed to the decision process in the current trial. Figure 3 shows the activity of two example neurons during trial n, conditioned on Significance thresholds (dashed lines) were estimated with a permutation test (n = 2,000). The analysis was conducted with a sliding window (200 ms, 10 ms steps); a = 0.01; corrected for multiple comparisons.
Full-size  DOI: 10.7717/peerj.5395/ fig-3 the outcome of trial n-1. One of them shows higher firing rates after incorrect trials (Post-C < Post-E; Figs. 3A and 3C) and the other responds more strongly after correct trials (Post-C > Post-E; Figs. 3B and 3D). ROC analysis confirmed, for both neurons, the existence of a trial-by-trial representation of the outcome of the preceding choice in the activity of the current trial (Figs. 3E and 3F).
Out of 658 neurons with enough trials to be analyzed (see Methods), 175 (27%) showed significant AUC ROC values (p < 0.01) within the relevant period, that is, at least one significant bin during the first 1,500 ms of the trial. These neurons were recorded in 86 different penetrations, which represent 59% of the total number of locations we explored (Fig. 4A). Out of the 175 neurons, 104 (59%) showed higher firing rates after errors (Post-C < Post-E neurons) and 71 after correct choices (Post-C > Post-E neurons). In 39 penetrations we found only Post-C < Post-E neurons; in 26 penetrations we found only Post-C > Post-E neurons; and in 21 penetrations we found both types. Given the proximity of the penetrations (see Fig. 4A), our results suggest no spatial segregation of these two neuronal populations. Figure 4C shows the AUC ROC averaged across Post-C < Post-E and Post-C > Post-E populations of neurons; the joint activity of these populations represents, during the whole duration of the current trial, the outcome of the previous choice. Figures 4E and 4F show the localization of these neurons in the cortex and the distribution of minima and maxima significant AUC ROC values, respectively; the mean for Post-C < Post-E and Post-C > Post-E neurons were 0.31 (SD = 0.07) and 0.67 (SD = 0.06), respectively.
To further characterize their functional properties, we checked whether these neurons represent the outcome of the current decision after feedback is provided. Consistent with our previous works (Pardo-Vazquez, Leboran & Acuña, 2008Acuña, , 2009, PMv neurons encode in their firing rates the consequences of current decisions: when correct choices were compared against errors, one hundred and eight neurons (85%) showed significant AUC ROC values (p < 0.01) within a 500 ms window starting at feedback presentation. This suggests the representation of the outcomes in PMv is conveyed from trial n-1 to trial n. However, the temporal dynamics of this representation are complex: while we found more Post-C > Post-E neurons, the firing rate is usually higher immediately after correct choices (C > E neurons, 65%).
The outcome of trial n-2 does not affect either behavior or neuronal activity in the current trial (n) We assessed the duration of post-error improvements by repeating the same analyses but conditioned on the outcome of trial n-2, and irrespective of the outcome of trial n-1 (i.e., in post-error (n-2) trials, the sequence (n-2n-1) could be error-correct or error-error and, in post-correct (n-2) trials, it could be correct-correct or correct-error). When the percentage of correct choices over the total number of completed trials-behavioral accuracy-was compared (Fig. 2B), no significant difference (p = 0.53) was observed as a function of the outcome of the previous trial (n-2): the mean percentages of correct choices were 77.8% (SD = 5.3%) and 77.4% (SD = 8.9%) after correct choices and errors, respectively. Behavioral engagement was also very similar for post-correct (n-2) and post-error (n-2) trials ( Fig. 2D): the percentage of completed trials after correct choices in the trial n-2 (75.8%; SD = 9.3%) was not different (p = 0.298) from the percentage after errors in the trial n-2 (76.6%; SD = 12%). Remarkably, we found the same pattern when analyzing the effect of the outcome of trial n-2 on neuronal activity: less than 5% of the neurons (26 out of 654 neurons analyzed) reached significant AUC ROC values (Fig. 4B) and the average AUC-ROCs for post-correct and post-error preferring neurons were very close to 0.5 (Fig. 4D). Therefore, while the outcome of trial n-1 had a significant impact on the activity of PMv neurons and behavioral accuracy and engagement, the outcome of trial n-2 had a much smaller effect on the neuronal activity and no significant effect on behavioral performance.
During the current trial (n), the outcome of difficult trials (n-1) has stronger effects on behavior and recruits more PMv neurons than the outcome of easy trials To study the combined effect of the outcome and the difficulty of the previous trial, we analyzed outcome-related changes in behavior and neuronal activity after easy and difficult choices separately. The difficulty of the previous trial (n-1) had significant effects on behavioral adjustments.
On the one hand, we only found a significant PIA after difficult trials (Fig. 5A). For post-easy trials, the mean percentages of correct responses after correct and error trials were 76.9% (SD = 7.6%) and 78.3% (SD = 16.7%), respectively, and this difference was not significant (p = 0.11). For post-difficult trials, the mean percentages of correct responses after correct and error trials were 76.7% (SD = 9.1%) and 79% (SD = 11%), and this difference was significant (p < 0.01).
On the other hand, although PIE was significant both after easy and difficult trials, it was higher in the latter case (Fig. 5B). For post-easy trials, the mean percentages of completed trials after correct and error trials were 73% (SD = 14.7%) and 79% (SD = 16.1%), respectively. For post-difficult trials, these percentages were 72.4% (SD = 15%) and 80.1% (SD = 10.9%), respectively. Both differences were significant (p < 0.001). Together, these results show that behavioral improvement after errors was higher when those errors happened in difficult trials. To rule out an imbalance between easy and difficult conditions in trial n as a function of the difficulty of trial n-1 as a possible explanation for these differences, we compared the percentage of difficult conditions in trial n after easy and difficult trials. These percentages, 50% and 49.8% respectively, were not significantly different (p = 0.213).
Regarding neuronal activity, when only post-easy trials were considered, 48 (13%) neurons out of the 380 with enough trials showed significant AUC ROC values (Fig. 5C). From these, 28 (58%) were Post-C < Post-E neurons and 20 were Post-C > Post-E. When only post-difficult trials were considered, 103 (17%) neurons out of the 616 with enough trials showed significant AUC ROC values (Fig. 5D). From these, 66 (64%) were Post-C < Post-E neurons and 37 were Post-C > Post-E. Furthermore, the duration of the effect of the outcome of the previous trial on the current was also different as a function of the difficulty of the trial n-1: the average number of significant bins after easy and difficult discriminations was 11 (SD = 16.4) and 20 (SD = 28.6), respectively. The temporal profile of the neuron count also shows this difference: during the 1,500 ms in which our analysis was focused, the percentage of recruited neurons is clearly higher throughout the trial duration (Figs. 5C and 5D). Therefore, although there are PMv neurons representing the outcome of the previous choice (n-1) both after easy and difficult trials, our results suggest that this effect is higher in the latter case, as it happened with behavior.

DISCUSSION
In this work, we show, for the first time, that single neurons from PMv represent, during the current trial, the outcome of the immediately previous one. Post-correct and post-incorrect firing rates are significantly different in 27% of the analyzed neurons. Behavioral results demonstrate that the outcome of the preceding trial affects performance in a visual discrimination task, increasing accuracy and engagement in the task. These changes in neuronal activity and behavior during the current trial are only provoked by the outcome of the immediately previous trial. Finally, when the difficulty of the preceding trial was considered, we found that the effect of the outcome is stronger after difficult trials, again at both the neuronal and behavioral levels.
Post-error behavior in the LDT is consistent with previous works and also shows a new effect, post-error increase in engagement, which should be considered when using animal models for addressing behavioral adjustments resulting from the consequences of previous actions. PES has been shown, mostly in humans, in a variety of behavioral tasks Debener et al., 2005;Dutilh et al., 2012;King et al., 2010;Notebaert et al., 2009;Schiffler, Bengtsson & Lundqvist, 2017); it has also been described in monkeys performing the random dots motion (RDM) task (Purcell & Kiani, 2016) and in rats performing a time estimation task (Narayanan & Laubach, 2008). In saccade countermanding tasks, neither monkeys (Emeric et al., 2007;Pouget et al., 2011) nor humans (Emeric et al., 2007) showed PES. The phenomenology of PES is complex and it seems to depend on many task variables Dutilh et al., 2012;Notebaert et al., 2009), such as inter-trial interval and outcome frequency. In fact, when correct trials are infrequent, RTs are faster after error trials (Núñez Castellar et al., 2010;Notebaert et al., 2009). In the current work, we found a weak but significant post-correct speeding that can be explained by two features of the LDT: first, unlike in the RDM task, S1 and S2 do not change in time and the monkey does not get a clear benefit from increasing the sampling time-as suggested by the effect of the difficulty on the RTs; and second, performance in the LDT does not depend only in correctly perceiving the length of S2, but also on perceiving and maintaining in working memory the length of S1. Therefore, increasing the RT after errors would not have a great impact on accuracy. Despite the abundance of research focused on understanding the functional meaning of PES, it is still under debate whether this behavioral adjustment has an adaptive role, that is, whether it translates into better chances of correctly solving posterror trials Notebaert et al., 2009;Ullsperger & Danielmeier, 2016;Van Der Borght, Desmet & Notebaert, 2016;. Together with others (Purcell & Kiani, 2016), our results might help shedding light on this debate. On the one hand, in tasks in which there is an advantage in sampling the stimulus for a longer time, animals slow down their decisions after errors (Purcell & Kiani, 2016). On the other hand, when the behavioral task is designed so that sampling for longer time does not benefit the decision, as in the LDT, the subject does not show slower RTs after errors. Therefore, it seems that animals only slow down the decision, after an error, when this strategy can be beneficial to them.
Regarding PIA, it has been previously described in humans, mostly using attentional paradigms such as the flanker task (Maier, Yeung & Steinhauser, 2011;Marco-Pallarés et al., 2008). Other experiments, however, did not find differences in accuracy as a function of the outcome of the preceding trial (Hajcak, McDonald & Simons, 2003;Hajcak & Simons, 2008;King et al., 2010) and in some cases even decreased accuracy after errors was reported (Núñez Castellar et al., 2010;Notebaert et al., 2009;Schiffler, Bengtsson & Lundqvist, 2017). In monkeys (Purcell & Kiani, 2016) and rodents (Narayanan & Laubach, 2008), errors did affect RTs, but no PIA was reported. Although significant, the size of PIA in the current work was modest (less than 2%), as expected given the nature of our task. Perceptual decisions depend mostly on sensory processing and dramatic changes in discrimination thresholds are not to be expected from one trial to the next. Therefore, in subjects trained up to their psychophysical thresholds, there is little room for improving performance after an error is detected. This is the first time, to our knowledge, that PIE is described. This effect could result from an increase in motivation and attentional resources devoted to the task to avoid further errors (Maier, Yeung & Steinhauser, 2011). It could also explain faster post-error RTs in the LDT, as a consequence of the monkey being highly motivated and engaged in the task after making an error. Such a behavioral adjustment is hard to find in humans, since they typically do not fail to complete many trials. However, PIE might be relevant when studying the effects of the outcomes of previous trials and their neural basis in rodents and non-human primates, which usually abort a significant percentage of the trials (e.g., more than 20% of the trials in the LDT).
Differential neuronal responses after correct and error trials have been described in the dorsomedial prefrontal cortex (dmPFC) of rats performing a time estimation task (Narayanan & Laubach, 2008). These authors provided further evidence of the involvement of this area in post-error behavioral adjustments, since they reported its inactivation attenuated PES. In dmPFC, two populations of neurons were described, one with sustained increased firing rates after incorrect trials and the other after correct trials. These results were interpreted as evidence of a form of retrospective memory aimed at monitoring task performance. The neuronal responses described in dmPFC of rats are very similar to the ones we found, in the present work, in monkey's PMv.
EEG and imaging studies in humans have shown that the ACC is involved in error detection (Dehaene et al., 1994;Ito et al., 2003) and that this information is then used by decision-related cortical areas, mostly prefrontal and parietal, for adapting future behavior King et al., 2010;Li et al., 2008). Evidence suggests that, in perceptual decisions, error signals encoded mostly in the ACC affect future behavior in two ways: (1) decreasing activation in motor-related areas; and (2) increasing sensitivity to task-relevant features in decision and perception-related areas (King et al., 2010). The PMv represents many of the components of the decision process in different sensory modalities (Lemus, Hernández & Romo, 2009;Pardo-Vazquez, Leboran & Acuña, 2008Romo, Hernández & Zainos, 2004), and it receives projections from different brain regions involved in performance monitoring, such as the PFC, cingulate cortex, supplementary eye fields, and basal ganglia (Boussaoud et al., 2005;Dancause et al., 2006;Dum & Strick, 2002;Ghosh & Gattera, 1995;Hoover & Strick, 1993). Moreover, the dorsal part of the premotor cortex (PMd) is involved in representing trial history in a countermanding task (Marcos et al., 2013); in this paradigm, previous trials affect behavior performance and neuronal variability in the PMd. All this evidence, together with the present results, points at the PMv as potential actor in a network of cortical and subcortical regions involved in using the outcome of previous trials for shaping future behavior.

CONCLUSIONS
Previous works have suggested that PMv neurons could be involved in shaping behavior based on the consequences of previous decisions (Pardo-Vazquez, Leboran & Acuña, 2008. In this work, we show post-error effects on behavior and neural activity during a visual discrimination task. Remarkably, the effect of the consequences of the previous trials on neuronal response in the PMv mimicked post-error behavioral adjustments. Firstly, while the outcome of trial n-1 showed significant effects on discrimination performance and neuronal activity in trial n, the outcome of trial n-2 showed no effect on either of them. Secondly, behavioral and neuronal effects were stronger after difficult trials as compared to easy ones. These results provide support to our claim that PMv could be involved in using the consequences of previous actions to shape future behavior, most likely as part of a broader network of decision-related brain regions, including parietal (Purcell & Kiani, 2016) and prefrontal (Narayanan & Laubach, 2008) cortices. Further research, using methods such as electrical stimulation or reversible inactivation to manipulate neural activity in this area, should be conducted in order to stablish a causal relationship between neuronal activity in PMv and post-error behavioral adjustments.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This research was supported by the following grants: Human Frontier Science Program Long-Term Award (LT00042/2012) to Jose L. Pardo-Vazquez; from Ministerio de Ciencia e Innovación (MICINN), Spain, to Carlos Acuña. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.