Ventromedial prefrontal cortex is not critical for behavior change without external reinforcement

Cue-approach training (CAT) is a novel paradigm that has been shown to induce preference changes towards items without external reinforcements. In the task the mere association of a neutral cue and a speeded button response has been shown to induce a behavioral change lasting months. This paradigm includes several phases whereby after the training of individual items, behavior change is manifested through binary choices of items with similar initial values. Neuroimaging data have implicated the ventromedial prefrontal cortex (vmPFC) during the choice phase of this task. However, it still remains unclear what are the underlying neural mechanisms during training. Here, we sought to determine whether the ventromedial frontal cortex (VMF) is critical for the non-reinforced preference change induced by CAT. For this purpose, eleven participants with focal lesions involving the VMF and 30 healthy age-matched controls performed the CAT. We found that at the individual level, a similar proportion of VMF and healthy participants showed a preference shift following CAT. The VMF group performed similarly to the healthy age-matched control group in the ranking and training phases. As a group the healthy age-matched controls exhibited a behavior change, but the VMF participants as a group did not. We did not find an association between individual lesion patterns and performance in the task. We conclude that a fully intact VMF is not critical to induce non-externally reinforced preference change and suggest potential mechanisms for this novel type of behavioral change.


Introduction
Decision neuroscience has contributed to the understanding of maladaptive motivated behavior in conditions such as substance abuse, pathological gambling, and obesity (Bechara, 2005; Davis  The CAT procedure includes several phases. First, participants rank the stimuli to indicate their subjective preference. Based on the initial ratings, items are chosen to be associated with the button press and the cue in the following training phase. During training, the entire stimulus set is presented on the screen several times with some of the items consistently associated with the cue and the button press ("Go" items). Then, preference change is probed in a binary choice phase where two items of similar initial rankings are pitted against each other. If training did not influence preference, participants are expected to be indifferent between the two items (i.e. at chance). While the behavioral studies have shown that CAT produces a replicable group effect of about 60-65% preference of the trained Go items, the cognitive and neural mechanisms underlying this effect remain unclear. Eye-gaze data during the probe phase showed greater gaze towards Go items even when they were not chosen, compared to No-Go items. This suggests that the induced shift of preference in the CAT relies on attentional mechanisms to transform the low level visual, auditory and motor features of the training into an updated value of the associated items. Functional magnetic resonance imaging (fMRI) studies of CAT have implicated the ventromedial prefrontal cortex (vmPFC) in the probe phase of the task, showing greater activations for choices of Go items compared to choices of No-Go items modulated by the preference for individual items (Bakkour et al., 2016b;Schonberg et al., 2014). In the original study (Schonberg et al., 2014), activation of vmPFC was also observed at the end of the training phase, however, similar activation was found for both Go and No-Go items. While these studies implicate the vmPFC in the CAT during the training and choice phases, they do not reveal whether this region plays a crucial role in this preference manipulation.
Activity of vmPFC and adjacent mOFC (together termed ventromedial frontal cortex; VMF) have been implicated in representation and dynamic updating of value both in animals and humans (Wallis, 2012). In humans, activity within this area has been shown to scale with increasing subjective value across a range of stimuli types and tasks, and in some paradigms, to predict value-based choice (Bartra et al., 2013;Levy and Glimcher, 2012 Henri-Bhargava et al., 2012). Importantly, VMF damage was recently found to disrupt biasing of attention to rewarding features of the environment, suggesting that this area is critical to the interplay of attention and value in decision-making (Vaidya and Fellows, 2015).
Activation within VMF during choice in the fMRI studies of CAT suggests that this region might be necessary for the value update that underpins preference change in CAT.
Alternatively, other structures (e.g. the visuomotor network) encode the value update and VMF is merely active during the choice of the preferred item, reflecting the updated value rather than making a causal contribution to the preference change. These two models propose different roles for VMF in CAT (Fig. 1); in the first, this region dynamically assigns credit following low-level attentional training. In the second, VMF is not involved in modifying value, but is involved in value representation during choice. These models make different predictions regarding the effects of VMF damage on CAT performance: if intact VMF is necessary for the CAT effect, individuals with VMF damage will show an attenuated or absent shift of preference following CAT. Alternatively, if VMF is not necessary for the value updating during training or value retrieval during choice, VMF damage will not affect preference shifts following CAT. In the current study, we tested these competing hypotheses by examining whether focal VMF damage affects the shift of preferences observed following CAT. Understanding the role VMF plays in behavior change with CAT will shed light on the one hand on this novel non-externally reinforced procedure and on the other on the role VMF plays in value construction and assignment during value based-decision making more generally.

Participants
Participants with focal lesions involving the orbitofrontal cortex (OFC) and ventromedial prefrontal cortex (vmPFC), together referred to here as ventromedial frontal: VMF (N =11, mean age = 59.4  years, 5 males), were recruited from the Cognitive Neuroscience Research Registry at McGill University. All had fixed, circumscribed lesions of at least 6months duration (mean duration = 9.3 [5.4-16.5] years). Lesions were due to ischemic stroke, tumor resection, or aneurysm rupture. Thirty age-matched healthy control participants were recruited through local advertisements in Montréal. They were free of neurological or psychiatric disease and were not taking any psychoactive drugs. One control participant was excluded from the analysis due to extremely inconsistent choices (choice prediction accuracy = 0.52 z = -3.62; see Results for details). For the 29 included in this group, the mean age was 60.5 [44-79] y and 15 were females. All participants provided written, informed consent in accordance with the Declaration of Helsinki and were paid a nominal fee for their time. The study protocol was approved by the local Research Ethics Board.

Lesion Analysis
Individual lesions were traced from the most recent clinical computed tomography or magnetic resonance imaging onto the standard Montreal Neurological Institute (MNI) brain using MRIcro software (Rorden & Brett, 2000; www.mccauslandcenter.sc.edu/mricro/) by a neurologist experienced in imaging analysis and blind to task performance. MRIcron (www.nitrc.org/projects/mricron) was used to generate lesion overlap images (Fig 2).

Procedure
Sixty identically-sized color images of computer-generated fractal art images served as the stimuli ("Fantastic Fractals," 2013). The experiment was run using MATLAB (Mathworks, Inc. Natick, MA, USA) on a 21-inch screen.

Binary ranking
A forced-choice binary ranking procedure was used to estimate participants' baseline subjective preferences for each of the stimuli. In this task, 60 stimuli were randomly paired to form 300 unique pairs. For each pair of stimuli, participants had 2500 ms to choose their preferred stimulus, followed by a 500 ms choice confirmation screen and 500 ms fixation cross (Fig. 3A). Based on the assumption of choice transitivity from rational choice theory preference pattern leads to more distributed ranking scores. From these rankings, we quantified a transitivity score for each participant as the standard deviation of the participant's ranking scores.

Cue-approach training
Following the baseline evaluation of subjective preferences using the binary ranking procedure, participants underwent 16 training runs of the cue-approach training procedure.

Probe
Preference change following CAT was evaluated in a probe phase. On each probe trial, two items appeared to the right and left of a central fixation cross and participants were asked to select their preferred stimulus. In each pair, both items were of similar initial value (either high-value or low-value), but only one item was a Go item, i.e. associated with a cue during training. For each pair, participants had 1500-ms to select their preferred stimulus, followed by a 500-ms choice confirmation and a fixation cross for a jittered duration with an average of 3000-ms (range of 1000-11000-ms, 1000-ms intervals; Fig. 3C). In addition to these comparisons, as in previous CAT experiments, 'sanity check' trials were also incorporated in the probe phase to measure preferences consistency. In the 'sanity check' trials, participants were asked to choose between pairs of items in which one item was of initial high-value and the other of initial low-value (both Go or both No-Go items), to validate the stability across time of the initial preference evaluation. The probe phase included two runs with 152 total trials, with all unique probe pairs presented in a random order in each run.

Memory
At the end of the experiment, participants performed two sequential memory tasks. The first assessed memory for fractals presented during the experiment compared to novel items (Old/New). The second assessed whether participants remembered which images were associated with the cue (Go/No-Go).

Data sharing
Behavioral data and analysis codes are available at osf.io/d8ceg/.

Results
See Table 2 for summary of behavioral results for all tasks.

Binary ranking
For each participant, we estimated choice consistency as the prediction accuracy of a "leave one out" model.

Cue approach training
There were no differences between the groups in the button-press reaction time to the tone cue (calculated as time after cue) during training between the groups (t (

Group analysis
To assess preference changes following training, we analyzed the proportion of probe trials in which participants preferred the Go items over the No-Go items, using a two tailed repeated measures logistic regression. In each pair, both items were of similar initial preference based on the baseline evaluation phase. As in previous studies, we hypothesized that the cue approach effect would enhance preferences for the Go items above the chance level of 50% of trials (log-odds = 0; odds-ratio = 1).
Following cue-approach training, control participants consistently preferred the Go items

Individual participant analysis
For each participant, we calculated the individual probability of obtaining a preference shift.
We defined the threshold of individual learning based on the binomial distribution compared to chance: P random choice probability = 0.

Reaction time during probe
We found a difference between the control group and VMF group in choice reaction time

Memory:
At the end of the experiment, we tested participants' memory of the items (were the items presented in the experiment, or novel) and of the training condition for the items (were the items associated with a cue and button-press response, or not). We did not find a difference in the proportion of items recognized between the control and VMF groups (t (

Discussion
In this study, we examined whether VMF is critical for behavior change that does not rely on external reinforcement with the novel cue approach task. We aimed to differentiate between two potential underlying models of the effect: in one, VMF has a crucial role in the transformation of the visual, auditory and motor features of training into an enhanced value of the cued items. In the other, preference change with CAT does not critically depend on VMF. We tested these competing models by studying participants with VMF damage and healthy age-matched control and examined performance across the different phases of the CAT task. We found that the same proportion of participants in the VMF group and healthy aged-matched controls showed an individual preference change effect.
As a group, the VMF participants performed similarly to the healthy aged-matched control in all task phases. The VMF participants in our study were able to make consistent preferencebased choices between abstract fractal images in the binary ranking phase, similar to the control group. In the training phase, the VMF participants were indistinguishable from controls, showing intact ability to rapidly respond with a button press to the auditory cue.
Thus far the CAT was tested at the group level (e.g. Schonberg et al.,2014, Salomon et al., 2018. Here, we replicated this effect for the first time in elderly controls. The VMF participants did not differ significantly from the controls, but did not show the effect as a group. This was due to a subset of VMF participants that showed the opposite effect  Second, we provide further support to previous findings that people with VMF damage can make consistent preference judgements, at least under specific conditions. Here we adopted the triangulation approach to study the neural basis of a novel nonreinforced behavioral change task (Munafò and Davey Smith, 2018). Although we did not obtain clear cut findings, we conclude that there are no strong evidence that intact VMF is critical to induce non-externally reinforced preference change, by the mere association of cues and button presses. Between-participant variability in the CAT effect, seen in both healthy and VMF participants, limits the strength of this conclusion. The fact that the VMF group performed similarly in all other components of the procedure speaks to the ability of this group to perform value-based choices and valuation, at least of fractal art images. The finding of a correlation between reaction times and the degree of effect in the task calls for a better understanding of individual differences that underlie non-reinforced behavioral change with CAT. Further research is needed to fully determine the underlying neural mechanisms of the CAT and the role value representation in the VMF is playing within it toward future potential applications of this task.