Enhanced bottom-up and reduced top-down neural mechanisms drive long-lasting non-reinforced behavioral change

Behavioral change studies and interventions mostly focus on self-control or external reinforcements as means to influence preferences. Cue-approach training (CAT) has been shown to induce preference changes lasting months following a mere association of items with a neutral cue and a speeded response, without external reinforcements. We used this unique paradigm to study preference representation and modification in the brain. We scanned 36 participants with fMRI during a novel passive viewing task before, after and 30 days following CAT. We found that bottom-up neural mechanisms, involving visual processing regions, drive immediate behavioral change and that reduced top-down parietal activity and enhanced hippocampal activity underlie long-term change. We suggest these are evidence of a novel neural mechanism of preferences representation and non-reinforced behavior change. These findings support implementation of bottom-up instead of top-down targeted interventions to accomplish long-lasting behavioral change.


Introduction
Changing behavior is key to solving a broad range of challenges in public health. One of the most important targets of behavioral change interventions is the decision-making process.
Understanding how preferences are constructed and modified is a major challenge in the research of human behavior with broad implications, from basic science to offering long-lasting behavioral change programs (1,2).
Most behavioral interventions for treating conditions such as addictions and eating-disorders relied thus far on reinforcements and effortful self-control (3). However, previous studies suggest that these interventions tend to fail in the long term (4)(5)(6). In a unique paradigm, named cue-approach training (CAT), preferences for snack food items were successfully modified in the absence of external reinforcements (7). In the CAT paradigm, the mere association of images of items with a cue and a speeded button-press response leads to preference changes lasting months following a single training session (see Fig. 1) (7). Current theories in the field of value-based decision-making would not predict that a simple association of an image with a neutral cue and button press will affect choices lasting months into the future. However, replicated results of over 30 samples in multiple laboratories show that participants significantly choose high-value paired food items ('Go items') over high-value non-paired items ('NoGo items') following CAT (7)(8)(9)(10)(11)(12). Salomon et al. (13) recently showed that CAT can be used to change preferences towards various types of stimuli (unfamiliar faces, fractal art images and positive affective images) with different types of cues (neutral auditory, aversive auditory and visual cues), demonstrating that the underlying mechanisms of the task are general. Preference change following the task has been shown to last up to six months following a single training session lasting less than one hour, suggesting the task has potential to be translated into a real-world intervention.
Understanding the neural mechanisms underlying non-reinforced behavioral change could potentially set the ground for new theories of value-based decision-making, and for new behavioral change interventions targeting automatic processes for long-lasting change, benefiting the lives of millions around the world. However, the underlying neural mechanisms driving this replicable long-lasting change remain largely unknown. Previous studies showed that eye-gaze during choices was drawn towards high-value Go items more compared to high-value NoGo items even when the Go items were not chosen (7). Functional MRI demonstrated an amplified BOLD signal in the ventro-medial prefrontal cortex (vmPFC), a region associated with value-based decisionmaking (14), during choices of high-value Go items alone and compared to NoGo items (7).
Together, these results indicate the involvement of attentional mechanisms and a neural signature of the value change during choices of Go compared to choices of NoGo items. However, they do not reveal the mechanism underlying the preference change induced by the task that can hint at an undiscovered pathway for value change.
The training in the task is performed on single items and thus induces changes of preferences towards individual items, later manifested in the binary choice phase. The low-level nature of the task, involving neither external reinforcements nor high-level executive control, provides a unique opportunity to study preference representation and modification in the brain. Here, we aimed to use the novel CAT paradigm to study how preferences toward individual items are represented and modified in the brain. A previous study used multi-voxel pattern analysis on fMRI data acquired during CAT, but was not able to point to differences induced during CAT between Go  (30). (b) "Passive viewing", a new task in which items are individually presented on the screen, while participants passively observe them and perform a sham counting task. (c) Cue-approach training: Participants were instructed to press a button as fast as they could whenever they heard an auditory cue, and before the item disappeared from the screen. Items were presented on the screen one by one. Go items were consistently paired with the cue and button press response, while NoGo items were not. (d) The "passive viewing" task was repeated after training.
(e) In the probe task, participants chose their preferred item between pairs of items with similar initial subjective preferences, one Go and one NoGo item. (f) A recognition memory task. (g) The BDM auction was repeated. Stages e-g were performed again in the one-month follow-up session.
and NoGo items (9). Therefore, we introduce a novel passive viewing task of the items before, after and one month following CAT. During this task, pictures of snack food items were individually presented on the screen, while participants performed a sham counting task (see Fig. 1b,d). We aimed to test the different neural responses to the same images of Go versus NoGo items after training compared to baseline, as well as for the first time the neural changes one month following training. Regions in the brain showing preference-related functional plasticity immediately after training and one month later, could reveal a novel general mechanism of preference representation in the brain and specifically indicate how non-externally reinforced training leads to robust long-lasting preference changes.
Based on previous findings (6,8) we hypothesized that preference changes are dependent on attentional and memory-related mechanisms, affecting value representation. In our pre-registered (https://osf.io/q8yct/?view_only=360ad8ba027b4a85ab56b1586d6ad6c9) hypotheses, we predicted greater BOLD activity after CAT in response to high-value Go items in episodic memory-related regions in the medial temporal lobe, top-down attention-related dorsal parietal cortex and prefrontal value-related regions. In addition, we hypothesized we will replicate previous CAT results showing a significant behavioral effect of choosing high-value Go over high-value NoGo items during the binary choice probe phase and enhanced BOLD activity in the vmPFC during choices of high-value Go items (7)(8)(9)(10)(11)(12)(13).

Behavioral probe results
After CAT: Participants (N = 36) significantly preferred Go over NoGo items in high-value probe choices (mean = 0.590, SE = 0.032, Z = 2.823 P = 0.002, one-sided logistic regression) and marginally also in low-value probe choices (mean = 0.561, SE = 0.038, Z = 1.639 P = 0.051; Fig.   2). The proportion of Go items choices was significantly higher for high-value compared to lowvalue items (indicating a differential effect of CAT on preference for stimuli of the two value categories, Z = 2.184, P = 0.015, one-sided logistic regression). These results were predicted based on previous studies and replicated them (7-9,13).

Figure 2. Behavioral results of Go choices during probe:
Mean proportion of trials in which participants chose Go over NoGo items are presented for high-value (dark gray) and low-value (light gray) probe pairs, for each session (session1 / follow-up). Means of the single participants are shown with dots over each bar. The dashed line indicates chance level of 50%, error bars represent standard error of the mean. Asterisks reflect statistical significance in a one-tailed logistic regression analysis.

Imaging results
Behavioral results with snack food items from previous studies (7,9,13) and from the current study demonstrated a consistent differential pattern of the change of preferences across value levels: Preference modifications were more robust for high-value compared to low-value items.
Therefore, in our imaging results, we chose to focus on the functional changes in the representation of high-value items, which had a more dominant behavioral modification effect. We further tested two kinds of relations between the behavioral effect and the neural response: Modulation across items, meaning that the change in activity was stronger for items that were later more preferred during the subsequent probe phase (within-participant first-level parametric modulation); and correlation across participants, meaning that the change in activity was stronger for participants that later showed a stronger behavioral probe effect, quantified as a higher ratio of choosing highvalue Go over high-value NoGo items (between-participants group-level correlation). Finally, for a subset of three pre-hypothesized and pre-registered regions (vmPFC, hippocampus and superior parietal lobule) we performed a small volume correction (SVC) analysis (see online methods).

Passive viewing imaging results
To investigate the functional changes in the response to individual items following CAT, we scanned participants with fMRI while they were passively viewing the items. Participants completed this task before, after and one month following CAT (N = 36 before and immediately after, and N = 27 after one month).
After versus before CAT (Fig. 3, for description of all activations see Supplementary Table 2): BOLD activity while passively viewing high-value Go compared to passively viewing high-value NoGo items was increased after compared to before CAT in the left and right occipital and temporal lobes (Fig. 3a), along the ventral visual processing pathway (16).
Results of the SVC analyses revealed enhanced BOLD activity during passive viewing of highvalue Go items after compared to before CAT in the vmPFC (Fig. 3b).  Table 2.
One-month follow-up versus before (Fig. 4, for description of all activations see Supplementary Table 3): BOLD activity in the vmPFC was found to be enhanced one month following compared to before CAT with SVC analyses (Fig. 4a), similar to the short-term change. In addition, BOLD activity in the left orbitofrontal cortex (OFC) in response to high-value Go items was positively modulated by the choice effect across items in the follow-up compared to before CAT (wholebrain analysis; Fig. 4b). SVC analyses revealed that BOLD activity in response to high-value Go items in the right anterior hippocampus was positively modulated by the choice effect across items in the follow-up compared to before training ( Fig. 4c), while BOLD activity in response to highvalue Go minus high-value NoGo items in the right SPL was negatively correlated with the choice effect across participants in the follow-up compared to before training (Fig. 4d).
Inspection of the uncorrected results (z > 2.3) revealed increased visual enhancement for highvalue Go compared to high-value NoGo items in the follow-up compared to before CAT, in visual regions similar to the ones found to be enhanced after CAT. However, these clusters did not exceed statistical significance following whole-brain cluster correction and were not pre-registered; therefore, we did not perform an SVC analysis for these regions. No other significant activations were found in the comparison of the BOLD activity in response to high-value Go items in the follow-up compared to before CAT with whole-brain correction without modulation.  Table 3.

Probe imaging results
To investigate the functional response during choices, we scanned participants with fMRI while they completed the probe (binary choices) phase, as was done in previous studies (7,9). Participants completed the probe task immediately after CAT (N = 33). In the current study, we also scanned for the first time the probe session in the one-month follow-up (N = 25).
Immediate Probe (Fig. 5a- Table 4 and  Supplementary Table 5. Go items during probe; Fig. 5b) and negatively modulated by the choice affect across items (Fig.   5c). SVC analysis revealed that BOLD activity in the right SPL while choosing high-value Go items after CAT was negatively correlated with the choice effect across participants (Fig. 5d) and negatively modulated by the choice effect across items (Fig. 5e).
One-month follow-up Probe ( Fig. 5f-g, for description of all activations see Supplementary Table   5): BOLD activity in the precuneus, bilateral superior occipital cortex and bilateral middle and superior temporal gyrus while choosing high-value Go items in the follow-up probe was positively modulated by the choice effect across items (Fig. 5f). BOLD activity in the precuneus/posterior cingulate cortex (PCC) and right post-central gyrus while choosing high-value Go items in the follow-up probe was positively correlated with the choice effect across participants (Fig. 5g).

Discussion
Research of value-based decision-making and behavioral change has been focused on top-down mechanisms such as self-control or external reinforcements as the main means to change preferences (3,17). The cue-approach training (CAT) paradigm has been shown to change preferences using the mere association of images of items with a cued speeded button response without external reinforcements. The paradigm is highly replicable with dozens of studies demonstrating the ability to change behavior for months with various stimuli and cues (7)(8)(9)(10)(11)(12)(13). The behavioral results obtained in the current study (see Fig. 2) replicated previous studies, demonstrating enhanced preferences towards high-value cued (high-value Go) compared to highvalue non-cued (high-value NoGo) items following CAT (7-13).
Here, we introduced a novel passive viewing task to study the functional plasticity of response to single items before, after and one month following CAT. We aimed to reveal the neural mechanisms driving non-reinforced preference modification, both in the short and in the longterm. Prior to data analysis, we hypothesized and pre-registered that the underlying neural mechanisms will involve memory, attention and value-related brain regions.

Long-lasting non-reinforced behavioral change: suggested mechanism
Until this study, the mechanisms underlying the CAT effect were unclear and the effect could not be explained by current value-based decision-making and behavioral change theories. Based on our findings in this study, we suggest that the low-level association of the visual, auditory and motor systems during training modifies valuation of items via a network including: enhanced bottom-up perceptual processes in the short-term, long term maintenance by memory enhancement and inhibition of top-down attentional control; thus, resulting in a long-lasting behavioral change (see Fig. 6 for an illustration of the suggested mechanism).

Figure 6. Suggested mechanism:
Training leads to enhanced perceptual processing, which leads to value enhancement in the short-term and thus to immediate behavioral change. The enhanced perceptual processing further enhances memory activation and accessibility, which drives long-lasting behavioral change. In addition, the involvement of top-down attention is reduced following training, further enhancing the long-term behavioral change. Green arrows indicate enhancement while red arrows indicate inhibition.
Bottom-up mechanisms in the short-term. Visual processing was enhanced for high-value Go compared to high-value NoGo items following CAT (see Fig. 3a). By recording eye-gaze from a sub-group of our participants during this task, we found that this enhanced visual activity was most likely not the result of longer gaze duration on paired items (see Supplementary Data).
Activity in low and high-level visual regions was previously shown to be related to value, but only to past rewards and not to subjective values (18). We show here for the first time that activity in high-level visual processing occipito-temporal cortex is related to subjective values, without external reinforcements. We suggest that the functional changes in visual regions reflect modifications in the bottom-up perceptual representation of the paired items (19,20). In the shortterm, the enhanced bottom-up processing and representation change of individual paired items leads to enhanced value-related processing and enhanced preferences towards these items during choices.

Long-term maintenance via memory processes.
In the one-month follow-up, hippocampal activity during passive viewing was stronger for high-value Go items that were later chosen more during the subsequent probe (See fig. 4c). This finding suggests that memory-related processes supported the long-term maintenance of the behavioral effect. Importantly, it is the first demonstration of the relation between value-based decision-making and memory during passive immediately after CAT (see Fig. 3b) and in the one-month follow-up (see Fig. 4a). In the onemonth follow-up, value change was further reflected in the OFC, where activity was stronger while passively viewing high-value Go items that were later chosen more during the subsequent probe phase (see Fig. 4b).
These findings indicate a long-lasting value change signature of individual items not during choices (23,24). Overall, these results reveal for the first time an item-level value change during passive viewing (18,23), in line with previous findings of enhanced activity in the vmPFC during binary choices of more preferred high-value Go items (7). In addition, this is the first time, to the best of our knowledge, that such enhancement in value-related prefrontal regions is found one month following a behavioral change paradigm.

Functional activity during binary choices reflects enhanced bottom-up and decreased topdown mechanisms of preferences modification
Neural responses during binary choice also resonated the proposed novel mechanism for nonreinforced behavioral change (Fig. 6), demonstrating enhanced perceptual processing in the shortterm and involvement of memory processes in the long-term, as well as decreased top-down attention mechanisms.
When participants chose high-value Go over high-value NoGo items, activity in perceptual regions-both visual and auditory-was enhanced (see Fig. 5a). These findings suggest that in the short-term, retrieval of the low-level visual and auditory associations constructed during training, These findings demonstrate again that reduced top-down mechanisms are involved in the behavioral change following CAT.
We were not able to replicate previous results showing enhanced activity in the vmPFC during choices of Go items that were chosen more overall (7,9). These previous results were found for high-value Go items when the group's behavioral effect of choosing high-value Go items was significant but weak relative to other samples (study 3 in Schonberg et al., 2014). Similar results were found for choices of low-value Go compared to choices of low-value NoGo items, and not for choices of high-value Go items, when the behavioral effect was strong for high-value items and weak for low-value items (9). Therefore, a possible explanation for the lack of replication of these findings in the current study is that this contrast of modulation across items depends on the variance of the choice effect across items, which seems to be smaller here compared to previous samples that found this effect.
Another surprising finding was that activity in the striatum, a region known to be involved in reinforcement learning and habit-based learning (28,29), was negatively correlated with choices of high-value Go over NoGo items (see Fig. 5b-c). These findings potentially suggest that cueapproach training shifted the process of goal-directed decision-making during binary choices to rely more on bottom-up non-reinforced mechanisms. This is the first study with CAT to observe these effects during probe and thus it remains to be replicated in future studies.
Overall, neural activity during binary choices support our suggested new mechanism of nonreinforced behavioral change (Fig. 6), demonstrating similar patterns to these shown in the passive viewing task: enhanced perceptual processing in the short-term, long-term manifestation of the behavioral change through memory-related mechanisms, and reduced top-down involvement (here both in the short and long-term).

Conclusions
Current interventions that rely on reinforcement and self-control fail to change behavior for the long-term. Our findings emphasize the importance and great potential of targeting bottom-up rather than top-down mechanisms to induce long-lasting behavioral change. Our results further emphasize the involvement of memory processes in value-based decision-making (even in the absence of choice or memory task) and its relevance to the durability of the behavioral change. We presented a suggested novel mechanism underlying this change. These findings can lead to new theories relating perceptual processing, memory and attention to preferences and decision-making.
They hold great promise for new long-term behavioral change interventions targeting this novel path for value change based on bottom-up mechanisms, which can improve the quality of life for people around the world.

Study Design
Participants: Forty healthy right-handed participants took part in this experiment. The sample size was chosen before data collection and pre-registered during data collection (https://osf.io/kxh9y/?view_only=4476c6fd74a84f0eb5a893df7e46700a). We initially planned to collect n = 35 participants based on previous imaging CAT samples and based on predicted 10% attrition for the one-month follow-up. However, during data collection we realized attrition rates are higher than expected, thus the planned sample size was increased to n = 40 (before exclusions and attrition), and re-registered. The total number of participants included in the final analyses of the first session is 36 (19 females, age: mean = 26.11, SD = 3.46 years). Twenty-seven participants completed the follow up session (15 females, age: mean = 26.15, sd = 3.44 years).
All participants had normal or corrected-to-normal vision and hearing, no history of eating disorders or psychiatric, neurologic or metabolic diagnoses, had no food restrictions and were not taking any medications that would interfere with the experiment. They were asked to refrain from eating for four hours prior to arrival to the laboratory (7). All participants gave informed consent.
The study was approved by the institutional review board at the Sheba Tel Hashomer Medical Center and the ethics committee at Tel Aviv University.
Exclusions: A total of four participants were excluded: One participant due to incompletion of the experiment, one based on training exclusion criteria (7.5% false alarm rate during training) and two participants with incidental brain findings.
Experimental procedures: The general task procedure was similar to previous studies with CAT (7,13). In order to test for functional changes in the neural response to the individual items following CAT, we added a new passive viewing task before, after and one month following training.
First, we obtained the subjective willingness to pay (WTP) of each participant for each of the 60 snack food items using a Becker-DeGroot-Marschak (BDM) auction procedure (30), performed outside the MRI scanner (see Fig. 1a,g). Then, participants entered the scanner and completed two "passive viewing" runs while scanned with fMRI (see Fig. 1b,d), followed by anatomical and diffusion-weighted imaging (DWI) scans. Afterwards, participants went out of the scanner and completed cue-approach training (CAT) in a behavioral testing room at the imaging center (see Fig. 1c). They then returned to the scanner and were scanned again with anatomical and DWI.
Then, they were scanned with fMRI while performing two more runs of the "passive viewing" task and four runs of the probe phase, during which they chose between pairs of items (see Fig.   1e). Finally, participants completed a recognition task outside the scanner (see Fig. 1f), during which they were presented with snack items that appeared in previous parts of the experiment, as well as new items, and were asked to indicate for each item whether it was presented during the experiment and whether it was paired with the cue during training. As the last task during the first day of scanning, they again completed the BDM auction to obtain their WTP for the snacks.
Approximately one month after the first day of the experiment, participants returned to the lab.
They entered the scanner, were scanned with anatomical and DWI scans and completed two "passive viewing" runs as well as another probe phase (without additional training). Finally, participants completed the recognition and BDM auction parts, outside the scanner.
Anatomical and diffusion-weighted imaging data were obtained for each participant before, immediately after and one month following training. Analyses of diffusion data are beyond the scope of this paper.
Stimuli: Sixty color images of familiar local snack food items were used in the current experiment.
Images depicted the snack package and the snack itself on a homogenous black rectangle sized 576 x 432 pixels (see Supplementary Table 1; Stimuli dataset was created in our lab and is available online at http://schonberglab.tau.ac.il/resources/snack-food-image-database/). All snack food items were also available for actual consumption at the end of the experiment. Participants were presented with the real food items at the beginning of the experiment in order to promote incentive compatible behavior throughout the following tasks.

Procedure
Initial preferences evaluation (see Fig. 1a,g): In order to obtain initial subjective preferences, participants completed a BDM auction procedure (30). Participants first received 10 Israeli Shekels (ILS; equivalent to ~2.7$ US). During the auction, 60 snack food items were presented on the screen one after the other in random order. For each item, participants were asked to indicate their willingness to pay (WTP) for the presented item. Participants placed their bid for each item using the mouse cursor along a visual analog scale, ranging from 0-10 ILS (task was self-paced).
Participants were told in advance that at the end of the experiment, the computer will randomly generate a counter bid ranging between 0 -10 ILS (with 0.5 increments) for one of the sixty items.
If the bid placed by the participant exceeds the computer's bid, she or he will be required to buy the item for the computer's lower bid price. Otherwise, the participant will not be allowed to buy the snack but gets to retain the allocated 10 ILS. Participants were told that at the end of the experiment, they will stay in the room for 30 minutes and the only food they will be allowed to eat is the snack (in case they "won" the auction and purchased it). Participants were explicitly instructed that the best strategy for this task was to indicate their actual WTP for each item.
Item selection: For each participant, items were rank ordered from 1 (highest value) to 60 (lowest value) based on their WTP. Then, 12 items (ranked 7-18) were defined as high-valued items to be used in probe, and 12 items (ranked 43-54) were defined as low-valued items to be used in probe.
Each group of twelve items (high-value or low-value) was split to two sub groups with identical mean rank. Six of the 12 items were chosen to be paired with the cue during training (Go items; A baseline measurement before CAT, after CAT and in a one-month follow-up. In this task, participants passively viewed a subset of 40 items, which were presented in the training procedure (see item selection section and Supplementary Fig. 1). The task consisted of two runs (in each session). On each run, each of the 40 items was presented on the screen for a fixed duration of two seconds, followed by a fixed inter-stimulus interval (ISI) of seven seconds. Items were presented in random order. To ensure participants were observing and processing the presented images, we asked them to perform a sham task of silently counting how many items were of snacks containing either one piece (e.g. a 'Mars' chocolate bar) or several pieces (e.g. a 'M&M' snack) in a new package. At the end of each run, participants were asked how many items they counted. Task instructions (count one / several) were counterbalanced between runs for each participant. The time elapsed between the two runs before training and two runs after training was about two hours (including cue-approach training, anatomical and diffusion weighted scans before and after training and time to exit and enter the scanner).
Cue-approach training (see Fig. 1c): Training was performed outside the scanner. The training task included the same 40 items presented in the passive viewing task. Each image was presented on the screen one at a time for a fixed duration of one second. Participants were instructed to press a button on the keyboard as fast as they could when they heard an auditory cue, which was consistently paired with 30% of the items (Go items). Participants were not informed in advance that some of the items will be consistently paired with the cue, or the identity of the Go items. The auditory cue consisted of a 180ms-long sinus wave function. The auditory cue was heard initially 750ms after stimulus onset (Go-signal delay, GSD). To ensure a success rate of around 75% in pressing the button before stimulus offset, we used a ladder technique to update the GSD. The GSD was increased by 16.67ms following every successful trial and decreased by 50ms if the participant did not press the button or pressed it after the offset of the stimulus (1:3 ratio). Items were followed by a fixation cross that appeared on the screen for a jittered ISI with an average duration of two seconds (range 1-6 seconds). Each participant completed 20 repetitions of training, each repetition included all 40 items presented in a random order. A short break was given following every two training repetitions, in which the participants were asked to press a button when they were ready to proceed. The entire training session lasted about 40-45 minutes, depending on the duration of the breaks, which were controlled by the participants.
Probe (see Fig. 1e): Probe was conducted while participants were scanned with fMRI. The probe phase was aimed to test participants' preferences following training. Participants were presented with pairs of items that had similar initial rankings (high-value or low-value), but only one of the items in each pair was associated with the cue during training (e.g. high-value Go vs. high-value NoGo). They were given 1.5 seconds to choose the item they preferred on each trial, by pressing one of two buttons on an MRI-compatible response box. Their choice was highlighted for 0.5 second with a green rectangle around the chosen items. If they did not respond on time, a message appeared on the screen, asking them to respond faster. A fixation cross appeared at the center of the screen between the two items during each trial, as well as during the ISI, which lasted on average three seconds (range 1-12 seconds).
The probe phase consisted of two blocks. On each block, each of the six high-value Go items were compared with each of the six high-value NoGo items (36 comparisons), as well as each of the six low-value Go items with each of the six low-value NoGo items. Thus, overall there were 72 pairs of Go-NoGo comparisons (each repeated twice during probe, once on each block). In addition, on each block we compared each of two high-value NoGo items versus each of two low-value NoGo items, resulting in four probe pairs that were used as "sanity checks" to ensure participants chose the items they preferred according to the initial WTP values obtained during the BDM auction.
Each probe block was divided to two runs, each consisted of half of the total 76 unique pairs (38 trials on each run). All pairs within each run were presented in a random order, and the location of the items (left/right) was also randomly chosen. Choices during the probe phase were made for consumption to ensure they were incentive-compatible. Participants were told that a single trial will be randomly chosen at the end of the experiment and that they will receive the item they chose on that specific trial. The participants were shown the snack box with all snacks prior to the beginning of the experiment.
Recognition task (see Fig. 1f): Participants completed a recognition task, were the items from the probe phase, as well as new items, were presented on the screen one by one and they were asked to indicate for each item whether or not it was presented during the experiment and whether or not it was paired with the cue during training. Analysis of this task is beyond the scope of this paper.

MRI Acquisition
Imaging data were acquired using a 3T Siemens Prisma MRI scanner with a 64-channel head coil, at the Strauss imaging center on the campus of Tel Aviv University. Functional data were acquired fMRI analysis: Imaging analysis was performed using FEAT (fMRI Expert Analysis Tool) v6.00, part of FSL (FMRIB's Software Library) (39).

Univariate imaging analysis -passive viewing:
The functional data from the passive viewing task were used to examine the functional changes underlying the behavioral change of preferences following CAT in the short and long-term. We used a general linear model (GLM) with 13 regressors: Six regressors modelling each item type (high-value Go, high-value NoGo, high-value sanity, low-value Go, low-value NoGo and low-value sanity); six regressors with the same onsets and duration, and a parametric modulation by the mean-centered proportion of trials each item was chosen in the subsequent probe phase (the number of trials each item was chosen during the subsequent probe divided by the number of probe trials including this item, mean-centered) and one regressor for all items with a parametric modulation by the mean-centered WTP values acquired from the first BDM auction. These 13 regressors were convolved with the canonical double-gamma hemodynamic response function, and their temporal derivatives were added to the model. We further included at least nine motion regressors as confounds, as described above. We estimated a model with the above described GLM regressors for each passive viewing run of each participant in a first level analysis.
In the second level analysis (fixed effects), runs from the same session were averaged and compared to the other session. Two second level contrasts were analyzed separately: after compared to before CAT and follow-up compared to before CAT.
All second level analyses of all participants from after minus before or from follow-up minus before CAT were then inputted to a group level analysis (mixed effects), which included two contrasts of interest: One with the main effect (indicating group mean) and one with the mean centered probe effect of each participant (the demeaned proportion of choosing Go over NoGo items during the subsequent probe in the relevant pair type, i.e. either high-value, low-value or all probe pairs). The second contrast was used to test the correlation between the fMRI activations and the behavioral effect across participants (correlation with the behavioral effect across participants).
All reported group level statistical maps were thresholded at Z > 2.3 and cluster-based Gaussian Random Field corrected for multiple comparisons at the whole-brain level with a (corrected) cluster significance threshold of P = 0.05 (40).
Since we only found a behavioral effect for high-value items, similarly to previous cue-approach samples with snack food items (7,13), we focused our analyses on the contrasts for high-value items: high-value Go items, high-value Go items modulated by choice and high-value Go minus high-value NoGo items.
Univariate imaging analysis -probe: Imaging analysis of the probe data was similar to previous imaging studies with CAT(7,9). We included 16

Small volume correction (SVC) analysis:
We pre-hypothesized (and pre-registered) that value, attention and memory-related regions will be involved in the behavioral change following CAT: Prefrontal cortex, dorsal parietal cortex and medial-temporal lobe, respectively (https://osf.io/6mysj/). Thus, in addition to the whole-brain analyses described above for the passive viewing and probe tasks, we ran similar group level analyses once for each of these prehypothesized regions (bilateral hippocampus, bilateral SPL and vmPFC), with a mask containing the voxels which were part of the region. All masks were based on the Harvard-Oxford atlas (see Supplementary Fig. 2), anatomical regions for the vmPFC mask were based on those used in previous CAT studies (7,9).

Statistical Analysis
Behavioral analysis of the probe phase: Similar to previous studies using cue-approach task (7,13), we performed a repeated-measures logistic regression to compare the odds of choosing Go items against chance level (log-odds = 0; odds ratio = 1) for each trial type (high-value / lowvalue). We also compared the ratio of choosing the Go items between high-value and low-value pairs by adding the value level as independent variable. These analyses were conducted for each session separately.

Imaging analysis:
We performed multi-level analysis with FSL. In the first-level analysis (fixed effects), we ran GLM as described above (see fMRI analysis section) on each run of each session of each participant. We then ran second-level analysis (fixed effects) for each participant, in which runs from each session were averaged and compared against the other session. We then performed group-level (mixed effects) analysis for each comparison of sessions (after compared to before CAT and one-month after compared to before CAT). Group level statistical maps were thresholded at Z > 2.3 and corrected for multiple comparisons with Gaussian Random Field correction at the whole brain level with P = 0.05 as the corrected cluster significance threshold. We also performed small volume correction analysis on three pre-hypothesized and pre-registered regions, in which we corrected for multiple comparisons only across voxels within each region.

Supplementary Materials and Methods: Eye-tracking
Eye-tracking data were recorded from a subset of participants, using an EyeLink 1000 Plus SR-Research eye-tracker. For the passive viewing task, we had useable eye-gaze data from 10 participants after CAT and 10 participants in the one-month follow-up (we did not obtain eyetracking data prior to training). Eye-tracking data from the passive viewing task were used to test whether the duration of time spent observing Go items was different from the duration observing NoGo items after CAT. We averaged the time spent looking on Go items and the time spent looking on NoGo items for each participant and performed a paired t-test to test for differences in observation time between Go and NoGo items.
We found no differences in eye-gaze duration between high-value Go and NoGo items during the    Fig. 5f-g. For each cluster, the list presents all regions from the Harvard-Oxford atlas that contained at least 10 active voxels within the cluster, as well as the X/Y/Z location for the peak activation in MNI space.