Go for broke: The role of somatic states when asked to lose in the Iowa Gambling Task

The Somatic Marker Hypothesis (SMH) posits that somatic states develop and guide advantageous decision making by “marking” disadvantageous options (i.e., arousal increases when poor options are considered). This assumption was tested using the standard Iowa Gambling Task (IGT) in which partici- pants win/lose money by selecting among four decks of cards, and an alternative version, identical in both structure and payoffs, but with the aim changed to lose as much money as possible. This “lose” version of the IGT reverses which decks are advantageous/disadvantageous; and so reverses which decks should be marked by somatic responses – which we assessed via skin conductance (SC). Participants learned to pick advantageously in the original ( Win ) IGT and in the (new) Lose IGT. Using multilevel regression, some variability in anticipatory SC across blocks was found but no consistent effect of anticipatory SC on disadvantageous deck selections. Thus, while we successfully developed a new way to test the central claims of the SMH, we did not ﬁnd consistent support for the SMH.


Introduction
The Iowa Gambling Task (IGT; Bechara, Damasio, Damasio, & Anderson, 1994) was devised in order to understand the decision making deficits shown by patients with damage to their ventromedial prefrontal cortex (VMPFC); in particular, their tendency to repeat disadvantageous courses of action. The decrement in these patients' personal, financial and social decision making following their brain damage -despite intact intelligence, attention, memory and language skills -led to the development of the Somatic Marker Hypothesis (SMH; Damasio, Tranel, & Damasio, 1991;Damasio, 1994). Reflecting these patients' difficulties expressing emotions, and their altered physiological responses to emotional but not neutral stimuli, Damasio hypothesized that the VMPFC played a role in successful decision making (Damasio et al., 1991;Damasio, 1994). The SMH proposes that emotions we experience act as biasing signals (somatic markers; e.g., as assessed by skin conductance) that help guide decision making. Poor outcomes elicit intense somatic signals that 'mark' the course of action that led to those poor outcomes. When this course of action is considered in a subsequent decision, the somatic signals are activated and so serve to reduce the likelihood of repeating previous poor decisions.
The IGT requires participants to select from four decks of cards, from which they either receive a monetary reward, or a combined monetary reward and punishment (loss), which are revealed upon selecting the card (see Table 1). Two decks (termed "bad decks") offer high ("immediate") rewards but large ("delayed") punishments. The other two ("good") decks offer lower ("immediate") rewards and smaller ("delayed") punishments. To be successful at the IGT, participants must learn to forgo large immediate gains in order to avoid the larger delayed punishments. The structure of the rewards and punishments is such that calculating the exact longrun average outcomes of the decks was presumed to be unlikely by Bechara et al. (1994). Instead, participants must use more intuitive decision making processes that, according to the SMH, are determined by emotional hunches that participants develop about the decks when playing the task. Healthy control participants should learn to select more from the good decks by the end of 100 selections. Patients with VMPFC damage, however, continually select from the bad decks throughout the game (Bechara et al., 1994). However, more recent research has shown the reward structure is cognitively penetrable (Maia & McClelland, 2004) and while not all healthy participants learn to select from the good decks, some VMPFC patients have shown learning on the IGT (e.g., Fellows & Farah, 2005).

Table 1
Reward and punishment structure of the IGT (original version; Bechara et al., 1994 Bechara, Tranel, Damasio, and Damasio (1996) hypothesized that ("anticipatory") somatic states arising prior to card selections differentiate between good and bad decks (thereby facilitating advantageous selections). Results from skin conductance (SC) data show that both control and patient groups have greater skin conductance responses (SCR) after selecting a card containing a punishment compared to a card containing a reward only -thereby marking poorer outcomes with greater arousal. However, while control participants develop elevated anticipatory SC in the few seconds prior to selecting cards from bad decks, VMPFC patients do not (Bechara et al., 1996).
However, findings for these "anticipatory skin conductance responses" (aSCR) are not consistent. Bechara and Damasio (2002) reported variance in the aSCRs of healthy participants who were poor performers in the IGT, with some developing anticipatory markers as would be predicted but this did not facilitate advantageous play, yet some studies find elevated aSCRs only in the highest performing sub-groups of participants (e.g., Carter & Smith Pasqualini, 2004;Crone, Somsen, Van Beek, & Van der Molen, 2004). Crone et al.'s (2004) moderately performing group, showed lower aSCRs but improvement in deck selections across the game; indicating learning can take place in the absence of somatic markers. Another challenge comes from Suzuki, Hirota, Takasawa, and Shigemasu (2003) who found no difference between anticipatory SC on early and later trials -failing to provide support for anticipatory markers developing as the game progresses and subsequently guiding behaviour in the IGT.
The structure of the reward and punishment schedule -as distinct from each deck's expected value (i.e., mean loss/gain) -has been suggested as an alternative explanation for elevated aSCRs found for bad decks. Modifying the good decks to have the higher rewards and punishments (but still an overall net gain), Tomb, Hauser, Deldin, and Caramazza (2002) found greater aSCRs prior to selecting from good decks; the opposite of what the SMH predicts. Tomb et al. (2002) suggested somatic markers were driven by the immediate action being taken, rather than by longer-term outcomes. Yen, Chou, Chung, and Chen (2012) modified the IGT to test whether aSCRs were due to differences in expected value (EV) or differences in the riskiness of the decks. They found that anticipatory SC marked the preferred choices across different stages of learning in the game; greater for the high-risk bad deck early on, then greater for low-risk good deck later in the task. Chiu et al. (2008) also adapted the IGT to create the Soochow gambling task: the good and back decks had the same EVs as in the original IGT, but punishments occurred on 4/5 cards in the good decks and only 1/5 times in the bad decks. Chiu et al. (2008) found that participants chose more from the bad decks, suggesting that the frequency of gains and losses took precedence over EV.
If conscious awareness of advantageous play can occur before aSCRs develop, this would negate the need to use somatic states to guide decision making. Bechara, Damasio, Tranel, and Damasio (1997) measured SCRs during the IGT but also assessed participants' knowledge about the decks at points throughout the game to determine when participants became aware of the best strategy for advantageous play. The assessment of conscious knowledge led Bechara et al. to describe four conceptual stages in the game: 'pre-punishment', 'pre-hunch', 'hunch' and 'conceptual' stage. They concluded that, in healthy controls, covert somatic markers develop in response to experienced outcomes and influence decisions, and that this occurs prior to the generation of overt responses to such outcomes. Maia and McClelland (2004) examined participants' knowledge using Bechara et al.'s (1997) questions but posed additional more detailed questions and found that participants had consciously available knowledge, which enabled them to perform well, at an earlier stage in the IGT than Bechara et al. (1997) reported. Other studies have supported Maia and McClelland (e.g., Gutbrod et al., 2006;Evans, Kemish, & Turnbull, 2004) but only Fernie and Tunney (2013) replicated Maia and McClelland's exact methodusing questions from Bechara et al. (1997) and from Maia and McClelland's (2004) -while additionally measuring SC. Fernie and Tunney (2013) found no differences in aSCRs between the decks, or between the question groups, prior to acquiring task knowledge. Outcome SC following punishments was larger for the disadvantageous decks in the pre-knowledge period, but only for participants who went on to display knowledge. The authors concluded that a lack of conceptual knowledge together with a lack of differential aSCRs does not hinder successful play in the IGT. Maia and McClelland (2004) suggested the poor performance of patients with VMPFC damage could be better explained by an inability to carry out reversal learning by inhibiting the win-staylose-shift strategy typical of many learning from feedback tasks (Restle, 1958;Rolls, 2005) when experiencing a punishment in the advantageous decks. To investigate this, Fellows and Farah (2005) switched the IGT deck structure so that the disadvantageous decks were no longer the better decks during the initial trials, and found VMPFC damaged patients' performance on the task equaled that of healthy controls. However, Bechara, Damasio, Tranel, and Damasio (2005) state that reversal learning is not the only requirement for successful IGT performance; rather, a "stop signal" (which could take the form of an emotional signal) would also need to develop.
Research from other experiential decision tasks (i.e., where the participant receives feedback on their choices) highlights that -even if participants successfully inhibit win-stay lose-shift responses -they may still have difficulty choosing well. The principle "do what works best most of the time" is a good heuristic for predicting patterns of choice in experiential tasks (Rakow & Newell, 2010). For example, Yechiam, Rakow, and Newell (2015) found that, even when decision makers are informed about each option's payoff distribution, disadvantageous options with a rare but "catastrophic" outcome can be popular choices if the feedback one receives emphasizes that -on almost all occasions -this delivers a better payoff than a (safe) option with higher EV. This conforms to the patterns of preference observed for deck B in the IGT from which nine in every ten cards yields a positive outcome: Steingroever, Wetzels, Horstmann, Neumann, and Wagenmakers' (2013) report a preference for this low-frequency-of-punishment bad deck over the good decks in most studies; and this "prominent deck B phenomenon" has also been discussed by Lin, Chiu, Lee, and Hsieh (2007) and Dunn, Dalgleish, and Lawrence (2006).
To further investigate the influence of the IGT's EV and punishment frequency on subsequent choices, and the development of somatic markers, we created a lose version of the IGT, which simply reversed the original instruction from winning, to losing money.
We asked participants to play either the win IGT or our lose IGT. The aim of the win IGT is to find cards with gains, and minimize losses; the aim of the lose version of the IGT is to find cards with losses, and to minimize gains. The contingencies of the decks were identical to the original IGT in both versions but relative to the standard IGT, this lose version of the IGT reverses which cards progress the participant towards his/her goal, and which hinder. Thus, the "good" decks in the original IGT (C & D) are now "bad" in the lose version (because they deliver net gains); whereas the original bad decks (A & B, which deliver net losses) become "good" when the goal is to lose. The frequency of punishments is also switched for decks, with Deck B and D becoming high frequency punishment decks in the lose version and Deck A and C low frequency punishment decks, providing an additional interesting examination of the Deck B phenomenon. This lose version of the IGT allows us to further test the SMH to examine whether somatic markers develop that help to differentiate options which are conducive to your current goal from those that are not whilst guarding against the threat of introducing confounds by changing the task.

Hypotheses and analysis
We retain the standard letter labels for the IGT decks (A-D), which are defined according to their payoff distributions (see Table 1). The SMH predicts that participants begin the IGT by selecting from all four decks (for a number of trials) and then: H 1a . In the original Win version of the IGT, participants mainly select decks C and D in the later trials. However, if frequency of punishment drives preference, a different pattern of results is expected, which also varies between the Win and Lose versions because reversing the goal reverses the rank-order for the frequency of disadvantageous cards in the decks. These (partially) competing predictions are: H 2a . In the Win version, participants prefer low-frequency loss decks (B and D) to high-frequency loss decks (A and C). H 2b . In the Lose version, participants prefer high-frequency loss decks (A and C) to low-frequency loss decks (B and D).
The SMH predicts that, in response to outcomes experienced from selecting the cards, outcome SC responses will develop, with outcome SC greater for non-rewarded cards than for rewarded cards. Therefore, the SMH predicts that: H 3a . In the Win version, cards containing a loss generate greater outcome SC than those that contain only a gain. H 3b . In the Lose version, cards that contain only a gain generate greater outcome SC than those that also contain a loss.
The SMH predicts that participants (with no cognitive impairment) develop anticipatory SC responses prior to picking from the decks, which will be greater for the disadvantageous decks. The SMH therefore predicts that: H 4a . In the Win version, later in the IGT, greater anticipatory SC predicts selecting from decks A and B.
H 4b . In the Lose version, later in the IGT, greater anticipatory SC predicts selecting from decks C and D.
Previous studies examining SC in the IGT have examined "successful" and "unsuccessful" participants (based on their task performance) separately (e.g., Crone et al., 2004). In keeping with this, and because the SMH predicts that SC effects will be largest for participants who were playing advantageously, we will include task performance as a predictor to determine whether this is the case.

Participants
There were 75 participants (54 female) with a mean (standard deviation, SD) age of 20.46 (4.31) years and a range (interquartile range; IQR) of 17-43 (18-21) years. Three participants were removed for being non-responders (see procedure) and were replaced to ensure equal cell sizes (N = 36 in each version of the IGT). Participants were university students recruited from the University of Essex Psychology Department Volunteer list. Participants received a performance-contingent payment of £0.50 for every £1000 of task balance; plus a show-up fee of £5 (or course credit).

Apparatus
SC activity was recorded using the Mind Media NeXus-10, a multi-channel physiological monitoring and feedback platform, with a sampling rate of 32 samples per second, and employing BioTrace+ software. In order to help control for individual differences in SC before the task (Figner & Murphy, 2011), a measure of baseline activity was taken (duration calculated as a ratio of 1:5 of the average time taken to complete the task). SC activity was recorded continuously throughout the task, and for critical events in each task a trigger was sent from within the software that controlled the study via a button box to mark the SC reading. This enabled us to take timeframe windows of SC data around this trigger. Standard analyses of SC data were then performed using Ledalab, MatLab-based analysis software for SC data analysis using Continuous Decomposition Analysis, with no downsampling of the data (Benedek & Kaernbach, 2010).

Materials and design
A computerized version of the original IGT (Bechara et al., 1999) was programmed using REAL Studio. Participants chose repeatedly (100 selections, number not specified in advance to the participant) between four decks of cards by clicking on them. When a participant clicked on a card, the card displayed the amount of (notional) money won and (on some trials) lost. A green bar on the screen indicated the amount of money won or lost by increasing or decreasing in length; no values were shown but the original starting balance of £2000 was represented by a red bar alongside the green bar to show participants how well they were doing. The instructions were slightly modified from the instructions given in the computerized IGT (Bechara, Damasio, Damasio, & Lee, 1999). The only aspect differentiating the versions was that participants in the Lose version were instructed to lose (rather than to win) as much money as possible.
The four decks (with images resembling playing cards) were positioned in a single row, with deck locations randomized for each participant. There were 80 cards in each deck (40 cards kept in the original order, repeated twice) to help prevent decks running out. 1 The font on the face of each "turned" card was red or black (40 of each color per deck), sequenced as per the original IGT (Bechara et al., 1994). A sound of a card turning played when the participant selected a card. Participants performed either the original Win version or the Lose version in a between-subjects design.

Procedure
The study took about 30 min to complete, and the entire session lasted approximately 50 min. A consent form described the "exchange rate" (i.e., notional-earnings-to-actual-payment) and the NeXus-10 equipment. Once consent was obtained, participants cleaned the palm side of their first and third fingertips of their nondominant hand with an alcohol wipe and the experimenter applied EEG paste. The experimenter then attached the electrodes to the distal phalange of the participant's first and third digits, who then placed their hand palm upwards on a cushion on the desk to keep it as still as possible throughout the experiment. As recommended, the sensors were given five minutes to settle and participants were asked to take a deep breath to determine if the participant showed a response in the SC signal (Figner & Murphy, 2011). They read the instructions for either the Win or Lose IGT during this time.
Participants selected from among the four decks (see Table 1) by clicking their chosen deck with the computer mouse, using their dominant (preferred) hand. Each card displayed the outcome for ten seconds, then all the cards were enabled again and the back of the cards were shown for each deck. There was no time limit for participants to make their card selections. Upon completion, the accrued balance appeared on the screen.

Measures and data analyses techniques
The measure of SC reported here was the mean phasic driver within the response window, which is the most accurate representation of phasic activity. The window of SC data used for outcome SC was 1 to 4 s following a trial outcome (a card being selected) during which the card outcome was displayed on screen. The window for SC data used for anticipatory SC was the 3 s prior to the trial outcome (a card being selected). These timeframes were exported from Ledalab for each trial for all participants.
Due to the repeated-measurement of SC and card selections across trials, regressions were run using a multilevel model. Multilevel models are used to assess data that contain a natural hierarchy or clustering of cases within variables. This is appropriate with the current data because the 100 card selections represent a cluster of observations for each participant. 2 This technique allows examination of trial-by-trial data in a principled fashion (e.g., by not treating trials as if they were independent observations). Analysis of variance (ANOVA) is primarily used to assess the physiological data from the IGT in previous studies; comparing mean outcome SC, or mean anticipatory SC, between the four decks -often, but not always, across a series of 20-trial blocks. This creates problems (see Dunn et al., 2006) because data are missing when participants fail to select from a deck in a given block (as, indeed, should be so if participants have learned to avoid bad decks). This creates unequal cell sizes, which is usually resolved (for these within-subjects data) by eliminating those participants' data, thus reducing statistical power. Therefore, we did not use ANOVA to analyse the SC data because it would require removing participants from the analysis and because we believe that multi-level regression is a better way to analyse these data (a view explicitly endorsed by one of our reviewers).
Research has shown both inter-and intra-individual variability in the rise and recovery time of SCRs (Breault & Ducharme, 1993;Edelberg & Muller, 1981). The anticipated variation between participants in SC (Figner & Murphy, 2011) was accounted for by entering 2 Multilevel models differ from standard regression models (e.g., Ordinary Least Squares) due to dividing the error variance into separate components. This allows the model to control for the patterns of the structured data: patterns in the error from the model are assumed to have a reliable structure and are not just noise. participant as a level 2 random intercept within the multilevel model. Multilevel modeling was utilized in order to distinguish within-and between-participant variations in SC (Goldstein, 1995;Hox, 2010). If SC was entered as a predictor in the fixed part of the model, we also included it at level 1 as a random slope in the random part of the model to account for within participant variability, in addition to participant as a level 2 random intercept. The level 1 variables were at the individual trial level (100 data points in this study) and included participants' card selections (e.g., which deck picked, advantageous/disadvantageous selection) and SC measures (outcome SC; anticipatory SC). We checked for outliers using the Blocked Adaptive Computationally-efficient Outlier Nominator (BACON; Billor, Hadi, & Velleman, 2000) procedure, which identifies multivariate outliers in a set of predictor variables, and removed those outliers from all regression analyses.

Deck selections
To examine the prediction that participants will initially explore and pick from all decks before shifting to exploit the advantageous decks by the end of the task (H 1a,1b ), we ran a 5 (Block: 100 card selections split into 5 blocks of 20) × 4 (Deck: A, B, C, D) × 2 (Version: Win, Lose) ANOVA with card selections from each of the decks as the dependent variable. Where Mauchly's test showed that the assumption of sphericity had been violated, the p-value was adjusted following the procedure suggested by Greenhouse-Geisser (likewise in all future analyses). All post hoc comparisons were Tukey. There was a significant three way interaction between Deck, Block and Version of IGT, F(12, 840) = 14.62, p < 0.001, p 2 = 0.173, so two separate 5 (Block) × 4 (Deck: A, B, C, D) ANOVAs were run for each version.
In the Win version there was a significant interaction between Block and Deck, F(5, 196) = 6.35, p < 0.001, p 2 = 0.154, see Fig. 1. Over the course of the game, as predicted by the SMH, the two disadvantageous decks (A and B) were selected less, while selections of the two advantageous decks (C and D) increased. Over the first 20 trials, disadvantageous deck B had the most selections. Comparing the selection of disadvantageous deck B and the other disadvantageous deck A (the 3rd most frequently selected) at block 1, revealed a significant difference, F(1, 35) = 14.04, p < 0.001, with participants preferring the low-frequency-of-loss disadvantageous deck to the high-frequency-of-loss disadvantageous deck. Comparing disadvantageous deck B to the two advantageous decks C and D combined at block 1, revealed deck B was picked significantly more than the two advantageous decks, F(1, 35) = 13.67, p < 0.001. Examining these comparisons at the end of the game in block 5 revealed that the two advantageous decks were now selected significantly more compared to the disadvantageous deck B, F(1,35) = 7.68, p = 0.009. After 100 trials, disadvantageous deck B was not picked more than the other bad deck A, F(1,35) = 1.86, p = 0.182, indicating no preference for the low-frequency-of-loss disadvantageous deck B over the high-frequency-of-loss disadvantageous deck A. There was no difference in the selection between the high-frequency-of-loss advantageous deck C and the lowfrequency-of-loss advantageous deck D at the end of the game in block 5, F < 1.
In the Lose version there was a significant interaction between Block and Deck, F(6, 212) = 13.86, p < 0.001, p 2 = 0.284, see Fig. 2. Initially the disadvantageous decks (C & D) were selected more frequently than the advantageous decks (A & B). In block 1, the preferred deck, disadvantageous deck C (M DeckC = 6.69), was picked significantly more than the least-preferred deck, advantageous deck B (M DeckB = 3.78), F(1, 35) = 20.04, p < 0.001. By block 5,  advantageous deck A was picked significantly more than the other advantageous deck B, F(1, 35) = 24.61, p < 0.001. There was no difference between the selection of advantageous deck B and the two disadvantageous decks (C & D) combined, F < 1. Participants were picking the advantageous deck A (M DeckA = 10.81) significantly more than the other decks by the end of the game. The advantageous deck B (M DeckB = 3.10) was picked equally to the two disadvantageous decks. Disadvantageous deck C (M DeckC = 4.14) was picked significantly more than the disadvantageous deck D (M DeckD = 2.00) in the final block of the game (block 5), F(1, 35) = 11.61, p = 0.002.
An additional ANOVA examined the difference between total number of cards selected from the disadvantageous decks subtracted from the total number of cards selected from the advantageous decks by Block and by Version (Win, Lose). This revealed a large and significant main effect of Block, F(4, 280) = 32.61, p < 0.001, p 2 = 0.318 reflecting a progressive improvement in task performance over blocks. There was a tendency for more advantageous selections in the Lose version than in the Win Version but this was not significant (F < 1).

Multilevel modelling
To examine H 3a and H 3b we regressed Outcome SC (log 10 transformed) on Reward Type (rewarded = 0, non-rewarded = 1), Version (Win, Lose), Block (dummy coded with block 1 as the reference category) and their interactions. There was a significant three way interaction between Reward Type, Version and Block, b = −0.07, z = −4.59, p < 0.001, so we examined Outcome SC separately for each Version. In the Win version there was no effect of Reward Type (z = −1.45, p =0.148). Therefore we found no support for H 3a that Outcome SC should be greater following a non-rewarded card than a rewarded card in the Win IGT. There was a main effect of Block with Outcome SC decreasing across the game, (Block2: b = −0.27, z = −8.82, p < 0.001; Block3: b = −0.27, z = −8.76, p < 0.001; Block4: b = −0.26, z = −8.11, p < 0.001; Block5: b = −0.19, z = −5.67, p < 0.001). The interaction between Reward Type and Block was not significant (z = 0.01, p = 0.546). In the Lose version there was an effect of Reward Type, b = 0.11, z = 2.26, p = 0.024. Therefore we found support for H 3b that Outcome SC should be greater following a non-rewarded card than a rewarded card in the Lose IGT. There was a main effect of Block with Outcome SC decreasing across the game, (Block2: b = −0.27, z = −7.95, p < 0.001; Block3: b = −0.34, z = −8.89, p < 0.001; Block4: b = −0.26, z = −5.89, p < 0.001; Block5: b = −0.23, z = −4.60, p < 0.001). The interaction between Reward Type and Block was also significant, b = −0.03, z = −2.33, p = 0.020; Outcome SC was higher following a non-rewarded card than a rewarded card initially, in line with H 3b but only for the first two blocks of the game, see Fig. 3, and then was higher for rewarded cards until the end of the task. When we included Task Performance (successful participants were those who ended the task with more than £2000 in the Win IGT and less than £2000 in the Lose IGT) in the above regressions it did not change the pattern of results found and was not a significant predictor.
To examine H 4a and H 4b a multilevel logistic regression was run to see whether disadvantageous selections (0 = good decks (Deck A and B in Lose IGT, deck C and D in Win IGT); 1 = bad (Deck C and D in Lose IGT, deck A and B in Win IGT)) could be predicted from Anticipatory SC. Anticipatory SC was entered at level 1 and participant at level 2. We initially regressed Anticipatory SC for that trial   (centered) on disadvantageous selection but found no significant effect, Odds Ratio = 0.95, z = −0.51, p = 0.610. To check for any effect of Version, we regressed disadvantageous selection on Version, together with Anticipatory SC and its interaction with Version (Win, Lose). The main effect of Anticipatory SC remained a non-significant predictor, Odds Ratios = 1.00, z = 0.02, p = 0.986. The interaction term for Anticipatory SC with Version was also not significant, Odds Ratio = 0.89, z = −0.56, p = 0.573. We then included Block (dummy coded with block 1 as the reference category) and its interaction with Anticipatory SC (centered) to control for the time point in the task, whilst retaining Version, Anticipatory SC and their interaction. The main effect of Block was significant; for each of blocks 2 through 5, the chances of picking disadvantageously were substantially lower than those for block 1 (Block2: Odds Ratio = 0.54, z = −7.55, p < 0.001; Block3: Odds Ratio = 0.34, z = −12.88, p < 0.001; Block4: Odds Ratio = 0.27, z = −15.22, p < 0.001; Block5: Odds Ratio = 0.27, z = −15.23, p < 0.001). The interaction between Anticipatory SC with Block showed significant differences between block 1 (base) with block 3 (Odds Ratio = 0.62, z = −2.43, p = 0.015) and block 5 (Odds Ratio = 0.66, z = −2.18, p = 0.029), see Fig. 4. The presence of a significant Block by Anticipatory SC interaction indicates some variability in the regression coefficients for anticipatory SC across blocks, however not reliable enough to suggest a consistent effect of anticipatory SC on disadvantageous deck selections as predicted in H 4a and H 4b . Specifically, further analysis of each block individually indicated that only in block 4 did a positive effect of anticipatory SC on disadvantageous selections approach significance (p = 0.054). This was further qualified by a significant interaction between anticipatory SC and version (p = 0.019) reflecting that only in the Win version was anticipatory SC positively related to deck selections. We again included Task Performance as a predictor at each step in the above analyses, however it did not change the pattern of results and was never a significant predictor itself.

Discussion
The support for our study hypotheses is summarized in Table 2. We find that, irrespective of whether participants are asked to win, or to lose, they successfully learn (over the course of the IGT) to pick more from those decks that are advantageous to their goal. Notably, however, while in the (original) Win version, participants' final preferences favoured both advantageous decks (C and D) to a similar degree, in the Lose version it was the net loss deck with the high frequency of losses (deck A) that drove advantageous play. A preference for the low frequency of loss deck (B) is common in the original (Win) version of the IGT (Steingroever et al., 2013), which we also find -though only in the initial trials. The punishment schedule in deck B is such that participants do not experience a loss until card 9 and then again until card 14 so it is not surprising the initial preference is such. If participants do not always easily learn that deck B is "bad" in IGT studies, it is perhaps unsurprising that our participants fail to identify this as a good choice when the goal reverses (in the lose IGT) and they "should" select it. However, the patterns of deck-preference do not simply reverse when the goal is reversed. We argue that this points to the importance of considering the frequency of losses and gains, not just the expected value (EV) of options in the IGT (as assumed by the SMH). Three lines of research support this interpretation: (1) studies that have adapted the IGT payoff structures; (2) research using "decisionfrom-experience" tasks in which -like the IGT -one learns from previous choices; and (3) research on how biased samples affect exploration (i.e., sampling) of the environment.
One example of the first line of evidence comes from studies that adapted the EV in the IGT such as Chiu et al. (2008) and Yen et al. (2012); these studies showed the importance of the frequency of wins and losses on deck selections. Second, research from other experiential decision tasks which finds that the principle "do what works best most of the time" is a good heuristic for predicting patterns of choice (Rakow & Newell, 2010). Thus, choosing a high-frequency over a low-frequency "punishment" deck (i.e., deck A over B) when trying to lose is compatible with choosing the action most likely to progress one's goal, even though -on average -decks A and B are equally "good" when trying to lose. Likewise, in the standard (Win) IGT, deck B delivers the best possible outcome 90% of the time -which presumably explains the comparative attractiveness to this disadvantageous option early on in the task as the first punishment is not experienced until the 9th card in the fixed punishment schedule. Third, the developing pattern of deck-preferences that we observed can also be understood from a sampling perspective. Denrell (2005) identifies an intriguing asymmetry that arises when sampling information is costly (as in the IGT, where turning a card to learn more about a deck risks forgoing a better option). If an option initially appears good, people will keep selecting it, though if initial outcomes are better than the longrun average for that option they will eventually learn to disfavor that option. In contrast, if an option initially appears bad, people will stop selecting it, and so forgo the opportunity to learn that the option is, on average, better than they initially experienced. This is termed the "hot stove effect" (Denrell & March 2001), following an anecdote from Mark Twain who observed that a cat need only sit once on a hot stove for it to never to do so again, but it will consequently never learn that most of the time the stove is a perfectly decent place to rest. In the standard (Win) IGT, Deck B is like Twain's "stove" -most of the time it is a good option, but getting "badly burned" once or twice teaches you not to go there (Fig. 1). In the Lose IGT, Deck B contains "rare treasures" (Teodorescu & Erev, 2014) -if you don't persevere, you will never know what bounty awaits you (Fig. 2). This neatly explains the patterns of preference we observe for deck B in the first block of trials from which the first eight picks deliver gains before occasional very large losses begin to appear. Thus, in the Win version it is initially attractive, though participants persist long enough with it to learn that they should resist selecting it. In contrast, deck B is initially unattractive in our Lose IGT, and few participants persist with selections long enough to learn that it is advantageous to their goal.
The SMH predicts SCRs will develop after experiencing the outcomes of the decks. We find that experiencing a non-rewarded card predicted greater outcome SC but only in the Lose version. The interaction between block and reward type found higher outcome SC to the non-rewarded cards (as predicted by the SMH) but only in the first 40 trials of the task. So it seems there is some support that outcome SC does differentiate between outcomes, marking those that are not conducive with your current goal with higher SC; however even when this difference is found it does not persist throughout the entire 100 trials of the game and was not evident in the Win IGT. Bechara et al. (1997) did not report outcome SC in their findings and so comparisons cannot be made with their previous work; and, also, it is not clear from the SMH whether differences in outcome SC should persist throughout the task at a similar magnitude or, rather, should decrease over time. However, Fernie and Tunney (2013) reported that outcome SC following rewards was higher for participants who displayed knowledge, and that outcome SC reduced for advantageous decks in the trials after participants displayed knowledge. This suggests knowledge influences outcome SC and our drop in outcome SC may reflect the point at which our participants were becoming knowledgeable in the Lose IGT.
We did not find that greater anticipatory SC predicted selecting from disadvantageous decks (for both the Win and the Lose versions). Controlling for version, the interaction between anticipatory SC and block suggests that anticipatory SC varies across the IGT but does not have a consistent impact on selections. We did not expect anticipatory SC to aid selections early on in the task but it is difficult to know exactly where the "sweet spot" lies, where there should be a relationship between selections and anticipatory SC. We did not assess conscious knowledge in our study, which has been used to distinguish different levels of understanding experienced by participants in the IGT. In Bechara, Damasio, and Damasio (2000), anticipatory SC developed and was significantly higher prior to selecting disadvantageous decks in the "pre-hunch phase" (where participants could not articulate a successful strategy) and this difference remained until the end of the game, even once (presumed) conceptual understanding of the game was acquired (and presumably somatic markers are no longer needed to guide decisions).
A more crude assessment of participants' awareness of successful play in the task is to examine whether participants ended the game as winners (or losers). This may miss out participants who grasp the concept too late in the game to recover previous losses (or gains), however we found no beneficial effect of SC for those who performed better in the task, unlike in previous studies (e.g., Crone et al., 2004). In sum, the predictions of the SMH regarding the development of outcome and anticipatory SC were not reliably supported in either the Win IGT, or our Lose IGT, where the predictions regarding SC were sometimes found to reverse. This suggests that advantageous play in both versions of the IGT can occur in the absence of somatic markers. However, as found in other IGT studies (e.g., Bechara & Damasio, 2002), there was considerable inter-individual variation in performance: 19/36 Win-version participants finished "up" or "even" on their starting balance, and 21/36 Lose-version participants finished the game with less than the original endowment (as per their goal). There is therefore a considerable portion of participants who did not master the IGT quickly enough to finish "better off" than they started. Others have also reported that a sub group of their (healthy) participants perform poorly in the IGT, which they attribute to participants using one of a number of different strategies (e.g., Crone et al., 2004); and so analyses have also been split into those who win versus those who lose, or ranges of performance. Typically these report that poor performers show no anticipatory SC, or high variability in anticipatory SC (e.g., Bechara & Damasio, 2002;Carter & Smith Pasquilini, 2004;Crone et al., 2004). However, when we included performance as a factor, we failed to find any difference in SC across both our successful and unsuccessful performers -and found no moderating effects of task performance (such as those expected if only successful participants develop somatic markers that subsequently guide their choices).
More recently, Overman and Pierce (2013) reviewed the impact of real versus virtual versions of the IGT on Performance, and other factors such as the impact of gender on card selections. The real/virtual IGT devised by Overman and colleagues required participants to select from real decks of cards, which were then also represented virtually on a computer in front of them. This task removes the possibility of participants questioning whether the decks and cards interact, and performance has been shown to increase with selections of advantageous cards reaching 70-80% (depending on the number of trials administered) (e.g., Overman et al., 2004Overman et al., , 2006. The use of a purely virtual IGT may explain the low rates of successful performance in both the Win and Lose IGT in the current study. Overman and Pierce (2013) also found that women show a preference for the high frequency of reward decks, Deck B and D, and ruled out differences in mathematical ability, hormones and response perservation. This gender difference may explain the Deck B preference seen in other research and evident in our Win IGT in the early blocks, with 24 females playing this version out of the 36 participants. It may also explain the preference of Deck A in the lose IGT, where females may drive the preference for a high frequency of punishment deck when trying to lose. Further examination of gender on performance in the Lose IGT would help shed further light on the deck selections.
In this paper, we compared the standard IGT against a novel adaptation of the IGT that reversed the predictions that the SMH makes for the task, thereby facilitating novel tests of the SMH. Participants were, in general, able to learn to succeed in either version of the task, with their pattern of deck selections strongly suggesting that the frequency of losses and gains (as distinct from EV) is critical to predicting how participants will perform in such tasks. While we observed elevated SC in response to bad outcomes, this only occurred early on in the initial card selections in our new Lose IGT. We did not find consistent support for the development of anticipatory SC or their ability to guide advantageous play -a key assumption of the SMH. This, despite using more powerful methods of analysis than is typically used for the IGT.