E ﬀ ects of depressed mood ☆

Evidence shows that there are individual di ﬀ erences in the extent to which people attend to and integrate information into their decisions about the predictive contingencies between events and outcomes. In particular, information about the absence of events or outcomes, presented outside the current task frame, is often neglected. This trend is particularly evident in depression, as well as other psychopathologies, though reasons for information neglect remain unclear. We investigated this phenomenon across two experiments (Experiment 1: N = 157; Experiment 2: N = 150) in which participants, scoring low and high in the Beck Depression Inventory, were asked to learn a simple predictive relationship between a visual cue and an auditory outcome. We manipulated whether or not participants had prior experience of the visual cue outside of the task frame, whether such experience took place in the same or di ﬀ erent context to the learning task, and the nature of the action required to signal occurrence of the auditory outcome. We found that all participants were capable of including extra-task experience into their assessment of the predictive cue-outcome relationship in whatever context it occurred. However, for mildly depressed participants, adjacent behaviours and similarity between the extra-task experience and the main task, in ﬂ uenced information integration, with patterns of ‘ over-integration ’ evident, rather than neglect as we had expected. Findings are suggestive of over-generalised experience on the part of mildly depressed participants. there was a signi ﬁ cant di ﬀ erence between high and low BDI D ′ values in no pre-exposure conditions, F (1, 142) = 12.99, p < 0.001, η p 2 = 0.084, there was no di ﬀ erence between the groups in pre-exposure conditions, F < 1, suggesting that high BDI performance was equally poor irrespective of pre-exposure.

Learning about the relationships between stimuli or behaviour and subsequent outcomes is fundamental to human ability to function adaptively within an environment. Given our evolutionary success, it might make sense to assume that human learning follows normative rules (e.g., Crocker, 1981;Tversky & Kahneman, 1989) that would facilitate appropriate future behaviour. However, such rules often include the assumption that all information relevant to a given relationship is equally weighted. Evidence actually shows that most people weight some information types more highly than others (Kao & Wasserman, 1993;Mandel & Lehman, 1998;Wasserman, Elek, Chatlosh, & Baker, 1993) and almost neglect other equally relevant information (Mata, Garcia-Marques, Ferreira, & Mendonça, 2015;White, 2002). As we will explain below, the neglect of specific types of information is subject to individual differences and is much more apparent in depressed than non-depressed people (Msetfi, Murphy, Simpson, & Kornbrot, 2005) although the reasons for this neglect remain unclear. The current research examines this difference.
The matrix shown in Fig. 1 (below) provides a simple framework for studying the information that is thought to contribute to learning about the co-occurrence of two stimuli. Here, we refer to the stimuli as the event and the outcome in order to distinguish between them and to emphasise the temporal order of causal relationships. The letters in the cells of the matrix denote the frequency of each information type or event-outcome conjunction. ΔP is the normative measure of the strength and direction of the relationship and is based on the assumption that each event-outcome conjunction is equally relevant and important to the contingency between event and outcome (Allan, 1980). People's actions in anticipating, predicting or generating outcomes, and their judgements about the strength of the relation can be mapped onto ΔP and this assumption tested. In the laboratory, researchers have exposed participants to eventoutcome information over a number of experimental trials, usually separated by inter-trial intervals (e.g., Alloy & Abramson, 1979;Dickinson, Shanks, & Evenden, 1984), though some have used tabular presentation (e.g., Greville & Buehner, 2007;Vallée-Tourangeau, previous research include, for example, the brief presentation of an auditory or visual stimulus (e.g., geometric shape), fictional real world events (e.g., fertiliser given, flower blooms) or an action made by the participant (e.g., button press), with predictive behaviours (e.g., Kattner & Green, 2016), operant behaviours (e.g., Msetfi, Murphy, & Kornbrot, 2012) and contingency judgements (e.g., Allan & Jenkins, 1983) used as dependent measures. Findings show that such measures are sensitive to different levels of contingency, and that correlations with the normative ΔP model are high (r = 0.8, Allan & Jenkins, 1980;r = 0.98, Wasserman et al., 1993) with variability which depends on the precise procedure used.
In spite of such high correlations between dependent measures and the normative model, evidence also suggests that the information in the cells of the contingency matrix is weighted unequally. One example is Kao and Wasserman's (1993) study, which involved comparisons between pairs of contingency conditions that were constructed specifically to identify the weight that participants gave to each cell or conjunction. Over a series of experiments, with both tabular and trialby-trial presentation, data suggested that participants weighted the information unequally, such that cell 'a' > cell 'b' > cell 'c' > cell 'd', in terms of contribution to the contingency. Numerous other studies have reported evidence consistent with this finding (e.g., Crocker, 1981;Shaklee & Wasserman, 1986;Wasserman, Dorner, & Kao, 1990). Thus, overall, it seems that occasions when events and outcomes are present are more salient and perhaps are considered to be more important than their absence, even though their absence is equally relevant to the problem at hand according to a normative analysis. Furthermore event and outcome absence information is considered of least importance and is often neglected or discounted in learning (Mata et al., 2015).
Thus data from contingency learning studies evidences high correlations with the normative model along with unequal weighting and sometimes neglect of 'absence' information. In spite of this, evidence also shows that people can and will integrate 'absence' information, with a strong effect on learning. This type of integration is particularly clear when the information is presented outside of the current task frame. One example of this was provided by Msetfi et al. (2005) who exposed participants to zero contingency conditions in which the event was an action and the outcome was a light flash. The key finding was that healthy participants judged the action-outcome relation to be stronger when inter-trial intervals (ITI) were long than when they were short. It was argued that the ITI, although taking place outside of the task frame itself (the trial), contained no actions and no outcomes and as such was conceptually the same as the absence information contained in cell 'd' of the contingency matrix. Thus, results suggested that participants integrated this information into their contingency judgements and that the true contingency was inflated as a conse-quence. Therefore, in spite of evidence for the neglect of cell 'd' absence information, this data shows that people can and will use it (see also for goal directed behaviour, Mata et al., 2015).
Another classic example of the integration of absence information from outside the task frame involves cell 'b' of the contingency table. In latent inhibition procedures (Lubow & Moore, 1959), participants are pre-exposed to an event in the absence of an outcome before the main task itself commences. They are subsequently required to learn that the same event is predictive of an outcome. Typically, results show that the absence information, which is presented initially (event-no outcome), is integrated into subsequent learning because such learning is slower in pre-exposed participants in comparison to those who did not experience pre-exposure (Escobar, Arcediano, & Miller, 2003;Gray et al., 2001). Therefore, people can and will use absence information and it has a powerful effect on learning.
So far, we have described evidence that, in some procedures, absence information is considered relevant and is integrated (e.g., Escobar, Arcediano, & Miller, 2002;Msetfi et al., 2005), yet in other procedures, absence information is neglected or simply plays a less salient role in human learning and behaviour. Whilst that topic is of interest in its own right, in the current study we are particularly interested in understanding how depression influences information neglect and integration.
One example of this is that people with mild depression are less likely to integrate absence information from outside the task frame into their contingency learning as non-depressed people do. Evidence comes from studies showing that extending the duration of the inter-trial interval had no effect on contingency learning for participants with mild levels of depression (Msetfi, Murphy, & Simpson, 2007;Msetfi et al., 2005). Although other disorders are not the focus of this paper, we note that there are instances of information neglect related to other individual differences. Specifically, people with schizophrenia (e.g., Baruch, Hemsley, & Gray, 1988;Gray, Hemsley, & Gray, 1992;Guterman, Josiassen, Bashore, Johnson, & Lubow, 1996), and healthy people with high levels of schizotypy (Braunstein-Bercovitz & Lubow, 1998;Gray, Fernandez, Williams, Ruddle, & Snowden, 2002;Lubow, Ingberg-Sachs, Zalstein-Orda, & Gewirtz, 1992) also neglect absence information under pre-exposure conditions in latent inhibition procedures. Counter intuitively, because the individual difference relates to psychopathology, this neglect leaves their learning intact. Depending on the particular situation, information integration could be helpful or unhelpful to people's ability to behave adaptively in a given situation and it is important that we understand the conditions and individual differences that affect integration.
In this paper, we are particularly interested in the effects of depression on information neglect and integration. Briefly, theoretical explanations for other individual difference effects have involved associative learning processes, invoking impaired attention and an inability to filter irrelevant information or to learn about context (e.g., Hall & Rodriguez, 2010;Lubow, Feldon, & Weiner, 1988;Msetfi, Wade, & Murphy, 2013) amongst others. However, it has also been argued that what is being measured in many such studies is not learning per se but rather (and related to methodological issues) a decision mechanism based on computation of conditional probabilities (for a discussion see: Le Pelley & Schmidt-Hansen, 2010) as in the contingency matrix given in Fig. 1.
The contingency framework is useful for our purposes here and may provide some much needed clarity. Rather than focusing on the underlying learning processes and the purpose of each component of the task (although we plan to return to these later), the contingency framework merely identifies the type of information, which is present in a given task, and has or has not been integrated into learning. This also allows us to compute a normative metric that simply describes the strength of the relationship in either case. This framework avoids some of the theoretical and interpretational confusion that is less than helpful when trying to understand individual differences. Moreover, it should Generic contingency matrix showing the relationship between the occurrence of an event (e.g., action or event) and the occurrence of an outcome. The notations a, b, c, and d refer to the frequencies of each event-outcome conjunction. The normative model for the one-way relation between them is ΔP. ΔP = a / (a + b)c / (c + d) and generates a number between − 1 and +1 denoting the strength and direction of the relationship. This is based on the assumption that the weighting of each conjunction is equal a = b = c = d. allow us to identify the specific subtypes of absence information to which individual differences are sensitive.
Returning to information neglect in depression, there are a number of dissociable components to the extra-trial (inter-trial interval) information that people with depression do not integrate into their learning. Thus, it is not clear whether depression compromises sensitivity to one or a number of these aspects (see Table 1).
In the experiments reported in this paper, we planned to examine the different types of information shown in Table 1, to elucidate reasons for increased informational neglect in depression specifically and depression effects on learning more generally. As in many previous studies, participants were categorised (Msetfi, Cavus, & Brosnan, 2016;Msetfi et al., 2013) by their scores on the Beck Depression Inventory (BDI: Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) as mildly depressed or not depressed. These criteria are well validated for categorising student participants into these groups and have yielded depression differences on similar learning tasks (Chase et al., 2011).
We chose to use a simple behavioural task in which participants had to learn about the relationship between the occurrence of a visual stimulus and the subsequent occurrence of a brief auditory stimulus. Absence information presented outside the current task frame was manipulated and subsequent learning was then measured via the recording of predictions made on each trial. These predictions were made by participants through their behaviour, such that both instances of actions and no actions were predictions of the occurrence and nonoccurrence of the auditory stimulus.
This procedure deviates in important ways from the no/action and no/outcome contingency procedures used to detect depression effects in previous research (Alloy & Abramson, 1979;Msetfi et al., 2005). In previous studies, a trial marker stimulus would indicate the partici-pants' opportunity to act, which might generate the occurrence of an outcome. In the present study, an analogous set of stimuli was experienced but for somewhat different reasons. First, the predictive stimulus occurred, then the participants made their predictive response (action, or no action) and then the outcome stimulus occurred or not. So, the stimulus sequence (event, action, outcome) is almost identical between this and previous studies. This description simplifies the difference between this and previous research with the aim of allowing us to study, systematically, the effects of the various aspects of information neglect and integration, specifically 'absence' information on learning.

Experiment 1
In this first experiment, we tested whether depressed and nondepressed people's learning about the relationship between a predictive stimulus and an outcome is affected by exposure to contextual information and the absence of outcomes which occur outside the current task frame (see Table 1). In this study, in order to make the 'additional information' explicit, the contingency learning task was divided into two phases, as in a latent inhibition task (e.g., Lubow & Moore, 1959), with the additional information presented in the first phase.
In Phase 1, 80 trials exposed participants to the stimuli of critical interest in this study. Half of the participants experienced the predictive stimulus (pre-exposure group) in the absence of reinforcement, whilst the other half (no pre-exposure group) experienced context alone in the absence of reinforcement. In addition, the experimental task context was either the same or different to the subsequent second phase. In all cases, there was no exposure to the outcome stimulus in Phase 1.
Then, in the second phase, all participants were asked to predict the occurrence of a brief auditory stimulus (explosion) by performing an action (button press). Over 160 trials, the predictive stimulus was either present (stimulus = , k = 20) or it was absent (stimulus = , k = 140). A summary of the procedure used in Phase 2 is given in Fig. 2 below, and summarises the stimuli, actions and outcomes present or absent on each trial. In no pre-exposure conditions (Fig. 2, left), there is a perfect contingency, ΔP = 1, between the predictive stimulus and the occurrence of the outcome. However, in pre-exposure conditions the perfect ΔP value is degraded by numerous presentations of the predictive stimulus outside of the task frame in Phase 1 (see +80 added Table 1 Dissociable informational components of inter-trial interval information.

Inter-trial intervals
Type of information (i) No outcomes occur Absence of reinforcement (ii) No events occur Absence of stimulus (iii) No actions occur Absence of behaviour (iv) Inter-trial intervals take place in a context

Contextual information is present
(v) Inter-trial intervals take place outside of the experimental trial Additional information outside the task frame

Fig. 2.
Overall summary of procedure used in Experiment 1 with ΔP calculations given for each pre-exposure group. The values in the contingency matrices relate to the frequencies of each event-outcome conjunction. The triangle is the predictive stimulus (event) and the square represents the absence of the predictive stimulus. NB. '+80' refers to the additional no outcome trials presented in Phase 1 to the pre-exposure group. *refers to the predictive response participants must make in each trial in Phase 2.
to the cell frequency), and the ΔP value computes to a weak 0.2 contingency. The manipulation of context and outcome absence was extra-task frame variables designed to influence learning in Phase 2. This allowed us to test sensitivity to two of the five information types in Table 1, whilst controlling for the others. If participants are sensitive to the absence of reinforcement information presented in Phase 1, then Fig. 2 shows that there should be weaker learning in the pre-exposure group in Phase 2. In addition, if participants are sensitive to the contextual aspect of this learning task, this pattern should only be evident in participants for whom Phase 1 context is the same as Phase 2 context. Moreover, depressive differences in sensitivity to the context, either increased or decreased sensitivity, should be evident in the size of preexposure effects in the different contexts.

Participants
Participants completed the BDI online to measure their mood state before being invited to participate and then, again, on arrival to take part in the experiment. These participants were recruited from two university populations (n 1 = 117, 75%; n 2 = 40, 25%). The resulting sample of 157 participants were assigned to the high BDI (n = 60, female: n = 39, male: n = 13) or low BDI (n = 97, female: n = 58, male: n = 24) groups, based on a median cut off of 5 on the BDI (BDI > 5 = high, BDI ≤ 5 = low), as we have previously observed BDI effects with a cut-off point of 5 (see Chase et al., 2011). Overall and as expected, the BDI groups differed on BDI scores, t (155) = 14.92, p < 0.001, as well as DASS anxiety, depression and stress scores, all ts > 5.05, all ps < 0.001.
Participants in each mood group were then randomly assigned to one of four experimental groups, pre-exposure/same context, no preexposure/same context, pre-exposure/different context, no pre-exposure/different context. The four experimental groups were successfully matched on digit span and estimated IQ, a univariate ANOVA gave F(3, 153) < 1, p = 0.977, and F(3, 153) < 1, p = 0.497, respectively for these measures. The groups did not differ on BDI or DASS anxiety, depression and stress scales scores, all Fs < 1, all ps > 0.43. Participants from each data collection site were equally distributed across experimental groups (Table 2).

Design
A fully factorial 2 × 2 × 2 design was employed with BDI group (low BDI, high BDI), exposure group (pre-exposure, no pre-exposure) and context group (same context, different context) as between-subjects factors. Two dependent variables were analysed in this study. The first dependent variable was the number of trials taken to reach the success criterion (see Gray et al., 1992), and was recorded as the number of trials taken in Phase 2 to reach five consecutive correct responses with no false alarms. The second dependent variable used here was D-prime (D′), which compares the target 'hit' rate to the false alarm rate (i.e. Z[p (hit)] − Z[p(false alarm)]). This psychophysical measure provides an index of participants' sensitivity to the target, but normalises this for number of target versus non-target presentations, and response tendencies. The latter might be argued to inflate traditional trials to criterion measures. In addition D′ can be calculated over trial blocks and it is therefore possible to examine speed of learning in each condition. Here we report both measures in order to ensure comparability with previous work.
1.1.3. Materials 1.1.3.1. Beck Depression Inventory (BDI: Beck et al., 1961). The BDI is a self-report inventory of depression symptoms and has been used in research with clinical and student populations for many years.
Participants were asked to choose from 21 statements that best describe them. These ranged from neutral statements (e.g., I do not feel like a failure) scored as 0, to more extreme mood related statements (e.g., I feel I am a complete failure as a person) scored as up to a value of 3. Total scores could range from 0 to 63 where higher scores indicate higher levels of depression. The BDI has been validated in student samples, with correlations of 0.77 being reported between BDI scores and a psychiatric rating of severity of depression (Bumberry, Oliver, & McClure, 1978). & Lovibond, 1995). The DASS is a 42-item self-report questionnaire that yields three subscales, measuring the severity of depression, anxiety and stress symptoms. Participants rate each item (e.g., I found myself getting upset by quite trivial things) on a scale of 0 to 3, indicating the extent to which this had applied to them in the past week. A score of 3 would indicate that the statement had applied to the participant most of the time. There are 14 items for each of the emotional states and each subscale can yield a maximum possible score of 42.

Depression Anxiety Stress Scales (DASS: Lovibond
1.1.3.3. Learning task. All experimental stimuli were programmed and presented using computer (REALbasic, 2009, Release 2.1). Realistic graphics of rooms containing a TV cabinet and a TV were used to represent distinct contexts (see Appendix 1). For the same context condition, the room in Phase 1 and Phase 2 was always a blue bedroom. The different context condition exposed participants to the same blue bedroom in Phase 1, and a green bedroom in Phase 2.
For all participants, Phase 1 involved the presentation of three-letter codes constructed using three sets of 40 randomly selected letters presented over 80 trials. The letters were shown on the screen in a sequence, like 'H Q J', and written in white bold Helvetica size 36 font with 25 pixels in between each letter. The top of each letter was positioned to be 25 pixels below the top of the shape on which it was superimposed. For participants in the pre-exposure group, the predictive stimulus appeared behind the letter codes. Participants in the nopre exposure group simply saw the letter codes. In the Phase 2, a 500 ms auditory stimulus occurred on 20 of the 160 experimental trials following presentation of the predictive stimulus. The predictive stimulus was a blue triangle shape and was 150 pixels wide and 100 pixels in height. On the other 140 trials, the predictive stimulus was absent and a blue square shape, 100 pixels wide and 100 pixels in height, was shown. During presentation, both stimuli were positioned centrally on the TV screen and the TV itself was positioned centrally in terms of screen width and 75% of screen height from the top of the screen (see Appendix 1).

Procedure
After reading an information sheet and having the opportunity to ask questions, participants gave informed consent to their participation. They then completed a range of demographic questions, the digit span task, the BDI and DASS. Demographic data were used to estimate premorbid IQ (for method and equations see Barona, Reynolds, & Chastain, 1984).
Following this, participants were asked to read the instructions for Phase 1 of the learning task (see Appendix 2). Participants would play a game during which they would be taken to a virtual room, containing a TV and games console, visualised by a realistic picture displayed on the computer screen. A series of three-letter codes would appear on the TV screen and participants were asked to note the fourth code and count how many times it reoccurred. There were 40 codes and each one was displayed twice over 80 trials in a random order. Each trial and stimulus exposure lasted for 1500 ms and was separated from the next trial by a 500 ms inter-trial interval. For the pre-exposure condition, the three-letter code was superimposed over a blue triangle on the television screen. In the no pre-exposure condition, the three-letter code simply appeared on the TV screen. In all cases, the three-letter code was centred on the TV screen. At the end of the 80 trials, participants were required to enter the number of times they had seen the fourth code using the computer keyboard.
Phase 2 instructions stated that participants would be taken to a room with a TV and games console and that a series of three-letter codes would appear on the TV screen (see Appendix 2). As in Phase 1, the room was a virtual room represented by pictures on the computer screen, and was the same for the context same group, and different to Phase 1 for the context different group. Participants were told that, during this second game, they might notice the occurrence of some small explosions. Their task was to work out the rule that guided the occurrence of the explosions. Participants were asked to press the space bar every time they thought the explosion was about to occur. The same 40 three-letter codes used in Phase 1 were also used in Phase 2, with each block of 40 codes presented in a random order during four blocks of 40 experimental trials, resulting in a total of 160 experimental trials. The timing of each trial was the same as Phase 1.
For each block, 5 of the 40 trials included presentation of the predictive stimulus for 1500 ms, followed by the sound stimulus, which had a duration of 500 ms. On the other 35 trials, the predictive stimulus was absent for 1500 ms after which the sound stimulus was not played and there was silence for 500 ms. Trial timing was therefore exactly the same as for Phase 1. Responses were only possible during the 1500 ms display of the predictive stimulus. At the end of the 1500 ms interval, the trial was coded as a response or non-response trial. Order of trials was randomised across blocks. After 160 trials, the game ended and participants were debriefed, thanked and received a nominal payment in return for their participation.

Results and discussion
Response data was collected on every trial in Phase 2 and a score based on trials taken to reach criterion and a value for D′ for each trial block were calculated for each participant. These data are displayed separately below and our observations were tested using a between subjects and mixed factorial ANOVAs where appropriate, with alpha held constant at 0.05 throughout unless stated otherwise. Fig. 3 suggests that for all participants, irrespective of their mood state, pre-exposure to the non-reinforced predictive stimulus in any context in Phase 1 resulted in the perception of a weak relationship between stimulus and outcome in Phase 2.

Trials to criterion
The results of the ANOVA showed that there was a significant main effect of pre-exposure condition, F(1, 149) = 39.37, p < 0.001, η p 2 = 0.209, with participants in the pre-exposure group taking on average 7 more trials to reach criterion than the no pre-exposure group. There were no other significant main effects or significant interactions, all Fs < 2.09, all ps > 0.14.

D-prime
As there was no evidence of a context effect, for clarity, we have collapsed D′ data over context conditions in Fig. 4. Increasing values for D′ indicate that target sensitivity increased rapidly over trial blocks, with the pre-exposure effect evident from block 1 to block 4.
The results of the mixed factorial ANOVA supported these observations, with a large significant effect of pre-exposure group, F(1, 149) = 34.88, p < 0.001, η p 2 = 0.19. The main effect of trial block was significant, F(3, 447) = 139.01, p < 0.001, η p 2 = 0.483, as was the block by pre-exposure interaction, F(3, 447) = 6.10, p < 0.001, η p 2 = 0.039. Follow-up simple effects analyses showed that whilst the pre-exposure effect was significant across trial blocks, it reduced in size  significantly by block 4 (block 1 η p 2 = 0.312, block 2 η p 2 = 0.153, block η p 2 = 0.091, block 4 η p 2 = 0.086). There were no other effects or interactions that were reliable or approached the significance criterion. Overall, the results of this experiment showed that exposure to a stimulus in the absence of outcomes, outside of the frame of the current task, weakens learning as predicted in Fig. 2. However, the context manipulation had no effect. This was the case for both depressed and non-depressed participants. Therefore, Experiment 1 indicated no differences in learning between depressed and non-depressed people and showed that subsequent learning is sensitive to the presentation of non-reinforced stimuli presented outside the frame of the current task.
These data suggest that neither sensitivity to no outcome and context are compromised in mild depression, and perhaps that previous reports of learning impairments in people with depression reflect processing other aspects of absence information. For example, the experimental manipulations in Experiment 1 were situated in the top row of the contingency matrix, shown in Fig. 1. As we described in the introduction, the top row of the contingency relates to the frequencies of outcomes and no outcomes in the presence of the predictive stimulus. This experimental manipulation of pre-exposing participants to instances of the predictive stimulus in the absence of the outcome might be thought of as 'cell b' type experience. This manipulation failed to evidence effects of depression on learning and shows that depressed people can and will integrate information from outside the task frame into subsequent performance.
However, as we noted in the introduction, previous studies have specifically suggested that depression effects are focussed on information contained in the least weighted cell of the contingency matrix, 'cell d'. Not only does cell d include no outcomes and context, it also involves the absence of behaviour or actions. Therefore, in order to test sensitivity to absence of actions and outcomes, Experiment 2 reversed participants' predictive response requirement in Phase 2. This meant that the behaviour required to predict the outcome was 'no action', whereas an 'action' was required to predict a 'no outcome' trial. A secondary effect of this manipulation is to equate the informational content of Phase 1 pre-exposure with Phase 2.

Experiment 2
In order to test participants' sensitivity to the absence of actions and outcomes, rather than the absence of outcome only, we adjusted the experimental procedure and reversed the response requirements. In all other aspects, the procedure was identical. Thus, in Experiment 2, the absence of the predictive stimulus (rather than its presence, in Fig. 5 labelled occurrence of event) required an action; this might be thought of as an 'all clear' action. The predictive stimulus itself did not require an action because participants were required to take 'no action' if they thought the outcome was about to occur (in Fig. 5, labelled absence of event). Conceptually, this pre-exposure manipulation now involves the bottom row of the contingency table (cell d) rather than the top row (cell b). So, in other words, actions were to predict the 'all clear' (no outcome) and no action would predict outcome occurrence. The aim of this manipulation was to test participants' sensitivity to pre-exposed information when marked by the absence of actions thus equating the informational content of Phases 1 and 2. In summary, all aspects of Experiment 2 were identical to Experiment 1, with the exception of the action required on each trial.

Method
Only details that are different to Experiment 1 are given here.

Procedure
Participants were requested to press the spacebar on the computer keyboard, when they thought the sound stimulus was not likely to occur, and to do nothing when the sound stimulus was likely to occur. Fig. 5. Overall summary of procedure used in Experiment 2 with ΔP calculations given for each pre-exposure group. NB. '+80' refers to the additional non-reinforced trials presented in Phase 1 to the pre-exposure group. * refers to the predictive response participants must make in each trial in Phase 2. R.M. Msetfi et al. Acta Psychologica 178 (2017) Fig. 6. For low BDI groups, the data are similar to the previous experiment. Low BDI participants in the pre-exposure group took 8 trials longer to reach criterion than low BDI participants in the no pre-exposure group, who reached criterion at the first opportunity. However, high BDI participants appeared to take equally as many trials to reach criterion irrespective of pre-exposure (Fig. 6).
In order to explore the absence of the pre-exposure effect in high BDI participants further, we compared their trials to criterion scores across Experiments 1 and 2. For the no pre-exposure groups, performance was significantly worse in Experiment 2 than Experiment 1, F(1, 63) = 10.02, MSE = 46.05, p = 0.002. However for the pre-exposure groups, there was no reliable difference in performance between the two experiments, F < 1, p > 0.5. Fig. 7 shows corresponding D′ data across each trial block and = 8.41, p = 0.004, η p 2 = 0.056. There were also significant interactions between trial block and pre-exposure, F(3, 426) = 3.95, p = 0.008, η p 2 = 0.027, and BDI group and pre-exposure, F(1, 142) = 4.93, p = 0.028, η p 2 = 0.034. The interaction between trial block and context group approached but did not reach the level of significance, F(3, 426) = 2.60, p = 0.052, η p 2 = 0.018. Follow up simple effects analyses showed that, as in Experiment 1, the size of the preexposure effect reduced over trial blocks (block 1 η p 2 = 0.156, block 2 η p 2 = 0.113, block 3 η p 2 = 0.073, block 4 η p 2 = 0.031), whilst remaining significant throughout. The further exploration of the BDI by preexposure interaction showed that whilst low BDI groups showed a strong pre-exposure effect, F(1, 142) = 21.56, p < 0.001, η p 2 = 0.132, high BDI groups did not, F(1, 142) = 2.38, p = 0.125, η p 2 = 0.016. In addition, whilst there was a significant difference between high and low BDI D′ values in no pre-exposure conditions, F(1, 142) = 12.99, p < 0.001, η p 2 = 0.084, there was no difference between the groups in pre-exposure conditions, F < 1, suggesting that high BDI performance was equally poor irrespective of pre-exposure. Taken together, our findings here suggest that non-depressed participants are equally sensitive to stimuli presented with no outcomes outside of the immediate task frame, irrespective of context, whether or not that information is accompanied by the presence or absence of action. Depressed participants, however, only displayed sensitivity to the pre-exposure manipulation when it was marked by actions and not when it was marked by the absence of actions. In the second experiment, depressed participants' learning was equally poor whether or not they were pre-exposed.

General discussion
In this series of experiments, we set out to answer the question of why people with mild depression do not integrate contingency information presented outside the current task frame into their learning like non-depressed people do. We reasoned that properties of the information itself caused this pattern, such as differential sensitivity to absence information (of stimuli, of outcomes of behaviour), to context or simply having a very specific frame of reference (e.g., constrained focal set of events, Cheng & Novick, 1990) for on-going learning. We found that information presented outside of the frame of the current task influenced all of our participants' contingency learning, in some circumstances. According to these findings, people who are mildly depressed are equally capable as others of integrating presentations of stimuli, which are unaccompanied by outcomes, into their contingency learning. However, when that information was accompanied by the absence of action, across both experimental phases, rather than just the pre-exposure phase, then sensitivity to the presence versus absence of that information was eliminated and performance was generally poor. In order to explicate these findings further, we first discuss possible explanations for these effects before discussing implications for theory, depression and limitations of the work.

Theoretical implications
Learning about the relationships between events or actions and outcomes has been subject to analysis using a contingency framework for the last 50 years (e.g., . Researchers have investigated the fit between human contingency judgements and various rules for combining information from the contingency matrix into one measure (e.g., Allan, 1980;Cheng, 1997). All these rules share the assumption of equal weighting of contingency information although this has been shown to be violated on numerous occasions (e.g., Crocker, 1981;Shaklee & Wasserman, 1986;Wasserman et al., 1990). A more fundamental assumption is that contingency information (cells a through d) and their weighting in learning is based on properties of relevant stimuli themselves. So, for example, cell 'a' is highly weighted because the stimulus is present and the reinforcing outcome is also present. Cell 'd' is weighted low or neglected because both stimulus and reinforcing outcome are absent.
One implication of our findings may be that weighting based on 'stimulus properties' themselves may not always occur. 'Adjacent' stimulus information, such as whether other stimuli or actions are present, also influences the extent to which people weight and integrate information into their learning. We have come to this initial conclusion because the key difference between Experiments 1 and 2 is whether the pre-exposed predictive stimulus was accompanied by different actions across Phases 1 and 2, as in Experiment 1 (Phase 1: no action; Phase 2: action), or by the same actions, as in Experiment 2 (Phase 1: no action; Phase 2: no action).
For the non-depressed, the difference or similarity between the two phases did not matter; they still evidenced the predicted perception of a weak contingency between stimulus and outcome in pre-exposure conditions. This is consistent with the idea that exposure to or learning about the predictive stimulus itself and its properties governed their performance in Phase 2. At first glance, it seemed that in Experiment 2, pre-exposure information had no impact on depressed participants, suggesting complete information neglect. However, closer examination of both data sets (statistical comparison Experiment 1 versus Experiment 2) suggested a different conclusion. In Experiment 2, the performance of depressed participants was equally poor in both Experiment 2 conditions and comparable to their pre-exposure performance in Experiment 1.
This analysis suggests then that, for depressed participants, it was not their experience of the predictive stimulus that determined their performance but the adjacent non-action experience. Furthermore, the integration or neglect of specific types of information may not be a given based simply on the properties of the stimuli themselves. Rather, based on the performance of our depressed participants, information weighting or neglect may be dynamic and dependent on adjacent stimuli. This conclusion is based on the excellent performance that depressed participants displayed in Experiment 1 versus their very poor performance observed in Experiment 2. This cross-experiment comparison suggests that depressed participants completely integrated the adjacent no-action information in Experiment 2, whether or not a specific stimulus was pre-exposed. Thus it could be argued that the net result of this, for depressed participants, was that Phase 1 trials were integrated into their cell 'd' experience in both the pre-exposure and no pre-exposure groups, as shown in Fig. 8. This would result in a degraded contingency for both groups and is consistent with the poor performance we observed in both pre-exposed and non pre-exposed participants.
On the basis of that analysis, we can return to the contingency calculation to check the predictive power of the two stimuli that were presented during Phase 2 of the task. Recall that the triangle, the stimulus which predicts outcomes, is now relocated to the lower row of the contingency table because both in Phase 1 and Phase 2 it occurs along with no actions (see Fig. 8, bottom half of the contingency tables). Assuming the complete integration of Phase 1 trials (+ 80), the predictive power of the triangle computes to a weak contingency, where ΔP = 0.2. The predictive power of the square stimulus, which does not predict the outcome, also computes to a weak contingency, but in the opposite direction, where ΔP = − 0.2. Therefore the contingency analysis [of these observed effects] is that, for depressed participants, both the predictive and non-predictive cue had equally weak predictive power (i.e. ΔP = 0.2 and ΔP = − 0.2). Fig. 8 further suggests that the individual differences reported here are entirely consistent with a contingency framework applied to information integration. As we argued in the introduction, the contingency framework is attractive because it often provides an accurate account of the decisions people make in such conditions (Le Pelley & Schmidt-Hansen, 2010) and allows us to focus on the informational content of experience on which individual differences are based. However, this approach does not provide much insight into the processes underlying the individual difference.
For such insight, we might look to associative models, such as the Rescorla-Wagner model (Rescorla & Wagner, 1972) and Pearce's model of stimulus generalisation (Pearce, 1987), which explicitly show how 'adjacent' stimuli can reduce the amount of associative strength (a construct which is used as isomorphic to contingency strength) that any one stimulus can accumulate. Thus in our contingency analysis, shown in Fig. 8, of the depression effect reported here, we showed how depressed participants might be exposed to two very weak contingency conditions, with their performance being consistent with this. At asymptote, associative models often produce identical predictions to the contingency framework but for different reasons. From that perspective, every stimulus present shares the limited amount of associative strength available for any given outcome. Thus, additional stimuli present on learning trials will always compromise the ability of the others to acquire strength because of the shared nature of associations (e.g., Shanks, 1989). This process is known as cue competition. In addition, it is also the similarity between configurations or groups of stimuli, which includes all stimuli present in a given context, that affects the extent to which learning in one situation (e.g., Phase 1) generalises to another situation (Phase 2) (Pearce, 1987).
These theories might help answer the question as to why there are differences between the depressed and non-depressed. It could be that cue competition itself is affected by depression. If this were the case then, for depressed people, this would mean that the predictive stimulus, and arguably the non-action, accumulate little of the available associative strength. Hence, in Experiment 2, their performance was equally poor across conditions. Another suggestion links to the similarity between the task phases. It could be argued that the components of the learning task (e.g., , context, no action) were more similar across the two learning phases in Experiment 2 than in Experiment 1. This might indicate that, for our depressed participants, the greater similarity between Phases 1 and 2, drove strong generalisation from all of the events that occurred in Phase 1 to the events that occurred in Phase 2. Note that we use the term generalisation loosely here to indicate that Phase 1 exposure affected and compromised Phase 2 performance similarly in pre-exposure and no pre-exposure conditions. So, from the experimenter's perspective, the only pre-exposed stimulus was the subsequently predictive stimulus (the blue triangle), and this pre-exposure effect was always evident for non-depressed participants. However, for depressed participants, the entire array of stimuli including context and non-actions may have acted as the pre-exposed stimulus, which then generalised to Phase 2 compromising learning in both pre-exposure and no pre-exposure groups. In order to explain compromised performance, we have to assume that both groups of depressed participants experienced a pre-exposure effect. Indeed our data are consistent with this idea.
There are multiple theoretical approaches which attempt to explain the processes which underlie pre-exposure effects, including retrieval failure due to non-reinforced stimulus exposure (e.g., Bouton, 1993) and loss of salience (Rescorla & Wagner, 1972) or associability (e.g., McLaren & Mackintosh, 2000) but there is little consensus in the literature on which theory provides the most convincing account (Le Pelley & Schmidt-Hansen, 2010). However, the important point to note here is that it was the additional between phase similarity evident in Experiment 2 that affected depressed people's performance patternsbut not non-depressed -and this is why we might term their performance as 'over-generalised'.
This suggestion is interesting because in many cases depression's effects on contingency learning have led to 'better' learning about actions and outcomes (e.g., depressive realism: Alloy & Abramson, 1979) because of context information being neglected in their judgements. In this case, when depression effects were apparent, poorer learning resulted and this was not linked to learning about context. This explanation is consistent with other evidence of 'over-generalisation' in depression (e.g., Rekart, Mineka, & Zinbarg, 2006;Williams & Scott, 1988) which contributes depressive thinking styles and over-generalisation errors as described in cognitive theory of depression (Beck, 1967;Clark, Beck, & Alford, 1999). Similar generalisation trends are evident in schizophrenia (e.g., Wood, Brewin, & McLeod, 2006) and other psychopathologies (see Hackmann & Holmes, 2004) suggesting that generalisation may play a role in other well-known failures to integrate information from outside the task frame (e.g., Baruch et al., 1988;Gray et al., 1992;Guterman et al., 1996). A further implication is that for non-depressed participants there is an optimal generalisation profile based on generalising relevant information and which only results in impaired learning in specific conditions. We do note however that this is only one possible explanation for our findings.
For example, it could be argued that differences in learning between our depressed and non-depressed participants are based on the manipulation of the response alternatives between Experiments 1 and 2, and what we are reporting here is merely an artefact of behavioural passivity (e.g., Blanco, Matute, & Vadillo, 2009;Msetfi, Kumar, Harmer, & Murphy, 2016) in depression. We can reject this suggestion for several reasons. Firstly, in both experiments, behavioural action and non-action were both required for good performance. A participant who, in Experiment 1, was able to predict the auditory outcome with the action very successfully, would also have to withhold that action on non-predictive trials. An error on either trial type would result in the trials to criterion count being reset to zero. Similarly, in Experiment 2, whilst non-predictive trials required the action and predictive trials did not, an error on either trial type matters. Any depressed participant with aberrant response levels would produce poor scores in both experiments. Moreover, in both experiments, we calculated a target sensitivity measure, known as D-prime, which is used in the psychophysical literature to measure target sensitivity whilst discounting any kind of response bias. The D-prime data produced identical results to the success criterion data. This suggests that our findings of depression effects are not simple behavioural artefacts of the different response alternative but are valid learning effects.

Limitations
An obvious limitation of this study is that the depressed participants (high BDI groups) score above the median on the depression scale rather the score as depressed based on any standardised, clinically informed cut off score (Beck et al., 1961). Here, as in other studies (Chase et al., 2011), we used this strategy for pragmatic reasons, to maximise our sample size, but also under the assumption that the differences between non-depressed and depressed are continuous rather than qualitative in nature (Cox, Enns, Borger, & Parker, 1999) and should therefore be present to a lesser degree in participants who score just above the median. In order to reassure readers on this point, we reanalysed our data excluding participants who scored between 5 (cut off used in this experiment for mild depression) and 9 (clinical cut off) on the BDI. Consistent with the continuity assumption, effects were stronger in this reduced sample (η p 2 = 0.07) than in the mildly depressed sample who scored above the median (η p 2 = 0.05).

Conclusions
As an answer to the question we posed in this paper, we conclude that depressed and non-depressed participants are both able to integrate no outcome information from outside the current task frame into subsequent predictive learning. However, for mildly depressed participants, integration was dependent on the similarity of accompanying information, and we conclude that they experience strong patterns of generalisation which can compromise their contingency learning in some conditions.

Appendix A. Appendix 1
Stimuli and room pictures.

Appendix B. Appendix 2
Phase 1 instructions. "This experiment consists of two stages. You will be given the instructions for stage 2 after you have completed the first stage. During the game, you will be in a room where there is a TV. The TV is connected to a games console, which has been set up and placed ready for you to play. The room will look like the one shown on the right of this screen. The game involves computer codes. In the first stage of the game, you will see a series of 3-letter codes appear on the TV screen one after the other. It is very important that you pay attention to these codes. Your task is to identify the 4th code and then to count how many times it is repeated. You will be asked to report the number of repetitions at the end of this stage of the experiment. This part of the experiment will take about 3 minutes. Please ask the experimenter if you have any questions." Phase 2 instructions Experiment 1. "In the second Phase of the experiment, you will be in a room where the TV and the games console have been set up and placed ready for you to play a game. The room will look similar to the picture you can see on the right of the screen now. You will see a series of 3-letter computer codes presented one after the other on the TV screen. During the game, you will also notice the occurrence of some small explosions. It is your task to discern the rule that guides when the explosion happens. Whenever you think an explosion is ABOUT to take place, you should press the SPACEBAR on the computer keyboard. Even when you think you know the rule, please continue with the game. This part of the experiment will take about 6 minutes. Please ask the experimenter if you have any questions." Phase 2 instructions Experiment 2. "In the second phase of the experiment, you will be in a room where the TV and the games console have been set up and placed ready for you to play a game. The room will look similar to the picture you can see on the right of the screen now. You will see a series of 3-letter computer codes presented one after the other on the TV screen. During the game, you will also notice the occurrence of some small explosions. It is your task to discern the rule that guides when the explosion is about to happen and when it is not about to happen. So you need to do two things: Whenever you think that an explosion IS about to happen, signal this by doing nothing. So DO NOT PRESS the spacebar on the computer keyboard when you think that there WILL be an explosion. Whenever you think an explosion IS NOT about to happen, signal this by pressing the spacebar. In other words, only PRESS the spacebar on the computer keyboard every time you do not think an explosion will occur. Even when you think you know the rule, please continue with the game. This part of the experiment will take about 6 minutes. Please ask the experimenter if you have any questions."