The Importance of Retrieval Context in Visual Working Memory

Two contrasting theoretical accounts of visual working memory capacity; the Discrete Slot Model and the Shared Resource Model have contrasting views of how people remember a set of objects. The Discrete Slot Model defines working memory capacity in terms of number alone whereas the Shared Resource Model takes into account the resolution of the stored representations. Research has raised questions with regards to the retrieval arrays of the change-detection tasks used to assess the models above, with the consideration of the benefits of both single and multiple retrieval probes. The current investigation aimed to investigate the retrieval probe question (single versus multiple) by assessing visual working memory using a paradigm created. This paradigm was manipulated to allow three working memory retrieval contexts to be used – 1) a full array with a cue; 2) a single probe in a peripheral location and, 3) a single probe in a central location. Results suggested no overall benefit of a multiple retrieval probe over the singe retrieval probes, however a suggestion was made about the benefits of using single only arrays to avid an advantage of spatial memory cues. Results are discussed in terms of methodological improvements with regards to the future research suggested.


Introduction
Early perspectives of memory discussed short term memory as the temporary storage of information [1], with visual, verbal and spatial types of material being stored in one single short-term memory component [2][3][4]. In more recent views, such as those of Baddeley [5] and Logie [6], short-term memory has been defined as 'working memory' which involves more than just basic storage mechanisms but the processes involved in maintaining the memory information. Both Baddeley [5] and Logie [6] identified the separation of different types of material stored within working memory, suggesting that there are separate phonological and visual stores. Logie [6] later distinguished between the separate visual and spatial stores within memory, identifying the visual cache component as the visual specific memory component and the inner scribe as the spatial specific component. When considering any working memory approach, considerations can be made regarding the type of task which can be used to investigate the different aspects of working memory -phonological, spatial and visual. Research from Hamilton et al. [7] looked at spatial working memory with the use of a spatial tapping task specifically designed to target the spatial working memory components -inner scribe in Logie's model and the visuospatial sketchpad of Baddeley [5]. One finding from the investigation indicated that although the task was designed to assess spatial working memory, it also used the more general attentional components of working memory such as the central executive, drawing upon the use of long-term memory and potential phonological information. Hamilton et al. [7] concluded that future researchers must be careful when using tasks assessing certain aspects of working memory as tasks may not be as domain specific as first thought. A similar study from Brown and Wesley [8] used a visual matrix task, finding the use of potential verbal strategy involvement and making the task appear to be not visual specific. A more recent visual working memory task, named as a change detection [2] has also been designed to assess visual working memory it was shown that this task may also have verbal working memory involvement, however there are also other factors that need to be considered when investigating visual working memory, such as the array size presented and the type of array presented at retrieval. This current paper will discuss an experimental investigation which looked at a version of such a change-detection task and the potential contexts that need to be considered before using the task in further settings.
Large categorical changes, also known as quantitative changes, in visual memory capacity can be assessed using change detection paradigms, similar to those of Luck and Vogel [2]. Luck and Vogel researched these quantitative changes in visual arrays, identifying the Discrete Slot Model (1997) as a model for defining visual working memory capacity. In chapter 1, this model was identified; it proposed a simple 'slot' approach to visual working memory storage, having a limit of approximately 3-4 items. Change detection paradigms have been employed to investigate the Discrete Slot Model and have been used in numerous pieces of literature [2,9,10]; however, there remains an issue with regards to the potential of organisation of the array elements and the subsequent implications for the type of retrieval context within these paradigms. Are single or multiple stimulus retrieval probes the most appropriate to use in a visual change detection task given the potential for hierarchical organisation? [11,12]. The current investigation aims to address this issue by using a methodology which includes both single and multiple retrieval probe types.
Brady and Alvarez [11,12] investigated visual working memory capacity, suggesting that multiple objects in a retrieval/probe array could influence the recall of an object in working memory. In their research, participants were asked to remember the size of 1 red, blue or green circle, out of an array of 7 different sized circles. The circles were of differing sizes; however, each type of colour was presented retrieval arrays in terms of single versus multiple objects in these arrays. It was found that multiple object probe arrays can cause problems such as mis-binding in working memory, contradicting the research from Brady and Alvarez [11]. As part of their initial investigation, Wheeler and Treisman [1] also used the paradigm created by Luck and Vogel [2] and it was suggested that when participants are presented with several objects; it can be difficult to bind the correct object's colour and location from their own capacity store, causing binding errors in memory recall. Wheeler and Treisman [1] suggested using single object retrieval arrays as a way of reducing this error, however, it was not investigated as to which type of single retrieval probe should be used, for example a single central or single peripheral retrieval probe. Such a hypothesis would suggest an advantage in performance within the single probe context and could give current researchers an indication of whether a single retrieval context is the most appropriate to use within memory research.
Prior change detection research has successfully employed a single probe retrieval context. Jackson et al. [15]used singe retrieval probes successfully when investigating the visual working memory capacity for faces. Researchers here presented between one and four faces in the encoding array, however, as with Wheeler and Treisman [1], only one face was presented at retrieval. This provides further support for the fact that a single retrieval probe may be the most appropriate the use, as single probes have been used in both shape and face contexts within visual memory capacity research.
The previous research, above, provides a question about the retrieval context which needs to be addressed. The aim of this initial study is to identify whether performance in the single probe retrieval context differs from the conventional Luck and Vogel [2] full array protocol.
Participants will be exposed three types of retrieval arrays. The first condition will consist of a full retrieval array with one cued square [2]. The second condition will consist of a single central square and the third condition will consist of a single peripheral square in any of the eight possible locations. Wheeler and Treisman [1] had concluded that single retrieval probes were the most appropriate to use, however they did not specify whether a central location or a peripheral location was the most appropriate, therefore this will also be investigated. Array sizes of 1, 2, 4, 6 and 8 will be used during this pilot investigation as these had been used by Wheeler and Tresiman [1] and Luck and Vogel [2]. It was decided not to use the array size of 12 which had been used by Luck and Vogel [2] as this size was too demanding on memory and did not reach a 60% performance level required for the current doctoral investigations.

Calculating K-scores
As a way of calculating the working memory capacity of each retrieval context, K-scores will be created using the formula from Cowan et al. [16]. There are two versions of the formula with one taking into consideration the full array retrieval context k=N*(H+CR-1)/CR and the other taking into consideration the single retrieval conditions of the peripheral and central probes k=N*(H+CR-1). The formula from Cowan et al. [16] takes into consideration both hit rates (H), the amount of correct change detections, and also the correct rejection rates (CR) which are the amount of correct responses that are rejected (e.g. amount of non-changes detected). Array size (N) that the K-score is concerned with is also an important factor in these equations as the array size will influence the amount of items held in memory.

1)
Participants will perform more accurately in the single with the same diameter. Results were in line with the predictions made, demonstrating that participants would pay attention to the full array, being biased towards the size of a retrieval circle that was the same colour as in the encoding arrays. For example, if a participant was shown a blue circle with 1cm diameter at retrieval but had blue circles with a 3 cm diameter at encoding; participants would recall the single blue circle with a 3cm diameter. Brady and Alvarez [11] suggested a configuration issue here meaning that other objects in the array could influence how accurate participants were in recalling the retrieval object. In the research presented by Brady and Alvarez [11], participants used the full array to aid recall and therefore paid attention to all objects in the array. As all circles were not the same size, errors in the recall of each array were shown as participants mis-judged the size of the circles based upon the presence of other stimuli within the array.
Luck and Vogel [2], on the other hand, argued that participants store objects as independent units therefore there are no influences from other array items as Brady and Alvarez [11] had suggested. In their earlier work, Luck and Vogel [2] had proposed that people do not pay attention to the full array when encoding the image. In a series of smaller investigations, it was demonstrated that people store individual items within memory, whether these items be single featured items or multiple featured items. In this research, it was suggested that people could store simple shapes such as squares and also more complex shapes such as squares with different coloured borders. Due to the fact that each item fills a slot within working memory, there is no influence of the remaining objects in the array, meaning that multiple probes retrieval contexts do not have any advantage over single probes. This will underpin one of the hypotheses as predictions will be made regarding the differences between the multiple and single retrieval probes.
Jiang et al. [12] investigated the organisation of material in visual short term memory, in particular the influence of spatial information in arrays, such as the location of an object. Researchers, here, suggested that the configuration of each array can affect how accurately a series of objects is remembered. For example, participants need to pay attention to spatial aspects of an item, such as location, so that they can combine both visual and spatial information in memory to aid recall. Jiang et al. [12] suggested that arrays are stored based on global configuration, meaning that arrays are stored as a whole where people consider both spatial and visual aspects at the same time [13].
In a series of eight smaller experiments, the basic paradigm from Luck and Vogel [2] was used; however, this was manipulated to incorporate conditions where the spatial configuration was manipulated. Overall results demonstrated that configuration within a stimulus element is an important aspect of visual working memory storage, for example, being able to combine colour and location in memory to assist with recall. This process is known as binding and is discussed in detail by Allen, et al [14]. Jiang et al. [12] highlighted the importance and advantage of using multiple probes here. In a single probe condition, there is no combination of colour to location at retrieval, meaning it may be more difficult to identify which colour was present at the time of encoding. For the purposes of the current investigation, an advantage of this full retrieval context would mean that this context is not appropriate for the measurement of visual working memory capacity. As researchers are aiming to avoid the use of spatial cues with those presented in full arrays, any spatial advantage of the full arrays would mean that this retrieval context would not be used.
A key piece of literature from Wheeler and Treisman [1] also raised questions with regards to the retrieval context of the Luck and Vogel [2] change-detection task protocol. Researchers, here, investigated central retrieval array condition, with this condition having the highest average score across all array sizes.

2)
Array size 8 (in all conditions) will be the most difficult array size to complete; therefore, participants will have poorer performance on this block of trials.

3)
Array sizes 1 and 2 will have performance levels of nearly 100%, similar to those of Luck and Vogel [2].

4)
The K scores for the single retrieval condition will higher than those of the full array with a cue and the single peripheral condition, indicating that this condition is more accurately performance with a larger capacity score.

Method Design
A 3 x 5 repeated measures design was used as all participants took part in each experimental condition. Factor 1 was the retrieval context, consisting of three levels -single central probe, single peripheral probe and full array with cue. Factor 2 was the array size containing 5 levels of array sizes 1, 2, 4, 6 and 8.

Participants
A total of 15 Undergraduate Psychology Students (11 females and 4 males, with a mean age of 21 and a standard deviation of 2.05) received four course credit points for their participation in the experiment. All participants were recruited from a North East University.

Materials
There was only one quantitative change detection task used during this experiment, created using E-Prime 2.0.

Quantitative change detection task (all conditions)
A practice task was created for all participants to complete before the main experimental phase began. This practice task contained arrays from all five array sizes and all three retrieval conditions, with three trials for each separate retrieval condition and array size. The experimental change detection task contained five randomly ordered experimental blocks with each block containing one array size. The array sizes were 1, 2, 4, 6 and 8 coloured squares. In each experimental block 60 trials were shown, giving a total of 300 trials in the full experiment. The 60 trials in each experimental block consisted of: 20 retrieval arrays that were created using a full array (of the chosen array size) with one cued square; 20 retrieval arrays which presented only one square in a central location and the final 20 trials contained only one square at a randomly chosen peripheral location. After each group of 20 trials, a three-minute rest break was given to participants; however, participants could carry on without a rest break by pressing the 'SPACE' bar on the keyboard. Please see

Trial procedure
Each trial consisted of the presentation of an encoding array of either 1, 2, 4, 6 or 8 squares, presented for 500 milliseconds on a computer screen [2]. A fixation was presented at first for 1000 ms. A maintenance array was then shown and this contained one central cross, presented for 900 milliseconds. Finally, a retrieval array was presented for 3000 milliseconds or until the participant pressed the corresponding key on the keyboard.
Participants had to decide if the highlighted square in the retrieval array was the same or different colour as to one in the encoding (memory) array. Participants had to press 'z' on the keyboard when the same array colour had been presented, and the 'm' on the keyboard when a different colour had been presented. In the example below, shown in Figure 1.3, a 'Z' response would be required.

Procedure
The current investigation was ethically approved by the University Health and Life Sciences Ethics Committee. The total testing session lasted one hour. The task was fully explained to participants before the testing phase began, an information sheet was read and a consent form was signed by participants.
Participants were asked to work through the task themselves, pressing the appropriate keys on the keyboard when prompted. Before the task began, instructions were displayed on the screen and participants completed a series of practice trials. Participants then had to press the 'SPACE' bar to continue to the experimental phase of the task. Before each new testing block began, the array size was displayed on the screen so that participants knew the array size being presented next. This prepared the participant for the upcoming array size to reduce any confusion. When rest breaks were given during the task, participants were instructed to press the 'SPACE' bar on the keyboard to continue with the task. The researcher did not prompt the participant to do so.
When testing completed, participants were notified on screen and were asked to wait for further instructions from the researcher. At this point testing had finished and participants were thanked and were fully debriefed, including a reminder of the right to withdraw.

Scoring
The first part of the analysis used a simple scoring procedure, total correct score, where participants were awarded 1 point for a correct  response and 0 for an incorrect response. This enabled a qualitative comparison with the original Luck and Vogel [2] pattern of results. The total score for the whole task had a maximum of 300. The maximum score for each array size could total 60 and the maximum score for each array size within each of the three retrieval conditions could be 20.
For the purposes of this analysis, researchers were primarily concerned with the differences between each retrieval array type (full with cue, single central or single location) and also the differences between the five array sizes (1, 2, 4, 6 and 8 squares). This will be followed by more appropriate K score analyses. Once the negative K scores had been disregarded from the analysis, only 11 participants were included in the K analysis instead of the original 15.

Overall raw data anova results
A 3 (retrieval condition) × 5 (array size) repeated measures ANOVA was conducted on the raw data to look at any potential effects of array size and interference condition. This was an initial analysis using all 15 participants' raw data.
There was no significant main effect of the probe/retrieval condition on recall accuracy, F(2,13)=0.535 p=0.598, partial η²=0.076, suggesting that participants did not perform more accurately in any one condition. However, a significant main effect of array size, F(4,11)=59.39, p<0.001., partial η²=0.956 was found. Bonferroni corrections found that array size 8 was less accurately performed (M=38.20, SD=6.85) than all other array sizes. Please see Table 1 for details of the means and standard deviations of each array size.

Performance levels
In order to make a qualitative comparison with the original Luck and Vogel [2] findings, performance levels were calculated for each array size by totalling each participants score for all three conditions in each array size. This was then divided by the total by the highest possible score of 300.

K-scores
For this section of the analysis, 4 participants' data were excluded as the k-scores were calculated to be negative scores. As working memory capacity cannot have a negative score, the data was not used.

K-score ANOVAs
Due to the previous raw data analysis showing no effect of retrieval context, a simple analysis was conducted on the K-Scores to see if any effects were present. A 3 (retrieval condition) x 5 (array size) repeated measures ANOVA was conducted on the K-Scores showing a significant main effect of retrieval context (F(2,9)=10.071, p= 0.005, partial η²=0.691) and also a significant main effect of array size (F(4,7)=278.736, p< 0.001, partial η²=0.994). There was no interaction effect present (F(8,3)=2.650, p =.228, partial η²=0.876). Table 2 for the averages of each individual K condition (Figure 2.3).
Bonferroni post hoc analyses on the retrieval conditions revealed a significant difference between the full cued array condition (M=2.72, SD=96) and the single peripheral location condition (M=1.93, SD=78, p=0.012), with the full cued arrays having a higher K score than the single peripheral condition. A significant difference between the    This indicates that although there were no significant differences in the raw scores between the performances in each condition, there was a difference in the k-scores of each condition, suggesting higher k-scores for the full arrays.
Bonferroni post hoc analyses on the array size data also demonstrated that most array size differences were significant (all p<0.05) except the differences between array sizes 2-8 (p=0.088), 4-6 (p=1), 4-8 (p=1), and 6-8 (p=1). It could be suggested that once the participant had hit their capacity limit of 3-4 items, then there was no difference in storage of the higher array sizes indicating that approximately 3-4 items were stored regardless of array size.

K-score correlations
Average scores of array sizes 4, 6 and 8 were taken across each retrieval context condition. Array sizes 1 and 2 were not used in this analysis as they displayed ceiling effects of over 90% performance levels. Correlations were then conducted on the data to look at the relationships between the retrieval conditions. There were no significant correlations between the full cue retrieval context and the single central retrieval context (r =0.487, p=0.129). The correlation between the single central and single peripheral retrieval context was significant (r =0.621, p=0.018) indicating a positive correlation between the two single retrieval conditions. There were no significant correlations between the full cue retrieval context and the single peripheral retrieval contexts (r =0.003, p=0.992). The links between the single retrieval conditions but not with the full cued condition indicates that these types of retrieval conditions could be measuring working memory capacity in a different way (Table 3).

Discussion
The current experiment aimed to investigate the importance of retrieval context within working memory change detection paradigms and hoped to discover which one of three different retrieval conditions were the most appropriate to use to assess visual working memory capacity. The change detection protocol identified by Luck and Vogel [2] was used and adapted to include three different retrieval contexts of a full array, single central array and single peripheral location array. It was predicted that the single central retrieval probe would be the most appropriate retrieval context to use as participants would perform more accurately in this condition, however it was found that this was not the case and no retrieval context effects were found from the initial raw data analyses. It was also predicted that array size 8 would be the array size of which participants would score the lowest. This prediction was supported with array size 8 having the lowest performance levels and results also indicated ceiling effects with array sizes 1 and 2. With regards to the K scores, it was predicted that higher K scores would be shown in the single central retrieval condition, demonstrating higher working memory capacity with no influences of spatial cues or the full arrays. This prediction was not supported as there were significant differences between the full arrays and both of the single probe retrieval conditions with the full array condition demonstrating a higher K score. The lack of correlations between the full and single probe conditions could indicate that there is an advantage within one condition. Suggestions can be proposed that the full array condition does have a spatial advantage, meaning that it will not correlate with the single retrieval conditions.
As no initial effect of retrieval context was found, the current results show a surprising contrast to the findings of Wheeler and Treisman [1]. In this research, it was demonstrated that single retrieval probes could be more accurately recalled as binding errors (of colour and location) would be reduced with the use of only one retrieval probe. The current study demonstrated that this was not the case, suggesting no differences being present between each retrieval context with regards to the raw score analyses. However, analyses of the K scores indicated a larger K score for the full array retrieval conditions, contrasting the research of Wheeler and Treisman [1]. This could suggest a spatial advantage over the full retrieval arrays and therefore this condition will not be used for the remainder of the thesis and a single retrieval probe will be used instead to eliminate any advantage of such spatial cues and binding errors.
Within their research, Wheeler and Treisman [1] did not distinguish between where the single retrieval probe was positioned and did not specify whether a peripheral location or a central location probe would be more beneficial to participants; therefore, the current study investigated this. The current study has added to this literature by demonstrating that there are no differences between the recall of single central and single peripheral retrieval contexts.
Both the current study and the study by Wheeler and Treisman [1] used the basic paradigm created by Luck and Vogel [2] therefore it was hoped that results would be similar. The current study has given a contrast to the work of Wheeler and Treisman [1] suggesting that firstly, binding errors may not be the primary explanation for the errors    of memory recall and secondly, single probe trials appear to have no advantage over the multiple probes.
As the K score analyses found significant differences between the single and multiple retrieval arrays within the current study, they do show support for the findings of Brady and Alvarez [11] and Jiang et al. [12]. Brady and Alvarez [11] suggested that when people look at an array consisting of multiple probes, other items in the array can affect how well the array is recalled. This type of influence means that participants pay attention to all items in an array and a bias can occur when trying to recall items from memory if confusion is caused between the differences of the encoding and retrieval items. Current findings are in support of this with larger K scores for the full arrays compared to both of the single retrieval conditions. However, it can be noted that the influence of other items in an array may not always be a positive thing to note. As current researchers are aiming to create an accurate measure of visual working memory capacity without any influence of spatial memory or the influences of other items in an array, the full retrieval condition will not be used for the remainder of the doctoral thesis. The higher K scores for the full array condition indicates that there could be some form of visuospatial advantage when being presented with multiple retrieval probes on the same array. This advantage would need to be considered in all subsequent analyses throughout further research and any differences in results could be due to the way the encoding and retrieval arrays are being presented. As a way of eliminating the advantage of other items in the array sequence, single retrieval arrays will be used in the remainder of the thesis as a more accurate measure of visual working memory capacity.
One difference between the current study and Brady and Alvarez [11] was the types of stimuli used. The current investigation used a well-established change detection paradigm from Luck and Vogel [2] whereas Brady and Alvarez [11] created their own stimuli which may not have been widely used in different contexts. It may be an advantage to carry out the study of Brady and Alvarez [11] using the stimuli of Luck and Vogel [2] to see if similar results occur with the use of a procedure that has a different full array configuration to Luck and Vogel [2]. Similar to Brady and Alvarez [11], Jiang et al. [12] also concluded that multiple probe arrays present an advantage. The results from the current investigation again do not support this notion.
Jiang et al. [12] suggested that colour memory (visual memory) can rely upon location memory (spatial information) to increase memory performance and recall. This is because people will pay attention to the spatial organisation of the material as well as the coloured details. For the current experiment, the results were in support of this with the full arrays presenting larger K scores. This again indicates to potential advantage of using spatial ques within an array and leads current researchers to eliminate full retrieval arrays.
As there was a decrease in performance levels for array sizes 4, 6 and 8 in the current study, this suggests that the Discrete Slot Model [2] is the more appropriate model to best describe the visual working memory capacity in the current quantitative change detection task. As more items have to be remembered with the larger set sizes, performance levels decrease and K-Values decrease as the slots are filled. When capacity reaches the level of 3-4 items, the remaining items simply do not get stored and capacity is reduced.
If a shared resource account [17] had been used, then the array sizes would have equal performance as the allocation of resource use would be equal across all items in the array. This would have meant that all items in the array would have been remembered no matter how many items were in the array.
A potential limitation of the current investigation is that the encoding intervals of the stimuli were not varied. Luck and Vogel [2] had suggested that an encoding time of 500 milliseconds was sufficient to allow the encoding of all items and that there were no differences between the 100 millisecond and 500 millisecond encoding interval. However, recent research from Lin and Luck [18] used a 100 millisecond interval in their research successfully and did not use the original 500 millisecond interval as proposed by Luck and Vogel [2], showing that 100 milliseconds was enough time to encode all visual information within an array. It may be of benefit to repeat the current investigation using a 100 millisecond encoding interval to discover if any effects are present with more updated stimuli.
The current investigation used array sizes 1, 2, 4, 6 and 8 to assess the different retrieval contexts in visual working memory capacity. These array sizes were taken from the original work of Luck and Vogel [2] as this work formed the basis of the current doctoral thesis. A proposal for future research could be to include other array sizes such as array size 5. Lin and Luck [19] used array size 3 successfully in their work on visual working memory, finding that similarity in items lead to improved performance on the change detection task. Lin and Luck [19] also used other shapes such as diamonds which could have also been used in the current change detection task to look at the effect of colour change detection of different shapes. However, researchers must also consider how the use of different shapes could affect the way the task is presented, for example increasing the possibility of binding errors. By using different shapes and colours, participants would have to bind these features within memory, potentially causing errors if these bindings were disrupted.
The current investigation looked at the retrieval contexts within a visual change detection task. As the full arrays were shown to potentially have an advantage over the single retrieval probes, current researchers will use single central retrieval probes in the remainder of the doctoral thesis. This is due to the fact that the advantage from the full array may be enough to improve working memory capacity due to the influence of other items in the array (spatial configuration). As Wheeler and Treisman [1] did find an advantage of single retrieval probes, the next stage of the doctoral thesis will use single retrieval probes instead of multiple ones. The use of the single central retrieval probes will eliminate any spatial advantage there may be with regards to the single peripheral probes. In future studies, researchers will be considering whether the paradigm by Luck and Vogel [2] is purely a visual task before using this task in a developmental setting. Before using this change detection task further, and to avoid any use of spatial memory, current researchers will eliminate the use of the full retrieval arrays with a cue and also the single peripheral location retrieval probes as these can also rely on spatial cues. By using only single central retrieval probes, the use of spatial (location) cues can be eliminated so that the information presented purely visual based as participants will have to pay attention to colour only.
During the current study, as set size 8 was poorly performed and set sizes 1 and 2 were very accurately performed creating ceiling levels, current researchers will use only the size 6 and size 4 arrays in the next stage of the current research. This will ensure that the change detection task consists of the appropriate level of difficulty for the adult population.