Spatial Demands of Concurrent Tasks Can Compromise Spatial Learning of a Virtual Environment: Implications for Active Input Control

While active explorers in a real-world environment typically remember more about its spatial layout than participants who passively observe that exploration, this does not reliably occur when the exploration takes place in a virtual environment (VE). We argue that this may be because an active explorer in a VE is effectively performing a secondary interfering concurrent task by virtue of having to operate a manual input device to control their virtual displacements. Six groups of participants explored a virtual room containing six distributed objects, either actively or passively while performing concurrent tasks that were simple (such as card turning) or that made more complex cognitive and motoric demands comparable with those typically imposed by input device control. Tested for their memory for virtual object locations, passive controls (with no concurrent task) demonstrated the best spatial learning, arithmetically (but not significantly) better than the active group. Passive groups given complex concurrent tasks performed as poorly as the active group. A concurrent articulatory suppression task reduced memory for object names but not spatial location memory. It was concluded that spatial demands imposed by input device control should be minimized when training or testing spatial memory in VEs, and should be recognized as competing for cognitive capacity in spatial working memory.


Introduction
Studies of human memory and spatial cognition have benefited from the use of virtual environments (VEs; Gamberini, 2000), for example, where real-world exploration is limited by practical circumstances or where the manipulation of experimental variables is impossible given the constraints of the real world (e.g., Foreman, Stanton-Fraser, Wilson, Duffy, & Parnell, 2005;Stanton, Wilson, & Foreman, 2003;Wilson & Peruch, 2002). Environments used to train and assess aspects of memory have ranged from single rooms containing a few objects in an otherwise empty space (Sandamas & Foreman, 2003;Wilson, 1998) to more complex environments, such as homes, schools, hospitals, office blocks, and shopping malls (e.g., Brooks, Attree, Rose, Clifford, & Leadbetter, 1999;Foreman, Sandamas, & Newson, 2004;Foreman et al., 2005) to a part of a city (Maguire et al., 1998). A major consideration in the use of VEs for psychological research is that learning in a VE results in the acquisition of representations of space that are (at least, functionally) similar or equivalent to those acquired from real-world exploratory experience (e.g., Foreman et al., 2005;McComas, Dulberg, & Latter, 1997;Wilson & Peruch, 2002).
However, there is controversy over the degree to which virtual and real environmental exploration is affected by the active or passive status of participants. The usual finding in real-world studies is that active engagement confers better spatial learning for adults and children. Vehicle passengers tend to learn less than drivers about the spatial layout of a town (Appleyard, 1970;Hart & Berzok, 1982) and realworld spatial learning is generally better in children after active than passive learning (Foreman, Foreman, Cummings, & Owens, 1990;Gibson, 1966;Herman & Siegel, 1978).
Spatial learning in VEs, however, does not appear to be reliably affected by the active or passive status of the participant, and active status has sometimes been found to be a disadvantage. This apparent anomaly was first evidenced by Arthur (1996). Wilson and Peruch (2002, Experiment 1) conducted a study in which participants either actively explored a virtual environment or passively observed an active participant's exploration, and then attempted to remember the locations of four targets. Surprisingly, subsequent orientation and way-finding measures found more accurate judgments for passive observers than active ones. In a second experiment, using a within-subject design, Wilson and Peruch (2002) found no difference between active and passive participants. Sandamas and Foreman (2007) found that passive participants were more accurate on a subsequent object placement task than active participants whose displacements they observed.
A possible explanation for these phenomena is that in some situations working memory (WM) can become overloaded and the performance of active participants is compromised by the imposition of having to use an unfamiliar input device to interface with the computer (Sandamas & Foreman, 2007). The impact on performance of having to operate an input device has been generally overlooked in VE (or other) research, possibly because device operation (whether mouse, joystick, keyboard, or tailored device) has been regarded as nonchallenging. It has not been regarded as demanding a significant amount of cognitive capacity, in the same way that when a real-world environment is explored on foot, bipedal locomotion is not considered a parallel task as it is so familiar, automatic, and requires no training. Nevertheless, this cannot be assumed to apply to the manual guidance of virtual exploration. Indeed, it is interesting that the manual movements required to operate an input device and interface with software-especially 3-D environment software-are similar to those required by those tasks commonly used as concurrent tasks to disrupt spatial learning in real-world spatial learning including map learning (cf. Coluccia, 2008;Garden, Cornoldi, & Logie, 2002). Baddeley (1993) has previously emphasized that training and task familiarity must be crucial factors in determining the extent to which concurrent tasks interfere with each other. In terms of spatial learning, there is evidence for such a model, since Sandamas, Foreman, and Coulson (2009) restored the advantage that active explorers should theoretically have by giving participants greater time to familiarize themselves with the input device. Active participants in this situation were more accurate on the same object placement task as in Sandamas and Foreman (2007) than passive participants, indicating that extra training with the input device helps alleviate the concurrent demand on WM caused by using such a device. Clearly, the model adopted here, which is the same as that adopted by Coluccia (2008), Garden et al. (2002), and Baddeley (1993), makes no assumptions about where in WM the bottleneck or information overload occurs-it might, for example, occur in a visual, spatial, or motoric memory store, or at central executive level. Nevertheless, it makes a clear prediction that the greater the spatial load applied via a concurrent task, the less spatial information is likely to be obtained from simultaneous exploration of a VE.
This approach is standard, the dual-task approach being the most commonly used paradigm for gauging resource demands on WM (Guttentag, 1989) and has reliably indicated that as the demands of concurrent tasks increase, performance on a central task diminishes. For instance, Garden et al. (2002, Experiment 1) found that both spatial tapping and articulatory suppression tasks interfered with the primary task of route learning from a segmented map. Coluccia, Bosco, and Brandimonte (2007) found that map learning, as demonstrated through map drawing, was impaired when participants performed the same concurrent spatial tapping task as Garden et al. (2002), but not when performing a concurrent articulatory suppression task. Their findings were interpreted as supporting their hypothesis that visuo-spatial WM is implicated in spatial learning from maps.
The present study tested both active and passive groups for their memory for the layout of a virtual room containing six objects. While observing a pre-recorded exploration session, each passive group was challenged with a simultaneous concurrent task, tasks varying according to their spatial demands. Active explorers of VEs were expected not to show any advantage over passive observers due to the extra load placed on WM by having to operate an input device (aka perform a concurrent spatial task).
Furthermore, to examine whether the spatial loading of a concurrent task could be manipulated to negatively impact to a lesser or greater degree on passive observers, different concurrent tasks were carefully selected for their degree of spatial demand. These required simple hand/finger movements with no spatial sophistication (simple card turning), hand/finger movements requiring spatial decisions (sequential compass-point card-sorting), or hand/finger movements to tap keyboard keys in a prescribed order (mimicking the complex movements made by active participants using keyboard keys to guide their virtual spatial displacements). A final secondary task was also used that was entirely non-spatial, or not spatially demanding, in which condition spatial learning was expected to remain unaffected. This was a verbal task (articulatory suppression), which may rely on the WM phonological loop, and while not affecting spatial memory per se was anticipated to reduce memory for the names of objects encountered within the VE.

Participants
One hundred and fifty undergraduate participants (60 male, 90 female), aged between 18 and 54 years, with a mean of 23 years and a SD of 5.5, were recruited from the undergraduate population and awarded course credits for participation. Note. The top of the gramophone can be seen in the foreground.

Apparatus and Procedure
All participants except those in the Active condition (see below) watched a pre-recorded exploration of the VE made by a confederate participant prior to the start of the experiment. The VE was modeled using SuperScape VRT 3.0 software and constructed to represent a large room in which six objects were arranged at floor level. These were a flower in a pot, computer monitor, bottle, road traffic cone, triangular road sign, and gramophone (see Figure 1). The floor was green in color and the walls were lilac. One wall had gray cupboards against it and another had windows and wall-mounted radiators (see Figure 1). The confederate was allowed to move freely about the environment, using four keyboard keys to direct their displacements (forward, back, left rotate, right rotate), and was requested to visit each of the floor-level objects twice, but in an unsystematic way. A visit was defined as moving close to an object so that it occupied the entire screen. The room was devoid of any other objects such as tables or chairs (see Figure 1). The VE was displayed for all participants on a 21-inch monitor. The entire exploration lasted 140 s. The route was recorded using HyperCam 2 digital recording freeware. Participants in the active condition each explored the VE for 140 s, using the same input device and visiting each object twice as the confederate had done. All participants were simply instructed to "get to know the layout of the VE." Each participant was allocated to one of the following six conditions: (1) Controls: These passive participants watched the recording of the confederate's exploration of the VE with no additional task.
(2) Simple card-sorting: Using a standard pack of playing cards, participants were asked to pick up each card in turn, turn it over, and place it face down next to the original pack while viewing the recorded VE exploration. Participants were asked to maintain a turnover speed of approximately one card per second without pause. They were asked to start turning cards prior to the commencement of the recording. (3) Complex card-sorting: The same standard pack of cards was used as for the simple card-sorting condition. However, participants in this group were asked to pick up the first card and place it directly above the pack. The next card was placed to the right of the pack, the next beneath the pack, and the fourth to the left of the pack; in other words, participants placed the cards around the pack in a clockwise direction to four compass points. This sequence (upright-down-left) was then repeated until the video sequence ended. Participants were allowed a short practice session prior to the commencement of the recording to ensure that they understood and were able to carry out the instructions. (4) Verbal task (Articulatory suppression): Participants were asked to repeat the days of the week, starting with Monday, out loud at a rate of approximately one per second, while viewing the recording. (5) Spatial tapping: Participants were asked to repeatedly tap a 12-key sequence on the number keypad of a computer QWERTY keyboard, starting with the 1 in the bottom left corner, at a rate of one per second following a predefined Boustrophedon sequence of 123654789654. Participants' keystrokes were displayed on a separate computer screen to that on which the VE was displayed so that their performance could be monitored. If participants' keystrokes became too slow-less than approximately one every 2 s-or erratic, the experimenter gave them a verbal prompt. This happened on only two occasions. This task has been used previously by Garden et al. (2002) and Coluccia (2008). (6) Active: Participants explored the VE for 140 s using the keyboard arrow keys to control displacements. They were required to visit each object in the VE twice. A visit was defined as moving close to an object so that it occupied the entire screen.

Assessing Spatial Memory
At the end of the video sequence, each participant was taken to a table several meters from the video screen, and given a sheet of paper on which was depicted a birds eye screen shot of the room layout but with only one of the floor objects depicted (the road traffic cone). Colors were authentic, exactly as those in the VE. Participants were asked to indicate the positions of the missing floor objects by drawing five crosses and to label each cross with the name of the object. Where an object name was not recalled, this was recorded and the participant was reminded of the identity of the object and was asked to guess its location by labeling one of the unlabeled crosses. This procedure was necessary so that each participant had 5 scores for the Placement Error Score DV. Participants were given unlimited time, though almost all completed the exercise within 1 to 2 min. Object placement is a much used and recognized measure of spatial learning in both real world (Herman, 1980;Herman, Kolker, & Shaw, 1982;Herman & Siegel, 1978) and virtual (Brunswick, Martin, & Marzano, 2010;Sandamas & Foreman, 2007;Sandamas et al., 2009) studies.

Results
Performance was assessed along two factors, one spatial (object location) and one non-spatial (object name). The spatial measure (placement error) was calculated by placing a transparent acetate sheet, on which all of the six floor objects were depicted, over the floor plan on which participants had indicated where they thought the object positions were and measuring the distances between participant-placed objects and the objects' original positions. Thus, five error distances in millimeters were obtained for each participant. From these, average placement error scores were calculated. The non-spatial measure was based simply on participants' memory for the names objects encountered within the VE; as indicated above, this was recorded when participants were making their placement judgments.

Average Placement Error Scores
Average Placement Error (APE) scores were entered into a two-way, 6  2 (Condition  Gender) Analysis of Variance (ANOVA showed that complex card-sorting (p = .018) and spatial tapping (p = .044) participants performed significantly worse than controls. All other comparisons were nonsignificant at p = .05 (see Table 1). Figure 2 above shows placement error (in mm) in order of magnitude by condition.

Memory for Objects
Memory for the names of the objects encountered was initially entered as a dependent variable (DV) in the twoway ANOVA, as used above with APE scores; however, Levene's test for homogeneity of variance was highly significant (p < .001) and the data were highly negatively skewed; therefore, the following non-parametric treatment of the data was conducted.
As can be seen from Figure 3 above, only the median score (3.0) for participants in the verbal (articulatory suppression) condition was markedly lower than participants in the other conditions all of whom scored 5.0. Table 2 shows the spread of data (frequencies) for object memory by condition. Examination of the figures shows the negative skew of the data in that most participants remembered all of the objects they encountered within the VE. The exception was the verbal (articulatory suppression) condition. Participants' object memory data were subjected to a KruskalWallis analysis with condition as the independent variable. The result was significant,  2 (5, N = 150) = 44.37, p < .001.

Discussion
This study set out to further explore the idea that the spatial learning of active participants in virtual studies may be negatively affected by the imposition of having to use an input device and also that this imposition can be replicated by having passive participants perform a spatially demanding concurrent task. To some extent the findings have supported this notion and the hypothesis that spatial learning would be negatively affected by a spatially demanding concurrent task but not by a non-spatially demanding one was supported. Participants in the complex card-sorting and spatial tapping conditions were significantly worse than those in the control group, making significantly greater placement errors, whilst participants in the simple card-sorting and verbal conditions performed at an equivalent level. These findings concur with previous research, which indicates that a spatially demanding secondary task interferes with effective spatial learning about an environment (e.g., Coluccia, 2008;Coluccia et al., 2007;Garden et al., 2002) while a non-spatially demanding  one does not (e.g., Coluccia, 2008;Coluccia et al., 2007). Moreover, as hypothesized, participants in the verbal (articulatory suppression) condition remembered significantly fewer object names than participants in the control group and all other groups but were no worse at remembering object positions, illustrating perhaps a dissociation between spatial and sematic memory and that the verbal and spatial concurrent tasks in the current study selectively disrupt WM as in the studies of Coluccia et al. (2007), Coluccia (2008), and others. Participants were carefully observed for evidence of their shifting attention overtly from the screen to the concurrent task, though no evidence was seen of this systematically. Unsurprisingly, the keyboard-tapping task was most prone to this, seen occasionally in some participants, although this task has been successfully used as a concurrent task in previous comparable real-world studies (Coluccia, 2008;Garden et al., 2002). Moreover, while both of the complex concurrent tasks suppressed performance on the spatial memory measure, no obvious overt shifts of attention could be detected for the complex card-sorting task. Future work could use measures of performance on concurrent tasks (including speed-accuracy trade-off) that could provide further evidence for the distribution of attention and cognitive capacity among the primary and concurrent tasks. For present purposes, it was clear that participants in all conditions were engaged with the tasks given to them.
Of particular interest here, however, is how participants in the active condition performed compared with the other groups. As discussed above, based on the findings of previous real-world studies such as Foreman et al. (1990), Gibson (1966), Herman and Siegel (1978), and others, one would expect participants who actively explored the environment to have an advantage in terms of spatial learning over passive observers. This advantage, however, is not reliably found in virtual studies (e.g., Arthur, 1996;Sandamas & Foreman, 2007;Wilson & Peruch, 2002), and here participants in the active condition were no better than passive controls and performed at a level equivalent to that of participants undertaking a spatially demanding concurrent task. Although not fully congruent with our hypotheses, as active participants were arithmetically worse (but not statistically significantly worse) than passive controls, these findings still provide additional evidence to that of Sandamas et al. (2009) in support of the notion that spatial learning in VEs is disrupted by the imposition of using an input device to explore and that any advantage that active explorers of a VE might have is compromised by this. We accept that the present complex concurrent tasks may have made greater demands on cognitive capacity than the operation of an input device. However, the pattern of current results strongly suggests that in VE-based training and testing, active participants are prevented from taking advantage of their active status, by virtue of having to use up some visuo-spatial WM capacity in input device control. This is further suggested by the results of Sandamas et al. (2009) who found that additional training with an input device restored the active advantage in children, indicating that the extra training reduced input device demands on WM. The result poses questions regarding the degree of spatial-motor disruption that occurs in the performance of familiar real-world tasks, where an active advantage over passive exploratory experience is usually obtained. For example, the motor movements made in controlling a motor vehicle (depressing pedals, steering, and operating gears) might also be expected to disrupt spatial learning, yet anecdotally (see Hart & Berzok, 1982), drivers typically obtain more spatial information than a passive passenger. It is likely that in well-trained motor tasks, the impact of spatial-motor movements is reduced. Driving becomes an automatic behavior, except when conscious attention is required to modify a sub-program, as when traffic suddenly slows and a driver has to react. It is likely that at moments when such distractions occur, spatial information cannot be processed. Likewise, a novice driver is unlikely to acquire as much spatial information after driving a route in an unfamiliar town as an experienced motorist.
The present data are also of interest in relation to previous studies in which children with disabling conditions were able to find their way around school buildings after a period of virtual exploration (Foreman et al., 2005). In some cases, children unable to operate an input device were trained by having them observe the displacements of an active explorer, who took instructions but operated the input device on their behalf. Far from disadvantaging the disabled children, it is likely that they were allowed more cognitive capacity to apply to the learning of the environment and would have been disadvantaged by having to operate an unfamiliar input device. This clearly has wider training implications.
In summary, this study was conducted to further investigate the suggestion of Sandamas and Foreman (2007) that active explorers of VEs do not demonstrate the expected advantage in spatial learning over passive observers due to the added demands on WM of having to use an unfamiliar input device to interface with the computer. Sandamas et al. (2009) addressed this problem by giving participants extra time to familiarize themselves with the input device. They found that this restored the expected advantage for activity and proposed that this was because the extra training reduced input device demands on WM. Here we have approached the problem using the concurrent task paradigm and found that concurrent tasks with a spatial element, estimated to load WM at least to the same extent as using an unfamiliar input device, can disrupt spatial learning of a VE layout thereby lending further support to the initial proposition of Sandamas and Foreman (2007).
Although the current findings are congruent with our own previous research findings regarding spatial learning in VEs and the research findings of others regarding spatial learning with other media and the implication of WM (see above), further research is currently underway in which our procedures are being refined to address the possible criticism of the current findings that they may not dissociate all of the WM components implicated in the observed effects. For instance, both complex spatial tasks involve elements of planning and decision making (arguably Central Executive functions) not present in simpler and non-spatial tasks. It may also be possible to dissociate visual working memory from spatial working memory with refinements of the current approach, using VEs, to determine where within the WM system information bottlenecks occur.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research and/or authorship of this article.