Visuospatial attention in children with Autism Spectrum Disorder: A comparison between 2-D and 3-D environments

Abstract Previous research has illustrated the unique benefits of three-dimensional (3-D) Virtual Reality (VR) technology in Autism Spectrum Disorder (ASD) children. This study examined the use of 3-D VR technology as an assessment tool in ASD children, and further compared its use to two-dimensional (2-D) tasks. Additionally, we aimed to examine attentional network functioning in ASD children. We administered a battery of visual processing and attentional tests on 18 ASD children and 18 age-matched typically developing counterparts. Results showed that both groups performed comparably on 2-D and 3-D visual processing and attention tasks, although the ASD group was significantly slower in the 3-D task. Intact attentional network functioning was also revealed in the ASD group. These findings have validated the use of VR technology as an assessment of ASD functions, and contributed to the understanding of functions in young ASD children.


ABOUT THE AUTHORS
Our team consists of experts in Computer Sciences, Psychology, Pediatrics, and Rehabilitation Science. We have been collaborating on a number of projects aiming to provide innovative interventions for the enhancement of social, emotional, and cognitive development in children with special learning needs. One current focus is to develop and validate the use of virtual reality technology to assess and train children with Autism Spectrum Disorder. Our team has successfully set up hardware capable of inducing a series of immersive virtual environments (e.g. classroom, library, and/or playground), and developed a package of child-friendly and engaging game-based training software. Apart from training, we have embarked on projects that compare children's performances in traditional psychoeducational and virtual reality-based tests.

PUBLIC INTEREST STATEMENT
Individuals with Autism Spectrum Disorder (ASD) possess a unique behavioral profile that presents a challenge to clinicians and researchers in terms of assessment and treatment. Following recent developments in computer technology, the preliminary use of Virtual Reality (VR) technology in intervention programs for ASD has met with some success. This perspective article examines the use of VR technology in assessment tools for ASD. VR technology presents several advantages for individuals with ASD, who report a preference for interacting with computer technology. Capitalizing on this preference, this perspective article has successfully implemented VR technology as an assessment tool for ASD children. This preliminary success not only provides a novel and ecologically valid perspective into the understanding of ASD but also opens up avenues for future research to adopt VR technology in clinical assessments and education tools.
Given their unique behavioral profiles, it is no surprise that the assessment and treatment of ASD individuals presents a challenge to clinicians and researchers. However, there have been recent developments in novel computerized approaches to training and intervention programs for ASD individuals, capitalizing on their interest in computerized learning (Boucenna et al., 2014;Grynszpan, Weiss, Perez-Diaz, & Gal, 2014). Virtual Reality (VR), a computer-based technology capable of simulating immersive virtual environments (VEs), holds particular promise for learning and assessment for ASD individuals (Konstantinidis, Luneski, Frantzidis, Costas, & Bamidis, 2009). Recently, some studies have investigated the use of VR technology as a learning tool in small-scale ASD intervention programs. For example, researchers have developed VEs for fire safety drills (Self, Scudder, Weheba, & Crumrine, 2007), road safety skills (Josman, Ben-Chaim, Friedrich, & Weiss, 2008), as well as social VE scenarios such as job interviews and meeting new people (Kandalaft, Didehbani, Krawczyk, Allen & Chapman, 2013). Other preliminary studies have also successfully adopted VR technology for social skills training, reporting improvements in social cognition skills such as better recognition of facial emotions, emotional gestures, and emotional situations (Ip et al., 2016;Serret, 2012). However, no study has yet to look at the use of VR as an assessment tool, which offers several potential advantages particularly for the ASD population. Firstly, VR technology can create highly realistic settings that are safe, controlled, and tailored to suit these individuals' needs. Secondly, their preferences for VR technology ensure an optimal amount of task motivation and engagement, thus enabling responses that are of higher ecological validity. As such, VR technology can provide a novel understanding of functioning in ASD which may not be possible using traditional paper-and-pencil tests. The performances of the VRbased assessment may also serve as a predictor for another learning outcome. For example, in one study of autism, it was found that children's attention to both social and non-social virtual characters and objects was linked to their academic achievement, even after the effects of IQ were controlled for (Rajendran, 2013). Moreover, it is essential to validate the usefulness of VR technology for ASD children before such technology can be widely used. According to the technology acceptance theory (TAM), the acceptance of VR technology by the potential users depends on the perceived usefulness of the technology (Davis, Bagozzi, & Warshaw, 1989). For these reasons, the present study will explore the use of 3-dimensional (3-D) VR technology as an assessment tool in ASD individuals.

3-D virtual reality
VR refers to a simulated environment that leads to telepresence, or the mediated perception of an environment. In other words, telepresence refers to the extent to which one experiences the mediated environment rather than one's immediate physical environment (Steuer, 1992). VEs primarily stimulate users through visual experiences, which encourage an optimal level of interaction and immersion within the VR experience. To date, there are only a handful of studies validating the use of VR and interactive media technologies for training the ASD population. In particular, these studies investigated how different levels of immersiveness in a virtual environment could affect the quality of user-VR interactions. For example, Matsentidou and Poullis (2014) used a Cave Automatic Virtual Environment (CAVE, Cruz-Neira, Sandin, DeFanti, Kenyon, & Hart, 1992) to deliver social skill training contents to children with autism. The similar CAVE technology also enabled social adaptation training in inclusive education settings for school-aged children with ASD (IP et al., 2016). Lorenzo, Lledó, Pomares, and Roig (2016) employed a L-shape VR environment for emotional skills training of children with ASD, while a cylindrical panoramic virtual environment is used for simulating dolphinarium for children with ASD (Cai et al., 2013). All these VR environments consist of a single or multiple stereoscopic viewing projections arranged in various configurations. The employment of stereoscopic viewing projections allows users to perceive depth, which should provide more immersive experiences.
In VEs, realistic 3-D scenes are encountered in "real time", such that movements in the VE correspond to actual movements in reality (Parsons, Mitchell, & Leonard, 2004). This idea is conceptualized as affordance, namely representational fidelity in the model of learning in 3-D virtual learning environments (Dalgarno & Lee, 2010). Learners who experience high level of representational fidelity will obtain enhanced spatial knowledge representation in the 3-D environment. This maximizes the user's task motivation and engagement levels (Burdea & Coiffet, 2003;Mitchell, Parsons, & Leonard, 2007;Self et al., 2007).

Visuospatial and attentional processing in ASD individuals
Over the past few decades, a growing amount of research has revealed unique capabilities in higher level visuospatial processing (Allen & Courchesne, 2001;Keehn et al., 2010;Kuschner et al., 2007;Simmons et al., 2009). One such capability is the perception of local aspects of a stimulus from a global context. In particular, ASD individuals have been found to perform better in certain visuospatial attention tasks such as shape identification in the complex designs of the Children's Embedded Figures Test (CEFT) (Jolliffe & Baron-Cohen, 1997;Shah & Frith, 1983), the Block Design subtest in the Wechsler Intelligence Scale for Children (WISC) (Shah & Frith, 1993), the Navon hierarchical letter test (Plaisted, Swettenham, & Rees, 1999), as well as conjunctive visual search tasks (Plaisted, O'Riordan, & Baron-Cohen, 1998).
Studies have also frequently reported abnormalities in sustained and focused attention in ASD individuals, even suggested as a distinguishing characteristic of later ASD diagnosis (Chien et al., 2014;Elsabbagh et al., 2009;Zwaigenbaum et al., 2005). Yet, it is only recently that studies have started to conceptualize attention in ASD children in terms of the Attention Network Theory, in which attention is categorized into three distinct networks: namely the alerting, orienting, and executive control networks (Fan, McCandliss, Fossella, Flombaum, & Posner, 2005;Fan, McCandliss, Sommer, Raz, & Posner, 2002;Posner & Petersen, 1990). The alerting network accounts for the ability to attain and sustain a state of alertness (Keehn et al., 2010). The orienting network is responsible for orienting visual attention, which refers to the disengaging, shifting, and re-engaging of attention (Posner, Walker, Friedrich, & Rafal, 1984). Finally, the executive control network is responsible for functions such as task switching, working memory, and inhibition (Miyake et al., 2000). The conceptualization of attention into three separate neural networks is proposed to offer greater efficiency at identifying specific attentional abnormalities in individuals and examining the relationships between the networks (Casagrande et al., 2012;Keehn et al., 2010). This is especially important given recent suggestions that early attentional abnormalities may contribute to the development and emergence of ASD (Keehn, Müller, & Townsend, 2013). A better understanding of attentional functions in ASD may thus lead to improved early identification and screening of the disorder.
Previous administration of the Child version of the Attentional Network Test (ANT-C) on ASD children and adolescents of ages 8 to 19 revealed significantly impaired performance only on the orienting attentional network, suggesting its insufficient modulation (Keehn et al., 2010). Other studies, however, have produced inconsistent results supporting both intact and impaired alertness, visual attention, and executive function in ASD individuals. Research on attentional network functions in these individuals thus remains inconclusive to date (Corbett, Constantine, Hendren, Rocke, & Ozonoff, 2009;Goldberg et al., 2005;Keehn & Joseph, 2008;Keehn et al., 2013;Palkovitz & Wiesenfeld, 1980). A potential concern with these studies is the participants' age. While Rueda et al. (2004) observed little change in the development of all 3 attentional networks from age 6 to adulthood, little remains known about the functioning of these networks in younger children. The present study will thus make an important contribution to the field by extending the current understanding of attention network functioning to young ASD children.

The present study
The present study is the first of its kind in adopting VR technology as an assessment tool on ASD children. As fully immersive systems may not be appropriate for use with young ASD children, we have chosen to employ a simpler VR system consisting of 3-D stereo projections projected onto a screen monitor and viewed with 3-D glasses. There are three major research questions in this study. First, do young ASD children experience visuospatial processing and attention deficits? It is hypothesized that the ASD children will perform significantly worse than the TD controls in all relevant measures. Second, are the results obtained through tests presented in three types of media (paperand-pencil vs. 2-D computerized test vs 3-D computerized test) consistent or not? As predicted by the model of learning in 3-D virtual learning environments (Dalgarno & Lee, 2010), representational fidelity can enhance the sense of presence for VR users and promote spatial knowledge representation. If we observe that ASD children experience difficulty in processing visual information in a 2-D but not 3-D environment, this implies that the difficulties observed in ASD children is partly due to the media of presentation. In contrast, if ASD children showed the deficits in both 2-D and 3-D environment, we believe that the deficit is not medium-specific. Third, the study will examine the performances of ASD children in three areas, namely alerting, orienting, and executive control networks, as proposed by the Attention Network Theory (Fan et al., 2005).

Participants
Thirty-six children, aged 4.0 to 6.6 years (mean age 4.99 ± .76), were recruited from local kindergartens and a non-governmental organization for children with special needs. Eighteen were children with Autism Spectrum Disorder (ASD group), while 18 were aged-matched typically developing children (TD group). Demographic information for both groups is displayed in Table 1. Children in the ASD group were diagnosed with either Autism, Asperger's Syndrome, or Autistic features either by clinical psychologists, developmental and behavioral pediatricians, or psychiatrists. Children were excluded from the study if found to display any of the following: (a) uncorrected vision problems, (b) comorbid physical, cognitive, or neurological difficulties, (c) any physical condition that precluded them from engaging in the VR task, or (d) extreme fear or anxiety in an VR environment.

Procedure
Human participation ethics approval was obtained from the Education University of Hong Kong prior to the study. Eligible children were invited to a laboratory for a 1.5 h assessment after obtaining informed written consent from participants' caregivers. Participants were encouraged to complete the assessments without parental assistance or involvement. The demographic questionnaire was completed by participants' primary caregivers.

Motor tasks
Two motor tasks served as control measures to evaluate motor speed and performance. The first motor task assessed overhead arm elevation movement, which was the movement pattern involved in the 3-D VR tasks. In this test, participants were instructed to assume a sitting position with both hands resting on their thighs, before performing a full arm elevation and returning to resting position. This movement was demonstrated beforehand to ensure the participants' understanding of the procedure. To maximize consistency among participants, three practice trials were conducted. Ten consecutive trials were conducted for the dominant arm, during which the number of elevation made within 10 s was recorded. The second motor task measured the speed and performance of participants' finger tapping movement. This test emulated the finger tapping movement required for the key press response in the ANT-C. Participants were instructed to perform tapping on a single button on the keyboard for 50 s, after which the total number of key presses was recorded. A 5 s practice trial of finger tapping was provided beforehand. Only the dominant hand was assessed. To minimize potential distractions, a cardboard was used to cover all other keys on the keyboard that were not used.

Raven's coloured progressive matrices test (CPM)
The CPM also served as a control measure and assessed non-verbal reasoning ability. The CPM has previously been established to have good psychometric properties as illustrated by high test-retest reliability (.70-.90) and internal consistency (.80-.90) (Raven, Court, & Raven, 1995).

Children's color trails test (CCTT)
CCTT was employed as a 2-D paper-and-pen assessment of visual attention and cognitive flexibility in this study. The time in seconds to complete each test, and the number of errors made were recorded. While the CCTT is typically designed for use with those aged 8-16, it should be noted that it has also been administered successfully on children as young as 5 years old in clinical settings (Llorente, Williams, Satz, & D'Elia, 2003). Good reliabilities have been reported for the CCTT, with alternate form's reliability coefficients ranging from .85 to .90 and test-retest reliability coefficients of .90 to .99 (Llorente et al., 2003).

Child version of the attentional network test (ANT-C)
The ANT-C (Petersen & Posner, 2012;Posner & Petersen, 1990) formed the 2-D computerized assessment of visual attention and attentional network functioning in the present study. The ANT-C is frequently used to assess different diagnostic groups such as individuals with Attention Deficit Hyperactivity Disorder (ADHD) (Casagrande et al., 2012) and ASD (Dye, Baril, & Bavelier, 2007). In this test, each trial began with a fixation cross, followed by one of the following four types of warning cues: center cue, double cue, spatial cue, or no cue. Center cues consisted of an asterisk at the location of the fixation cross, while double cues comprised two asterisks, one above and one below the location of fixation cross. Spatial cues consisted of a single asterisk at the location of the target fish. This was followed by the presentation of the target fish, presented either above or below the fixation cross. In congruent trials, flanking fish were presented facing the same direction as the target fish while in incongruent trials, they faced the opposite direction from the target fish. In neutral trials, the target fish appeared alone without any flanking fish. Participants were required to decide which direction the central target was facing (e.g. left or right), and to make the corresponding key press response. High test-retest reliability has been reported for the ANT-C as a sensitive measure of the three distinct systems of visual attention (Fan et al., 2002).

3-D virtual reality (3-D VR) program
A 3-D VR program, specially designed for the purpose of the present study, formed the 3-D assessment of visuospatial processing and attention. A stereoscopic viewing projection system was used to create an immersive virtual environment and induce telepresence in participants. This system consisted of two projectors, a silver screen, and kinetic motion sensors. The 3-D stereo projection was operated to maximize immersion. Linear polarization was achieved by placing polarized filters in front of the projectors to separate the left-eye view from the right eye. This enabled participants to have an immersive experience with the aid of 3-D glasses. Kinetic motion sensors were employed as interaction devices to detect participants' movements and gestures. Video-recorders were also set up in the front and back of the room to capture participants' reactions in the experimental session. The setup is illustrated below in Figure 1.
Two 3-D VR games, namely Bubble Poking and Balloon Poking, were administered in the testing session (Figures 2(a) and (b)).
Both games were basically the same except for some graphic differences in the types of stimuli used (i.e. balloons or bubbles), designed to induce some variation in the games and minimize boredom for the children. In addition to observing a demonstration of the task procedure, participants were given a practice trial for each program to ensure their understanding of the procedure. To minimize environmental distraction, the background music of the programs was turned off and the room lights were dimmed. Calibration of the motion sensors was conducted for every participant prior to each test session. Each game consisted of six blocks (five trials in each block). Both Bubble Poking and Balloon Poking programs required the participants to perform an arm elevation to burst the bubble/balloon stimuli as soon as they reached the black line on the screen. A simple movement pattern (overhead arm elevation) was chosen to minimize possible confounding variables such as motor clumsiness, poor motor planning, and inadequate physical endurance. Moving targets were used as it required good temporal-spatial capabilities of motor control. To start with, participants were required to place their hand on thigh. Once the target stimulus touched the black vertical line, the participant was required to elevate his arm in full range before returning to the resting position (hand on thigh). Real-time measurement of the movement pattern was recorded by the motor sensors, Kinect (Microsoft, 2010). Only a complete set of the movement pattern would be identified as a successful attempt by the motion sensors, which resulted in the outburst of the target balloon/bubble stimuli. Visuospatial performance was measured in terms of both accuracy rate (i.e. ability to burst the target stimulus) and time taken (msec). A delayed reinforcement scheme was Source: Adapted from Ip, Bryne, Cheng, Kwok and Lam, (2011). implemented in the two games (Figure 3(a)), with the presentation of a smiling face only upon completion of each block. Figure 3(b) shows the superimposed image as experienced by the participants in the VR Scenario.

Results
The purpose of the current study is to examine whether young children with ASD showed similar performances in visual-spatial tests that were presented in 2D vs. 3D environments. We are also interested to examine the test performances of the three distinct attentional functions in ASD children. Before testing the major hypotheses, we evaluated children's performances across the two control variables, namely motor performance and non-verbal reasoning. Independent t-tests revealed no significant differences between groups on finger tapping, overhead arm elevations, and non-verbal reasoning, as shown in Table 1.
For a parallel comparison of visual attention performances between 2-D and 3-D environments, we chose the neutral, no cue condition of the ANT-C computer task as the 2-D task to compare with the performances in the 3-D VR task. This condition was selected as it was thought to provide an unbiased measure of visual attention without the presence of cues or interference. A MANCOVA was performed to compare 2-D visuospatial performance between groups. Accuracy rate and reaction  times in the 2-D ANT-C computer task were entered as dependent variables. Conditions (ASD and TD groups) were entered as independent variables. Despite the finding of similar finger speed movements and non-verbal reasoning between ASD and TD groups, these variables were nevertheless entered as covariates to control for any insignificant differences that may still contribute to error variance. Results revealed no significant differences between both groups with regard to reaction time, F(1, 23) = .25, p = .63, η 2 = .01, and accuracy rate, F(1, 23) = .01, p = .93, η 2 = .00, indicating intact visuospatial attention in the current ASD sample.

Attentional network performances
No significant differences were obtained across all conditions (congruent, incongruent and neutral) and cue types (no cue, double cue, center cue, and spatial cue) in the ANT-C, as displayed in Table 3. Subsequently, performance scores for the three attention network systems (Alerting, Orienting, and Executive control) were calculated using Rueda et al.'s (2004) subtractions: Alerting: median Reaction Time (RT) for no-cue trials − median RT for double-cue trials; Orienting: median RT for central-cue trials − median RT for spatial-cue trials; Executive Control: median RT for incongruent flanker trials − median RT for congruent flanker trials. Finally, a MANCOVA was conducted to examine any differences in attentional network functioning between ASD and TD groups, while controlling for finger speed movement and non-verbal reasoning. Results revealed no significant differences between groups in functioning on all three attentional systems (Alerting − F(1, 25) = .53, p = .47, η 2 = .02; Orienting − F(1, 25) = .89, p = .36, η 2 = .03; Executive Control − F(1, 25) = .37, p = .55, η 2 = .02), indicating intact attentional network functioning in the current ASD sample.

Discussion
This study is the first of its kind to explore the use of 3-D VR technology as an assessment tool for visuospatial and visual attention in preschool children with ASD. Despite some concerns about the applicability of immersive 3-D VR technology in ASD individuals, majority of the ASD children were able to complete the VR tasks without problems. This included the appropriate usage of 3-D glasses. Our evidence has provided support to the TAM (Davis et al., 1989), which postulates that the perceived usefulness of VR technology in ASD children can promote the acceptance of the use of VR in education. With wider application of VR technology in ASD children, a more precise form of measuring test performance using VEs modeled after real life settings can be obtained. Hence, educators are better informed of the predictions of the expected training outcomes in the real world. VEs are especially useful for allowing the children to be the actual agent in tasks that are hard to control ethically and experimentally in everyday life.

Visuospatial attention
In the subsequent 3-D VR tasks, ASD children were not found to significantly differ from TD children in terms of successful attempts to burst the bubble or balloon stimuli, although the ASD group required a significantly longer time for successful attempts. Given that both groups were matched on performance speed in the 2-D ANT-C task, as well as movement speed in the arm elevation motor control task, performance differences between both groups in the 3-D tasks are likely not an issue of differences in motor abilities. Rather, it is likely that the differential nature of the stimuli in the tasks may have contributed to such findings. While the stimuli in the 3-D VR task were in motion, the stimuli in the 2-D ANT task were static. As such, a slower performance in the 3-D task and not the 2-D task may reflect challenges in the processing of motion stimuli rather than a weaker task motivation specific to the 3-D task. Indeed, some literature has previously suggested ASD individuals to be less sensitive to several aspects of motion, namely: global motion, second-order motion and motion coherence (Dakin & Frith, 2005). This reduced sensitivity may cause problems in the processing and analysis of dynamic information, which involves the integration of information over time. A longer processing period of temporal integration may thus be required for these individuals. Their slower performances may also be accounted for their poor integrative functioning between subsystems (visual-motion detection, temporal sequence processing, and motor coordination) in the present 3-D visual spatial task (Greffou et al., 2012). Future study can make use of different VEs to decompose a complex task into different subtasks, so that the precise deficit area can be identified.
Our findings have also clarified the hypothesized enhancement of representational fidelity in the model of learning in 3-D virtual learning environments (3D-VLEs; Dalgarno & Lee, 2010). While the model suggests that a 3-D environment can enhance the viewers' spatial knowledge representation, our ASD children showed accurate but delayed performances in visual-spatial processing task. This result calls for a further specification of the nature of spatial knowledge suggested by the model of 3D-VLEs. It is probable that "static" and "dynamic" spatial knowledge could be processed differently in VEs.

Attentional network functioning
Intact sustained and divided visual attention was observed in the current ASD sample using the 2-D paper-and-pencil CCTT test. Although inconsistent with previous reports of attentional abnormalities in ASD, this finding is congruent with present findings of intact attentional network functioning on the 2-D computerized ANT-C. That is, ASD children were not impaired in achieving and sustaining alertness, orienting visual attention, as well as executive functioning. Such results are consistent with that of Keehn et al.'s (2010), of which intact functioning was found in the alerting and executive control network in their ASD sample. These intact skills in ASD children enable them to pay enough attention to the entire VE and therefore benefit from VR-based social skills training (Ke & Im, 2013). However, significant impairment on the orienting network was revealed in their ASD sample, unlike in the present study. It is likely that these differential results are attributable to participants' age. In the present study, we recruited younger ASD children of ages 4 to 6, whereas other studies have mostly recruited an older sample (Keehn et al., 2010). Given that ASD is a developmental disorder, certain abilities or deficits such as attentional dysfunction may only present themselves over time. If this is the case, the development of attentional abnormalities then opens up further questions such as its relationship to the development of other deficits in ASD. For instance, it remains unknown if attentional dysfunction is a consequence or a causation of other ASD symptoms.

Limitations
It should be noted that this study was not without its limitations, the first of which was its small sample size. Despite efforts to recruit an equal ratio of male and female ASD participants, the current sample remained predominantly male. This limited the generalizability of the present findings. Secondly, we acknowledge that the visuospatial and attentional tasks of differential 2-D and 3-D modalities were not entirely parallel. While these tasks were employed with the intentions of minimizing practice effects, they may have limited our ability to draw strong comparisons across participants' performances on 2-D and 3-D tasks. This is also somewhat related to the design of the 3-D VR task, which in providing an increased ecological validity has resulted in a loss of experimental control and task specificity. As such, we are able to provide comparisons in performances between 2-D and 3-D tasks only in the broad domain of visuospatial attention, but are unable to tease apart the specific aspects of visuospatial attention performance in the VR task. Nevertheless, the present study has validated the use of VR technology as an assessment tool and paved the way for future research to adopt similarly ecologically valid measures that examine specific processes in visuospatial attention.

Conclusion
The current study has provided evidence for the successful application of VR technology as an assessment tool for a sample of ASD preschool children, and has also provided a novel perspective into the understanding of visuospatial processing and attention. Such research carries growing relevance in the face of increasing ASD prevalence rates. Present findings of intact attentional functions in young ASD children carry potential implications for identification and diagnosis, although future research is needed to validate these findings.