Investigation of spatial ability test completion times in virtual reality using a desktop display and the Gear VR

Guzsvinecz, Tibor; Orbán-Mihálykó, Éva; Sik-Lányi, Cecília; Perge, Erika

doi:10.1007/s10055-021-00509-2

Investigation of spatial ability test completion times in virtual reality using a desktop display and the Gear VR

S.I. : VR and Cognitive Science
Open access
Published: 17 March 2021

Volume 26, pages 601–614, (2022)
Cite this article

Download PDF

You have full access to this open access article

Virtual Reality Aims and scope Submit manuscript

Investigation of spatial ability test completion times in virtual reality using a desktop display and the Gear VR

Download PDF

Tibor Guzsvinecz ORCID: orcid.org/0000-0003-3273-313X¹,
Éva Orbán-Mihálykó²,
Cecília Sik-Lányi¹ &
…
Erika Perge³

2774 Accesses
9 Citations
3 Altmetric
Explore all metrics

Abstract

The interaction time of students who did spatial ability tests in a virtual reality environment is analyzed. The spatial ability test completion times of 240 and 61 students were measured. A desktop display as well as the Gear VR were used by the former group and by the latter one, respectively. Logistic regression analysis was used to investigate the relationship between the probability of correct answers and completion times, while linear regression was used to evaluate effects and interactions of following factors on test completion times: the users’ gender and primary hand, test type and device used. The findings were that while the completion times are not significantly affected by the users’ primary hand, other factors have significant effects on them: they are decreased by the male gender in itself, while they are increased by solving Mental Rotation Tests or by using the Gear VR. The largest significant increment in interaction time in virtual reality during spatial ability tests is when Mental Rotation Tests are accomplished by males with the Gear VR, while the largest significant decrease in interaction time is when Mental Cutting Tests are completed with a desktop display.

Virtual reality and gamification in education: a systematic review

Article Open access 19 March 2024

A review of immersive virtual reality serious games to enhance learning and training

Article Open access 05 December 2019

The Impact of Virtual Reality in Education: A Comprehensive Research Study

1 Introduction

A theory is proposed by Gardner (1983), saying that every human has multiple types of intelligence and spatial intelligence is one of them. This theory was improved by Maier (1996) who concluded that spatial intelligence is made up of five different parts: spatial perception, visualization, mental rotation, spatial relations and spatial rotations. According to Miller and Bertoline (1991), this ability is not a biological susceptibility, but can be improved through time: improvement can occur simply by life experiences or by being exposed to certain learning environments. It has been suggested in the study of Miller (1992) that spatial ability training should be included in the curriculum of engineering studies. According to Ghiselli (1973), the success in the fields of engineering, mathematics and architecture is related to the spatial skills of the person.

A well-developed spatial ability is important in modern age and it becomes more relevant with each passing day as it is required by many jobs. With it, the person can understand relations between objects and space. A considerable amount of paper-based tests was developed through the years to improve spatial intelligence and ability of people. These tests include the Mental Rotation Test (MRT) (Ault and John 2010), the Mental Cutting Test (MCT) (Bosnyak and Nagy-Kondor 2008) and the Purdue Spatial Visualization Test (PSVT) (Branoff and Connolly 1999). Since these are paper-based tests, the following question can arise: “what happens when these tests are taken in virtual reality? (VR)”

This is an essential question as according to Burdea and Coiffet (2003), a VR system is made up of five important components: the VR engine itself, its software/database(s), I/O devices, tasks and users themselves. This means that users are as equally important as the other factors in a VR system (Heldal 2007; Schroeder et al. 2006). As users are part of this, interactions can occur between its components as shown by several studies: not only the learning skills of users can be increased due to VR (Horváth 2016; Kovari 2018; Wilson 2019), but their spatial skills as well (Dünser et al 2006; Macik 2018; Mclellan 1998; Molina-Carmona et al. 2018a; Parsons et al. 2004; Torner et al. 2016). The conclusion of the last two studies is that mental rotation of males are better than that of females in VR and in augmented reality (AR). Also, the suggestion of the last study is that AR could be a good tool for improving spatial ability.

Studies also exist which present the design of VR spatial ability tests and the research plan of their authors. An MCT test in VR was developed by Hartman et al. (2006) with the goal to help other scientists in the creation of future MCT tests. A testing method was outlined by presenting the procedure and data analysis. As the study is about the creation of VR MCT tests, their results are not published in this study. A VR MRT test outline was created by Rizzo et al. (1998b) by presenting their future plan and their first results at the time of writing the referenced paper: the rates of correct answers between a pre-test and a post-test were investigated in it. Later, their preliminary results on these tests were published (Rizzo et al. 1998a). According to them, the VR MRT test helped users as their results were improved on the post-tests. A study that has different results also exist in the literature (Jiang and Laidlaw 2019), where results of two groups which did the MRT test type were compared: a desktop environment was used by one group and a virtual environment (VE) by the other. According to them, low spatial ability participants benefited from learning between the pre-test and the post-test. The conclusion was that their performance on MRT tests were not significantly affected by using VR technologies.

However, in the study of Oman et al. (2000) it was found out that the performance of users who used head-mounted displays (HMDs) was slightly better than that of those who did not use them. Their conclusion was that VR can be used for spatial ability training and thus, is excellent for this purpose. According to Chang et al. (2017), a perspective-test was developed in VR to measure the spatial skills of users. Similarly to the previous study, pre-tests and post-tests were conducted. Users were grouped into three: those who interacted with the application with motion; those who interacted with a keyboard and a mouse; and users who interacted with motion, but used nonspatial tasks. Their conclusion was that the first two groups improved between the tests. However, significant improvements were only found in the case of the first group. The PSVT-R test in VR was created and evaluated by Molina et al. (2018b). Two groups of users were tested: a desktop display was used by one group and an HMD by the other one. Pre-tests and post-tests were done by both of them. In the study, the conclusion was that there are improvements in the spatial skills of both groups, but it is significant when an HMD is used.

It is proven by our earlier research that results which are similar to those of paper-based tests can be gathered when using a desktop display (Guzsvinecz et al. 2020a). However, when using an HMD such as the Gear VR, they change significantly. This means that positive influence of an HMD is also confirmed, however new facts arose in the referenced study: while a significant difference exists between results of males and females when using a desktop display, this difference disappears when the Gear VR HMD is used. Moreover, a similar phenomenon can be observed between right-handed and left-handed users: in the case of a desktop display, the performance of right-handed people on the tests is significantly better than that of their left-handed counterparts, but with use of the Gear VR, significantly better results are achieved on tests by left-handed users than by right-handed ones.

As can be seen that while transition from paper-based tests to digital ones is not easy, it has certain advantages, especially on different user groups. This is because when a user is placed inside a VE, interaction between the human and machine changes: for example, the user does not use a pen and paper, but a sensor (or other input devices) to take tests. This also means that in VR the human-computer interaction (HCI) and human-computer interfaces can be different from application to application (Kortum 2008): new interfaces and various I/O devices have to be learned in every VR application and the required tasks could differ between each of them. Due to this, the developers of VR applications have to take these differences into account (Sutcliffe et al. 2019), and the focus should be on user-centric development (Drettakis et al. 2007). To help users with HCI, a toolkit was developed by Takala (2014) which makes it easier to create VR applications using building blocks. With it, applications can be created for HMDs. In this study, spatial graphical user interface ideas of students are presented and the toolkit is evaluated. According to them, both received positive feedback.

Therefore, to investigate multiple aspects of different types of user interaction in VR and influences of display devices and display parameters on it, a spatial ability measuring VR application was developed, which can use a desktop display and the Gear VR (Guzsvinecz et al. 2019). Regarding tests, since some factors and influences on correct answers were investigated in another paper (Guzsvinecz et al. 2020b), in this study the factors that affect the spatial ability test completion times are planned to be found. What can be extrapolated from the results is how to make interaction with the computer in VR less time consuming. As mentioned in several studies (Chang et al. 2017; Guzsvinecz et al. 2020a; Molina et al. 2018b; Oman et al. 2000), using an HMD has positive influence on the users’ ratio of correct answers, but this effect is different between user groups. Thus, it is possible that completion times could be affected by different display devices, users’ various characteristics, and test types. If this is the case, designers of VR applications which require spatial skills (such as applications for education and even for cognitive rehabilitation) could use the results to create VEs which are less time-consuming and/or more user-friendly as users’ characteristics are taken into account during development.

2 Research questions and hypotheses

As mentioned in the introductory section, complexity of a VR system can be testified to some extent by completion times and even rates of correct answers: such systems are very complex and users, tasks and I/O devices are integral components of them. Therefore, the test completion times in VR were investigated because using different display devices can have certain positive or negative influences on the results of correct answers of users with various characteristics. In case of this study, these human characteristics mean users’ gender, and their primary hand. These two characteristics were chosen as significant differences were found in their rates of correct answers in our previously referenced research. In addition, the completion times of each test type are also investigated to see whether they interact with different display devices and various user groups.

Therefore, firstly, the connection between completion times and the probabilities of correct answers were investigated. Secondly, the influence of used display devices on completion times regarding various user groups and test types was needed to be analyzed. Lastly, the goal was to find the combinations of display devices, human characteristics, and certain test types which result in either the smallest or the largest completion times. Therefore, before the research commenced, three research questions (RQs) were set up, which are the following:

RQ1: Are completion times and probabilities of correct answers independent from each other?
RQ2: Are completion times affected by different display devices, users’ various characteristics, and test types? Do they interact with each other?
RQ3: Which combination of mentioned factors results in either the smallest or the largest test completion times?

After setting up the RQs, the same number of hypotheses (Hs) was formulated. These three Hs which contain both the null hypotheses and their alternatives, are the following:

H1: Completion times and probabilities of correct answers are independent from each other, opposite to: completion times and probabilities of correct answers are dependent.
H2: No significant effects on completion times are provided by different display devices, human characteristics and test types, opposite to: significant effects on completion times are provided by different display devices, human characteristics and test types.
H3: The smallest and the largest completion times are not significantly affected by some of the mentioned factors, opposite to: the smallest and the largest completion times are significantly affected by some of the mentioned factors.

3 Methodology

In order to answer the mentioned RQs, a spatial ability testing application was developed at the University of Pannonia in 2019. The Unity game engine was used for development and two versions of spatial ability application exist: one was made for Windows 7 or newer operating systems and the other one was developed for Gear VR SM-R322 HMD, which uses Android operating system. A Samsung Galaxy S6 Edge+ smartphone was placed inside the Gear VR.

While the two versions of spatial ability testing applications are built similarly, two main differences exist between them. In the case of the desktop display version, interaction is done with a keyboard and/or mouse. When using the Gear VR however, interaction is done with a touchpad which can be found on its right side. As the I/O devices are integral components of a VR system, this is a critical difference. Another distinction is that the virtual camera can rotate when the Gear VR is used: the smartphone inside it has accelerometer(s) and gyroscope(s), thus rotation of users’ head can be followed, meaning the virtual camera can rotate accordingly. Thus, all objects could be seen from slightly different perspectives and the students felt that they were inside the VE. However, the virtual camera cannot move in any direction due to the Gear VR being only able to handle rotations. In case of the desktop display version, the virtual camera could not be rotated or moved, thus the objects could only be seen from a frontal point of view: the immersion of users was not as high, because they were outside of the VE.

Testing was done in two groups during September 2019. At the University of Debrecen, an LG 20M37A (19.5”) desktop display device was used for testing by 240 students. Those who tested at the University of Debrecen were either architect and civil engineering or mechanical engineering students. The ones who came to tests were 23.5 years old on average with a dispersion of 3.1 years. For all tests, a computer laboratory was used. Due to its small size, twelve groups of twenty students were made. Testing was done during three weekdays and thus, was completed within a week. At the University of Pannonia, 61 users tested with the Gear VR: this group consisted of information technology (IT) and non-IT students. They were 19.7 years old on average with a dispersion of 1.5 years. In this case the tests’ duration was three weeks long as only one Gear VR was available at the University. This means that all testers had to come in a sequential order, one-by-one. Each of them required at least thirty minutes and an hour at most to complete the tests. Thus, the skills of 8 students were measured at most per day, while the smallest number of testers per day was 2. As they were students, their appointments were made according to their classes so they could come to the tests before or after their classes.

During measurements, each test type had to be done three times – in other words, three sequences. Each test type could be found in every sequence. In all of them, the order was the MRT, MCT, PSVT test types. As each test type consisted of ten rounds, thirty questions could be found in every sequence. After one sequence of tests was completed, users could rest – if they wanted to – and after that they started the next round of testing which consisted of same test types, but their solutions to spatial ability problems were changed using a randomization technique. In total, each student was asked 90 questions on the tests. In Figs. 1, 2, 3 the MRT, MCT and PSVT test types can be seen in the application, respectively.

Naturally, on the tests, answers had to be chosen, while the completion times of users were also measured (as seen in the upper right corners in Figs. 1, 2, 3). This was done differently in each version of the application due to the two types of interactions. In case of the desktop display version, answers could be chosen by pressing certain numbers on the keyboard: 1–4 in case of the MRT test type and 1–5 during other test types. These numbers correspond to objects on screen from left to right. Selecting answers could also be done by clicking on a certain object with a mouse. However, when the Gear VR version is used, students had to look at objects they wanted to choose. Afterward, the touchpad had to be tapped on the right side of the Gear VR to select an object.

As mentioned earlier, the virtual camera’s position is locked in both versions, but it can be rotated when using the Gear VR. While the number of rotations was measured during testing, students were asked to not rotate the virtual camera to correctly investigate their spatial skills.

Regarding students at both universities, every person who was willing to do the tests could join. This means that there were no selection criteria applied. Moreover, since the spatial skills of students were measured, no information was gathered of their height and body weight. To respect their anonymity, their names were not gathered.

It should be noted that information is logged about users (age, gender, primary hand, years spent at a university, what do they study), the test (test type, completion time, number of correct answers), and used technical parameters (virtual camera type, its rotation, its field of view, the contrast ratio between foreground object and the background, whether shadows are turned on in the scene and the used device) into a .csv file. Since completion times are in our focus, therefore, the effects of users’ gender, primary hand, the test type, and used device are investigated on them.

As mentioned, the goal is to identify effects of the mentioned factors on test completion times. Thus, the correlation between probabilities of correct answers and test completion times is also investigated. To evaluate the relation between probabilities and completion times, logistic regression analysis method was used, while to analyze the factors’ effects on the latter, linear regression analysis methods were used (Hosmer Jr et al. 2013; Walpole et al. 2011). All calculations were performed by help of statistical program package R (R Core Team 2018).

4 Results

In this section, completion times are analyzed in each case, and are measured in seconds. These are logged into the mentioned file after ten questions are answered (as one test type consists of ten questions). The smallest completion time is 7.9 seconds and the largest one is 1168.43 seconds which is approximately 20 minutes. Their average is 200.388 seconds with a dispersion of 123.279 seconds.

The distribution of completion times is not normal as the Kolmogorov-Smirnov test resulted in p-value \(< 2.2 \times 10^{-16}\). Therefore, the hypothesis of normal distribution is rejected. The histogram of test completion times is presented in Fig. 4.

Since completion times and probabilities of correct answers are numerical values, the correlation coefficient can be used to evaluate whether they correlate or are independent.

The numerical value of correlation of completion times and probability of correct answers equals to 0.223. By performing a test to check whether it can be considered zero or not, p-value \(< 2.2\times 10^{-16}\) was received, therefore the hypothesis of correlation’s zero value is rejected. This means that these variables are not independent of each other. The positive sign of correlation coefficient means that when completion time increases, the probability of correct answers increases as well. It is shown by the correlation coefficient’s value that linear relationship is not strong between these two variables. If the correlation coefficient of the logarithm of completion time and probabilities is looked at, a somewhat larger correlation is yielded, which is 0.299.

Relations between completion times and the mentioned factors are analyzed by regression as seen in Table 1. In all Tables where the results of linear regression analysis are presented, the estimated coefficients (Est.), standard error (Std. err.), test statistics (t value) and p-value are shown in each case. It should be noted that the latter is the probability of type I. error (Pr(\(>|t|\))). Inside those Tables where the results of logistic regression analysis are presented, z values are shown instead of t values.

Table 1 Results of logistic regression analysis of the relation between completion times and probabilities of correct answers

Investigation of spatial ability test completion times in virtual reality using a desktop display and the Gear VR

Abstract

Similar content being viewed by others

Virtual reality and gamification in education: a systematic review

A review of immersive virtual reality serious games to enhance learning and training

The Impact of Virtual Reality in Education: A Comprehensive Research Study

1 Introduction

2 Research questions and hypotheses

3 Methodology

4 Results

4.1 Analyzing the effect of one factor

4.1.1 The effect of users’ gender on time

4.1.2 The effect of users’ primary hand on time

4.1.3 The effect of test type on time

4.1.4 The effect of device used on time

4.2 Analyzing the effect of pairs

4.2.1 The effect of the pair of gender and test type on time

4.2.2 The effect of pair of gender and device used on time

4.2.3 The effect of the pair of test type and device used on time

4.3 Analyzing the effect of all factors on time

5 Discussion

5.1 Rejected hypotheses - detected effects

5.2 The mixed case

5.3 The importance of the results

6 Conclusions

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation