Elsevier

Computers & Education

Volumes 92–93, January–February 2016, Pages 64-75
Computers & Education

Embodied learning using a tangible user interface: The effects of haptic perception and selective pointing on a spatial learning task

https://doi.org/10.1016/j.compedu.2015.10.011Get rights and content

Highlights

  • Multisensory learning is investigated using a tangible user interface.

  • An interactive selective pointing of labels is contrasted with permanent display.

  • Higher interactivity leads to worse learning performance with the tangible interface.

  • We introduce Embodied Cognitive Load Theory as extension for Cognitive Load Theory.

  • Interactivity and haptic perception are discussed in terms of a cost-benefit model.

Abstract

Tangible User Interfaces offer new ways of interaction with virtual objects, yet little research has been conducted on their learner-friendly design in the context of spatial learning. Although frameworks such as Embodied Cognition stress the importance of sensory perception and movement, studies have found that high interactivity can be overwhelming and may lead to a lower learning performance. In a 2 × 2 factorial design participants (n = 96) learned heart anatomy using a 3D model that was either controlled using a mouse or a tangible object, i.e. a motion tracked plastic model of the virtual heart. Secondly, we varied the interaction mode featuring either a selective pointing mode in which only the label that the user currently activated was displayed or permanent display of all labels. Retention performance, cognitive load scores, and motivation measures indicate that the tangible object leads to significantly higher learning outcomes. The effect of the label display mode is different for the two input devices: The performance with selective pointing in the mouse condition is better than the performance with permanent display in the mouse condition; in the TUI condition this is exactly the other way around. Based on these results, we propose extensions for Embodied Cognition and Cognitive Load Theory.

Introduction

Constructing spatial representations of three-dimensional objects is a complex cognitive task relevant in many areas of education including STEM fields and medicine. In recent years, researchers have begun to investigate the benefits of virtual learning environments as well as the cognitive foundations for their learner-friendly design (Nicholson et al., 2006, Pan et al., 2006, Stull et al., 2009). Despite the introduction of theoretical frameworks aimed at explaining and predicting the effects of interactive and perceptually rich learning environments, studies in this field often lead to mixed results. We propose a more systematic approach for research on spatial cognition in virtual environments by contrasting predictions derived from Cognitive Load Theory (CLT; Sweller, 1999, Sweller et al., 1998) with current research on Embodied Cognition (EC; Barsalou, 1999, Shapiro, 2010). Based on this analysis, we conducted an experimental study focused on the influence of multimodal perceptual input and an interactive display mode using a tangible user interface (TUI). While the main focus of research on TUIs lies on devising more natural interaction modes and creating more engaging experiences (Hornecker and Buur, 2006, Shaer and Hornecker, 2010, Xie et al., 2008), we are primarily concerned with the use of TUIs for generating spatial representations.

Spatial learning can be investigated using virtual learning environments that allow users to interact with three-dimensional objects or scenes. Although several studies have shown that spatial learning using interactive media can lead to improved learning performances (Barrett et al., 2015, Huk, 2006, Stull et al., 2013, Stull et al., 2009), some results indicate that high interactivity and perceptual richness can be detrimental to learning (Levinson et al., 2007, Song et al., 2014). These results have been most commonly discussed within the framework of CLT and, in recent years, EC.

CLT holds that our limited working memory capacity requires learning materials to be designed in a manner that leaves as much memory capacity as possible reserved for the actual learning contents (Sweller et al., 1998). Furthermore, all cognitive load resulting from content or interactions not directly contributing to learning should be avoided. While CLT is most commonly used in the fields of instructional and educational psychology, its basic architecture lends itself to a wide variety of research questions that involve the analysis of cognitive tasks. CLT proposes three types of cognitive load (Sweller et al., 1998): Intrinsic load is defined as the load resulting from the complexity of the learning material (or task). The more elements that are required to be kept in working memory simultaneously, the higher the intrinsic load. Therefore, learning materials that enable a segmentation of a learning task lead to better learning outcomes as the intrinsic load is lowered and more cognitive resources are available for learning. The second component, extraneous load, is a result of the design of the learning materials. The learner-friendly presentation of learning content lowers extrinsic load by facilitating the access to relevant information and thus aiding understanding. A series of fMRI investigations reviewed by Whelan (2007) suggest that extraneous load can be linked to processing constraints in brain areas that modulate attention across sensory modalities. More specifically, extraneous load activates modality-specific neural structures in the posterior parietal cortex and Wernicke's Area, which we interpret as an indication that resources for sensorimotor control, integration of spatial information and language processing can be affected through extraneous load. Lastly, germane load is described as the load used for the construction of mental representations from learning contents required for storing the information in long-term memory. While most CLT researchers use Baddeley's models of working memory (Baddeley, 1992, Baddeley, 2000) as their basis, there have been recent efforts to expand these models to include movements and related aspects of embodiment (Wong et al., 2009; Ayres, Marcus, Chan, & Qian, 2009).

One of the main conclusions that can be drawn from CLT research is that highly interactive and perceptually rich learning environments can be disadvantageous to learning due to the high extraneous load imposed on the user (e.g., Stull & Mayer, 2007). At the same time, it should be noted that CLT research often uses a dichotomy of “high” and “low” interactivity that can usually be translated into “active” (requires the use of more elaborate controls) or “passive” (requires little additional activity from the user) interaction designs. The well-established theoretical framework of CLT is empirically supported by a wide variety of studies in the fields of learning and human–computer-interaction (for an overview, see Plass, Moreno, & Brünken, 2010), however, the model is facing growing criticism for its poor ability to explain why some forms of additional interactivity, such as gesturing and pointing (De Koning and Tabbers, 2013, De Nooijer et al., 2013) can be beneficial for learning outcomes. While a “less is more” design approach derived from CLT often leads to better learning performance than highly interactive learning environments that overstrain learners, it is important to study how and why some forms of added perceptual and motor load have recently proven to be helpful.

Research on the role of multimodal perception and bodily movement in cognitive processes is predominantly conducted on the basis of the EC framework that stresses the importance of bottom-up processes and modality-based perceptual representations (e.g., Glenberg, 2010, Kirsh, 2013). Accounts of EC such as the Perceptual Symbol Systems model (Barsalou, 1999, Barsalou, 2008) aim to describe cognition as the re-enactment of perceptual states. Thus, thinking about an object cannot be adequately defined as the mental processing of abstract representations of that object, realized through non-perceptual mental contents such as semantic networks. Instead EC characterizes cognitive processes as the re-activation of neural states that were active during the perception and encoding of that object (Barsalou, Kyle Simmons, Barbey, & Wilson, 2003). This approach suggests that humans are able to store perceptually rich memories including information gathered through all sensory modalities, hence allowing tactile percepts and motor affordances to be memorized in conjunction with visual and auditory information in multimodal representations.

Although a rapidly growing body of research within cognitive psychology and neuroscience supports these claims, attempts to utilize these results in more applied fields of psychology have led to mixed results. A number of studies found positive effects of perceptually enhanced learning materials featuring haptic feedback (Minogue and Jones, 2006, Schönborn et al., 2011, Wiebe et al., 2009), and interactivity (for a review, see Plass, Homer, & Hayward, 2009). Consequently, researchers have asked for updates to the “less is more” stance of CLT to be compatible with findings from EC (Paas & Sweller, 2011). By contrast, a number of studies designed with the principles of EC in mind have revealed that highly interactive and multimodal user interfaces do not always lead to an improved learning performance or may even overburden the learner (e.g., Post et al., 2013, De Nooijer et al., 2013). A typical example of such studies was carried out by Song et al. (2014): Participants were assigned to one of four versions of a medical learning tool for stroke syndromes differing in their degree of interactivity. One group watched a non-interactive presentation of a tissue lump traveling through the brain, while another group was given a version in which a slider had to be used to control the presentation. Two other groups used highly interactive versions in which the tissue lump had to be moved either by clicking on the different brain structures or by dragging it along a pre-defined path. Performance on a transfer test was best in the two conditions featuring a low degree of interactivity, suggesting that the more interactive versions overwhelmed participants by imposing a high extraneous cognitive load on them.

We want to go a step further with the interpretation of this result by claiming that highly interactive design versions requiring substantial effort such as motor coordination lead to worse learning outcomes only if users do not receive benefits for the added motoric (and extraneous) cognitive load that high interactivity usually entails. In the case of the study by Song et al. (2014), the additional load needed for motor coordination (and possibly for the learning of more complex interface controls) could not be compensated by any advantage related to the learning task. As the test questions did not include items in which the motor activities performed in the more interactive conditions could have made a direct impact (e.g., a task in which a motion trajectory should be drawn on paper), the higher degree of interactivity only filled up cognitive resources without offering learning benefits. A review by Koning and Tabbers (2011) on EC learning studies can be considered to support this interpretation as it presents several studies in which task-appropriate actions during learning increased learning performance, contradicting the results of Song et al. (2014).

Another major aspect of EC relevant to spatial learning is the potential of using multimodal perceptual input to aid in the formation of spatial representations. In a recent review by Pouw, van Gog, and Paas (2014), the authors discuss a variety of studies featuring instructional manipulatives, i.e. physical objects or interactions used in learning situations, and conclude that sensorimotor experiences offer a high potential for reducing cognitive load as well as for transfer of learning. However, based on mixed results regarding the usefulness of physical manipulatives, the authors claim that the benefits are dependent on several design factors: Most importantly, additional information gained through haptic modalities must be relevant in the process of forming an understanding to become valuable for learners. They claim that this is particularly the case if sensory input from one modality is simply not enough to understand a given learning content. Furthermore, they discuss a study on chemistry education in which a 3D model aided task performance only if participants made active use of the model, thereby externalizing mental rotations and facilitating learning (Stull, Hegarty, Dixon, & Stieff, 2012), again confirming that task-appropriate interactions can be helpful. By examining whether additional haptic input from a TUI is advantageous in the formation of spatial representations, we want to test whether predictions derived from EC can be supported.

Regarding the design of more interactive learning scenarios, studies focusing on gesturing and pointing have revealed promising findings for the application of EC to learning (Pouw et al., 2014a, Chu and Kita, 2011), in particular through increased motor activation and a corresponding focus of attention (Brucker et al., 2015, De Koning and Tabbers, 2013). Despite the positive results of EC-based pointing and gesturing studies, research based on CLT has revealed that interactive learning features that require action in order for learning elements to appear (which we call selective pointing) can negatively affect learning results. This has been shown with a selective pointing implementation that hides all information currently untouched by the mouse cursor (similar to mouse-over effects in web-browsers) by Rey and Diehl (2010), demonstrating that additional motor activity may increase cognitive load. On the other hand, hiding information and relying on knowledge in stored in working memory has been linked to increases in learning performance (Gray and Fu, 2004, Souza et al., 2014). On the basis of these conflicting results, we want to include a selective pointing feature into our study design to assess whether the feature helps in the context of spatial learning tasks and to gain insight on whether the effect of pointing differs from input device to input device.

Since TUIs can be implemented with unimanual and bimanual interaction designs, we want to provide a short overview of research in this area. For 2D interaction scenarios, the use of bimanual interaction modes instead of one handed interaction designs have resulted in performance gains for a positioning and scaling task in 2D space as well as in a navigation and selection task (Buxton & Myers, 1986). Similarly, Leganchuk, Zhai, and Buxton (1998) found beneficial effects of bimanual interaction in an area sweeping task commonly found in graphics software in terms of performance and cognitive load that are explained using the kinematic chain model by Guiard (1987). This model supposes that the non-dominant hand provides the frame of reference for the action of the dominant hand as well as featuring a timely precedence and a coarser granularity of action compared with the dominant hand. As a result, the two hands are used in an asymmetric but cooperative manner during bimanual interactions, thereby lowering cognitive load. Turning to bimanual interaction design in 3D space using TUIs, a study by Hinckley, Pausch, and Proffitt (1997) has revealed that people are better able to memorize the alignment of two objects with a bimanual interaction mode than with a unimanual mode. However, this study featured a task in which the primary aim was motor learning while the anatomy learning task used in our study deals with visuospatial learning in a 3D environment. Nevertheless, the results from research on bimanual interaction are in line with the usual predictions derived from EC claiming that bodily activity can reduce cognitive load.

Besides, we aim to assess whether the input device affects users’ attitudes regarding motivation and usability. Studies such as Xie et al. (2008) report motivational gains for TUI use, similar to recent augmented reality research by Cai, Wang, and Chiang (2014) in which positive attitudes towards an interactive augmented reality learning tool were found. In particular, we aim to assess whether the TUI can increase interest and whether an interactive display mode can affect self-efficacy. Regarding usability, we want to ascertain whether multimodality and an interactive display mode are perceived as stimulating and practical.

Based on this theoretical background, we formulated the following hypotheses:

H1

The use of a TUI will lead to a better learning performance, as compared with the use of a mouse.

H2

  • a)

    A selective pointing feature will lead to different learning outcomes with the two input devices.

  • b)

    Based on EC, selective pointing should lead to a higher learning performance compared with the permanent display of all labels, particularly when combined with a TUI due to the concurrent motor activation.

  • c)

    Based on CLT, selective pointing should lead to lower learning performance with any input device due to the higher degree of interactivity.

H3

The use of a TUI will result in less cognitive load during a learning task compared with a mouse interface.

H4

The TUI will create more interest in the learning contents compared with a mouse interface while a more interactive display mode will increase self-efficacy. The TUI will be rated as more practical and stimulating compared with the mouse controls.

Section snippets

Participants

The participants were 101 university students enrolled predominantly in Media Communication and Psychology at a German university. Five participants were excluded from the analysis because they either stated to have learned heart anatomy during the last year or studied a subject that requires in-depth knowledge of heart anatomy and physiology, such as Sports, leaving 96 participants (Mage = 23.9 years, SDage = 3.43; 65 female, 31 male) in the analysis. Ninety-four of these were native speakers

Prior knowledge and spatial ability

Participants in the four experimental groups did not differ regarding their performance in the prior knowledge test as tested using an ANOVA, F(3, 92) = .2014, p = .895. A Kruskall–Wallis test for the question item regarding if and when the participants had learned heart anatomy revealed no significant difference between the four groups with a chi-square test, either [χ2 (3, N = 96) = 2.19, p = .535]. With regard to spatial abilities, there was neither a difference in spatial ability between

Discussion

The main objective of this study was to investigate whether additional haptic input and interactive label presentation are beneficial when learning with 3D objects. We will now discuss the effects of the input device and the selective pointing feature on the dependent variables learning performance, cognitive load, motivation, and usability. Based on the results we propose an extension for CLT as well as a cost-benefit model of EC based on our findings.

References (56)

  • Z. Pan et al.

    Virtual reality and mixed reality for virtual learning environments

    Computers & Graphics

    (2006)
  • L.S. Post et al.

    Effects of simultaneously observing and making gestures while studying grammar animations on cognitive load and learning

    Computers in Human Behavior

    (2013)
  • K.J. Schönborn et al.

    Exploring relationships between students' interaction and learning with a haptic virtual biomolecular model

    Computers & Education

    (2011)
  • H.S. Song et al.

    The cognitive impact of interactive design features for learning complex materials in medical education

    Computers & Education

    (2014)
  • A.T. Stull et al.

    Usability of concrete and virtual models in chemistry instruction

    Computers in Human Behavior

    (2013)
  • M. Valcke

    Cognitive load: updating the theory?

    Learning and Instruction

    (2002)
  • R.R. Whelan

    Neuroimaging of cognitive load in instructional multimedia

    Educational Research Review

    (2007)
  • E.N. Wiebe et al.

    Haptic feedback and students' learning about levers: unraveling the effect of simulated touch

    Computers & Education

    (2009)
  • A. Wong et al.

    Instructional animations can be superior to statics when learning human motor skills

    Computers in Human Behavior

    (2009)
  • R. Amthauer et al.

    Intelligenz-Struktur-Test [Intelligence-Structure-Test] 2000 R

    (2001)
  • A. Baddeley

    Working memory

    Science

    (1992)
  • L.W. Barsalou

    Perceptual symbol systems

    Behavioral and Brain Sciences

    (1999)
  • L.W. Barsalou

    Grounded cognition

    Annual Review of Psychology

    (2008)
  • W. Buxton et al.

    A study in two-handed input

  • M. Chu et al.

    The nature of gestures' beneficial role in spatial problem solving

    Journal of Experimental Psychology: General

    (2011)
  • B.B. de Koning et al.

    Facilitating understanding of movements in dynamic visualizations: an embodied perspective

    Educational Psychology Review

    (2011)
  • B.B. De Koning et al.

    Gestures in instructional animations: a helping hand to understanding non-human movements?

    Applied Cognitive Psychology

    (2013)
  • T.H.S. Eysink et al.

    Learner performance in multimedia learning arrangements: an analysis across instructional approaches

    American Educational Research Journal

    (2009)
  • View full text