A comparison of the effects of haptic and visual feedback on presence in virtual reality

In the current consumer market, Virtual reality experiences are predominantly generated through visual and auditory feedback. Haptics are not yet well established, but are increasingly introduced to enhance the user ’ s sense of ‘reality ’ . With haptic (vibrotactile) feedback now part of the built-in mechanism of VR consumer devices, there is an urgent need to understand how different modalities work together to improve the user experience. This paper reports an experiment that explores the contributions made to participants ’ sense of presence by haptic and visual feedback in a virtual environment. Participants experienced a virtual ball bouncing on a virtual stick resting across their avatar hands. We found that presence was enhanced when they could both see and feel the ball ’ s action; with a strong suggestion that haptic feedback alone gave rise to a greater sense of presence than visual alone. Similarly, whilst visual or bimodal feedback enhanced participants ’ ability to locate where the ball bounced on the stick, our results suggest that the action itself was more readily discerned haptically than visually.


Introduction
Virtual Reality (VR) has been described by Lanier (2017) as (among other things) "the substitution of the interface between a person and their physical environment with an interface to a simulated environment". That is to say that our natural perceptual apparatus is hijacked by the VR system (Klevjer, 2011). However, in practice, only selected parts of the interface are substituted-we are never fully removed from our physical environment. In today's mainstream technology, visual and auditory feedback predominate, with increasing attention being paid to haptics; other modalities remain in the experimental stage.
The perceptual interface is of particular significance in understanding presence in virtual environments. The literature suggests a core aspect, described as "spatial presence", "place illusion", or simply "being there", that is central to presence. This is widely understood to be dependent on the nature, extent and veridicality of our sensorimotor interaction with the virtual environment, and how that relates to our normal engagement with the real world.
This paper explores the relationship between perception and presence, with an emphasis on the holistic nature of multi-modal perception. We argue that, whilst different perceptual modalities are represented in a VR system by totally distinct technological interfaces, we should not lose sight of the ways in which modalities combine to form a perceptual whole. In reviewing how presence has been understood in the literature, we propose an alternative perspective, based on Lanier's Fourth VR Definition, as quoted above. This approach highlights the way we engage with the virtual environment, drawing on previous perceptual experience-both real and virtual-to make sense of the perceptual feedback generated by the VR.
With vibrotactile haptic feedback now being part of the mainstream consumer VR device, we believe there is an urgent need to explore how haptic and visual feedback contribute to a participant's sense of presence in a limited VR environment. Our experiment was inspired by an earlier study reported by Berger et al. (2018), which involved 'funnelling': a technique that represents haptic events at different distances from the points at which the feedback is delivered. Funnelling, and its derivation from earlier psychophysical research, are discussed, along with an outline of Berger et al.'s study.
As with Berger et al.'s experiment, our participants experienced a virtual ball bouncing on a virtual stick held across their avatar hands. In different conditions, the ball bounced either in a single location throughout the condition, or in a variety of locations; feedback was delivered in different combinations of haptic and visual, so that participants might see the ball bounce, or feel it, or both or neither. Our aim was to provide an account for the results reported by Berger et al., by broadening the scope of the study to compare both visual and haptic feedback.

Perceptual modalities in VR
The objective of introducing haptic feedback to VR is to increase the player's sense of 'reality' by reproducing more closely the kind of 'real world' conditions portrayed in the VR experience. The use of different technologies independently generating visual, auditory and haptic feedback encourages the idea that the modalities are separable. This makes it easy to suppose that 'adding haptics' will necessarily improve the experience. However, real-world perception is almost invariably multimodal and holistic (Nanay, 2018). Our perceptual apparatus has evolved to make sense of changing sensations that originate naturally and coherently from the same event or percept, even though received through multiple modalities. By contrast, in VR, individually crafted digital feedback must be actively co-ordinated to appear as if from a single source. Coherence is only as good as design and technological limitations afford; yet our perceptual apparatus still expects to make sense of continuous and coherent multi-modal percepts Theoretical science has also long assumed that individual senses and perceptual modalites can be studied independently of each other (Nanay, 2018). For example, sensorimotor theory (Noë, 2004;O'Regan et al., 2001) discusses 'sensorimotor contingencies' (SMCs) in terms of individual modalities: the way an object visually appears bigger, or a sound grows louder, as we move towards it; how the apparent (visual) shape of an object varies as we move around it; or the particular configuration our hand must take as we pick it up. In reality, SMCs are multimodal. As we lift a glass to our lips, there is no incongruence between its changing visual aspect and the constant configuration of the hand holding it: we understand that both are characteristic of that object in those circumstances. Along with these consistencies, the seen and felt curvature of the glass, the sound of effervescence and the bouquet of a good wine, together with its taste as it reaches our lips, all combine to speak to us of a glass of bubbly, rather than of orange squash. This idea of multimodal percepts is reflected in Nanay's argument (Nanay, 2018) that if some senses are occluded, we draw on multimodal mental imagery to complete the holistic percept. That is, if there is no relevant physical stimulation in one sense modality, perceptual processing in that modality may be triggered by sensory stimulation in another modality. Thus we hear and smell the characteristic sound and aroma of coffee brewing in another room: visual mental imagery allows us to experience 'coffee machine' holistically.
Whilst experimental studies may still focus on individual senses in isolation, interpretation of outcomes should take account of multimodality. Our study seeks to apply this insight to our engagement with virtual reality, just as with our 'real', physical environment. Not all SMCs (eg those relating to taste, smell, tactile shape etc.) are represented in current VR systems: if multimodal imagery can compensate for absent modalities, how does this affect presence? In the present study, we explore the effects on presence of visual only feedback ('minus haptic'), haptic only ('minus visual') and both together.

Presence in virtual environments
Presence is generally understood as a 'sense of "being there" in the virtual environment' (Slater et al., 1994). It has also been characterised as 'an illusion that a mediated experience is not mediated' (Lombard and Ditton, 1997), and as 'the (suspension of dis-) belief of being in a world other than the physical one' (Lessiter et al., 2001). However, there is some confusion as to just how this should be measured: what factors actually constitute Presence, and which merely contribute to its attainment? For example, Table 1 summarises three instruments that have been developed to measure Presence: the Presence Questionnaire (PQ) by Witmer and Singer (1998); the ITC Sense of Presence Inventory (ITC-SOPI) from Lessiter et al. (2001); and the Igroup Presence Questionnaire (IPC) from Schubert et al. (2001). There are clear parallels between the main factors addressed by each questionnaire, but little consistency in their characterisation-as subjective experience (eg PQ: Immersion), by subjective contributing factors (eg ITC-SOPI: Engagement), or through objectively measurable contributing factors (eg ITC-SOPI: Ecological Validity).
McMahan (2003) notes that 'The terms immersion and presence ... have been so loosely defined as to be interchangeable-which they often are.' This lack of clarity extends also to related concepts such as engagement and involvement, which are variously characterised as aspects of presence or immersion, or as pre-requisite or contributing factors for either. This confusion of terminology is clear in Table 1.
To reduce confusion, Slater (2009) proposes the concept of Place Illusion (PI) as 'the strong illusion of being in a place in spite of the sure knowledge that you are not there', excluding 'other multiple meanings that have since been attributed to the word "presence"'. This corresponds with the main factor in each of the three instruments discussed above, as shown in Table 1. He sees PI as contingent on immersion, which is determined by the set of 'valid actions' it supports-that is, 'the actions that a participant can take that can result in changes in perception or changes to the environment'. Valid sensorimotor actions are characterised by the SMCs they give rise to. Valid actions-and thus immersion-are completely determined by the physical properties of the VE: the extent and quality of tracking of user actions; the range and fidelity of perceptual feedback; and the interaction between them.
Slater also suggests Plausibility Illusion (Psi) as 'the illusion that what is apparently happening is really happening (even though you know for sure that it is not)'. Relating specifically to events ('what is happening'), this has some correspondence with the third category in Table 1; however, in the three instruments, the concept of 'realness' seems to apply rather to the material environment as a whole than just to It is notable that, in all three instruments, involvement or engagement is defined largely in terms of attention. But is there more than mere 'paying attention' in this? Our engagement with a VE is perceptual, an activity of making sense of the substitute environment, just as we do with the real world. Froese et al. (2012) characterise this sense-making as ǣemergent from the ways in which an agents movements are not just random physical events, but are goal-directed actionsǥ. When we pay attention to the stimuli, our actions are purposefully directed to maintaining a meaningful relationship with the VE: not just our attention but our intentionality is directed towards the VE and away from our physical environment. This 'real world dissociation' is also associated with the concept of 'duality' in the study of picture perception-our awareness that, as well as visually presenting a virtual space (Noë, 2004), a picture also exists as an object within the wider, real world visual environment (Hecht et al., 2003).
Lanier's characterisation of Virtual Reality as "the substitution of the interface between a person and their physical environment with an interface to a simulated environment" (Lanier, 2017) offers an alternative perspective on the idea of Presence in VR. Perception can be understood as an interface between a person and their physical environment ( Fig. 1 (a)), the means by which we make sense of the world we find ourselves in. VR technology comes between us and our natural environment ( Fig. 1(b))-it hijacks our perceptual apparatus, so that we find ourselves engaging with the Virtual Environment generated by the VR. So now, our perceptual apparatus is telling us that we are 'in the Virtual rather than our natural world ( Fig. 1(c)). This, we suggest, is the essence of Presence. On this understanding, so long as the VR presents us with an environment we can make sense of as we interact with it, we should expect to feel at least some sense of presence. The degree of presence felt will depend on the ease and extent to which we can make sense of the perceptual feedback by which the VE is presented.
Sensorimotor theory (Noë, 2004;O'Regan et al., 2001) tells us that we make sense of the world by our skilful mastery of SMCs; a mastery that we continually build on, interpreting new perceptual experiences in the light of what we have already mastered. Thus, if the SMCs presented by a VR are close to those we are used to, we can easily make sense of them, and feel present in the environment: 'I am in a room' or 'I am in the countryside'; 'I can pick up that cup/flower'; 'I can look behind that table/tree', etc. Novel VR encounters can readily be understood in relation to real world experience: 'that creature is human-like/bird-like/horse-like'. But previous virtual experience can also contribute: for example, experience of moving desktop icons with a computer mouse may help us to recognise the 'picking up' and 'letting go' actions of a new VR system, even though this requires a button press on the hand-controller rather than the grasping action we use in the real world; and even though we cannot feel the object that we are 'holding' in our avatar hand. All this is generally consistent with Slater's PI, in that the valid actions the VR supports, and the SMCs they give rise to, are fundamental to a sense of presence.
Typical Valid Actions in VR include head movement, tracked by the HMD to give rise to the visual flow we are used to in the real world. Similarly, a swinging movement with the hand-held controller can, in an appropriate VR interface, give rise to visual SMCs typical of swinging a bat towards an approaching ball, with auditory and haptic feedback signifying the resulting impact. Even intentional inaction-holding the virtual bat still to allow a ball to bounce on it-can constitute a Valid Action.
The importance of Valid Actions and SMCs finds some echoes in the research behind the three instruments, and is also widely reflected in the literature generally. Klevjer (2011) argues that, in a first person shooter, 'your relationship with the navigable camera [and other controls] has hi-jacked your way of being in the world as a body'. Steuer (1992) relates Presence to 'distal attribution or externalization where, for example, vibration of the controller in the hand is experienced as ball hitting bat. Or, as Klevier puts it, "a car, in the moment of driving, can become ... a prosthetic extension of your own body. You ... start inhabiting the world as a unit of driver-and-vehicle, a new kind of being, an avatar." Biocca et al. (2002)'s work investigating inter-sensory integration has demonstrated that it is possible to use visual cues (but not audio cues) to induce the illusion of haptic feedback. However, when it comes to presence, it was found that although visual feedback cues linked to motor actions did not increase presence, audio feedback cues of the same nature did increase certain aspect of Presence (i.e., spatial presence).

Haptics and presence
Many studies have found that the inclusion of haptic feedback significantly improves task performance (IJsselsteijn et al., 2000;Sallnäs, 1999). However, regarding presence, the results are more inconclusive. For instance, Kaul et al. (2017) found that participants reported higher levels of presence when they received vibration feedback on their head. However, in Viciana-Abad et al. (2010)'s study, participants played the 'Simon says' memory game either in the midair (without haptic) or on a table top (with passive haptic), also with a stylus or their bare hand. It was found that although passive haptic improves both task performance, it significantly increased presence scores only when using a stylus but not with hands. In a more recent study Kreimeier et al. (2019) comparing vibrotactile, force feedback, and visual only, their results were again inconclusive regarding presence. However, there was some evidence that force feedback, compared to vibrotactile and visual only, improves performance in the throwing and stacking tasks. Most of these studies uses lab-made or existing haptic devices that are not part of the current consumer market to generate haptic feedback. Brasen et al. (2018) looked at haptic vibrotactile feedback generated by one of the mainstream VR controllers (i.e. HTC VIVE). In their study, each participant performed two tasks (dialing and stirring) twice, one The substitute interface-the VR system interposed between us and our environment. (c) Our perceptual system tells us that we are 'in' the Virtual Environment depicted through this substitute interface. (Adapted, with permission, from an image by Samuel H Kenyon, https://www.science20.com/eye_brainstorm/enactive_interface_perception_and_affordances-84602).
with vibrotactile feedback and one without. Participants reported that they found the dialing task easier, more immersive, and more real with the vibrotatcile feedback than without. The stirring experience produced similar results (better performance, more immersive, more real). However, this result relies on perceived performance rather than actual performance. Hence, their measure on immersion was based on one question where participants have to indicate directly which method was "more immersive".

Haptics and funnelling in VR
Our object is to explore the contribution that different perceptual modalities in VR make to the player's sense of presence in the VE. The use of visual and auditory feedback is well established in current mainstream VR systems, but haptics are still relatively undeveloped. Vision is generally accepted as the human's primary mode of engagement with our environment (Colavita, 1974;Dennett, 1993). In addition, visual and audio technology are highly developed for a wide range of purposes from entertainment (TV, film, CGI etc) through general IT and AI to specialisms such as teleoperation, robotics and drones. All this provides a natural resource for designing sophisticated audio-visual feedback, which remains the core technology for VR. By comparison, haptic feedback in current mainstream systems is limited to vibrating motors in hand-held controllers: vibration can vary in intensity, frequency and duration, with some scope for symbolic patterns. The range of haptic experience on offer is extremely narrow, compared with the rich array of pressure, texture, vibration, temperature and pain that we encounter daily across our whole body surface, as well as internally. A wider range of experience is available with haptic gloves, vests and whole-body suits (eg Lindeman et al. (2004); Pacchierotti et al. (2017)); and with external devices such as air vortex (Sodhi et al., 2013) and tactile drones (Knierim et al., 2017). However, these are not current commercial systems.
Funnelling is a technique whereby haptic stimuli applied to the hands are experienced as originating somewhere between the hands (Miyazaki et al., 2010). Normally, with a hand-held controller such as Oculus Touch or HTC Vive, a player can experience virtual objects as vibrating in the hand, or (with brief bursts) as tapping or impacting against the palm. This can be extended, by distal attribution, to impacts on virtual hand-held objects, such as a ball striking a bat. Using the technique of funnelling (Lee et al., 2013), more sophisticated experiences can be offered, with the illusion of an impact somewhere between the two hands. In the real world, this illusion requires a physical bridge between the hands (eg a stick or ruler), so that the impact seems to occur on the bridging object (Miyazaki et al., 2010). However, Lee et al. (2012Lee et al. ( , 2013 showed that it can be achieved if a virtual object spans virtual hands visually presented through VR: with no physical object connecting their two hands, participants experienced simultaneous stimuli on their fingertips as a single sensation occurring on the virtual ruler spanning their virtual fingers. In a recent paper, Berger et al. (2018) applied the principle of funnelling to compare different levels of sophistication for haptic feedback in a VR system. Participants were visually presented with avatar hands they could move by means of hand-held controllers. A virtual stick was seen to be resting across the two virtual hands, and haptic feedback from the controllers gave the sensation as of a small ball impacting on the stick. In four experimental conditions, participants received (a) no haptic feedback; (b) Central haptic feedback: equal stimulation at each hand, giving the impression of impact in the centre of the stick; (c) Varied haptic feedback: unequal stimulation at the two hands, so that impacts appeared to be at different locations along the stick; and (d) the same Varied haptic feedback, with the addition of a visual representation of the ball touching the stick. These conditions comprised a single Independent Variable, seen as offering progressively more sophisticated haptic feedback, against which they measured participants' sense of presence in the VR. Rather surprisingly, they found a significant drop in presence for condition (c); they reported this as an "uncanny valley of haptics", due to enhancing haptic feedback without corresponding visual feedback.

Hypotheses
The present study was designed to be comparable with Berger et al.'s experiment (Berger et al., 2018), with additional conditions to cast further light on their surprising key result. The idea of an "uncanny valley" arose from the representation of their data as a linear graph, mapping presence against four experimental conditions: (1) "no feedback"; (2) "haptic generic", with haptic feedback representing impacts in a single location at the centre of the stick; (3) "haptic spatialised", with haptic feedback representing impacts in varied locations on the stick; and (4) "haptic plus visual spatialised", with the addition of visual feedback to the haptics of condition 3. The authors represented these experimental conditions as contributing to a single independent variable (IV), which they characterised as "increasing sophistication of haptics".
Our initial interest was to find whether the unexpected drop in presence between haptic generic and haptic spatialised would also appear between generic and spatialised conditions with only visual feedback; also, whether visual only feedback would give rise to a greater or lesser sense of presence than haptic only. To achieve this, we separated their single "sophistication" IV into two IVs: Modality (with values of Haptic only, Visual only, and Haptic plus Visual); and Location (with values of Central (their "generic") and Varied (their "spatialised")). This 3x2 experimental design gave rise to 6 conditions, numbered 2-7, as illustrated in Fig. 3. To these we added condition 1 (not illustrated), with neither Haptic nor Visual feedback and therefore, in effect, neither Central nor Varied Location (their "no feedback"). With the additional conditions, our approach provides a more systematic structure in exploring the relationship of how haptic and visual feedback individually and collectively interact with different aspects of presence. Condition 1 was taken as a baseline measure for both IVs: responses in all other questions were calculated relative to this baseline (see Section 4).
Our hypotheses were: H1. Events occuring in varied locations will give rise to a greater sense of presence than in a single location. H2. Bimodal perceptual feedback (haptic plus visual) will give rise to a greater sense of presence than either single modality alone. H3. Visual feedback alone will give rise to a greater sense of presence than haptic feedback alone.
These hypotheses were framed to provide a context for Berger et al.'s "Uncanny valley of haptics" which appears to contradict Hypothesis 1. Hypothesis 3 is based on the accepted understanding that vision is our primary mode of perception (see Section 2.4).

Participants
A total of 26 participants were recruited from the students and local community of the university. Of these, three withdrew because of technical issues. Participants classified themselves by age-range, gender and experience with IT, digital games and VR, as shown in Table 2. Of the 23 who actually took part, almost all (20) were under 35. They included 15 women, 7 men and one non-binary. All were regular IT users, and a majority (19) played digital games monthly or more. Most (17) had experienced VR "hardly ever" or "never". The non-gamers were also among those with little or no VR experience.

Study environment
A VR experience was developed in the Unity environment 1 comparable to that of Berger et al. (2018), as illustrated in Fig. 2. The experience was delivered through an Oculus Rift S headset 2 , with Oculus Touch handsets to control avatar hand movement, and to deliver haptic feedback 3 .
In each condition, a small ball dropped on to the stick and bounced back off it. This was programmed to approximate to motion under gravity, but modified to ensure that the ball always hit the stick at the desired location. Thus, the action was realistic so long as the stick was held still, but could be distorted. In visual conditions, the ball was rendered visually, but rendering was disabled for non-visual conditions. In haptic conditions, the Oculus Touch hand controller vibrated briefly to give the impression of an impact in the hand. Haptic feedback in the Touch controller is supplied by a vibrating motor, so a single, clean impact cannot be delivered; rather, we looked for a burst of vibration short enough to feel like an impact, but long enough to be clearly detectable. In preliminary trials, we found that, at maximum amplitude of 255, a duration of 80ms gave a good approximation; this was also consistent with Berger et al.'s experiment 4 .
To trigger this activity, the participant had to align the stick with a "cloud", as shown in Fig. 2(c). The "cloud" then disappeared and, for the remainder of each condition, the participant had only to hold the stick in position. In any condition, ten groups of ball bounces were programmed, together lasting about 30 seconds. Within a group, the ball bounced five times at the same location. In Central location conditions, all groups were located at the centre of the virtual stick; in Varied location conditions, groups were randomised between the five positions shown in Fig. 3, visiting each location twice per condition.

Study conditions
The experiment used a 3x2 factor, within participants design with independent variables Modality and Location. This gives six conditions (numbered 2-7) as illustrated in Fig. 3. Factors for the Modality IV comprised different combinations of haptic and visual feedback received by the participants, as shown in the diagram 5 , Factors for the Location IV related to how sophisticated the feedback was: "Central", where the ball was programmed to bounce only on the centre of the stick; and "Varied", where groups of impacts varied among five positions on the stick-central, left or right end, or at the quarter or three-quarter position. To these six, a "No Feedback" condition (Condition 1) was added, providing a baseline against which the other conditions can be measured. Participants experienced the seven conditions in counterbalanced order to avoid carry-over effects. A sequence of four 7x7 Latin Squares was generated, and the first 23 rows used for counterbalancing.

Questionnaire
The effect of the different conditions was measured through a questionnaire in terms of the participants' sense of presence in the VR, and of their experience of the ball-bounce events as depicted visually and/or haptically. The questionnaire was adapted from that used by Berger et al. (2018) for their haptic-centred study, so that our study would be as nearly comparable with theirs as possible. Their questionnaire, in turn, derived from a subset of the Igroup Presence Questionnaire (Schubert et al., 2001), with additional questions relating to the specific action of their study environment 6 .
With adapted wording to allow for the inclusion of additional visual conditions, and to provide consistent responses across all items, our resulting questionnaire comprised a series of 14 statements which the participants ranked on a seven-point Likert scale from "Disagree completely" (coded 1) to "Agree completely" (coded 7). Responses were collected through a VR scene following each condition scene, and stored as comma separated numerical values (csv) for analysis. Using a questionnnaire scene avoided the need for participants to remove the headset between conditions: a potential break in immersion which might take time to rebuild each time.

Main and supplementary questions
Responses to the ten numbered questions in Table 3 form the basis of this study. Four supplementary questions were included for validation only: the first two confirmed that participants could discriminate between Central and Varied location conditions; the remaining two confirmed that they experienced Visual and Haptic feedback in bimodal conditions as depicting a single event in a single location.

Study results
Data from the participant responses were compiled into a single dataset with one row per participant per condition (161 rows in total). As discussed in Section 3.4, Condition 1 (No Feedback) provides a baseline against which responses in other conditions are measured. Responses to three questionnaire items-1, 4 and 8-with negative sense were reversed to bring them in line with the questionnaire as a whole.
For each participant (p), the response R p,q,1 to a given questionnaire item (q) in Condition 1 was subtracted from each of that participant's responses R p,q,c to that question in the other 6 conditions (each condition c). This gave a set of Relative Responses RR p,q,c each relative to the baseline R p,q,1 for that question by that participant, as follows: All Condition 1 (baseline) responses were then removed from the dataset, as they were not to be included in analysis.
The Relative Responses fell in the range -6 to +6, with the baseline at 0. These were normalised by mapping them on to the range 0 to 1, with the baseline at 0.5. Analysis was carried out on this normalised dataset.
Row mean values were calculated to give the mean normalised response from all questionnaire items for a participant in a single condition; this is taken as a measure of that participant's sense of presence for that condition. Analysis of Variance was carried out using the SPSS statistical package, V23 7 . Principal Component Analysis was carried out using Scikit Learn in Python 8 .

Descriptive statistics
In the dataset, each row was coded for Modality (IV1) and Location (IV2). Thus, each row for Condition 2 was coded {Haptic, Central}, Condition 3 {Visual, Central} etc. Fig. 4 shows mean responses plotted against the two independent variables, using the GGraph facility in SPSS. In this box plot, the dashed line shows the normalised baseline against which all responses were measured. The order of Modality levels on the x-axis, and of groupings within the x-variable (Location) is designed to show the effect of increasing the sophistication of feedback. Thus, Central→Varied Location and Haptic→Visual→Both Modalities each represents an expected trend of increasing sophistication. The arbitrary placing of Haptic before Visual along the x-axis reflects the expectation of Hypothesis 3 (Section 3) that Visual would contribute more to user experience. The overall trend of the plot indicates that engaging both modalities offers a greater sense of presence than either one alone. For each Modality condition, it appears that Varied Location also enhances presence compared with Central. However, the plot also suggests that Haptic alone offers a better experience than Visual alone, contrary to expectation. Finally, we note that the Haptic Spatialised condition appears to give greater presence than Haptic Central, contrary to Berger et al.'s reported result. This surprising result may be due to differences in methodology to accommodate the addition of visual conditions, as discussed in Section 5.2.

Analysis
Two-way repeated measures ANOVA was carried out in SPSS. For the 3 levels of Modality, and for the interaction between Modality and Location, Mauchly's Test for sphericity was included: in cases of nonsphericity, adjustments are required to correct for increased Type 1 error rate. Sphericity was not violated in either case (Mauchly p = 0.12 and 0.28 respectively), so no correction was needed (Field et al., 2012).
Effect sizes were calculated using Partial η 2 (η 2 P ). The first section of

Location
ANOVA analysis shows a main effect for Location (F(1, 22) = 9.98; p = 0.005), with a large effect size of 0.31. Thus, we can say that the sense of presence arising with Varied locations is substantially greater than for Central only.

Perceptual modalities
ANOVA analysis shows a main effect for Modality (F(2, 44) = 13.67; p = 0.000). A large effect size of 0.38 is seen. Pairwise comparison (Table 5, Main Analysis) strongly supports an effect between Haptic and Both modalities (p = 0.001), and between Visual and Both (p = 0.000);  Supplementary Questionnaire Items for validation It seemed as if different events occurred at different locations between the virtual hands Within each group of 5, events seemed to occur at a single location I could locate where the events seemed to occur on the stick The appearance of the ball and the feeling of impact seemed to belong together * Responses to questionnaire items that suggest negative Presence were reversed before normalisation Note that the Principal Component loadings shown here will be addressed in Section 4.3 below. this confirms the suggestion in the box plot that feedback in two modalities is better than one. The more surprising indication, that Haptic feedback alone gives rise to a greater sense of presence than Visual alone, falls short of statistical significance (p = 0.069), although this might merit further study.

Interaction effect
There is no evidence of an interaction effect between Location and Modality (F(2, 44) = 0.30; p = 0.75). It seems that the two are not dependent on one another.

Principal component analysis
PCA was carried out using Scikit Learn in Python, to explore possible factors contributing to the results obtained. Sampling adequacy for this analysis was confirmed by the Kaiser-Meyer-Olkin measure (Kaiser, 1970): MSA = 0.77, which Kaiser regards as "middling'. Bartlett's test (p = 0.000) confirmed that the assupmtion of sphericity was not violated. Whilst this analysis is somewhat limited by the small sample size, and an apparent imbalance in the questionnaire, it does suggest some interesting avenues for further research.
In the PCA, three components had eigenvalues over Kaiser's criterion of 1 (Kaiser, 1960), together explaining 72% of the total variance. As all three components gave negative loadings for the highest contributing questionnaire items, we have used the inverse of each PC for ease of interpretation. Items contributing most to each Principal Component are listed in Table 6. Table 3 shows the PC loadings to be applied to each item for further analysis, with the highest contributors flagged in bold italic.
Items 10 and 9 contributed most to the variance represented by PC1, with item 10 contributing 76%. The variance represented by PC3 also came mostly from the same two items, with item 9 contributing 61%. Together, these two items contribute 35% of the total variance found in this data.
The variance represented by PC2, by contrast, is fairly evenly spread across the questionnaire. Items 1, 3, 8 and 9 contribute a little more than the others, with the highest from item 9 at only 23%.

Interpretation of principal components
Of the three PCs, both the first and third components loaded mostly on items 9 and 10, which relate to the impact of ball on stick. The third loaded fairly evenly across all items.
Item 10 contributes three-quarters of the variance in the first component, with item 9 adding nearly a third of the remainder. The same two items dominate component 3, with item 9 contributing almost two-thirds, and and nearly half of the remainder from item 10. Although both relate to impact events, it seems that there is an underlying distinction. Item 9 concerns participants' awareness of the ball's action against the stick, whilst item 10 reflects awareness of the location of that action on the stick. We suggest, therefore, that PC1 represents a participant's ability to discern and locate ball-bounce events ('Locate Events') PC3 is harder to interpret, particularly as this largely accounts for residual variance after PCs 1 and 2. Its main contributor is item 9 ('impacts happened'), with a substantial negative contribution from item 10 ('where impacts happened'); we can consider this latter contribution as reflecting uncertainty about impact location. Thus, we suggest that PC3 represents participants' ability to discern impacts that they cannot locate ('Discern Events'). The level of this composite factor would be increased when participants are most aware of the ball's action against the stick; and also increased when they are most uncertain about its location.
PC2 was the only component that loaded to any extent on items other than 9 and 10; even here, item 9 was still the highest contributor, albeit by a small margin. In fact, if we combine the contributions of items to this component by their IPQ categories (Spatial Presence, Involvement and Realness), these categories scored fairly evenly at around 30% each. We suggest, therefore, that PC2 reflects participants' general sense of presence (General Presence), in all its aspects.

ANOVA On principal components
To investigate the effects of our IVs on the PCs, two-way repeated measures ANOVA with pairwise comparison of variable levels was carried out on mean normalised responses, after loading for each inverted PC in turn. The results from these analyses are given in the relevant sections of Tables 4 and 5.
Box plots of these loaded responses are shown in Fig. 5. Because none of the ANOVAs showed any interaction effect (see below), separate plots are given for Modality and Location, for each PC.
PC1-Locate Events With PC1 weightings, we can see evidence of a main effect for Modality (corrected F(1.39, 30.57) = 9.50; p = 0.002). There is no evidence of a main effect for Location, nor for any Interaction effect. Pairwise comparisons confirm significant differences between each single modality and Both (Haptic∼Both p = 0.002; Visual∼Both p = 0.013); and between Haptic and Visual alone (p = 0.045). The mean differences shown in the Pairwise Comparisons indicate that Both modalities scores more highly than Visual only, which in turn is higher than Haptic only. This is reflected in the box plot for PC1 Modalities (Fig. 5a). Figure (5 d) confirms that there is little difference in PC1 between Central and Varied Locations. Thus we conclude that participants found it hardest to locate impacts with only Haptic feedback, somewhat easier with only Visual feedback, and easiest of all when Both Modalities were in play. The smaller positive contribution from item 9 is consistent with the idea that if the impact is hard to discern, then locating it will also be difficult. This ability to locate events was unaffected by whether impacts were all in Central or in Varied Locations.
PC2-General Presence With PC2 weightings, we can again see a main effect for Modality (corrected F(1.66, 36.53) = 17.22; p = 0.000); also for Location (F(1,22) = 9.602; p = 0.005). Again, ANOVA offers no   evidence of an Interaction effect. Pairwise comparisons support significant differences between all three pairs: Haptic∼Visual (p = 0.008); Haptic∼Both (p = 0.004); Visual∼Both (p = 0.000). The Modalities box plot (Fig. 5b), and the mean differences shown in the pairwise table, indicate that PC2 is greatest for Both modalities, lowest for Visual only, with Haptic only between the two. Fig. 5e shows that PC2 is greater for Varied Location than for Central. We conclude that different Modality conditions had a substantial effect on General Presence: with Haptic only giving rise to a greater sense of presence than Visual only, and Both Modalities greater than either alone. This was the only Principal Component for which Location also showed a significant effect, with Variable Location offering a greater sense of presence than Central. Both these results are very similar to our main analysis of the data without taking account of principal components; although in that analysis, the difference between Haptic only and Visual only fell short of statistical significance, whilst here it is clearly significant. PC3-Discern Events (with uncertainty of location) With PC3 weightings, ANOVA again offers a main effect only for Modality (corrected F (1.35, 29.69) = 23.43; p = 0.000) with no evidence for Location or interaction effects. Pairwise comparisons again confirm significant differences between all three pairs: Haptic∼Visual (p = 0.000); Hap-tic∼Both (p = 0.002); Visual∼Both (p = 0.000). The box plot (Fig. 5c) and mean differences show that PC3 is greatest for Haptic only, lowest for Visual only, with Both modalities lying between. The Location box plot ( Fig. 5f) again confirms very little difference in PC3 between Central and Varied Locations.
Since awareness of impacts is the greater contributor, we conclude that participants were most aware of the ball's action against the stick in the Haptic only condition; at the same time, as we have seen from PC1, uncertainty about location was also greatest for this condition, which would further increase the level of PC3. By the same argument, Visual only feedback appears to have a lower awareness of impact, along with less uncertainty about location, which leads to the lowest levels of PC3 in this condition. When Both Modalities are active, participants benefit from a high awareness of impact through the Haptic feedback, and from low uncertainty about location through the Visual impact; since these affect PC3 in opposite directions, we find that Both Modalities falls between the two individual modalities in this component.

Haptic vs visual feedback
Our main experimental results show clearly that users have a greater sense of presence when the sophistication of feedback is increased (from Central to Varied, Hypothesis H1). Receiving both Haptic and Visual feedback also clearly enhances presence compared with either modality alone (Hypothesis H2). These effects appear to be independent of each other, and are very much what might be expected. However, we note that Berger et al.'s "Uncanny valley", which would be contrary to H1, was not replicated. This will be considered in Section 5.2.
More surprising is the suggestion that the sense of presence may be greater with Haptic only feedback than with Visual only (counter to Hypothesis H3). Although not significant in the main analysis, this suggestion is supported by a significant increase in PC2 (which we have interpreted as reflecting Presence generally) for Haptic vs Visual feedback. Since vision is accepted as the primary source of human perception (Colavita, 1974;Dennett, 1993), it was reasonable to suppose that this modality would make the greater contribution.
On the holistic view of perception, we might better understand these two conditions as "minus Visual" and "minus Haptic" 9 : both equally unrealistic, compared with the real world, for events of this kind. It is hard to envisage a real-world situation where the stick and hands are visible, but the source of impact can't be seen. Unlike the coffeemachine-next-door (Section 2), we can't account for the discrepancy by visual occlusion. Equally, if we can see a ball hitting a stick touching our hands, we'd expect to feel the impact.
On the other hand, for most of our participants, "real world" SMCs are not the only SMCs of which they have skillful mastery. Regular engagement in interactive digital games would give rise to their own set of SMCs for players to master. Few players will be unfamiliar with the wide range of audio-visual only games that have dominated the market for many years. In suspending disbelief in relation to the absence of haptics and other sensory modalities, so as to master these games, players will have added new sets of audio-visual SMCs to their repertoire. With more recent games, audio-visual-haptic SMCs will also be added. Each gamer or non-gamer will bring to the VR a very different range of mastered SMCs, some including haptics, others not: all this may well influence what seems "real" to them, and how they respond to the inclusion or exclusion of a particular modality in the VR.
Reduced perceptual coherence might also contribute to this result. Limitations in the algorithm for the ball's visible movement-almost, but not quite, motion under gravity-might subliminally interfere with participants' making sense of the events; even though this feedback is consistent, participants might need more time to master these new SMCs fully. In the same way, haptic representation of the ball's hitting the stick as brief vibrations rather than as single impacts, whilst not consciously noticed, might have a similar effect. Further study is needed to explore this.

Relation to Berger et al. (2018)'s study
One of our motivations was to better understand the "uncanny valley" that Berger et al. reported, by broadening the scope to include comparison between vibrotactile, visual and bimodal feedback. However, our study did not replicate the "uncanny valley".
As with Berger et al.'s study, the overall trend of Presence vs sophistication (whether of Location or Modality) in our study was upward.
The key difference was that their "Haptic/Spatialised" condition gave a lower sense of presence than their "Haptic/Generic"; in our corresponding conditions, Haptic/Varied (Condition 5) gave a greater sense of presence than Haptic/Central (Condition 2). A number of factors may have contributed to this. Whilst we aimed to replicate the earlier study, there were methodological differences. The addition of three visual or visual/haptic conditions may have affected participants' overall experience of the experiment. Although conditions were counterbalanced, many participants experienced some form of visual feedback before the two haptic only conditions, which might have influenced their expectation of later conditions. Changes in the questionnaire to accommodate the use of both visual and haptic feedback may also have had an effect. The 3x2 factor design also required ANOVA analysis rather than their simpler set of pairwise comparisons. Other differences in methodology are less likely to have had an effect: participants seated rather than standing, handling of the trigger mechanism, or visual depiction of ball motion rather than simple appearance and disappearance on the stick. It appears that Berger et al.'s pairwise comparisons were based on their first Principal Component, although this is not entirely clear. Our PCA gave very different results: as participant number in both studies were low for PCA, we should place more confidence in the main analysis; although our PC2 cloesly reflected the main results. Again, there may have been differences in the study population with their participants drawn from a corporate environment, as against the university context of ours. It is important to note that our participants were predominately female (17 out of 23), which is different from most VR studies where females are often underrepresented (Peck et al., 2020). Finally, our questionnaire scenes were designed to avoid breaking immersion by the removal of the headset between conditions. However, it may be that the nature of these questionnaire scenes introduced a different kind of confounding. This is something that we should seek to review in future experiments.
Our results have widened our understanding of the relationship between perceptual feedback and presence. By separating Berger et al.'s single IV into perceptual modality and location of impact-each an aspect of the complexity of feedback-we have shown that the effects of these two IVs are not related. By introducing conditions with visual and bimodal feedback to match those of haptic feedback, we have also seen that visual only may not, as expected, give rise to a greater sense of presence than haptic only; although it is clear that both together consistently enhance presence more than either alone.

Discernment and location of impacts
Analysis of PC1, interpreted as participants' ability to Locate Events shows that participants found this easier with Visual feedback than with Haptic, and easiest of all with Both. Given the dominance of vision in human perception, this is unsurprising. It seems likely that in a comparable real-world situation, we should find it more difficult to locate impacts of a real ball on a real stick with eyes closed than with eyes open, although we have not found any empirical study on this.
Our results on participants's ability to Discern Events, even when they are unclear about their location, are confused by the composite nature of PC3. It is reasonable to conclude that this discernment is greater in Haptic only conditions than in Visual only. We suggest that discernment may be greatest with Both Modalities, but that the composite PC3 is pulled down because uncertainty of location is lowest for these conditions. Further study will be required to clarify this.

Conclusions and future study
Of our three hypotheses, our results support H1 and H2, so we conclude that more sophisticated feedback, both in terms of Varied vs Central Location, and of bimodal vs unimodal feedback, does lead to a greater sense of presence, as expected.
More interesting, Hypothesis H3-that Presence is greater with Visual only than with Haptic only feedback-is not supported. Our results, while not conclusive, suggest on the contrary that Haptic only feedback contributes more to Presence than Visual only. Further study with a larger population may provide confirmation or otherwise of this suggestion. If confirmed, then a between subjects exploration with a better balanced questionnaire, focused on Modality alone, would be of interest: to clarify the contribution of participants' range of real and virtual perceptual experience; and of the relative plausibility of technologygenerated visual and haptic feedback.
Our exploration of principal components suggests that Haptic and Visual feedback also have different effects on participants' ability to discern the ball's action on the stick, or to identify where on the stick this happens. Unsurprisingly, Visual feedback offers more accurate location of the action, but it seems that that action is more readily discerned with Haptic feedback. However, the latter effect is unclear because it appears mainly through PC3, which only accounts for about one tenth of total variance; also because it is closely associated with uncertainty about location. Again, further study with a larger population would offer a more robust PCA might offer a clearer picture.
In our study, we focused particularly on the vibrotactile form of haptic feedback as this technology is now routinely available with current consumer VR devices. This should allow our results, and those of related research, to be applied directly to the design of current VR experiences. For example, we offer a better understanding of the often subtle relationship between presence and different perceptual modalities. This could inform decisions of when and how to design for increased presence, or to sacrifice presence for different aspects of user experience. Developers may, for example, choose to subvert the accustomed SMCs of the real world as part of their game mechanics.