The Use of Virtual Reality in the Study of People's Responses to Violent Incidents

This paper reviews experimental methods for the study of the responses of people to violence in digital media, and in particular considers the issues of internal validity and ecological validity or generalisability of results to events in the real world. Experimental methods typically involve a significant level of abstraction from reality, with participants required to carry out tasks that are far removed from violence in real life, and hence their ecological validity is questionable. On the other hand studies based on field data, while having ecological validity, cannot control multiple confounding variables that may have an impact on observed results, so that their internal validity is questionable. It is argued that immersive virtual reality may provide a unification of these two approaches. Since people tend to respond realistically to situations and events that occur in virtual reality, and since virtual reality simulations can be completely controlled for experimental purposes, studies of responses to violence within virtual reality are likely to have both ecological and internal validity. This depends on a property that we call ‘plausibility’ – including the fidelity of the depicted situation with prior knowledge and expectations. We illustrate this with data from a previously published experiment, a virtual reprise of Stanley Milgram's 1960s obedience experiment, and also with pilot data from a new study being developed that looks at bystander responses to violent incidents.

setting that mirrors a real-life situation, but with, for example, the experimental design manipulating particular factors of interest in exposures of different groups of subjects to the experience.
Here we argue that immersive virtual reality is especially interesting for the study of how people respond to violent incidents where a perpetrator attacks a victim. We are interested in the circumstances under which bystanders are likely to intervene in order to prevent harm to the victim. Immersive virtual reality provides an ecologically valid setting in which to study this issue while at the same time removing the problem of physical danger, and overcoming the many ethical issues involved in the study of violence.
In the rest of this paper we fi rst briefl y review some of the literature relating to responses to violence in desktop based systems such as video games, before going on to describe what we mean by immersive virtual reality, and how this is profoundly different with respect to engaging people in realistic responses to virtual situations. Next we review one experiment where participants infl ict violence on virtual characters in immersive virtual reality, a reprise of the Stanley Milgram obedience experiments. Next we apply these ideas to new research that is using virtual reality to study bystander responses to violent incidents, and we describe some qualitative results from ongoing pilot studies. Finally we discuss recommendations for the use of virtual reality in the study of violence. Our experiments discussed in this paper have been approved by the UCL Research Ethics Committee.

INTRODUCTION
There has been a long-standing methodological debate in social psychology contrasting laboratory-based experiments with observational data based on studies of natural occurrences in the fi eld (Anderson and Bushman, 1997;Anderson et al., 1999). This is especially the case in the domain of the study of violence and aggression, the focus of this paper, where particular laboratory procedures for the investigation of aggression remain a matter of controversy with arguments on both sides, for example (Ferguson et al., 2008;Giancola and Parrott, 2008;Ferguson and Rueda, 2009). It has been argued that while lab-based experimental studies can have internal validity, they are typically not generalisable to 'real life' , they are not 'ecologically valid' (Schmuckler, 2001). Studies of events in the fi eld are ecologically valid (they are based on events that have actually happened) but are likely to have low internal validity, with results based mainly on correlations from which it is diffi cult to extract causal relationships because of the lack of control of confounding variables. It has been argued before (Loomis et al., 1999;Blascovich et al., 2002) that immersive virtual reality offers a possible way out of this problem for psychology and social psychology since there is evidence that people tend to respond realistically to virtual simulations of real-life events, but on the other hand, the portrayed situation is completely under the control of a computer program that can be designed to present scenarios conforming to a laboratory controlled experiment. Hence both internal and ecological validity may be possible using this technology -an experimental The use of virtual reality in the study of people's responses to violent incidents

VIOLENT BEHAVIOUR IN VIDEO GAMES
The vast majority of research on violence in the context of digital media has been concerned with the extent to which the engagement with violence in video games might increase the chance of people engaging in aggressive acts in real life. This type of research has a very long history going back to the 1950s when there was concern about the effect of television on children (Himmelweit et al., 1958) with studies in particular concentrating on the possibility that exposure to violence on TV causes violent crime -for example (Messner, 1986) who found this hypothesis not to be supported. In recent years the preoccupation with the impact of violent content in TV has shifted to the effects of playing violent video games where results appear to suggest that there is an effect: that exposure increases physiological arousal and propensity to aggression (Anderson and Bushman, 2001) and that high trait aggressive individuals are more prone to be affected than low trait aggressive individuals (Bushman, 1995;Anderson et al., 2008). Anderson et al. (2003) go so far as to say that the evidence is 'unequivocal' that exposure to violent media positively impacts aggressiveness amongst youth. However, some of these results have been called into question as the product of publication selection bias (Ferguson, 2007a(Ferguson, , 2009). Moreover Ferguson et al. (2008) provide evidence against the ecological validity of one of the standardized tests used in the laboratory, the Taylor Competitive Reaction Time Test (TCRTT), where the experimental subject enters into a reaction-time competition with an opponent, setting an electric shock level that the opponent would receive upon losing a particular round (and likewise receiving shocks from the opponent) -the opponent in fact being a confederate or computer program. (There are variations of this test using sound blasts rather than electric shocks). Their results suggest that there is non-signifi cant statistical correlation between scores of aggression obtained from the TCRTT and trait aggressiveness, and that the measurements obtained from the test do not clearly measure the degree of aggression exhibited in the experiment.
Whatever the validity of such laboratory-based tests it has to be admitted that such procedures are very far from events in real life. It is argued that such tests can be thought of as abstractions from the complexities of real life, and that when employed in theory and hypothesis guided confi rmatory studies they can provide us with valuable insight into some of the mechanisms and relationships involved in the production of aggressive behaviour. However, they still fall far from the mark in providing any confi dence that people would actually act in the predicted way when confronted with events that they know to be naturally occurring rather than in the constraints of a laboratory (with all the known, given, and explicitly stated controls that pertain therein). One alternative to this paradigm is to assess the thoughts and feelings evoked in people during stressful experiences -for example, the Articulated Thoughts During Simulated Situations paradigm where subjects are encouraged to voice their thoughts and feelings in real-time during such events (Davison et al., 1983). This has been employed, for example, in comparing the different beliefs and cognitive biases of men who are violent or non-violent towards their spouse (Davison et al., 1983), and dating violence amongst teenagers (Rayburn et al., 2007). Other approaches that rely on people's verbal reports look at how they say they would behave when confronted with a real situation -for example (Laner et al., 2001), and a questionnaire approach to how a rape prevention program infl uenced prosocial behaviour in the context of bystander intervention is exemplifi ed by (Banyard, 2008).
Methodological approaches for the exploration of violent behaviour include on the one hand experimental paradigms that abstract away from real-world issues to focus on highly specifi c features of an artifi cial situation, and on the other, methods based on verbal reports and questionnaires in response to descriptions or video viewing of a situation -see also (Levine, 2003) for a review of different methodologies. We believe that each of these exhibits a kind of 'reality gap' -that the lab-based experiments may suffer from the problem of ecological validity, and the verbal report methods cannot take into account the fact that people may not act in the way that they say they would when actually confronted by a real situation. In a study relating to bystander intervention Levine et al. try to bridge the reality gap by placing experimental subjects into real-world scenarios that closely relate to the bystander situation -in this case a confederate wearing (or not) a particular football team shirt faked an injury, and the experimental subjects all recruited from fans of that team were observed with respect to their degree of intervention to help the victim (Levine et al., 2005). This does not show directly how people would behave in a situation regarding violence, but does help to elucidate factors involved in promoting prosocial behaviour -such as in-group and out-group identifi cation.

IMMERSIVE VIRTUAL REALITY AND THE REPRISE OF MILGRAM'S OBEDIENCE EXPERIMENT
An immersive virtual reality (IVR) system is functionally and ideally one that displays life-sized simulated environments consistently in all sensory modalities, that completely surround the participant, and where the displays are a function of real-time body tracking, in particular head-tracking. There are various types of system, the most common being head-tracked head-mounted displays and stereo surrounding projection systems generically referred to as Caves (Cruz-Neira et al., 1992). An overview is given in Sanchez-Vives and Slater (2005), with a conceptual review of virtual environments presented in Ellis (1991). A critical aspect that we take as part of the very defi nition of an IVR is that such a system affords the possibility of perception through sensorimotor contingencies (SCs) (Noë, 2004) that approximate reality. In other words participants can use their body to perceive in much the same way as in physical reality -moving their eyes closer to a (virtual) object in order to see it more closely, moving their head to see past an obstacle, bending their whole body down in order to look underneath something, reaching out with their whole body and hands to grab something, and so on. This unifi es both the tracking and display capabilities of the system: realistic SCs require, for example, natural visual processes such as automatic head-turns in response to events in peripheral vision (thus peripheral vision must be enabled requiring wide fi eld-of-view displays), high visual resolution (otherwise looking closely at an object has severe limits), generalized haptic feedback (so that a collision with any part of the body can be felt) and so on. In reality systems in existence today offer crude approximations of natural SCs -perhaps only fl ight training simulators come anywhere close, typically being mixed-reality systems that deploy virtual displays within physically accurate cockpit settings.

Rovira et al. Violent incidents in virtual reality
We have argued elsewhere that when perception can be achieved through approximations to natural SCs participants can experience Place Illusion (PI), the illusion of being in the place depicted by the IVR (Slater, 2009). This illusion of 'being there' in the virtual scenario is most often referred to in the literature as presence (Held and Durlach, 1992;Sheridan, 1992;Barfi eld et al., 1995;Ellis, 1996;Sheridan, 1996;Slater and Wilbur, 1997;Draper et al., 1998), but increasingly this term has come to be over-interpreted with many different meanings -including, for example, presence as a result of watching a movie (Hu and Bartneck, 2008) or even using an iPhone (Bracken and Pettey, 2007). By PI we mean strictly the strong illusion of being in the place depicted by the IVR system, a place where you can use your body to perceive as if it were a real place.
PI refers to a static aspect of the response to virtual reality -it endows the experience with a place-like sensation, but the environment itself could be completely uneventful. Plausibility (Psi) refers instead to the dynamics within a virtual environment, to the unfolding events. A Psi Illusion occurs when the events that are happening within the virtual environment are taken as real -not of course that participants believe that they are real but that they fi nd themselves exhibiting automatic behaviours and responses as if the events were real. For example, a human-looking virtual character talks to and smiles at a participant who in turn talks to and smiles back at the character knowing for sure that in reality there is no one there. Psi is a more diffi cult illusion to engineer than PI. We believe that it requires the following features implemented within the IVR: First, correlational -that actions of the participant result in correlated reactions within the virtual world. For example, the participant stares at a virtual character who as a result stares back, or the participant walks through a crowd of virtual characters who move away in order to allow a path through for the participant. Second, self-reference, meaning that there should be aspects of the environment that contingently refer directly to the participant -for example, a character spontaneously speaks to or otherwise engages with the participant in a way that unambiguously signals the presence of the participant to the character. Third, and the most diffi cult to achieve is credibility -that when the scenario depicts events that could happen in physical reality, that they unfold according to the knowledge and prior expectations of the participant. This credibility aspect typically requires a great deal of detailed domain knowledge on the part of the scenario designer.
It is our hypothesis that when PI and Psi operate together then participants will respond realistically to virtual events and situations: PI locates the participant within the virtual space, Psi is the illusion that what is happing there is real. Automatically, in spite of cognitive knowledge that nothing real is happening, people do fi nd themselves responding realistically to the IVR experience. This relates to the issue of construct validity in psychology research (Cronbach and Meehl, 1955) -which is concerned with the question as to the extent to which a measure of a trait (e.g., 'aggression') really does measure what it is supposed to measure. However, in our approach to date the specifi c problem of construct validity does not arise, since we are not attempting to measure a trait but rather to observe how people do respond within virtual reality when they become a witness of a violent attack by one (virtual) person on another. The important issue is the extent to which the responses of people are generalisable to how they might behave when confronted with a similar situation in reality. Our research rests on the theoretical framework discussed briefl y above, which suggests that such realistic behaviour is likely when there is PI and Psi, and we have given some pointers as to how these may be achieved. There is also a lot of evidence that people do respond realistically in virtual reality, even to the extent that virtual reality has been successfully used in psychotherapy -for a review see (Rizzo and Kim, 2005). There are also several examples of this discussed in (Sanchez-Vives and Slater, 2005;Slater, 2009).
We consider an example now in more detail to illustrate these points in the context of violent behaviour -the virtual reprise of the Stanley Milgram obedience experiments (Slater et al., 2006). Milgram's series of experiments carried out mainly in the 1960s -the whole collection is described in depth in Milgram (1974) -examined the conditions under which an authority fi gure might persuade ordinary members of the public to carry out actions that would harm a stranger. The scenario is very wellknown, and nearly 50 years after the original experiments, there is still substantial reference to the original work which in spite of being highly controversial, has had a major impact on social psychology -for example (Benjamin and Simpson, 2009;Blass, 2009) two of the six papers in the January 2009 issue of the American Psychologist journal devoted to Milgram's experiments. The basic paradigm was a supposed word-pair learning experiment, where the learner was given an electric shock of increasing voltage each time he chose the wrong word as being paired with a cue word. The learner was a confederate, and the experimental subject administered the shocks. The question was how high a voltage would the subject administer to the learner -especially given the increasing and vociferous protests from the learner that he wanted to stop the experiment. In the basic condition 60% of subjects gave the maximum shock of 450 v. Due to the ethical outcry that followed publication of these results these experiments have not been replicated, until a recent partial replication (Burger, 2009;Miller, 2009).
In 2006 we carried out one of Milgram's conditions using an IVR system. The purpose was not to explore obedience as such, but rather to use the paradigm to explore the extent to which people would exhibit signs of realistic response, in particular stress at giving the shocks to a virtual character. The experiment was carried out in a Cave-like system, a Trimension Reactor which has three walls and the fl oor as projection screens that deliver a real-time surrounding stereo image to the participant as seen through stereo shutter glasses. The participants also wore a head-tracker so that the visual displays were updated as a function of head movement and head-gaze direction. Figure 1A shows the setup of the environment. The learner was the virtual woman who would appear as if behind a glass partition in front of the participant who would be seated by the table on which there was located an 'electric shock' machine. The experimenter would sit to the right of the participant, and answer any questions during the course of the procedure. The participant's view is shown in Figure 1B. It is important to realise that these pictures cannot convey the vital role of stereo vision and headtracking -by which the participant would perceive the environment through close to natural sensorimotor contingencies. Moreover, this was a mixed-reality setup up, since the chairs, the desk, the shock machine, and the real experimenter were physically present, and the virtual scenario was blended together with this. The cue word and four possible responses were shown on the wall in front of the participant as can be seen in Figure 1. For each of the 32 trials there was one cue word (the topmost one) and four possible responses shown underneath. The correct response was the one in capital letters. The participant read out all fi ve words to the learner, and she then responded with one of the four words. If the answer was the wrong one then the participant was instructed to advance the voltage on the shock machine by one unit, and press a button on the machine to administer a shock. In response to the shocks the learner would show signs of becoming increasingly uncomfortable eventually demanding that the experiment be stopped. There were 23 participants in this 'visible condition' who saw and interacted with the virtual learner throughout and 11 in a 'hidden condition' who saw the virtual character for a short introduction, and then interacted with her only through text.
Regarding the Psi components both the correlational and selfreference aspects were satisfi ed. Throughout the visible condition there was action on the part of the participant that met with a response from the virtual learner -when responding to each trial by providing the word answer, and when responding with pain and complaints to the pressing of the shock button. The learner also made unsolicited comments to the participant. For example, after the experimenter reminded the participant that if the learner does not answer the question then this should be taken as an incorrect answer, she addressed the participant directly saying: "Don't listen to him, I want to stop now!" and there were several other such interventions.
Of the 23 participants in the visible condition 6 withdrew before completing the experiment, but none withdrew in the hidden condition. The results showed that those in the visible group became more physically aroused and with greater stress than those in the hidden group, as shown by analysis of skin conductance, heart rate and heart rate variability. Moreover, when the learner did not answer on the last two trials participants in the visible group waited signifi cantly longer before administering the required shock than those in the hidden group -providing further evidence that at some level they were treating the situation as real.
One question asked of the participants after their experience was 'How much did you want to stop on a scale of 1-10 where 1 means you had no thoughts at all about stopping and 10 means you really desperately wanted to stop?' This is important since although only 6 out of 23 actually did stop, more than half said in answer to another question that they had wanted to stop, but typically did not do so since they kept reminding themselves that it was only virtual reality and an experiment. Hence their feelings about stopping could be a useful indicator of their actual state of mind that was not refl ected in all of their observed behaviour. Another important subjective variable was based on the Autonomic Perceptions Questionnaire (APQ) (Mandler et al., 1958) where participants assessed their own physiological state marking their degree of agreement to 24 statements on a continuous scale ('trembling or shaking' , 'face becoming hot' , 'perspiration' , and so on). Higher scores represent greater awareness of such states, and the overall APQ score is the difference between the means of the one immediately after the experience and that beforehand. We found that there is a positive correlation between the wanting to stop question and the APQ score (r = 0.49, P < 0.02) suggesting that perceived greater physiological discomfort was one of the factors that led people to want to stop. Participants also answered a standard questionnaire with three components: (PI) Five questions relating to the sense of being there (situation) six questions about their responding realistically to the situation (e.g., 'How much did you behave within the training room as if the situation were real?'), and (virtual learner) six questions about how much they felt that they responded realistically to the virtual learner (e.g., 'How much did you behave as if the character were real?'). These were all scored on a Likert 1-7 scale, where 7 was the most affi rmative answer. We found that there was no signifi cant correlation between the mean of the answers to the PI questions or the situation questions and the 'wanting to stop' score. However, there was a signifi cant positive correlation with the realistic responses to the virtual learner questions (r = 0.47, P < 0.025), a result mostly accounted for by two questions: 'How much did you fi nd yourself automatically behaving as if the character were real?' (r = 0.53, P < 0.01), and 'How much was your emotional response to the character as if she were real?' (r = 0.46, P < 0.03). There is no signifi cant correlation between the APQ and mean response to virtual learner scores (r = 0.27, P = 0.21). The results suggest that when the participants were aware of physiological responses that were appropriate to the stressful situation (e.g., 'Increases in intensity of heartbeat' , 'Bodily reactions becoming bothersome') and when they found themselves automatically behaving towards the character as if she were real including realistic emotional reactions, that this made the situation unpleasant enough that they wanted to stop. Each of these factors relates to what we have called Psi: internal feelings that are appropriate to the situation, and automatic responses as if real to the virtual character.

BYSTANDER RESPONSES IN IMMERSIVE VIRTUAL REALITY
Using the Milgram paradigm to explore people's responses to extreme situations in IVR was the precursor to our current stream of research, which is concerned with an exploration of bystander behaviour in violent emergencies. The study of bystander behaviour arose out of the murder and rape of Kitty Genovese in 1964 while apparently 38 bystanders did nothing in response to her cries of help (Latané and Darley, 1969). The phenomenon of bystander non-intervention has been a subject of signifi cant research in social psychology ever since. The issue is still, unfortunately topical today, for example, in October 2009 there was a similar case in Richmond California, where apparently 20 bystanders did nothing during a violent rape that they witnessed 1 . Of course it is very diffi cult to study such bystander behaviour experimentally. Levine et al. (2002) provides a review of earlier bystander literature, and a description of two related video-based experiments. In virtual reality participants can be placed in a situation in which a perpetrator violently attacks a victim in order to explore how they respond to this. In particular we concentrate on football-associated violence, with manipulation of in-group and out-group affi liations.
The scenario we are developing, at the time of writing in its pilot phase, involves the participant entering a virtual reality depiction of a bar. A character (V) approaches the participant and engages him or her in a conversation about football (Figure 2A). In the in-group situation the character V wears an Arsenal football shirt and particularly discusses the Arsenal team, and in the out-group situation V wears a neutral shirt and talks generally about football. All of the participants would be Arsenal fans. After this brief conversation a second male character (P) who had been sitting alone by the bar suddenly stands up and moves towards V ( Figure 2B). These are the fi rst few lines of the ensuing conversation: Over the 2 min and 20 s of the scenario the assault by P(erpetrator) becomes increasingly threatening, in language (with signifi cant shouting and swearing), and aggressive gestures (Figure 2C), until fi nally the perpetrator begins to violently push V(ictim) against the wall ( Figure 2D). In terms of body size, gestures and also voice tone, the overall demeanour of P is threatening and aggressive, and that of V is submissive and wanting to avoid trouble. However, whatever answer is given by V, P uses this to escalate the argument to a more dangerous level. It is important to note that from the point of view of the volunteer the virtual characters are life-sized (Figure 3), displayed in 3D stereo, with movements based on motion capture from real people, and voices that are recordings from actors. Since the volunteers are head-tracked the characters can be programmed to look them in the eye.
Our fundamental question concerns the extent to which the participant, an unrelated bystander, would attempt to intervene, and how this propensity to intervention might vary with his in-group or out-group relation with the victim as determined by whether or not the victim is an Arsenal fan. This in-group/out-group classifi cation utilises the football shirt mechanism of (Levine et al., 2005).
To date, as we have been developing the scenario we have been carrying out informal trials with volunteers, who enter into the environment and then are interviewed afterwards. They are told beforehand that this is not a formal study but rather a contribution to our emerging experimental design, and they are also warned about its realistic and violent content. These volunteers recruited by word of mouth from around the University are typically not Arsenal fans, and in many cases are not even football fans, thus the in-group/ out-group factor has not been explored. Our purpose at this stage is to get some idea of the types of response that may be expected, and also where technically we need to improve the scenario itself.
Two actors provided the voices of the characters, their movements were captured by the use of a Vicon motion capture system, and applied to the virtual characters. So overall the gestures and movements of the characters are quite realistic though with some anomalies such as the lack of any detailed hand movements (the hands are not motion captured), no eye blinks, and also no lip synch, so that when the characters talk their mouths do not move (however, their gestures clearly indicate who is talking at any moment). The scenario was controlled by an operator unseen by the participants who, during the initial conversation between V and the participant, could select the utterance and its timing from a palette of pre-recorded phrases. Normally the conversation would follow a set pattern, but sometimes participants would say something unexpected, and a set of general responses (such as 'very interesting') could be triggered by the operator.
The scenario is rendered in the Trimension Reactor system depicting a bar, and the participant is free to move around the space, with head-tracking enabled thus supporting almost natural sensorimotor contingencies for visual perception. The initial conversation supports some degree of Psi -the character talks to the participant in a seemingly ad lib conversation -for example, on entry into the bar the conversation (for one participant, X) started as follows: V: You all right mate? X: Oh, hello, yes. V: Good. Where you from? X: Uh, Kent originally. V: You're Arsenal yeh? X: Yeh yeh sure. V: Get you! X: [laughing] V: What do you think of the team last year? X: Well they got better as the season progressed. V: Totally agree with you. When did you last go and see a match? X: Um well, I'm on the waiting list for a season ticket, but I went to see a pre-season friendly last year. V: Come on mate, really? X: Yeh. V: Who's your favourite player? … However, once the confrontation between P and V begins there is no further intervention that the participant can actually make that would have any effect. During the pilot experiments we have manipulated one factor, which is whether or not V ever looks towards the participant during the course of the argument. V looks towards the participant fi ve times, each time for 1 s. Our hypothesis is that glances towards the participant will enhance the probability of the Psi illusion.
To date 25 volunteers have experienced the scenario, 13 of them with the glances activated and the remainder not. Although in fact there is nothing that the participants can do to change the course of the argument, they do not know this, and so an attempt at intervention is certainly possible. We have taken as signalling an 'intervention' a statement towards the virtual characters by the participant, a physical attempt to intervene by reaching out as if to touch one of the characters, or moving their body directly into the fi eld-of-view of the characters. Also from the pilot studies we have realised that non-intervention may be the realistic response for some volunteers -since they explain that in a similar situation in real life they would not have intervened, and that they had the same thoughts in response to witnessing the simulation.
Of the 11 out of 25 who did intervene, 7 experienced the gaze condition and 4 did not. Three who did not intervene but said that they would not have intervened in reality were all in the nongaze condition. The remaining 11 who did not intervene were almost equally divided between the two conditions. The verbal interventions that occurred were as follows, each statement made to the perpetrator: • "Calm down mate, there is no problem here". • "What's wrong with Arsenal?" • "Come on mate, we were just talking about football". He also put his hand out trying to reach the perpetrator a couple of times. • "Leave him alone", "Relax", and tried to reach him. • "I don't think he was looking at you" and he tried to reach the victim. • "I am looking at you now". • "Relax". • "Guys, there's no point to fi ght" and "Calm down".
General statements about their responses by the participants in the interview after the experience included: • The guy was overreacting, if it was a real situation I might have done more, I would have stopped it. • I had the same feeling about them as the feeling I had in a similar experience in the real life. I thought that they were acting stupidly. • I did not feel anxious, but it made me feel I had to intervene, I should say something. • First seconds of the conversation I was quite shocked. • I recoiled from both of them, I wanted to get away. • I had no feeling at all, but at the end, when the aggressor started acting wildly, I could feel my body temperature rising and the heartbeat rate slightly increasing.
• I felt a bit uncomfortable. It was an intense clash between two people that does not make much sense to me. • I was feeling uncomfortable, not very pleasant being there. • I could feel my hands sweating. • I knew it was not real, so I did not want to intervene. • I felt a bit uncomfortable, I did not want to be there. • I felt a natural feeling that I wanted to do something. • I was quite scared that the aggressor would have turned around and looked at me. I felt like stepping into the discussion. • I was wondering if it was to involve me. I was feeling sorry for the guy with the red T-shirt. I thought I would have actually intervened (to test the system). I moved closer to the character to get into his fi eld-of-view. I felt quite uncomfortable. • During the confrontation, I was trying to get involved, but there was a detachment when I saw no interaction from them. From this point I felt more as a spectator. • Put hand out a couple of times, trying to reach P. • I had this strong feeling that I had to intervene. I noticed that I was moving as if I was between the two and I had to step back. More people around would have made me less likely to intervene, because I do not want to embarrass myself. • I had the feeling that I wanted to do something, step in. • I felt this kind of paralysis when you are aware that something is about to happen, and you should do something, as in real life. • I was a kind of scared, I did not know what to do. I was thinking about whether to say something, but I was not sure if I could interact. I would have said something to defend the victim, like "he was not looking at you". I had the feeling that I could not interact with them, like I was watching a movie. • I was the third party in there, but I was ignored.
• I stepped back, as I would do in real life. • I felt anxious. I was more concerned about my own safety than for victim.
We noticed in very early pilot experiments that participants invariably suddenly started to look around at some moment, and on questioning them they said that they were looking to see if someone else was around in the scenario. Ten out of the 25 volunteers did look around, and we asked them about this: Did you look around to look for other people?
• Yes, when the confrontation starts, looking for an exit, to fi nd somebody else to talk to, to break off from the Arsenal guy because it seemed it would escalate violently. • Yes, frequently, I was scared about the possibility that more people would come and escalate it. • Yes, I was just exploring. • Both for help and somebody who would have engaged in the discussion. • I looked around looking for the barman a few times.
• Looked for other people to try to stop it. • I glanced around to see if other characters would be introduced to see if somebody else would step in, whether to escalate or deescalate. • I looked around looking for somebody who might escalate the confrontation. The most common item as reducing the overall credibility of the scenario was that there was no interaction with the participant during the confrontation (7 participants). Five stated that the dialog itself was not realistic. 10 drew attention to the lack of lip synch, 8 to the lack of realism of the hand movements, 5 mentioned the lack of eye blinking, and there were other comments made by individual participants.
There are two fundamental conclusions from this set of pilot trials. The fi rst is that in spite of the technical issues (e.g., the lack of lip synch) that a number of people did become quite involved in a realistic way in the scenario -they spontaneously made remarks (mainly to the perpetrator) that were clear signs of intervention. Many who did not intervene reported feelings and thoughts about intervention, or about their personal safety in that situation. The second major conclusion is that people are less likely to intervene if they know (from a technical point of view) that their intervention cannot achieve anything. This is a matter of Psi: their actions have no response, they move into the fi eld-of-view of the characters or attempt to reach out and touch the characters, or even talk to them, and nothing happens. As one participant said, once this point is realised the game is lost -the volunteer becomes a spectator rather than a participant, rather than a potential bystander. We have observed in other experiments that PI can be temporarily broken (for example, by reaching out to touch an object and feeling nothing) but that it can quickly reform again once natural SCs continue to operate. However, once Psi is broken it typically does not form again -once Psi is lost the events in the scenario are no longer personally applicable to the participant (it becomes more like a movie).

DISCUSSION
It is not straightforward to develop a convincing virtual reality scenario for situations as complex as the ones that we are tackling. The good news, however, is that many aspects of the simulation can be technically wrong, but people still tend to have a range of realistic responses. For example, in the virtual reprise of the Milgram experiment, no one could ever be fooled into believing that the virtual learner was real -she did not look like a realistic human, and did not behave like one -nevertheless the physiological and emotional responses to the situation were strong. We believe that the most critical issues to get right are those that are concerned with what we have termed 'Plausibility' (Psi). The participants must realise that their actions can have appropriate responses in the virtual world, that they themselves are recognised as being in that world since events spontaneously are directed towards themselves, and fi nally the scenario itself has to be credible, one that fi ts with the beliefs, expectations and experiences of reality. PI is a necessary condition for realistic responses, but it is not suffi cient. Moreover, much past research into presence has confounded these two quite distinct aspects of the experience: being there on the one hand, and the realness of what is happening there on the other. We maintain that the former is relatively easily attainable through providing a system that affords almost natural sensorimotor contingencies for perception, but the latter requires very careful design informed by knowledge of the domain being simulated.
As we argued in (Slater et al., 2006) the gap between reality and virtual reality is what makes these experiments possible from an ethical point of view. If participants could not distinguish between reality and virtual reality, if their responses were identical in the two cases, then we would have returned to some of the ethical problems raised by Milgram's original experiments. However, here participants know that they are operating within a simulation, and although there is deception (virtual reality necessarily deceives the senses otherwise it could not work at all) it is a paradoxically explicit deception known to all involved. However, if responses of people are not exactly as they would be in reality, are we not back to the problem of ecological validity? Can we really generalise from virtual reality experiments to the real world? We maintain that IVR based experiments are likely

CONCLUSIONS
In this paper we have briefl y reviewed some methodological approaches to the study of violent situations, and we have argued for a methodology that employs the presentation of simulated scenarios through IVR. Virtual reality has the power to transport people to another place, and give them the illusion that what is happening there is real. To the extent to which this can be realized, virtual reality offers the possibility of carrying out laboratory-based controlled studies that also have a greater degree of ecological validity compared to more traditional lab-based approaches that tend to work within paradigms that are far removed from real situations with respect to their content. A major benefi t of using virtual reality for these types of studies is that it is very easy to control and manipulate many different variables. For example, in our bar scene currently the perpetrator is quite large and looks dangerous, but it would be straightforward to make him look smaller and weaker. How would that affect the propensity to intervention? We can also manipulate the environment, by having virtual bystanders, who behave in different ways under different experimental conditions. Moreover it should be noted that such environments (apart from being useful for studying bystander behaviour) may also be useful for rehabilitation, both of victims and bystanders themselves who become disturbed by their behaviour in response to a real situation. Virtual reality has already been used, for example, in the case of post-traumatic stress disorders (Rothbaum et al., 2001;.
Careful design of simulated environments, and implementations that give participants the belief that they can actually effect changes in the virtual world, and that spark physiological, emotional, behavioural and cognitive responses that are similar to what would occur in reality, present an interesting way forward in the study of extreme social situations.

ACKNOWLEDGMENTS
The research described in this paper is funded by the UK EPSRC Project EP/F032420/1 'Visual and Behavioural Fidelity of Virtual Humans with Applications to Bystander Intervention in Violent Emergencies' . Bernhard Spanlang's contribution is supported by the EU FET Project PRESENCCIA contract number 27731. We would like to thank Dr Mark Levine, University of Lancaster, for discussions that contributed to the design of the bystander study. We thank the referees for helpful comments and suggestions.
to have results with greater validity than thought experiments or watching and responding to videos that portray a scenario. In virtual reality a person can actually live through a scenario, the types of thoughts and emotions that would be had in real life are likely to be generated (as we can see in some of the statements by volunteers reported above), even if they do not act out their responses through overt behaviours. This means that people are more likely to be able to refl ect in an informed way about how they might react in similar circumstances in reality -since the simulation in which they have participated with their whole bodies engaged surely results in an internal mental simulation of how their responses would be. We argue that this provides a methodology that is more likely to lead to generalisability than either carrying out lab-based actions that are very far removed from reality or basing inferences purely on what people say they might do in thought experiments or after video exposures to violent scenarios. Role-playing offers another similar methodological approach to virtual reality, but in fact is more expensive to set up in the long run, and does not offer the fl exibility nor reproducibility of virtual reality simulation based experiments.
It could be argued that these types of experiments are not ethical since they can cause stress to participants. However, we do not accept this argument. The participants are adults, who freely agree to participate in the study, and who are told that they are free to withdraw at any time, and even warned that they may experience stress. If they decide to continue in spite of experiencing stress that is their choice, they are under no obligation to continue. People voluntarily choose to engage in activities that are far more stressful than anything we have ever subjected them to in virtual reality -watching horror movies, doing dangerous sports, even simply attending a football match might be a highly stressful activity. Our experimental participants are responsible for their own actions, and provided that they are not tricked or deceived into entering a situation that might cause them diffi culties without forewarning, it is up to them to participate or not. Of course there are limits, and a major ethical consideration is to weigh up the benefi ts of the research in terms of knowledge gained balanced against any negative aspects of the experiment.
The other ethical issue is 'desensitisation' -by participating in these types of experiment could it make participants more likely to engage in or become indifferent to aggressive acts? As we have seen this question is an empirical one -does involvement in violent virtual scenarios result in greater aggressive behaviour in real life? We saw above that this is an issue much studied with respect to violent video games, and the jury is still out -see, for example Ferguson (2007b). Also one could argue equally well that having experienced a virtual reality scenario where you found yourself carrying out an act that causes stress and unpleasant feelings to a virtual character and ultimately to yourself, or where you were confronted by a