Immersive virtual reality in the treatment of auditory hallucinations A PRISMA scoping review

Background: A large group of psychiatric patients suffer from auditory hallucinations (AH) despite relevant treatment regimens. In mental health populations, AH tend to be verbal (AVH) and the content critical or abusive. Trials employing immersive virtual reality (VR) to treat mental health disorders are emerging. Objective: The aim of this scoping review is to provide an overview of clinical trials utilizing VR in the treatment of AH and to document knowledge gaps in the literature. Methods: PubMed, Cochrane Library, and Embase were searched for studies reporting on the use of VR to target AH. Results: 16 papers were included in this PRISMA scoping review (ScR). In most studies VR therapy (VRT) was employed to ameliorate treatment resistant AVH in schizophrenia spectrum disorders. Only two studies included patients with a diagnosis of affective disorders. The VRT was carried out with the use of an avatar to represent the patient ’ s most dominant voice. Discussion: The research field employing VR to treat AH is promising but still in its infancy. Results from larger randomized clinical trials are needed to establish substantial evidence of therapy effectiveness. Additionally, the knowledge base would benefit from more profound qualitative data exploring views of patients and therapists.


Introduction
Auditory hallucinations (AH) are generally defined as a sound experienced by an awake individual in the absence of a corresponding external stimulus (Blom, 2015).AH can be present in a variety of disorders, including both psychiatric and somatic illnesses (Waters and Fernyhough, 2017).A majority of psychiatric patients, who are experiencing AH, report that these are verbal in nature (McCarthy-Jones et al., 2014), commonly referred to as auditory verbal hallucinations (AVH) or as "hearing voices".AVH are estimated to be present in approximately 75 % of patients suffering from schizophrenia, with approximately 60 % experiencing negative and hostile content (Waters and Fernyhough, 2017).Some patients hear voices, that command them to harm themselves or others, and the risk of patients complying with violent commands is a major concern for both patients, relatives, psychiatry, and society in general (Birchwood et al., 2018).Additionally, the content of AVH impact patients' emotional well-being and is a significant predictor of emotional distress as well as contact with mental health services (Beavan and Read, 2010).
Pharmacological therapies are commonly used to ameliorate AH with antipsychotic medication being the preferred compound.There are, however, significant difficulties associated with the use of antipsychotic medication, especially over the course of illness in schizophrenia with side effects being common (Rognoni et al., 2021) and related to poor quality of life (Katschnig, 2000).Additionally, around 30 % of patients with schizophrenia spectrum disorders continue to suffer from psychotic symptoms in spite of treatment with antipsychotic medication (Stępnicki et al., 2018).As a result, other therapeutic modalities, such as cognitive-behavioral therapy (CBT), have been advocated as a supplement or alternative to pharmacological treatments (Sommer et al., 2012).However, the efficacy of CBT is only of small to moderate magnitude as evidenced by meta-analyses with even smaller effect sizes being reported at longer term follow-up (Hazell et al., 2016;Van der Gaag et al., 2014).
Virtual Reality (VR) has been introduced in mental health research during the last decades as an ecologically valid instrument (Geraets et al., 2021).In immersive VR (referred to as merely VR in this paper), the subject is wearing a head mounted display and encounters a virtual environment in 3D with sound delivered through headphones.Thus, VR can represent social environments that trigger responses, reactions, and emotions in the user equivalent to those triggered by our shared physical world (Freeman et al., 2016).
VR has been applied in the treatment of various psychological conditions and proven its efficacy in treating post-traumatic stress disorder (Buck et al., 2019), social anxiety disorder (Emmelkamp et al., 2020), and specific phobias (Freitas et al., 2021).In contrast to the abundant studies on the use of VR in predominantly anxiety disorders, it has only recently been introduced as a treatment tool in psychotic disorders (Bisso et al., 2020).
Given the rapid development of new technologies and therapies, it is of interest to explore the current state of research on interventions using VR technology in the treatment of AH.Hence, the objective of this scoping review is to outline the literature on the use of VR targeting AH with a focus on studies reporting on empirical data.We aim to provide an overview of the field, meanwhile also identifying current knowledge gaps.

Protocol and registration
This scoping review was carried out using the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Extension for Scoping Reviews (PRISMA-ScR) (Tricco et al., 2018).This review protocol was entered in the OSF Registries, Registration DOI: 10.17605/OSF.IO/2SNZJ

Search strategy and selection criteria
We conducted a comprehensive search across electronic databases: PubMed, Cochrane Library, and Embase utilizing various search terms relating to VR and hallucinations/psychosis without any temporal limitations (refer to appendix A for the search query).The first search was performed on September 28th, 2022, with and update conducted on July 1st, 2023, to include newly published or registered trials.
The inclusion criteria of this review were: • Articles focusing on the application of immersive VR in the treatment of AH • Clinical trials, encompassing randomized trials, case studies, and study protocols • Entries in clinicaltrials.gov,unless results have been disseminated elsewhere • Articles published in English Exclusion criteria: • Non-empirical studies such as opinion articles • Conference abstracts, primarily excluded due to the potential overlap with published papers and the variability of preliminary results versus final results

Publication selection
For screening and selecting the data, we used Covidence (Covidence -Better Systematic Review Management, n.d.) online tool for reviews.
Titles and abstracts were independently reviewed by two authors and coded as to whether they met the inclusion criteria.Disagreements were resolved by discussion until consensus was reached or were decided by a senior author.

Results
A total of 4203 articles were imported for screening after the database search, of which 1324 duplicates were removed.Two studies were added after hand search (Beaudoin et al., 2021 and a trial entry to clinicaltrials.govby Alexandre Dumais: NCT04054778) making it 4205 studies to be screened.Of these 2818 were excluded.A total of 63 studies were full text screened and 47 were excluded as they did not fulfill inclusion criteria (Fig. 1).
The 16 articles identified (including two study registrations at clinicaltrials.gov)all originated from either a high-income country (number of papers in parentheses): Canada (11), Switzerland (1), and Denmark (1), or a middle-income country: China (3).The earliest identified study, was that of du Sert et al. (2018) which according to trial registration was initiated in 2015.All included studies were published in 2018 or later.Of the identified 16 studies, nine reported on quantitative data, five on qualitative data, and two studies included both quantitative and qualitative data (see appendix B, data-charting form for an overview).Eight studies are original trials (Dellazizzo et al., 2021;Dellazizzo et al., 2018a;du Sert et al., 2018;Liang et al., 2021Liang et al., , 2022;;Smith et al., 2022, Dumais: NCT04054778, and Egger: NCT04099940), meanwhile the remaining eight papers are based on data (or additional follow-up data on patients) from earlier trials (Beaudoin et al., 2021;Dellazizzo et al., 2020;Dellazizzo et al., 2018b;Hudon et al., 2022;Hudon et al., 2023aHudon et al., , 2023bHudon et al., , 2023c;;Liang et al., 2023).

Patient populations
All studies recruited adult participants (≥18 years) and the majority (14 out of 16 studies) focused on patients with schizophrenia spectrum disorders experiencing AVH that were treatment resistant to some degree, i.e. patients heard voices in spite of treatment with antipsychotic medication.The exact criteria of treatment-resistance vary across studies.The two studies differing from the others in regards to population were (1) Liang et al. (2021) who included a patient with psychotic major depression and (2) the trial registration by Egger (NCT04099940) planning to recruit patients suffering from AVH independent of diagnosis.
Mean age of the participants in the randomized clinical trials (RCT's) were: 42.9 (du Sert et al., 2018), 43.6 (Dellazizzo et al., 2021), and 25.3 (Liang et al., 2022).There is an overlap of patients participating in the RCTs and in the qualitative studies.Age of participants of each study are thus not listed here (see appendix B).

Type of intervention Therapeutic method
All of the studies applied interventions building on AVATAR therapy as developed by Professor Julian Leff (Craig et al., 2018;Leff et al., 2013Leff et al., , 2014)).In this type of psychotherapy, patients use computer software to create a computerized imagean avatarof the imagined source of their AVH.Additionally, by using a voice transformation program the therapist's voice is transformed into sounding like the voice heard by the patient (i.e. the AVH).If more than one voice is present, the patient is asked to identify and work on the most dominant or hostile voice.In therapy sessions, role-playing is conducted in which the patient engages in a dialogue with the avatar (displayed on a computer screen), L.C. Smith et al. meanwhile the therapist takes on the role of the avatar.
Within the first sessions the patient is encouraged to stand up towards the avatar.In later sessions the patient discusses their positive personal qualities with the avatar with the aim of increasing self-esteem.The role-playing is prepared in advance together with the patient, allowing the therapist to be familiar with typical abusive remarks stemming from the patient's AVH (the dominant voice).During therapy, the avatar gradually becomes less dominant and starts to consider what the patient is saying.Overall, in this type of therapy there is a focus on renegotiating the relationship between the avatar and the patient with the aim of the patient reclaiming power and control over the voice.
In the reviewed studies the avatar is displayed in VR, not on a computer screen (see next section on VR environments and technological equipment).Since the therapist acts as the avatar representing the voice, the chosen dominant voice must be in a language spoken by the therapist.
Different names of the VR-assisted therapy (VRT) are utilized by different groups and in different papers.The terms VRT and Avatar Therapy are used interchangeably, describing the same type of intervention.The number of sessions in the reviewed studies were 7-9 sessions, except for the trial by Dellazizzo et al. (2020), in which the patients received CBT prior to VRT, consequently 18 (9 + 9) sessions of therapy (see Table 1).
VR environments and technological equipment VR environments differ across studies.In the studies by the Canadian research group (Dellazizzo et al., 2021;du Sert et al., 2018), the avatar is standing in a dark room.In the study protocol by Smith et al. (2022) the avatar is displayed in an office spaceeither on a computer screen (on a desk) or with a 3D body seated behind a desk.This allows for initiating treatment with a 2D image of the avatar (representing the voice) in participants exhibiting elevated levels of anxiety in being confronted with the avatar (e.g., voice).Additionally, the distance to the avatar can be graduated as the patient becomes increasingly comfortable with its presence in VR.In neither of these studies, it is stated if there is an avatar or virtual body parts to represent the patient's body.
In the Liang et al. (2021) study, it is mentioned that the avatar is "standing in a virtual scene", but without further description of the environment.Liang et al. (2021Liang et al. ( , 2022) ) differ from other studies by delivering VRT through an advanced VR system, named CATS (virtual-reality-based computer avatar therapy system).CATS is defined by the authors as a VR-setup also capturing the therapist's facial expressions and body posture (in real time) creating an even more realistic dialogue with the avatar.None of the other studies reviewed report on the feature of whether the avatars are capable of expressing emotion.
The most detailed account of technological equipment and its usability was provided by the research team from Switzerland.Brander et al. (2021) employed the freely available Unity Game engine (Bonfiglio, 2018) to develop their VRT software.It is reported that the system reached excellent usability, particularly when employed by therapists (psychiatrists and psychologists), who scored significantly higher than nursing staff and administrative personnel on the System Usability Scale, SUS.Unfortunately, the visual aspects of the VR environment are not described, but assessed by an available screen shot, the VR environment can be described as a rather large room with a couch, conceivably an office or a living room.Two avatars are situated in the environment engaged in a conversation, possibly depicting both an avatar representing the voice and an avatar representing the patient.Since the screenshot is from the clinician's screen, it is unclear if the patient takes on a third person perspective, or if the patient holds a first-person perspective into the virtual world aligning with the other reviewed studies.Regarding assessment of usability on behalf of patients, the trial by Smith et al. (2022) employ the Simulator Sickness Questionnaire (Kennedy et al., 1992) to measure motion sickness.
In most papers, the term "presence" is mentioned, but commonly not defined as a construct or measure.Only two studies describe how presence is measured.In the study by du Sert et al. ( 2018), presence is defined as a feeling of being in the presence of the persecutor (i.e., a social presence of the voice) which was rated on a scale from 0 to 10 (with 10 being a very strong feeling of presence).The mean score on this item was 7.5 (SD = 1.5).In the trial registration by Dumais (NCT04054778), it is stated that Igroup Presence Questionnaire (IPQ, Schubert et al., 2001) is employed to measure presence.IPQ can be understood as a subjective sense of being in a virtual environment (i.e., being transported to the environment).Meanwhile not stated in the study protocol by Smith et al. (2022), it can be added that participants in this trial rate how well the avatar matches their own experience of the voice on visual and auditory characteristics (on a VAS scale ranging from 0 to 10).Likewise, participants rate how real the experience was (on a VAS scale from 0 to 10).Finally, they are asked questions to further explore (qualitatively) their experiences in VR and their encounter with the avatar.

Studies reporting on processes of VRT
Five studies (all Canadian) focused on specific aspects or processes of the VRT.These studies reported on the part of the session conducted in   VR, excluding informing on the dialogue between the patient and therapist before or after the virtual role-play.Dellazizzo, du Sert et al. (2018) conducted a qualitative content analysis of 12 patients undergoing VRT to identify the main themes that emerged in the therapy dialogues between the patient and the avatar.All courses of therapy comprised at least one of these five themes: emotional responses to the voices, beliefs about voices and schizophrenia, self-perceptions, coping mechanisms, and aspirations.Beaudoin et al. (2021) conducted a content analysis using transcripts of the dialogue between 18 patients and their avatars in previous trials (Dellazizzo et al., 2021;du Sert et al., 2018).Dialogues were from start mainly confrontational but gradually became more constructive.Hudon et al. (2023b) conducted an unsupervised machine learning driven analysis of verbatims (dialogue between avatar and patient) utilizing the same verbatims as the beforementioned study by Beaudoin et al. (2021) comparing the results from these two studies.The machine learning analysis identified three clusters of avatar's interactions and four clusters of patient's interactions.The clusters were overlapping with but not identical to those of the study by Beaudoin et al. (2021), i.e. the avatar interactions were similar to the previous study, meanwhile the patient interactions clusters differed.Also, Hudon et al. (2023a) conducted a content analysis from 32 therapy courses being part of previous trials (Dellazizzo et al., 2021;du Sert et al., 2018) and identified 5 dyadic interactions between the patient and the avatar, that occurred with a mean frequency of more than 10 times for each participant.The most common dyad was the avatar uttering a reinforcement statement (e.g., "you can tell me directly like this, if you have something to tell me") and the patient reacting with self-affirmation (e.g., well, it's just that I felt badly when you elevated the tone of your voice.You seemed angry and it affected me").Similarly, Hudon et al. (2023c) conducted a content analysis of transcripts of avatar dialogues for 16 patients from earlier and ongoing trials (Dellazizzo et al., 2021 and the trial registration on clinicaltrials.gov:NCT04054778) in order to identify the emotions that were expressed in the dialogues.The authors report that the most common emotional reactions from patients were neutral, joy, and anger, whereas the avatar tended to react with interest, disgust/contempt, and neutral emotions.

Studies reporting on the effect of VRT
A total of 11 of the reviewed studies reported findings on the effect of VRT on AVH.Four papers reported results from RCTs.Two of these were studies conducted in Canada by the same research group.The first study by du Sert et al. ( 2018), a randomized, partial cross-over trial with treatment as usual (TAU) as the control condition (N = 19), found symptoms of AVH (PSYRATS-AH, Haddock et al. 1999) to be significantly reduced after cessation of treatment (three months post baseline).
The study by Dellazizzo et al. (2021) was a randomized parallel comparative trial (N = 74) comparing VRT with CBT with the primary outcome assesed at treatment cessation (three months post baseline).
Patients were re-assessed six and 12-months post baseline.The VRT significantly reduced AVH-symptoms (PSYRATS-AH), which was not the case for the CBT intervention, although the differences between groups did not reach significance.Liang et al. (2022) conducted an RCT comparing the aforementioned VR intervention CATS with CBT (N = 65).The aim of the trial was to test the efficacy of CATS, meanwhile reporting on clinical outcome measures.Verbal fluency task (VFT) was performed while measuring functional near-infrared spectroscopy (fNIRS) to measure task dependent regional bloodflow (rCBF) before, during, and after the intervention.The between group difference in outcome measures post intervention did not reach significance and AVH improved in both groups.The P300 amplitude was found to show significant interaction effect and correlation with severity of AVH.The increment in P300 amplitude post intervention was significantly higher in the CATS group than in the CBT group.Building on additional evidence (Liang et al., 2022), Liang et al. (2023) propose that P50 sensory gating may be a possible neurophysiological correlate of AVH symptom recovery after CATS.The authors found that the P50 ratio and the improvement on PSYRATS-AH after the CATS-intervention was significantly correlated, thus CATS may cause decreased P50 ratio to improve sensory gating.
These randomized studies are all considered pilot-studies by the authors.The number of participants in each study is relatively low (range 19-74) and the assessors engaged in outcome evaluations were not blinded to treatment allocation.Due to methodological issues (nonblinding, drop-outs not included in analysis etc.), effect sizes are not reported here but are listed in appendix B, and briefly mentioned in the discussion.
Three additional RCTs on VRT for AVH have been registered, but results are not available yet.Alexandre Dumais registered an RCT (ClinicalTrials.gov,identifier: NCT04054778) in which the plan is to recruit 136 patients.Assessment is conducted by blinded assessors at baseline, one week after treatment and then again at three, six and 12months follow-up.Main outcome (PSYRATS-AH total) is assessed one week after cessation of treatment.Smith et al. (2022) are conducting a large-scale randomized trial with blinded assessors, recruiting 266 patients.Main outcome (PSYRATS-AH total) is assessed post therapy, i.e., 12 weeks, and additional follow-up assessment is conducted at 24 weeks post baseline.Another RCT protocol registered by Egger (ClinicalTrials.gov, identifier: NCT04099940) plan to recruit 100 patients.Main outcome is psychopathological assessments (symptom questionnaires, not further specified in the registration) at baseline, and 3, 6 weeks, and 6-and 12-months post baseline.
Apart from RCTs, other types of studies also reported on outcome.Two of these were case studies.Dellazizzo et al. (2018b) described the beneficial results of conducting avatar therapy with a treatment-resistant patient diagnosed with paranoid schizophrenia.The patient heard voices for almost 20 years and had not responded to other types of interventions, including antipsychotic compounds, electroconvulsive therapy (ECT), transcranial magnetic stimulation (rTMS) and CBT for psychosis (CBT-p).The patient significantly improved after VRT.Both positive symptoms and depressive symptoms diminished, and the patient's most distressing auditory hallucination was alleviated.Qualitative data from interviews with the patient, his relatives, and the treating psychiatrist supported these findings.Liang et al. (2021) reported on two patients receiving CATS adding data from fNIRS measuring task dependent regional cerebral blood flow (rCBF) while performing a verbal fluency task (VFT).One patient was diagnosed with paranoid schizophrenia, the other with psychotic major depression.Both patients improved on all outcome measures (comprising PSYRATS, and additional measures listed in appendix B).The patient with depression ceased to experience auditory hallucinations.Both patients fNIRS showed a trend in the direction of improvement.
Combining VRT with CBT In a study with 10 patients Dellazizzo et al. (2020) examined the benefits of combining CBT with VRT by offering VRT subsequently to a CBT therapy course.The effect of the CBT course was not statistically significant.The effect of VRT alone was effective in significantly reducing AVH symptoms (PSYRATS-AH total), meanwhile the combination of CBT and VRT was significant and yielded a larger effect size than that of VRT alone.Four out of ten patients were considered treatment responders with a decline of at least 20 % on PSYRATS-AH total after completing VRT.
Managing the COVID-19 pandemic after VRT or CBT Hudon et al. (2022) conducted a qualitative follow-up study with the use of semi-structured interviews to explore how patients who had previously received either VRT or CBT (in previous trials) were influenced by the COVID-19 pandemic.More patients from the CBT-group expressed depressive symptoms in direct relation to the pandemic, while more patients from the VRT group expressed anxiety symptoms.However, most patients did not report either depressive or anxiety symptoms and expressed stability in AVH-symptoms.

Acceptability and feasibility of VRT
None of the studies reported any adverse events in connection to VRT.The dropout rate from VRT were listed in three papers ranging from 6.25 % in the study by Liang et al. (2022)  In connection to the RCT by Dellazizzo et al. (2021), 15 patients were interviewed regarding acceptability and feasibility of therapy.Most patients found the intervention (VRT/CBT) adequate (content, sequence, dose, tailoring, timing, mode of delivery, and equipment).Pertaining to VRT, 37.5 % of participants stated that the intervention had been stressful at first.Regarding CBT, 42.9 % stated that homework had been uninteresting or in other ways inadequate.A third of the patients found the interventions (VRT/CBT) too short.
In the study by Dellazizzo et al. (2020) in which the patients received first CBT then VRT, this sequence of therapy was appreciated by all patients.Both interviews with patients and therapist's notes revealed, that CBT helped patients share their experiences and gain awareness into their illness, meanwhile also assisting patients in managing difficult emotions.Several patients stated that CBT prepared them for the more difficult confrontation in the VRT.Of note, none of the patients in the original trial (Dellazizzo et al., 2021) requested to try CBT after VRT.

Therapists' background and training
In most papers it was stated, that the therapy was conducted by skilled (experienced) therapist(s) and that the treatment was manualized (Beaudoin et al., 2021;du Sert et al., 2018;Liang et al., 2022;Smith et al., 2022).No studies mentioned training and only very few studies mentioned supervision or whether assessment of program fidelity was conducted (see appendix B for an overview).This may be because the studies can generally be considered pilot-studies, and, due to the pioneering aspects of the method, few therapists may be considered competent at supervising the intervention.Future larger-scale studies need to address these aspects when reporting on outcomes to enable quality assessment of studies and inform implementation.

Discussion
Based on this PRISMA ScR, it can be stated that VRT for auditory hallucinations has so far shown great potential.The method of employing an avatar to represent the most dominant voice appears to be a promising psychotherapeutic innovation for patients with psychosis.Meanwhile the evidence-base on the efficacy of VRT for AVH symptoms is accumulating, it should be noted, that there is a high risk of bias in the identified RCTs, e.g., none reporting outcomes rated by blinded assessors and information being scarce on training, supervision, and fidelity to the method.Additionally, the finalized trials are to be considered pilot studies (Dellazizzo et al., 2021;du Sert et al., 2018;Liang et al., 2022) and only the study protocols by Smith et al. (2022) and Egger and Dumais, (clinicaltrials.gov:NCT04099940 and NCT040547789) are larger single-blind RCT's enrolling ≥100 patients.Due to the inclusion of a variety of study designs and differing research methods, risk of bias assessment is not applicable for a scoping review (Tricco et al., 2018).Yet, it is evident from the current review, that employing VR to ameliorate AH is a relatively new method.Thus, it is advisable to await upcoming larger scale trials before conducting a rigorous systematic review (including risk of bias assessment) on the efficacy of VRT.It can be added that much is yet to be explored regarding the processes of therapy as well.Some of the reviewed studies identified themes and emotions explicitly expressed during patient-avatar dialogues (Beaudoin et al., 2021;Dellazizzo, Percie du Sert, et al., 2018;Hudon et al., 2023aHudon et al., , 2023c)).However, the meaning that patients construct from the dialogue with the avatar was not explored, and consequently it is not evident if certain aspects of dialogues were particularly meaningful for patientsor may relate to therapy outcome.The processes of therapy are not directly addressed in the identified trial registrations, and it is unclear if or how therapy process will be studied in upcoming trials.
Regarding the study sample in the trials; all were adults and almost all were diagnosed with schizophrenia spectrum disorders experiencing treatment resistant AVH.Few studies included patients with affective disorders (Liang et al., 2021, Stephan Egger (NCT04099940)).The identified studies all targeted AVH as opposed to non-verbal AH, which is identical to the original AVATAR therapy format (Craig et al., 2018;Leff et al., 2013).The choice to recruit treatment resistant patients may be due to the method being rather new, with relatively sparse knowledge on the risk of adverse events.Of note, none of the reviewed studies did, however, report adverse events relating to VRT.Studies investigating intervention efficacy on younger, possibly medicine-naïve patients are warranted i.e., the VRT may serve a more preventive purpose, potentially improving clinical and functional outcome for patients with a first-episode psychosis.
The PSYRATS-AH (Haddock et al., 1999) scale was the main outcome measure in all of the reviewed RCTs making the outcome of VRT-interventions on AH across these studies comparable.The reported effect sizes tend to be very high (Cohen's d ranging from 1.0 to 1.23) (see appendix B) but due to the abovementioned methodological issues the reported effect sizes should, however, be interpreted cautiously.Nevertheless, du Sert et al. (2018) hypothesize that using immersive VR (as opposed to a 2D computer screen (Craig et al., 2018)) may enhance presence and as such therapy efficacy.In keeping, Liang et al. (2023) speculate that the even higher effect size found in their study may be due to the more advanced CATS-system in which full-motion capture technology register the therapist's facial expressions, which is then simultaneously acted out by the avatar in VR, i.e. this VRT is not conducted through mere lip-synched animation as in the VRT of earlier trials (Dellazizzo et al., 2021;du Sert et al., 2018) potentially allowing for the avatar dialogue to become even more realistic and effectful (Liang et al., 2023).We have yet to see studies testing the difference between different types of VR environments and technologies head-to-head.To our knowledge no study has compared the effect of using 2D avatars on a computer screen to 3D animated avatars displayed in immersive VRor even more technologically advanced VR (e.g., CATS).The reviewed studies compared VRT to either CBT, supportive counselling, or TAU.Correspondingly, no study has compared VR-role plays to non-VR roleplay or non VR-exposure to voices (e.g., Hayward et al., 2017;Lincoln et al., 2021).
It is yet to be explored what properties of VR environments are best suited for simulating the experience of AVH, and how those properties may influence therapy efficacy.For example, the sense of presence (e.g., Sanchez-Vives and Slater, 2005) and embodiment (e.g., Kilteni et al., 2012) are two established constructs of VR experiences in Human-Computer Interaction (HCI) research.The term presence was mentioned in most of the papers, as well as realism, which is considered to be a part of presence (e.g., the IPQ questionnaire used in one study to measure the sense of presence has a subscale for realism).Body ownership and agency are commonly agreed constructs of embodiment in HCI.It was unclear whether studies presented an avatar of the patient in VR to induce embodiment, but in one study (Smith et al., 2022) using a 2D image of the avatar representing the voice and increasing the distance to the avatar was an option when closer confrontation with a 3D avatar (i.e., full-body representation) would be too overwhelming for the patient.Therefore, both the sense of presence and embodiment seem to be factors that are considered in the design of the VRTs, but their interactions with the treatment's efficacy remain unknown.For example, subjectivity may be radically altered in schizophrenia (Parnas et al., 2021), and AVH can be seen as "disembodied" experiences (Metzinger, 2018).For these reasons, measuring the sense of presence and embodiment with validated questionnaires or physiological and behavioral measures could help further assess which properties of the VR environments are best suited for VRT and how those properties could improve the efficacy of VRT.As expected, conducting VRT for AH is a rather novel approach.All reviewed studies were published in 2018 or later, and in middle-to high-income countries.Considering the cost of VR-equipment (and the cost of developing VR software), this is not surprising.If the VRT is found to be effective, implementation will most likely be initiated in countries with a relatively high per capita income.

Future studies
VRT for treating AVH has so far proven compelling and promising preliminary evidence, but the field are in the initial stages with a multitude of areas to explore in future studies.Larger rigorously conducted RCTs are needed to reach more firm conclusions on the effect of VRT for AVH.The maximum follow-up period to date is one year (Dellazizzo et al., 2021,. Dumais: ClinicalTrials.gov, NCT04054778) stressing the necessity of additional studies with longer follow up periods to conclude on the durability of the intervention.
In the reviewed studies, the language spoken by the patient's dominant voice had to be a language spoken by the therapist, excluding patients from minority populations hearing voices in other languages.Future studies may investigate if translating the dialogue into a language shared by the patient and the therapist is feasible which may broaden the group of patients that could potentially benefit from VRT.
Regarding studies reporting on the processes of therapy, conducting content analysis with transcripts from therapy dialogues between avatar and patient has been the preferred method to date (Beaudoin et al., 2021;Dellazizzo, Percie du Sert, et al., 2018;Hudon et al., 2023aHudon et al., , 2023bHudon et al., , 2023c)), which may be due to the fact that audio-recordings from therapy sessions tend to be available for transcription and as such are conveniently accessible for analysis.These studies generate valuable insight into processes of therapy, but there are limitations to focusing on the in-session dialogue, e.g., these studies do not take into account the processes involved in therapy sessions before or after the VR exposure, or experiences of patients and therapists not stated explicitly during therapy.Only Dellazizzo et al. (2021) have so far conducted a study analyzing semi-structured interviews with patients regarding acceptability and feasibility of the intervention.One of three patients stated that they wished for more therapy sessions (Dellazizzo et al., 2021).Accordingly, an upcoming trial aims at establishing the optimal dosage of therapy sessions (Garety et al., 2021).
The role of the therapist (acting as the avatar in VR) is markedly different in the reviewed studies than in traditional therapeutic interventions.Furthermore, in the studies by Liang et al. (2021Liang et al. ( , 2022) ) not only verbatims but also the therapist's facial expressions are conveyed to the avatar through full-motion capturefurther emphasizing the actor-like quality of the role.We have yet to see studies exploring the experiences and practices of therapists engaged with the VRT, conceivably broadening our understanding regarding both the VRT and what is needed in the training of new therapists.
Building on the evidence from studies using VR to treat AVH in schizophrenia, this type of treatment may possibly be modified for the treatment of other psychiatric disorders (e.g., PTSD or mood disorders), in which AVH may also be prevalent.Correspondingly, it might be possible to explore the use of VR to treat AVH in somatic illnesses (e.g., dementia or Parkinson disease).We have yet to see studies aimed at altering the content of non-verbal AH.

Limitations
In this scoping review we only included articles written in English and may consequently have overlooked research currently being conducted in non-English speaking countries.Based on the decision to focus on empirical trials reporting on results, this PRISMA ScR does not cover opinion articles or papers that may delineate theoretical work on VRT.
Several of the identified studies reported on numerous exploratory outcomes.It is beyond the scope of this paper to report on all outcomes.Hence, we refer to appendix B for an overview of measures employed in the reviewed studies.Moreover, we only focused on studies explicitly targeting AH, therefore this review does not cover interventions in which AH may be ameliorated as part of a broader psychotherapeutic strategy to reduce symptoms (e.g., studies using mindfulness in VR).
to 24 % and 21 % in the studies by, Dellazizzo et al. (2021) and du Sert et al. (2018), respectively.Only du Sert et al. (2018) listed the mean number of sessions completed being 5.7 out of the 7 sessions.

Table 1
Overview of studies, main findings.
▪ VRT achieved larger effects on AVH (PSYRATS-AH) than CBT.▪For both groups the withingroup improvements were significant (continued on next page) L.C.Smith et al.

Table 1
(continued ) This is an active trial, i.e. the results are not available yet.Follow-up: main outcome (continued on next page) L.C.Smith et al.