1 Introduction

Stress plays an important role in the onset and course of psychosis (Norman and Malla 1993). Stress factors, such as cumulative life events (Nuechterlein et al. 1994; Shevlin et al. 2008), daily hassles (Norman and Malla 1994), exposure to urban life (Marcells et al. 1998), and high levels of expressed emotions (Butzlaff and Hooley 1998; Nuechterlein et al. 1994), can trigger or worsen psychosis in vulnerable individuals (van Winkel et al. 2008). Accumulating evidence suggests that the hypothalamic–pituitary–adrenal (HPA) axis, a key physiological stress response pathway, is dysregulated in schizophrenia and related psychoses (Pruessner et al. 2017; Walker et al. 2008; Walker and Diforio 1997).

One of the most reliable and widely used laboratory tests to assess the physiological stress response is the Trier social stress test (TSST) (Dickerson and Kemeny 2004). The TSST consists of a short mock job interview and mental arithmetic task in front of an audience of two or three people. Thus, the TSST induces a social evaluative threat and uncontrollability as two main factors for robust activation of the HPA axis (Dickerson and Kemeny 2004). Various physiological responses evoked by the TSST make it a valuable tool for studying the biological mechanisms mediating the effects of psychosocial factors on adverse disease progression in many psychiatric disorders. The main finding of three systematic reviews (Ciufolini et al. 2014; Dauvermann and Donohoe 2019; Zorn et al. 2017) on the response of the HPA axis to psychosocial stress in patients with schizophrenia is that the cortisol response is blunted during the first episode of schizophrenia (Seitz et al. 2019; van Venrooij et al. 2012) and in chronic schizophrenia (Jansen et al. 2000; Lange et al. 2017a, b; Wieck et al. 2013). However, several studies have reported no significant difference in the cortisol response in patients with chronic schizophrenia (Brenner et al. 2011; Nugent et al. 2015). In contrast, cardiovascular responses to psychosocial stress, i.e., increased heart rate (HR) or blood pressure (Brenner et al. 2009; Jansen et al. 2000), are consistent in patients with schizophrenia.

The value of using virtual reality (VR) is to measure real time responses and to control and manipulate aspects of environment (Valmaggia et al 2016). Patients with psychosis find it difficult to enter socially stressful situations and often avoid it. However, knowing that VR environment is not a real situation and level of social stress can be adjusted, patients with psychosis may learn to cope with social stress delivered in VR. One review concluded that VR can be used safely in patients with psychosis and that it is a valid approach to study and mechanisms of psychosis (Veling et al 2014). Contrary to the concerns that patients with psychosis might be suspicious of the technology or finding it hard to distinguish between the virtual and real worlds, Freeman et al. (2022) reported that patients with psychosis were generally very positive toward the VR therapy, valued having the opportunity to try the technology, and experienced few adverse effects. VR therapy has been increasingly used to treat psychological disorders, such as paranoia, social phobias, and acrophobia (Freeman et al. 2018, 2019; Gujjar et al. 2019; Kampmann et al. 2016). Adapting the TSST to a VR environment has the potential to revolutionize how stress reactivity is elicited by researchers. This approach could solve some of the common TSST limitations originating from constraints in the design of different environments or set-ups and the assembly of trained evaluators/audiences. Studies using the TSST in VR (VR-TSST) have reported significant stress reactions, with subjective and peripheral physiological reactions comparable with those in response to the in vivo TSST and with lower cortisol responses in healthy controls (HCs) (Shiban et al. 2016; Zimmer et al 2019). No study has used the VR-TSST in patients with psychosis or schizophrenia. One related study combined VR with different social stress paradigms (virtual café with avatars looking angry at the participants) and applied it to individuals with higher psychosis liability (Veling et al. 2016). The present study was undertaken to investigate physiological responses to the VR-TSST in patients with psychosis and HCs. Popular parameters capturing autonomic nerve responses are the skin conductance level (SCL), HR, and the RR interval, an index of heart rate variability which are all reported to be impaired in schizophrenia (Castro et al 2008; Fujibayashi et al 2009; Lim et al 1999). Therefore, we included those three signals as physiological measures of the study.

2 Methods

2.1 Participants

Patients were recruited from the outpatient clinic of the Department of Psychiatry at Jeonbuk National University Hospital. The inclusion criteria were (1) diagnosis of schizophrenia spectrum disorders (schizophrenia, schizoaffective disorder, or schizophreniform disorder), psychotic disorder not otherwise specified, or delusional disorder according to the criteria of the Structural Clinical Interview from the Diagnostic and Statistical Manual of Mental Disorders, IV-TR edition (First et al. 2002), (2) age 19–65 years, (3) stable condition with no change of medication for last 2 months and (4) the ability to comprehend the procedure and aims of the present study. The exclusion criteria were (1) diagnosis of mental retardation (IQ ≤ 70), (2) history of head trauma, (3) serious neurological disorder (epilepsy, stroke, Parkinson’s disease, and/or dementia), and/or (4) an acute, unstable, and significant medical illness. Healthy volunteers were recruited as the HC group via advertisements. They underwent a psychiatric interview using the screening module of the Structured Clinical Interview for the DSM-IV (First et al. 2002). HC subjects were excluded if they met a current or previous diagnosis of mental disorders, had a significant medical condition, or had first-degree relatives with psychosis to minimize effects of genetic loading. All procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the 1975 Declaration of Helsinki, as revised in 2008. All procedures involving human subjects/patients were approved by the Ethics Committee of the Jeonbuk National University Hospital (Approval Number CUH 2014-11-002-053). All participants provided written informed consent.

2.2 Virtual reality equipment and the TSST

Participants were immersed in the VR world via a head-mounted display (HMD; Oculus Rift CV1). The HMD had a 90 Hz refresh rate and a 110° field of view with high resolution (1080 × 1200). The VR-TSST consisted of three scenarios, i.e., a resting phase (2 min), a job interview (5 min), and a mental arithmetic task (5 min) adopted from a modified version of the TSST (Kirschbaum et al 1993). The resting condition (baseline) was a 3D video of nature scenes. The job interview was recorded using a 3D 360° camera, and the arithmetic task was developed using an animated 3D cartoon. Three committee members (two middle-aged men and one middle-aged woman) were present in the job interview video. Two men asked the following five questions alternately: “Why did you apply to our company?” (Interview 1), “What makes you a good candidate? (Interview 2)”, “What certificates do you have? (Interview 3)”, “What are your strengths and weaknesses? (Interview 4)”, and “What are the strengths and weaknesses of your physical appearance? (Interview 5)”. If the participants did not answer 1 min, a pressing question was asked to give them a sense of interaction: “There is still time left. Can you say more?”, which was followed with a firm tone of voice. A woman looked at the participant and wrote down scores in a cold businesslike manner several times. None of the interviewers expressed emotion or provided social feedback, and all presented a neutral facial expression. In the arithmetic task, participants were required to subtract 6 from 1,000 (3 min) (Math 1) and 7 from 100 (2 min) (Math 2). When the answers typed on the VR screen were correct, they were told that the answer was correct. If not, they had to start over from the beginning. No anticipatory phase was provided before the job interview session. Screenshots of examples of the VR-TSST are shown in Fig. 1. The VR-TSST was generated using the Unity engine (Unity Technologies, San Francisco, CA, USA).

Fig. 1
figure 1

Screenshots of examples of virtual reality Trier social stress test: a job interview and b mental arithmetic task

2.3 Measures

Blood samples of 5 mL were drawn from a catheter into EDTA tubes at baseline (10 min before the test), during the test, and at 5–10 min and 30 min after the test was completed to measure plasma cortisol levels. The first blood sample was taken 30 min after insertion of the catheter to avoid possible stress effects of insertion of the catheter on plasma levels. The cortisol levels were measured by electrochemiluminescence immunoassay (D-68305; Roche Diagnostics GmbH, Mannheim, Germany) with a minimum detection limit of 0.4 \(\mathrm{\mu g}/\mathrm{dL}\) (11 \(\mathrm{nmol}/\mathrm{L}\)). The SCL was recorded using two gold-coated flat and dry surface electrodes (Ø = 10 \(\mathrm{mm}\)) embedded in the upper part of HMD which were placed tightly on the forehead. The sampling rate was 100 \(\mathrm{Hz}\). Skin conductance was expressed as the mean SCL (\(\mathrm{\mu S}\)) among the different data points. The HR was recorded using an earlobe-pulse-plethysmograph (FNIKorea Co., Ltd, Gwacheon, Korea) at a sampling rate of 256 Hz. The RR interval, which is the time between consecutive heartbeats, was also measured. The SCL, HR, and RR intervals were subdivided into the following seven individual periods, and the arithmetic mean was calculated for each: Baseline (2 min), Interview 1 (1 min), Interview 2 (1 min), Interview 3 (1 min), Interview 4 (1 min), Math 1 (3 min), and Math 2 (2 min). The severity of symptoms was evaluated using the Positive and Negative Syndrome Scale (Kay et al. 1987) 1 week before testing. The total daily dose of antipsychotic medication at the time of testing was converted into chlorpromazine (CPZ) dose equivalents based on the defined daily dose (Leucht et al. 2016).

2.4 Procedure

Participants were asked to refrain from physical exercise and alcohol at least 24 h before testing and to refrain from consuming anything but water 2 h before. Due to the circadian variation in cortisol levels, sessions were conducted between 9 a.m. and 12 p.m. only, to ensure relatively low pre-stress levels and to maximize comparability among subjects. All participants were invited to the clinical laboratory 30 min before the start of the test and were given instructions for the VR-TSST. Then, they were told to relax and remain calm for approximately 10 min during which the first blood sample for the baseline cortisol level was collected. The participants put on the HMD, and the TSST begun.

2.5 Statistical analysis

The data analysis was performed using R Statistical Software (version 4.3.0, The R Foundation for Statistical Computing, Vienna, Austria). The normality of the data distribution was tested using the Shapiro–Wilk test. The t-test and χ2-test were performed to assess differences in demographic variables between patients and controls. Sex and Education showing significant differences between the two groups were utilized as covariates in further analyses. Changes in the cortisol level, SCL, HR, and RR intervals during testing were analyzed using two-way repeated measures analysis of covariance (ANCOVA). The dependent variables were the four physiological measures; time of measurement was the within-subject factor, and group was the between-subject factor. For the post hoc test, each time point was analyzed separately using analysis of covariance controlling for baseline data, sex, and education between the groups (patients vs. controls). Paired t-tests were performed to compare the physiological measures obtained at baseline with those at the remaining six time periods. Given that a greater cortisol response to stress was expected in men (Kirschbaum et al. 1992) and there was an unequal sex distribution between the two groups, an additional exploratory analysis of cortisol levels in men was performed. Cortisol responders and non-responders were defined as the area under the curve for increase values > 0 and ≤ 0, respectively, based on the criteria suggested by Lange et al. (2017a, b). The sample size for each measure varied due to rejection of blood samples or technical errors when the physiological parameters were measured.

3 Results

Among the demographic variables, sex (\(\chi 2\) (1) = 16.61, \(p\) < 0.001) and education (t (95.93) =  − 2.55, \(p\) = 0.012) were significantly different between the patient and HC groups. The diagnoses of the patients were schizophrenia, schizophreniform disorder, schizoaffective disorder, psychotic disorder not otherwise specified, and delusional disorder, and the frequencies of each diagnosis are shown in Table 1. A significant main effect of time was detected for cortisol (\(p\) = 0.001). Significant main effects of group (\(p\) = 0.007; \(p\) < 0.001) and time (\(p\) < 0.001; \(p\) < 0.001) were observed for SCL and HR respectively. There were no interaction effects for cortisol, SCL and HR. The RR intervals had significant main effects for group and time and an interaction effect (\(p\) < 0.001; \(p\) < 0.001; \(p\) < 0.001) (Table 2).

Table 1 Demographic and clinical characteristics of the participants
Table 2 Main and interaction effects for cortisol, SCL, HR, and RR intervals

Post hoc analyses indicated significant differences in the SCL between the two groups at baseline (t =  − 2.72, \(p\) < 0.001), during Interview 2 (t =  − 2.46, \(p\) = 0.042), and during Math 1 and 2 (t =  − 2.6, \(p\) = 0.021; t =  − 3.25, \(p\) = 0.021). HR was significantly different between the two groups at baseline (t =  − 2.82, \(p\) < 0.001) and during Interviews 1–4 (t =  − 2.1, \(p\) = 0.007; t =  − 3.37, \(p\) = 0.001; t =  − 3.36, \(p\) = 0.004; t =  − 3.34, \(p\) < 0.001) and during Math 1 and 2 (t =  − 2.34, \(p\) = 0.021; t =  − 3.06, \(p\) = 0.021). The RR intervals were significantly different between the two groups at baseline (t = 3.26, \(p\) < 0.001), during Interviews 1–4 (t = 3.1, \(p\) < 0.001; t = 3.67, \(p\) < 0.001; t = 3.79, \(p\) < 0.001; t = 3.51, \(p\) < 0.001), and during Math 1 and 2 (t = 2.76, \(p\) = 0.028; t = 3.35, \(p\) = 0.021) (Table 3 and Fig. 2). The subgroup analysis of cortisol levels in men and the responder rates (36% and 40% in the patient and HC groups, respectively) were not significantly different between the two groups (results not shown).

Table 3 Post hoc results between the patient and control groups
Fig. 2
figure 2

Effects of TSST on a cortisol, b SCL, c HR and d RR intervals over time. Two-way repeated measures ANCOVA with baseline data, sex, and education data as covariates and Bonferroni correction. Data given as Mean ± SD. *p < .05, **p < .005. Note: BPM—Beat Per Minute; HR—Heart Rate; SCL—Skin Conductance Level; TSST—Trier social stress test

4 Discussion

The advantages of using the VR-TSST are that only modest space is required, and that it could be performed anytime because of no need to hire actors. The VR-TSST produced similar physiological responses in the HCs to those on the in vivo TSST. However, no study has used the VR-TSST in patients with psychosis. We sought to develop the VR-TSST and validate its usefulness in patients with psychosis.

We observed no significant differences in cortisol levels between the patient and HC groups at any time point. The lack of a difference at baseline is consistent with most studies (Brenner et al. 2009, 2011; Jansen et al. 2000; Nugent et al. 2015; Rubio et al. 2015; Steen et al. 2011; Wieck et al. 2013). No differences in post-stress levels (during the test or at 5–10 min or 30 min after the test) were detected, and the area under the curve for increase values agreed with those in previous studies (Brenner et al. 2011; Nugent et al. 2015). However, most previous studies reported a blunted cortisol response compared with the HCs (Jansen et al. 1998, 2000; Lange et al. 2017a, b; Rubio et al. 2015; Steen et al. 2011; Wieck et al. 2013). Unexpectedly, a low proportion of cortisol responders was observed among the participants. This suggests that our version of the VR-TSST is insufficient to produce a significant cortisol response in terms of immersion and interaction, although some individuals reported that the testing was quite stressful. However, a possible cause for the negative finding may be that participants quickly adapted to the VR environment and knew that it was not real and was controllable. Alternatively, it may be due to the absence of anticipatory phase or low quality of immersion and interaction of the VR-TSST. Therefore, VR-TSST contents must be improved in terms of immersion and interaction to produce a cortisol response comparable with that of the in vivo TSST. Manipulating participants to feel sense of walking into the interview room or using artificial intelligence avatar capable of performing interactive talk in real time manner could be considered.

Significant differences in the SCL were observed between the patient and HC groups at baseline and during Interview 2, Math 1, and Math 2. Pronounced skin conductance activity has been reported in children during the TSST (Gilissen et al. 2008). As SCL can be used as an index of the arousal level of the autonomic nervous system (Takano et al. 1993), these findings suggest that VR-TSST induced stronger sympathetic activation of the sweat glands in patients than in the HCs. Interestingly, we observed higher HRs at all time points in patients with psychosis compared with the HCs and an increased HR during Interviews 1–3 relative to baseline in the patients only. Most previous studies reported significantly higher HRs regardless of time in patients with schizophrenia compared with HCs and a significant increase in HR relative to baseline in patients with schizophrenia and HCs (Brenner et al. 2009; Jansen et al. 1998, 2000; Rubio et al. 2015; van Venrooij et al. 2012). Only one study reported a higher HR in patients with schizophrenia compared with HCs but no increase in HR relative to baseline in the patients (HRs during baseline and the task were 84.9 and 85.5 with no significant difference) (Dinzeo et al. 2008). Our HR results suggest that patients with schizophrenia were more anxious about the test and show heightened autonomic activation compared to HCs before and during the test. Significantly shorter RR intervals were observed in the patients compared to the HCs at all time points. A pilot study using VR-TSST in HCs detected no significant heart rate variability results (Jönsson et al. 2010). Relative sympathetic increases in HR cause the time between heart beats, the RR interval, to become shorter, and increases in relative parasympathetic activity cause the interbeat interval to become longer (Billman 2011). Thus, our RR interval findings are in line with the HR findings. These significant findings may be due to greater sensitivity to blood sampling or HMD in patients with psychosis. However, it is of note that we minimized stress of blooding sampling by using the catheter.

The present study had several limitations. First, we modified the TSST, as there was no anticipatory phase, and a job interview was used for the stress procedure instead of free speech. These changes may have contributed to the negative cortisol findings and HR response during the test. Second, the time point was relatively short (i.e., 30 min) for post-TSST measurement of the cortisol level compared with previous studies (up to 90 min). Moreover, there were no post-TSST measurements for the SCL, HR, or RR intervals. This short measurement duration may have obscured the full picture of the physiological response, preventing the identification of different responses between patients and HCs during the late recovery phase. Third, covariates, such as sex and antipsychotic medications, were not well controlled. There is evidence that men show a greater cortisol response to stress (Kirschbaum et al. 1992; Liu et al 2017), and the use of antipsychotic medications attenuates basal and morning cortisol levels in patients with psychosis (Chaumette et al. 2016; Girshkin et al. 2014). Given that there was a gender imbalance between the two groups, results should be interpreted cautiously within that context. Lastly, in order to be used as job interview skills training or stress management therapy, various contents need to be developed. Nevertheless, the strengths of the present study are that this was the first application of the VR-TSST in patients with psychosis and HCs, and its validity was tested by measuring physiological parameters. Most of patients were interested and excited about trying a new technology, VR.

In conclusion, the VR-TSST produced no discernible cortisol response in patients with psychosis compared with the HCs. However, a higher SCL and HR, as well as shorter RR intervals, were observed in patients than in the HCs. These findings suggest that the current version of the VR-TSST induces stronger autonomic and cardiovascular, but not endocrine (cortisol), responses in patients with psychosis compared with HCs. It was also acceptable and tolerable to all participants. We demonstrated the VR-TSST to be a valuable tool to evaluate or train the stress response in patients with psychosis.