Context-dependent generalization of conditioned responses to threat and safety signals.

Contextual information can modulate the conditioned response to a threat signal (conditioned stimulus, CS+): fear responses are either potentiated or attenuated depending on whether the context is threatening or safe. In this study, we investigated the influence of context on conditioned fear as well as on generalization of conditioned fear. Thirty-two participants underwent a cue-in-context learning protocol in virtual reality (VR). On Day 1 (acquisition), participants received a mild painful electric shock (unconditioned stimulus, US) in one virtual room (fear context, CTX+) at the offset of one colored light (CS+), but never at the offset of another colored light (CS-). In a second room (safety context, CTX-), the two lights were also presented, but not the US. Successful cue conditioning was indicated by aversive ratings and startle potentiation but not skin conductance responses (SCR) to CS+ versus CS- in CTX+ and not in CTX-. On Day 2 (generalization), participants re-visited both fear and safety contexts plus a generalization context (G-CTX), which was an equal mix of CTX+ and CTX-. The two CSs were shown again in all three contexts. Generalization of conditioned fear was revealed in affective ratings (CS+ was rated more aversive than CS- in G-CTX), but not in physiological measures (equal startle potentiation to CS+ versus CS- in all contexts). In sum, contextual information modulates the responses to a threat signal such that a safety context can inhibit conditioned fear. Interestingly, generalization processes also depend on contextual information.


Introduction
A reliable identification of threatening events is crucial for an organism's survival. In order to facilitate such identification, organisms easily learn to predict an aversive event (i.e., unconditioned stimulus, US) by associating an initially neutral stimulus (Pavlov, 1927), which contiguously and contingently precedes the threatening event (Rescorla, 1988). As a result of this associative learning, organisms show defensive responses at the presence of the initially neutral stimulus, which is then labelled conditioned stimulus (CS+). As an indication of defensive responses, healthy human individuals show startle potentiation Hamm and Weike, 2005;Lindner et al., 2015;Lipp et al., 1994), stronger physiological arousal Büchel et al., 1999;Haaker et al., 2013;Hamm and Weike, 2005;Lipp et al., 1994), amygdala activation (Andreatta et al., 2012;Büchel et al., 1999;Lindner et al., 2015) and aversive ratings (Andreatta et al., 2012;Haaker et al., 2013;Hamm and Weike, 2005) in response to the threat signal (i.e., CS+) as compared to a safety signal. A safety signal (or CS-) is normally a second stimulus, which is additionally presented during a classical conditioning protocol (named differential conditioning), but never associated with the aversive US (Lonsdorf et al., 2017).
Associative learning has been proposed as a simple and reliable model for the etiology and maintenance of anxiety disorders (Craske et al., 2009;Mineka and Oehlberg, 2008;Pittig et al., 2018). In particular, anxiety patients seem to have altered safety learning (Duits et al., 2015;Lissek et al., 2005). Thus, these patients compared to healthy controls show startle potentiation to the safety signals (i.e., CS-), which suggests generalization of conditioned fear (Duits et al., 2017), and/or delayed extinction of conditioned fear (Duits et al., 2017;Michael et al., 2007). In other words, anxious as compared to healthy individuals have greater difficulties in identifying safety and consequently tend to generalize their fear (Lohr et al., 2007;Struyf et al., 2015).
Generalization of conditioned fear refers to fear responses elicited https://doi.org/10.1016/j.ijpsycho.2020.06.006 Received 19 November 2019; Received in revised form 9 June 2020; Accepted 12 June 2020 by stimuli, which have never been associated to a threat, but which share physical or semantic properties with the threat signal (Dunsmoor and Paz, 2015;Dymond et al., 2015). Thus, the more similar to CS+ a stimulus is, the stronger fear responses are elicited. The pattern of these responses defines the generalization gradient, i.e. a generalization index . Normally, healthy individuals respond with fear to cues physically , conceptually or semantically (Dunsmoor and Murphy, 2014) most similar to CS+ resulting in a steep generalization gradient. In contrast, panic disorder patients for example respond with fear to a much broader number of cues, which results in a less steep generalization gradient (Lissek et al., 2014;Lissek et al., 2010). Conditioned responses are not only modulated by the cues' properties, but also and importantly by the contexts in which they are presented (Bouton, 2002;Urcelay and Miller, 2014). Context is a complex stimulus (Bouton et al., 2006;Maren et al., 2013;Rudy, 2009), which can be physical (i.e., a variety of background stimuli as well as the room or the cage in which the learning happens), temporal (i.e., the passage of time during which some aspects remain constant, while others change) or "internal" (i.e., the psychological or emotional state of an organism).
Several studies have investigated the modulatory role of the context on fear learning in humans (Alvarez et al., 2007;Baas et al., 2004;Hermann et al., 2016;Huff et al., 2011;Kalisch et al., 2006;Muhlberger et al., 2014;Sjouwerman et al., 2015). Notably, the heterogeneity in the applied paradigms and in the experimental goals is quite striking. However, the current evidence clearly supports the crucial role of the context in the return of conditioned fear. Specifically, in some of these studies (Alvarez et al., 2007;Baas et al., 2004;Hermann et al., 2016;Huff et al., 2011;Kalisch et al., 2006;Muhlberger et al., 2014) acquisition and extinction of conditioned fear were conducted in distinct contexts. Results indicate that, even after successful extinction, conditioned fear was (re-)elicited by the CS+ when presented again in the acquisition context as indicated by higher amygdala activation (Kalisch et al., 2006), larger SCR (Alvarez et al., 2007;Hermann et al., 2016;Huff et al., 2011;Kalisch et al., 2006), startle potentiation (Alvarez et al., 2007) and stronger fear ratings . In accordance with animal studies (for a review see Bouton et al., 2006), these human studies confirm the role of the context as a decisive factor for eliciting either the fear acquisition memory trace (i.e., the CS-US association) or the extinction memory (i.e., the CS-no-US association, Milad and Quirk, 2012;Quirk and Mueller, 2008).
Interestingly, other studies (Baas et al., 2004;Muhlberger et al., 2014;Sjouwerman et al., 2015) revealed generalization of conditioned fear, when CS+ and CS-were presented in a novel or safety context. Thus, healthy individuals showed startle potentiation (Baas et al., 2004) and larger SCR (Sjouwerman et al., 2015) to CS+ versus CS-in a context where the aversive event was never delivered. Furthermore, in a virtual reality study (Muhlberger et al., 2014) discriminative startle responses to CS+ and CS-were observed in a fear (i.e., a virtual office in which the aversive US was delivered at CS+ offset), but not in a safety context (i.e., a second virtual office in which no aversive US was delivered). Strikingly, when CS+ and CS-were presented in a third novel virtual room, which has never been visited during acquisition and therefore in which no US has been delivered, individuals showed startle potentiation to CS+ as compared to CS-suggesting generalization of conditioned fear. Notably, this novel context shared the semantic or conceptual features with the fear context as being an office, but importantly it differed from the fear context in physical features (e.g., furniture). In other words, this study (Muhlberger et al., 2014) demonstrated semantic-related generalization processes induced by the environment in which the cues are presented (Dymond et al., 2015). However, it is still unclear how and whether physical-related generalization processes are induced by a context, in which CSs are present.
Generalization processes of contextual learning related to the physical features of the contexts have been previously investigated Andreatta et al., 2019;Andreatta et al., 2017), however in a mere context conditioning protocol without CSs. Healthy individuals showed generalization of contextual anxiety on the verbal level, and no generalization of contextual fear on the physiological level. Specifically, successful contextual learning was indicated by stronger subjective anxiety and startle potentiation in the virtual office (i.e., anxiety context or CTX+) associated to an unpredictable aversive US as compared to the virtual office (i.e., safety context or CTX-) in which no US was delivered. Interestingly, during a subsequent generalization test participants reported stronger subjective anxiety in a new generalization virtual office than in the safety context, which was comparable to the subjective anxiety reported in the anxiety context. However, startle responses were as attenuated in the generalization context as in the safety context. Notably and differently from aforementioned study, the generalization context in these studies shared physical properties with both anxiety and safety contexts meaning that half of the furniture was taken from the anxiety office and the other half from the safety office (for details see . The present study combined the designs of Muhlberger et al. (2014) and  to further investigate the modulatory role of context on cue-conditioned fear with a focus on generalization processes related to physical similarities of contexts (Dunsmoor and Paz, 2015). Moreover, in Muhlberger's study the test phase followed the acquisition phase by a few minutes and consequently the recent fear learning may have interfered with generalization processes (for a broader discussion see Maren, 2014). Therefore, we extended the paradigm over two days in order to disentangle the recency effects from the learning mechanisms on context-dependent fear generalization. To this end, thirty-eight healthy individuals underwent a cue-in-context conditioning protocol in virtual reality on Day 1. Participants were passively guided through two rooms in which two colored lights were alternatively turned on and off. In one office (i.e., fear context or CTX +), a mild painful electric shock (the aversive US) was always delivered at the offset of one light (CS+), but never at the offset of the other light (CS-). No US was delivered in the other office (i.e., safety context or CTX-). On Day 2, participants visited both the CTX+ and the CTX-plus a third office (generalization context or G-CTX), which was the equal mix of the other two ; in all three offices the two lights, i.e. CS+ and CS-, were presented as well. Based on previous studies (Muhlberger et al., 2014), we expected aversive ratings, startle potentiation and larger SCR to CS+ versus CS-in the fear context, but not in the safety context on Day 1. On Day 2 we expected, based on previous studies (Alvarez et al., 2007;Hermann et al., 2016;Kalisch et al., 2006;Muhlberger et al., 2014), aversive ratings, startle potentiation and larger SCR to CS+ versus CS-in the fear as well as the generalization context, but not in the safety context. Considering that generalization context elicits comparable conditioned anxiety responses as the threatening context, we expected comparable discriminative responses in these two contexts.

Participants
Sixty-seven volunteers participated in the study, approved by the ethics committee of the Psychology Faculty of the University of Würzburg in accordance with the declaration of Helsinki. Exclusion criteria were age (younger than 18 years and older than 35 years), history of psychiatric or neurological disorders, current use of psychoactive drugs, chronic pain, pregnancy, or color blindness. Students of psychology received course credits and were only included if they were in their second semester of bachelor study (at maximum) because of possible confounding factors due to knowledge about conditioning. All other participants received 16 € for taking part in the study. Eighteen participants had to be excluded from the analysis due to technical problems (e.g., the amplifier for the startle probes was not turned on by the experimenter, or the VR program was interrupted by updates), seven because of missing startle responses in one of the conditions, and ten due to low startle amplitude (overall mean amplitude below 5 μV, see Method). The final sample consisted of 32 participants (20 females, 25.06 years, SD = 3.42).
All participants read and signed an informed consent form. They were informed about the possible side effects of VR (e.g., nausea, sweating, disorientation), the loud noise and that they would receive mild painful electric shocks during the experiment. Participants were told that they could find out the relationship between the electric shock and the stimuli, but there was no further mention of the US contingency.

Unconditioned stimulus (US)
A constant current stimulator (Digitimer DS7A, Digitimer Ltd., Welwyn Garden City, UK) generated mildly painful electric stimuli (50 Hz, 200 ms) delivered through two electrodes to the dominant inner forearm triggered by the software CyberSession (Programmversion VTplus GmbH, Würzburg, Germany; www.cybersession.info). The intensity was individually adjusted by means of two ascending and two descending series of electric shocks. For each electric shock, participants received two ascending and two descending series of electric stimulations. After each stimulation, they were asked "how painful is the electric stimulation?" and they could give their ratings verbally on a visual analogic scale (VAS) ranging from 0 (no sensation at all) to 10 (very strong pain), having 4 (just noticeable pain) as an anchor for the threshold. The intensity of each shock was either increased or decreased by 0.5 mA and started at 0 mA. The individual threshold was calculated by the mean of the two first intensities rated as painful on the ascending series, and the two last intensities rated as painful on the descending series. The pain threshold was then increased by 30% in order to avoid habituation and resulted in a mean intensity of 2.58 mA (SD = 1.71), which was rated as painful (M = 5.38, SD = 1.34).

Contextual stimuli (CTX)
We used the same VR environment described in . Briefly, it consisted of virtual offices separated by a corridor (see Fig. 1) and had the same floor plan, while they differed in the arrangement of the furniture. The aversive US was delivered in one office (fear context, CTX+), but never in the other office (safety context, CTX-). The offices were counterbalanced among participants. A third virtual office was a mix of the other two and contained 50% of the furniture from one office and 50% of the furniture from the other office, equally distributed in the room.

Cue stimuli (CS)
In the middle of the two offices, we positioned a lamp. When the lamp turned on, the offices were enlightened by either a blue or a yellow light for 8 s (Fig. 1).

Startle probes
The acoustic startle stimulus was a 103 dB burst of white noise presented for 50 ms binaurally via headphones (see the procedure for the amount and the time presentation of the white noises).

Ratings
Participants verbally rated the virtual offices alone or with the lights after each experimental phase (see Procedure). To this purpose, a screenshot of a room or a screenshot of a room illuminated by one colored light was presented. Participants were instructed to imagine being inside this virtual room and then were asked to rate the valence and arousal of the virtual stimuli, their anxiety or fear as well as their expectancy of the US. Each question was presented for 10 s. The questions were asked in German and referred either to the rooms alone (i.e., four questions for the three contexts, altogether 12) or to the lights (i.e., four questions for CS+ and four for CS-in the three contexts respectively, altogether 24). We report here the translations for the questions. For the valence ratings, we asked: "how negative versus positive was the office?" or "how negative versus positive was the light?"; for the arousal ratings: "how intense was your arousal in this office?" or "how intense was your arousal by this light?"; for fear ratings: "how strong was your fear in the office?" or "how strong was your fear by this light?"; and for the US-expectancy ratings: "how high is the probability that you received an electric shock in this office?" or "how high is the probability that you received an electric shock by this light?". Below each question, a VAS ranging from zero until 100 was presented as well. Zero meant "negative", "calm", "no fear" or "no association" for the valence, arousal, fear, and US-expectancy ratings, respectively; while 100 meant "positive", "intense", "strong fear" and "perfect association", respectively.

Questionnaires
Participants completed the German versions of several questionnaires (Table 1 and Supplementary Material). The Igroup Presence Questionnaire (IPQ, Schubert et al., 2001) measures the presence of participants in the VR, which refers to the immersion in the virtual environment eliciting the conviction that the participant is actually located in the virtual environment (Sanchez-Vives and Slater, 2005). The Anxiety Sensitivity Index (ASI, Alpers and Pauli, 2001) measures the individual anxiety sensitivity defined as fear of anxiety-or arousal related sensations such as increased heart rate. Both IPQ and ASI were filled in at the end of the second day. The State-Trait Anxiety Inventory (STAI, Laux et al., 1981) consists of 20 items for the trait part and 20 items for the state part and measures the individual general anxiety. The Positive and Negative Affect Schedule (PANAS, Krohne et al., 1996) was used to determine current positive and negative mood on 10 items.
The STAI trait and state (Laux et al., 1981) as well as PANAS (Krohne et al., 1996) were filled out at the beginning of the experiment. Moreover, STAI state and PANAS were completed for each day both at the beginning and at the end.

Procedure
Participants came in the laboratory on two consecutive days at the same time point. On Day 1, they sat on a comfortable chair while signing the informed consent and completing a demographical questionnaire as well as the trait and the state parts of the STAI and the PANAS. Afterwards the electrodes were attached (see Data reduction and analysis), participants wore the VR glasses and the pain-threshold workup was conducted as described above while participants saw a black screen thought the VR glasses. The experimental protocol for Day 1 consisted of three phases spaced out by ratings (Fig. 1).
During the exploration phase, participants could freely navigate through the two to-be-conditioned offices for 2 min each, by means of a joystick, but not through the generalization office (G-CTX). The two colored lights were turned on once for 8 s. in each office. No electric shock or startle probe was delivered.
In order to habituate the great initial reactivity of the startle response (Blumenthal et al., 2005), seven startle probes were delivered every 7-14 s. Afterwards, the two identical acquisition phases (Acquisition 1 and Acquisition 2) started. During these phases, participants were passively guided through the virtual offices on one of two prerecorded paths, played alternatively. Participants could still move their heads freely. All paths started from the corridor (inter-trial interval, ITI) and entered one virtual room after 20 s for 140 s (one trial). Each acquisition phase consisted of four trials, i.e. two entries to each virtual room. In each office, the blue and the yellow lights were turned on and off three times. The inter-stimulus interval (ISI, i.e. the interval between one light offset and the onset of the next light) lasted randomly between 10 and 20 s (M: 15 s). In one office (fear context, CTX+), participants received the US at every offset of one light (CS+), but never at the M. Andreatta, et al. International Journal of Psychophysiology 155 (2020) 140-151 offset of the other light (CS-). In the other office (safety context, CTX-), both lights (i.e., CS+ and CS-) were presented but never in association with the US. Lights and contexts were counter-balanced among participants. Additionally, during each trial three startle probes were presented during the cues (5-6 s after CS onset, meaning startle probes were delivered by three out of six CS presentations), three when the lights were off meaning during the contexts (9-15 s after CS offset) and two during the ITIs (7-10 s after the start of the pre-recorded path). Both USs and startle probes were never delivered during the first and the last 7 s of a room visit in order to prevent specific association between these aversive stimuli and the doors. Moreover, the time intervals between two startle probes, or between two USs, or between a startle probe and an US were at least 10 s. Twenty-four hours later, participants returned to the laboratory and after having filled in the STAI state and the PANAS again, the electrodes were attached as on Day 1. Moreover, the averseness of the US was retested by delivering one electric shock with the intensity determined on Day 1. In case participants reported the US as non-painful (< 4 on the VAS) the intensity was increased in steps of 0.5 mA until it was rated as painful. On Day 2, the US mean intensity was 2.61 mA (SD = 1.69) and, as on Day 1, was rated as painful (M = 4.50, SD = 1.30). After this procedure, we verified participants' contingency awareness regarding both offices (i.e., CTX+ and CTX-) and lights (i.e., CS+, CS-) visited on Day 1. Then, two identical generalization test phases (Generalization 1 and Generalization 2) started. Participants were passively guided into all three virtual rooms (i.e., CTX+, CTX-, and G-CTX) on two different prerecorded paths for each office. For each phase, each context was entered twice (altogether 12 trials, six in each generalization test phase). During the visit of each office, both blue and yellow light were presented three times in each context. In order to prevent extinction learning, the US was delivered once at CS+ offset during the last trial in the CTX+ of the first generalization phase. Notably, the CS-US contingency was reduced for preventing safety learning for the generalization context. Furthermore, five startle probes were delivered in each virtual room during each generalization phase (i.e., altogether 10 per context) in an unpredictable manner exactly as described for the acquisition phases. Five additional startle probes were presented during the lights during each generalization phase (meaning startle probes were delivered by five out of six CS presentations), as described above.
Importantly, the sequence of the rooms and the lights in all four phases (Acquisition 1, Acquisition 2, Generalization 1, Generalization 2) was pseudo-randomized with the restriction that the same office would not be entered more than twice in a row and the same light would not Fig. 1. Sketch of the experimental protocol. Participants had to come on two consecutive days. On Day 1, they first explored two virtual offices freely by means of a joystick. Then two identical acquisition phases were conducted in which in one office (fear context or CTX+, dark grey boxes) mild painful electric shocks (US) were delivered at the offset of one colored light (CS+, blue bars), but never at the offset of another light (CS-, yellow bars). While in the other office (safety context or CTX-, light grey boxes) the two lights were presented but not the US. On Day 2, participant re-visited both CTX+ and CTX-as well as an additional context (generalization context or G-CTX, grey bars), which was the equal mix of the fear and safety contexts. In each context, the two lights were alternatively turn on and off. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) turn on more than twice in a row.

Data reduction
Physiological responses were continuously recorded with a V-Amp 16 amplifier and Vision Recorder Software (Version 1.03.0004, BrainProducts Inc., Munich Germany). A sampling rate of 1000 Hz and a notch-filter at 50 Hz were applied. The offline analyses were conducted with the Brain Vision Analyzer (Version 2.0, BrainProducts Inc., Munich, Germany).
The startle response was measured with electromyogram (EMG) from the M. orbicularis oculi with two 5 mm Ag/AgCl electrodes placed below the left eye following guidelines (Blumenthal et al., 2005). The EMG was offline filtered with a 28 Hz low-cutoff filter and a 400 Hz high-cutoff filter. Then, it was rectified and a moving average of 50 ms was applied for smoothing the signal. The signal was then segmented for each phase, virtual room and colored light from 50 ms before and 1 s after startle-probe onset. After the baseline correction (50 ms before probe onset), the startle responses were manually scored and trials with excessive shifts (≥5 μV) within the 50 ms baseline were excluded from analysis. The conditions remained blind to the person, who scores the startle responses. Seven participants had to be excluded from the analysis as no startle response was revealed in one of the conditions. Startle amplitude was defined as the maximum peak between 20 ms and 120 ms after probe onset. Participants with a mean startle response < 5 μV were coded as non-responders and excluded from the analysis (N = 10). The raw data were then within-subject transformed to zscores and then to T-scores, separately for each day (see also, Andreatta et al., 2017;Golkar and Öhman, 2012;Klinke et al., 2020). The T-scores were averaged for each condition (i.e., CS+ and CS-in CTX+, CTXand G-CTX) and each phase (i.e., Acquisition 1, Acquisition 2, Generalization 1, Generalization 2), separately.
Electro-dermal activity (EDA) was measured with two 8 mm Ag/ AgCl electrodes, one fixed on the thenar and the other one on the hypothenar of the non-dominant hand (Boucsein et al., 2012). A 1 Hz high-cutoff filter was applied offline. Considering that startle probes strongly influence the SCR to the CS onset and considering the elevated number of startle probes during each condition (i.e., in the acquisition phases two out of three CS presentation was associated with a startle probe), we analyzed skin conductance responses (SCR) to the startle probes (Vrana, 1995), in order to have a sufficient number of responses. For SCR amplitude, we calculated difference (in μS) between response onset (i.e., the first deflection of the EDA between 1 and 4 s after startle probe) and the first following response peak (Boucsein et al., 2012). Responses lower than 0.02 μS were scored as zero. Responses were then range corrected separately for each day (see also, Sperl et al., 2016). Alike the startle response, we calculated mean scores for each CS (i.e., CS+ and CS-) within each context (i.e., CTX+, CTX-, G-CTX) separately for each phase (Acquisition 1, Acquisition 2, Generalization 1, Generalization 2). Although the underlying learning mechanisms of the startle response and the SCR may be quite distinct (Hamm and Weike, 2005;Vrana, 1995), we kept the exclusion criteria mentioned above for the statistical analysis of SCR, and excluded two additional participants, because they were labelled as non-responder (i.e., mean SCR amplitude < 0.02 μS).

Statistical analyses
The statistical analysis was performed with the software SPSS (Version 20.0, SPSS Inc.). Startle responses, SCR, valence, arousal, fear and US-expectancy ratings were analyzed with separate ANOVAs for acquisition and generalization phase respectively.
ANOVAs of acquisition day for valence, arousal and fear ratings had CS (CS+, CS-), context (CTX+, CTX-) and phase (habituation, Acquisition 1, Acquisition 2) as within-subject factors, while for USexpectancy ratings, SCR and startle responses had CS (CS+, CS-), context (CTX+, CTX-) and phase (Acquisition 1, Acquisition 2). ANOVAs for generalization day were calculated with the within-subject factors CS (CS+, CS-), context (CTX+, CTX-, G-CTX) and phase (Generalization 1, Generalization 2) for all variables. Simple contrasts were calculated as post-hoc tests for significant interactions and Bonferroni corrected. Additionally, we analyzed responses to the contexts alone and reported these results in the Supplementary Material.
Based on a reviewer's suggestion, we exploratively looked at the modulatory role of trait anxiety (Haddad et al., 2012) as well as anxiety sensitivity (Hunt et al., 2019) on context-dependent fear generalization. To this purpose, mean scores for the two generalization test phases were calculated for each dependent variable. CS-mean scores were then subtracted from CS+ mean scores, separately for each context. Pearson correlations (Bonferroni corrected α > 0.017) were calculated between the differential scores and STAI scores or ASI scores.
Partial η 2 are indicated for effect size. The analysis for the contexts without the cues is reported in the Supplementary Material.

Physiological responses. The interaction
Context × CS (Fig. 2) was not significant for the physiological responses (all p values > 0.245).
Lastly, the interaction Context × Phase (Table 3) was significant for arousal (F(2, 62) = 9.48, GG-ε = 0.774, p = 0.001, partial η 2 = 0.234), fear (F(2, 62) = 9.85, p < 0.001, partial η 2 = 0.241), USexpectancy (F(1, 31) = 5.26, p = 0.029, partial η 2 = 0.145) ratings, Fig. 2. Bars depict the means (with s.e.m.) of conditioned responses, while the slightly transparent geometric shapes depict the individual responses for (a.) valence, (b.) arousal, (c.) fear and (d.) US-expectancy ratings as well as (e.) startle and (f.) skin conductance responses to CS+ (grey bars and triangles) and CS-(white bars and squares) separately for the fear and safety contexts (CTX) for the acquisition day. Follow-up simple contrasts for the significant CS × Context interaction indicated discriminative responses to CS+ versus CS-are evident in the fear context, but not in the safety context. **p < 0.01; ***p < 0.001. US-expectancy ratings as well as (c.) startle responses to CS+ (grey bars and triangles) and CS-(white bars and squares) separately for fear and safety contexts (CTX). The ANOVAs for acquisition day returned significant CS × Context × Phase interactions for fear ratings, US-expectancy and startle responses demonstrating discriminative responses to CS+ versus CS-are evident in the fear context, but not in the safety context on the ratings, while on the physiological level such discriminative responses are less strong. + p < 0.05, **p < 0.01; ***p < 0.001.

Physiological responses. The interactions CS × Phase
Interestingly, the main effect CS turned out as significant for startle responses (F(1, 31) = 14.21, p = 0.001, partial η 2 = 0.314), but not for SCR (F(1, 29) = 0.63, p = 0.433, partial η 2 = 0.021) indicating that CS + elicited startle potentiation (i.e., stronger physiological fear) independently from the context in which it was presented, which is in line with arousal and fear ratings. On the other hand, the main effect of context was also significant for startle response (F(2, 62) = 7.03, p = 0.002, partial η 2 = 0.185), but not for SCR (F(2, 58) = 0.92, p = 0.403, partial η 2 = 0.031). Post-hoc Bonferroni corrected (α < 0.025) simple contrasts returned comparable startle responses for the generalization context as compared to the safety context (F(1, 31) = 1.79, p = 0.191, partial η 2 = 0.055) suggesting a dissociation from the ratings that is safety-like physiological responses. On the other M. Andreatta, et al. International Journal of Psychophysiology 155 (2020) 140-151 hand and in parallel to the ratings, fear context elicited slight startle potentiation (F(1, 31) = 5.22, p = 0.029, partial η 2 = 0.144) as compared to the safety context. While, the ASI scores positively correlated with startle differential score in the safety context (r(31) = 0.42, p = 0.016), but no other correlations were found (all p values > 0.175). Hence, on the physiological level of responses the more anxiety sensitive participants were, the better they discriminated between threat and safety signal, but only when the context was clearly safe.

Discussion
In this study, we investigated acquisition and generalization processes of conditioned fear mediated by contextual features in a two-day virtual reality paradigm. On Day1, healthy participants learned an association between a cue (a colored light, CS+) and an aversive event (US) in one virtual office (fear context), but not in a second virtual office (safety context). In both virtual rooms, a second colored light (CS-) was turned on and off as well, but never in association with the US. As expected and in line with previous studies (Alvarez et al., 2007;Hermann et al., 2016;Huff et al., 2011;Kalisch et al., 2006;Muhlberger et al., 2014), discriminative responses between CS+ and CS-indicating successful conditioning were evident in the fear, but not in the safety context. Thus, the CS+ as compared to the CS-was rated as more negative, arousing and frightening, and triggered stronger US expectancies in the fear context only. Importantly, such discriminative responses in the fear context only were observed in the startle responses too (i.e. the CS+ versus the CS-elicited slightly startle potentiation, although these comparisons did not survive Bonferroni correction).
Possibly, the difference between a clear and quick acquisition of context modulated conditioned fear on the cognitive level (i.e., ratings) versus a less clear and slower context modulated fear acquisition on the physiological level (i.e., startle responses) may be due to distinct underlying processes. According to the two-level account hypothesis (Hamm and Weike, 2005) and the reflective-impulsive model (Strack and Deutsch, 2004), physiological responses and in particular startle reflex work in an automatic-impulsive manner, while verbal responses depend on cognitive-reflective processes. Although the six learning trials were sufficient to learn the relations between cues, contexts and aversive events on a cognitive-reflective level, they were not enough to ) startle and (f.) skin conductance responses to CS+ (grey bars and triangles) and CS-(white bars and squares) separately for fear, generalization and safety contexts (CTX). Bars depict the means (with s.e.m.), while both the squares and the triangles represent individual responses The significant CS × Context interaction for generalization test phases suggests that twenty-four hours later discriminative responses to CS+ versus CS-were slightly still evident in the fear context, but not in the safety context. Participants generalized conditioned fear. + p < 0.05; **p < 0.01.
have an impact on the automatic system resulting in non-discriminative physiological arousal (i.e., SCR). Future studies should test the resulting hypothesis that more complex learning conditions as context modulated fear conditioning is more easily acquired by the cognitive-reflective versus the automatic-impulsive system, perhaps because only the former requires cortical processes.
The generalization phase partly confirmed our hypothesis that discriminative fear responses to cues emerge in the generalization context. Although, significant CS+ versus CS-differences were found for arousal and fear ratings and startle potentiation independently from the context (i.e. equally in all three virtual rooms), we also observed context dependent difference in responses to the CS+. Specifically, context dependent generalization was indicated by the finding that the CS+ was rated as more negative and fear eliciting in both the fear and the generalization context compared to the safety context. We speculate that our participants expected to receive the aversive US again on Day 2 as the electrodes for the electric shock were re-applied, and they might have considered the generalization context which shared half of the furniture with the fear context as a new fear context. Similar effects were not found for startle responses indicating that 24 h after the learning, these responses may have become less sensitive to contextual information when responding to the cues. Notably, previous studies reported that physiological fear responses to CS+ are modulated by context, as they were observed in a novel context as in a fear context Huff et al., 2011;Muhlberger et al., 2014;Sjouwerman et al., 2015). Considering the main effect for context on startle responses, we observed startle potentiation in the fear context as compared to the safety context, while startle responses to the generalization and safety contexts were comparable. Although the learning paradigm largely differed, this effect replicates our previous findings Andreatta et al., 2019;Andreatta et al., 2017). Therefore, this lack of contextual generalization raises the question why startle responses are potentiated to the CS+ in the generalization context as in the fear context, but are not potentiated to the generalization context (i.e., to the context alone, without any cue) as to the fear context? One possible reason may be the nature of the startle response which is a very short-termed phasic fear response triggered by distinct stimuli (Davis et al., 2010). Therefore, it seems possible that startle responses are more strongly modulated by discrete stimuli and less strongly be complex contexts which consist of numerous elements (Rudy, 2009).
Our hypothesis that the CS+ triggers no fear responses in the safety context was not confirmed consistently. Although less pronounced than in the fear context, we observed that participants reported stronger arousal and fear to CS+ versus CS-in the safety context. While, USexpectancy ratings assessed in the safety context indicate comparable (no)expectations of the US by both cues and a significant reduction in US-expectancy for CS+ when presented in the safety context as compared to the fear context. As previously discussed (Muhlberger et al., 2014), in this learning protocol the lights (i.e., the CSs) are foreground, and the rooms (i.e., the CTXs) are background, therefore the preeminence of the cues over the contexts might be the reason for these weak cue effects in the safety context. In line with this idea, twenty-four hours later right after the arrival in the laboratory participants correctly remembered to which light and in which context the aversive US was delivered the day before. However, they also indicated (although in lower degree) higher expectancy for the US by CS+ versus CS-in the safety context. Again, we think that this effect might be related to the preeminence of the lights as most relevant and informative signals about the threat.
Having asked about US-expectancy at the beginning of Day 2 may have strongly modulated participants' awareness. In fact, already after the first generalization test phase no association between the lights and the electric shock was reported in any room. In line with previous studies (Ahmed and Lovibond, 2015;Duits et al., 2017;Raes et al., 2014;Vervliet et al., 2010), giving specific instructions about CS-US contingencies may drive attention to the relationship of these stimuli, and consequently modulated the conditioned responses. Thus, participants reported even stronger fear to a CS+ after being instructed that this stimulus predicts an aversive event (Duits et al., 2017;Raes et al., 2014). Furthermore, participants generalized their fear to stimuli sharing a physical property with the CS+, which was verbally instructed to be associated with the US (e.g., the color of the shape, Ahmed and Lovibond, 2015;Vervliet et al., 2010). Considering the low contingency between the CS+ and the US (one time on Day 2 versus 12 times on Day 1), participants may have concluded that the two stimuli were no longer associated. In contrast to this, the affective ratings (i.e., valence, and fear ratings) revealed slightly discriminative responses between CS+ and CS-in fear, but not in safety context. As mentioned before, participants still received the US at CS+' offset once during the one CTX+ trial. Therefore, this seemed to be enough for delaying "affective", but not "cognitive" extinction (Craske et al., 2009;Milad and Quirk, 2012;Quirk and Mueller, 2008).
Of course, this study has some strengths and some limitations. The use of VR allowed us to investigate the interaction between cue-and context-learning in a highly controlled experimental setting, which otherwise is difficult to implement. First, although comparable in size Huff et al., 2011;Muhlberger et al., 2014), our sample may have been small for this complex experimental protocol. Secondly, in this study the generalization context was an equal and perfect mix of the fear and the safety rooms. However, it would be interesting to have rooms sharing properties of the learning contexts to a different degree (e.g., 75% of the fear context). Third, unexpectedly we found no effects on SCR. Although these negative results might be due to a lack of power, there are two other possible explanations. On the one hand, the underlying neuronal and cognitive mechanisms modulating the EDA are distinct from those underlying startle response (Hamm and Weike, 2005;Vrana, 1995). Therefore, such dissociative effects may reflect the distinct mechanisms. On the other hand, delivering startle probes during a classical conditioning protocol strongly modulates the acquisition and the expression of conditioned fear on the other dependent variables (Sjouwerman et al., 2016). Therefore, it is possible that the lack of observed effects on the SCR in our study can also be due to the delivery of the startle probes. Fourth, our study is in line with the guidelines (Blumenthal et al., 2005) and the majority of studies measuring startle responses (e.g., Golkar and Öhman, 2012;Grillon et al., 2008;Lipp et al., 1994), but the additional recording of an electro-oculogram (EOG) would be advisable for a more precise distinction between startle responses and spontaneous blinks (Gehricke et al., 2002). Lastly, our participants were young and healthy individuals showing generalization of conditioned fear to an ambiguous new context. Future studies should apply a similar experimental protocol to a subclinical or even clinical sample of anxiety disorder patients in order to better detect possible altered learning and generalization mechanisms (Craske et al., 2009;Mineka and Oehlberg, 2008).
In summary, we found a clear influence of the context on conditioned fear on both the verbal-reflective level of responses (i.e., ratings) and in lesser degree on the physiological and automatic level of responses (i.e., startle responses). Thus, participants showed stronger discriminative responses between CS+ and CS-in the fear than in the safety virtual office. Moreover, such discriminative learning was maintained 24 h later and extinguished after few trials in which only one US was delivered. Lastly and most importantly, participants showed context dependent generalization of conditioned fear. In conclusion, generalization processes do not only depend on the physical or sematic properties of the cues, but also on their interaction with contextual information. These findings are particularly interesting as the use of VR in clinical settings may facilitate the translation of real situations (Baas et al., 2004) and consequently to target altered contextdependent generalization mechanisms otherwise not possible in traditional therapy.

Declaration of competing interest
PP is shareholder of a commercial company that develops virtual environment research systems for empirical studies in the field of psychology, psychiatry, and psychotherapy. No further potential conflicting interests exist.