The varieties of inner speech questionnaire – Revised (VISQ-R): Replicating and refining links between inner speech and psychopathology

Highlights • Exploratory factor analysis supports a new five-factor inner speech questionnaire.• Positive/regulatory features of inner speech separate from evaluation and criticism.• Inner speech features can be consistently related to psychopathological traits.


Introduction
Inner speechor the act of talking to yourself in your headis an experience that is as familiar as it is elusive. While many people will report frequent inner speech (also known as silent speech or verbal thinking), operationalizing inner speech as a measurable process is extremely challenging, with some doubting that it is even possible (Schwitzgebel, 2008). It is, however, necessary for understanding the role of inner speech in psychological processing; whether as the day-to-day narrator of conscious experience, a putative facilitator of abstract thought, or a potential indicator of a developing psychopathology (Alderson-Day & Fernyhough, 2015a).
Inner speech has been extensively studied as a cognitive process in research on executive functioning in children and adults, and has been linked to verbal working memory (via rehearsal; Baddeley, 2012), planning (Williams, Bowler, & Jarrold, 2012), inhibition (Tullett & Inzlicht, 2010), and cognitive flexibility (Emerson & Miyake, 2003). Typically, such research has involved blocking the production of inner speech during cognitive tasks, via distractions such as articulatory rehearsal. In contrast, the day-to-day experience of inner speech has been assessed using self-report methodssuch as questionnaires and experience-samplingwhich have https://doi.org/10.1016/j.concog.2018.07.001

Varieties of inner speech questionnaire -Revised (VISQ-R)
Phenomenological properties of inner speech were evaluated with the 35-item Varieties of Inner Speech Questionnaire -Revised (VISQ-R). In addition to the original 18 items on dialogicality, evaluation, condensation, and the presence of other people in inner speech, the extended version contained new items on literal and metaphorical use of language, speaker positioning and address, and regulation of different moods. The new items were derived in an iterative manner by a working group including the original VISQ authors (SMJ & CF), plus members of the research team with expertise in phenomenology, analytic philosophy, and cognitive science (BAD & SW). A large set of items were initially generated (> 50) in response to the aims of the scale, and then these were refined down to a manageable number for exploratory testing.
Responses were made on a seven-item frequency scale where respondents evaluated how frequently the inner speech experiences occurred, ranging from "Never" (1) to "All the time" (7). This differed from the original 6-point rating scale based on agreement with statements (e.g. "Certainly applies to me"), in an attempt to be more precise about how often such experiences occur (cf. Hurlburt et al., 2013). Each subscale of the original VISQ has previously shown high internal reliability (Cronbach's α > .80) and moderate to high test-retest reliability (> .60). See Table 2 for a list of old and new items. & Fernyhough, 2011;Morrison, Wells, & Nothard, 2000) This scale included nine items used by McCarthy-Jones and Fernyhough (2011), adapted from Morrison et al. (2000)'s Revised Launay-Slade Hallucination scale, including five auditory hallucination and four visual hallucination statements (e.g. "I have had the experience of hearing a person's voice and then found that there was no one there"). Ratings are made on a four-point Likert scale ranging from "Never" (1) to "Almost always" (4). Scores can range from 9 to 36, where higher scores indicate greater hallucinationproneness. The auditory and visual subscales of the revised LSHS-R have been shown to have adequate internal reliability (Cronbach's α > .70, e.g. McCarthy-Jones & Fernyhough, 2011). Zigmond & Snaith, 1983) Levels of anxiety and depression were assessed with the 14-item Hospital Anxiety and Depression Scale (HADS; Zigmond and Snaith, 1983). This scale comprises of seven items relating to anxiety (e.g. "I get sudden feelings of panic") and seven items relating to depression (e.g. "I have lost interest in my appearance"). Responses are made on a four-point Likert scale, with the total scores ranging from 0 to 21. Higher scores indicate higher levels of anxiety and depression. This scale has been used extensively and shown to have satisfactory psychometric properties (Zigmond & Snaith, 1983).

Hospital anxiety and depression scale (HADS;
2.2.4. Dissociative experiences scale -Second revision (DES-II; Carlson and Putnam, 1993) Frequency of dissociative experiences was measured with the 28-item self-report Dissociative Experiences Scale. Participants are asked to indicate what percentage of the time they experience dissociative states, such as feelings of derealisation or absorption (e.g. "Some people sometimes have the experience of feeling that their body does not belong to them"). Answers can range from 0% to 100%. The original DES has a mean internal reliability of .93 (van IJzendoorn & Schuengel, 1996).
2.2.5. Rosenberg self-esteem scale (RSES; Rosenberg, 1965) The 10-item Rosenberg Self-Esteem Scale includes five positive and five negative statements on self-concept and social rank (e.g. "On the whole, I am satisfied with myself."). Responses are made on a four-point scale ranging from "Strongly agree" to "Strongly disagree". The scale has been shown to have high test-retest and internal reliability (e.g. Fleming & Courtney, 1984).

Data analysis
All data were analysed in SPSS 20, with the exception of the confirmatory factor analysis (which was conducted using AMOS 22) and the mediation analysis (for which we used the "medmod" package in jamovi (Version 0.9). Relations between variables were assessed using Pearson's product correlation co-efficient tests and hierarchical linear regression. Alpha values for correlations were Bonferroni corrected to account for multiple comparisons.

Sample 1: Exploratory factor analysis of VISQ-R (35 items, n = 1472)
Missing data constituted less than 1% of total responses for the extended VISQ and were replaced by mean values. Exploratory factor analysis (EFA) was performed using Principal Component Analysis (PCA) with oblique rotation (direct oblimin), allowing for inter-correlation of the VISQ factors. Items were removed based on communalities under 0.4 or if they failed to load > 0.5 onto a single factor (Costello & Osborne, 2005). For all of the below models, KMO statistics and Bartlett's test values were within acceptable ranges.
The initial solution returned by the analysis produced 8 factors with eigenvalues over 1 (68% of variance explained). However, items 24, 26, 29, and 33 had either low communalities or low factor loadings, and these were subsequently removed from the analysis. The following model identified 7 factors with eigenvalues over 1 and accounted for a greater amount of variance (72%) but this included one factor with only one item (item 23). Inspection of the scree plot suggested five main factors with eigenvalues over 2, with two more factors barely above 1. Forcing a five-factor solution (62% variance explained) led to five items not loading on a single factor (items 21, 22, 23, 30, and 34) of which three also failed the communalities test (Items 22,23,30).
With a total of nine items removed, a stable model was found with five factors (68.43% of variance explained), with all items loading on at least one factor over 0.5 and with all communalities > 0.4 (KMO statistic = 0.895, Bartlett's X 2 (325) = 23773.10, p < .001). The five factors, displayed in Table 3, broadly mapped on to the original VISQ structure but with an expanded, 6-item evaluative/critical factor (Items 20, 23 and 24 are new) and a new, 4-item positive/regulatory factor (Items 19, 22, 25 and 26 are new). As for the previous VISQ, internal reliability of each factor was excellent (dialogic: 0.87, evaluative/critical: 0.88, positive/regulatory: 0.80, condensed: 0.87, other people: 0.91). Table 4 shows that a number of the factors were inter-correlated: dialogic inner speech was most closely related to each of the factors, followed by evaluative/critical inner speech.

Sample 2: Confirmatory factor analysis of VISQ-R (26 items, n = 377)
Confirmatory factor analysis was used to test the five-factor model in sample 2. Less than 0.2% of the data were missing and replaced by the mean response (per item). Frequency of responses for each item (by subscale) are displayed in Table 5. Maximum likelihood estimation was used for model fitting. Following an initial model containing no covariances between the factors, modification indices indicated improved fit if covariances were allowed for each factor to correlate with dialogic and evaluative/critical inner speech, but no pairwise covariances between the other three factors. Covariances between error terms were also added (within factor only) where modification indices suggested improved fit.
This resulted in a stable model with a significant χ 2 value, χ 2 (2 6 0) = 562.53, p < 0.001, but acceptable CMIN/DF ratio of 2.16. Fit statistics for the model were also in a satisfactory/good range: CFI = 0.925, RMSEA = 0.056 (90% C.I = 0.048-0.062), Hoelter index = 200.    Using sample 2, we then attempted to replicate our previous findings using the new VISQ-R. Table 6 displays the descriptive statistics for the five dimensions of inner speech, as well as scores for hallucination-proneness (LSHS-R), anxiety and depression (HADS), self-esteem (RSES), and dissociation (DES). Mean scores were highest for the evaluative inner speech dimension, followed by dialogic, condensed and positive inner speech. The lowest mean scores were recorded for other people in inner speech. Reliability for dialogic, evaluative and other people in inner speech was high (α > 0.8) but lower for positive inner speech (α = 0.6). Cronbach's α was also found to be adequate for visual hallucination-proneness, HADS anxiety, HADS depression RSES and DES (α > 0.7), but on the boundary of acceptability (0.67) for auditory hallucination-proneness. Table 7 shows the bivariate correlations between VISQ-R, DES, RSES, HADS and LSHS-R scores. To account for testing the five VISQ-R factors against six psychopathology variables simultaneously, a Bonferroni corrected alpha-value of 0.0017 was applied for significance (i.e. 0.05/30). Auditory hallucination-proneness (LSHS-R) was positively associated with evaluative, dialogic and other people in inner speech, and visual hallucination-proneness. A similar pattern was observed for visual hallucination-proneness. Evaluative and other people in inner speech also correlated positively with HADS Anxiety scores, HADS Depression scores, and lower self-esteem. Frequency of dissociative experiences (DES) was positively correlated with dialogic and other people in inner speech. For exploratory purposes, we also compared male and female participants' scores on the new VISQ-R factors using independent samples ttests: the only difference observed was for evaluative/critical inner speech (t (372) = −2.84, p = 0.005), with female participants (M = 30.34, SD = 7.13) scoring higher than male participants (M = 27.29, SD = 4.45).

Predicting hallucination-proneness controlling for self-esteem (RSES) and dissociation (DES) (Alderson-Day et al., 2014)
A multiple regression was performed to assess the contribution of different types of inner speech (VISQ-R), self-esteem (RSES) and dissociative tendencies (DES). Age and Gender were entered in the first step, followed by the five subscales of VISQ-R in the second step. Self-esteem (RSES) was entered in the third block, and dissociative tendencies (DES) in the fourth block. The first model with age and gender as predictors of auditory hallucination-proneness was not significant (p > 0.05). The addition of the five dimensions of VISQ-R in Block 2 made a significant change to the model (Δ R 2 = 0.09, Δ F (5, 369) = 8.16, p < 0.001), where other people in inner speech, β = 0.21, p < 0.001, and evaluative inner speech, β = 0.13, p = 0.023, both significantly predicted auditory hallucination-proneness (R 2 = 0.103, F(7,369) = 6.077, p < 0.001). The addition of the self-esteem measure (RES) to Block 3 did not make a significant change to the model (Δ R 2 = 0.03, p > 0.05), and self-esteem did not significantly predict hallucination-proneness (p > 0.05). The addition of dissociation scale (DES) scores to Block 4 resulted in a significant change to the model (Δ R 2 = .17, Δ F(1, 367) = 83.88, p < 0.001), with dissociation found to significantly predict auditory hallucination-proneness, β = 0.44, p < 0.001, as well as other people in inner speech, β = 0.17, p < 0.001 (final model: R 2 = 0.273, F(9,367) = 15.281, p < 0.001).
As in Alderson-Day et al. (2014), we then examined what (if any) mediating role dissociation played in relationship between inner speech and hallucination-proneness. Using the medmod package in jamovi, we tested the direct and indirect effects of other people in inner speech on auditory hallucination-proneness, with DES scores as the mediating variable. Significant effects were apparent for both direct (Z = 4.16, p < 0.001) and indirect (Z = 3.04, p = 0.002) paths, with the latter accounting for 27.9% of the total effect. This suggested that dissociation partially (rather than fully) mediated the effect of other people in inner speech on hallucinationproneness, and that most of this effect was in fact direct.

Confounding factors: Item overlap, language status, and psychiatric diagnosis
Finally, we reran our main analysis accounting for some potential confounds in the data. One concern with the original VISQ is the inclusion of items that refer to experiencing "actual voices" of other people in their inner speech, which may be thought to index similar experiences as the auditory items of the LSHS. To address this,  reran their analyses removing items 12 and 16 of the VISQ from the other people subscale. When this was done for the present data, the analysis showed the same results as the regression model 1 (i.e., McCarthy-Jones & Fernyhough 2011), in that visual hallucinations were a significant predictor of auditory hallucinations in the first block (β = 0.46, p < 0.001), and visual hallucinations (β = 0.43, p < 0.001) and other people in inner speech significantly predicted auditory hallucinations in the second block (β = .12, p = 0.013). The analysis also showed the same results as in regression model 2 (i.e. , where both evaluative (β = 0.14, p = 0.013) and other people in inner speech (β = 0.20, p < 0.001) were found to be significant predictors of auditory hallucination-proneness. The addition of selfesteem scores did not make a significant change to the model (p > 0.05). In Block 4, both dissociation scores (β = 0.44, p < 0.001) and other people in inner speech were found to be significant predictors of auditory hallucination-proneness (β = 0.15, p = 0.001).
Other factors which may have affected our data include the presence of non-native speakers and people with a self-declared psychiatric diagnosis. To address these, we reran our main analyses of sample 2 first excluding non-native English speakers, and second excluding those with a psychiatric diagnosis. For the former, the results for the native speakers group (n = 297) showed the same results as reported in regression model 1, in that visual hallucination-proneness (β = 0.48, p < 0.001) and other people in inner speech (β = 0.13, p < 0.05) were significant predictors of auditory hallucination-proneness in the final model. Similarly, for regression model 2, other people in inner speech (β = 0.19, p < 0.001) and dissociation scores in (β = 0.45, p < 0.001) were significant predictors of auditory hallucination-proneness in the final model. Identical results were observed when the analyses were rerun only in people without a psychiatric diagnosis.
Finally, we compared those with (n = 43) and without a psychiatric diagnosis (n = 326) on the five subscales of the VISQ-R (8 participants preferred not to report their diagnostic status). The independent-sample t-tests showed that there was a significant difference in the level of reported dialogic inner speech between the two groups (t (367) = 2.08, p = 0.038), with the sample with psychiatric diagnoses reporting higher levels of dialogic inner speech (Dx group M = 23.07, SD = 5.89; no Dx group M = 21.20, SD = 5.47). A similar pattern was evident for evaluative inner speech (t (367) = 5.43, p < 0.001), with lower scores in the sample without a diagnosis (Dx group M = 35.29, SD = 7.47; no Dx group M = 29.12, SD = 6.94). No group differences were found in levels of condensed, positive and other people in inner speech (p > 0.05).

Discussion
Asking people to report on their inner speech is challenging, but with careful methodological considerations it can produce consistent results. Here we tested and confirmed a five-factor model for an expanded VISQ, the VISQ-R; introduced a new variable of positive/regulatory inner speech; and replicated a number of previous findings of relations between inner speech variables and psychopathology.
Compared to the original scale, the VISQ-R captures a broader range of phenomenology associated with the experience of selfdirected speech, in line with the primary aim of the study. In particular, new items relating to positive and negative states of inner speech, including the use of inner speech to regulate mood, survived the various stages of scale development. The inclusion of these new items appears to have elaborated the concept of evaluative inner speech captured in the original VISQ (McCarthy-Jones & Fernyhough, 2011). Evaluative states in the present study were clustered with statements about inner speech contributing to feeling anxious or depressed, while motivation and regulation in inner speech clustered around the new positive factor. This is consistent with prior findings of higher evaluative inner speech being associated with a more negative self-concept (Alderson-Day et al., 2014), even though the items included in the original evaluative factor were ostensibly neutral and focused more on deliberative states (e.g., I think in inner speech about what I have done, and whether it was right or not). The inter-correlations of evaluative and positive subscales in sample 1 (r = 0.40) and sample 2 (r = 0.29) also highlight that these are related but ultimately separable factors, rather than positive and negative poles of the same scale.
A secondary aim for the new scale was to capture the potential for inner speech to have positive psychological effects, as this is important for drawing together disparate strands of research on self-directed speech. In sports psychology and similar fields, self-talk has often been associated with improved focus and performance (Hardy, Begley, & Blanchfield, 2015;Hardy et al., 2005). However, such research often elides the distinction between overt and covert self-talk. Research on private speech (overt or out-loud self-talk) shows it to be associated with regulatory strategies in childhood (e.g., Fernyhough & Fradley, 2005) and in adulthood (Duncan & Cheyne, 2001). In contrast, the phenomenology of inner speech and its relations to psychopathological states and processes (such as rumination) have rarely been explored. Through our identification of a positive inner speech factorone with clear parallels with the overt and motivational self-talk deployed in sports researchwe have the potential, in future research, to test predictions about cognitive and behavioural performance. For example, one could hypothesise that participants with more positive/regulatory inner speech are likely to perform better on tasks requiring self-talk (or following instructions to explicitly use such a strategy), but those with more evaluative/critical inner speech may not. While some attempts have been made to link different functions of private speech to aspects of task performance, observational studies of private speech are fraught with difficulty (Winsler, Fernyhough, & Montero, 2009). The VISQ-R thus provides us with a new way of empirically linking aspects of self-talk to aspects of performance.
The cognitive benefits of positive/regulatory inner speech may also be evidenced in the spheres of creativity and imagination. To give one example, the presence of an imaginary companion in childhood has been linked to greater levels of self-talk in general in childhood (Davis, Meins, & Fernyhough, 2013) and adulthood (Brinthaupt & Dove, 2012). Evaluative/critical inner speech, in contrast, would be expected to relate more strongly to rumination, shame, and perfectionism (Flett, Madorsky, Hewitt, & Heisel, 2002;Orth, Berking, & Burkhardt, 2006). The gender difference observed for evaluative/critical inner speech in the present datawith higher scores in female participantsis consistent with greater rates of rumination in women (Nolen-Hoeksema & Jackson, 2001) and with findings from the previous VISQ (Tamayo-Agudelo et al., 2016).
Even with the introduction of the new factor, a number of previous relations to psychopathology were observed in the confirmation sample of university students. In line with the original VISQ study, specific characteristics of inner speech were associated with a greater proneness to auditory hallucinations but not visual hallucinations (hypothesis 1); dissociation appeared to mediate this relationship (hypothesis 2); and a subset of inner speech characteristics also correlated with scores for anxiety, depression, and selfesteem (hypothesis 3; McCarthy-Jones & Fernyhough, 2011). One difference from the original study concerns the role of dialogic inner speech. Whereas the dialogic factor predicted auditory hallucination-proneness more strongly than other factors in the original study, in the present study only other people in inner speech did so. It seems unlikely that this is due to the new scale structure: with the original VISQ, pairwise correlations between dialogic inner speech and hallucination-proneness were often evident (e.g., Alderson-Day et al., 2017) but did not always survive exclusion during hierarchical regression analysis . The exploratory analysis of those with and without a psychiatric diagnosis suggested here that dialogic inner speech (and evaluative/critical inner speech) may be greater in those with a self-reported diagnosis (cf. the marginally significant reduction in dialogicality in the inner speech reported by Langdon, Jones, Connaughton, & Fernyhough, 2009). In contrast, a recent study with a clinical sample suggested that condensed and other people in inner speech were related to psychopathology (de Sousa et al., 2016). As a potential explanation of these disparate findings, dialogic inner speech is typically the factor that correlates most with all of the other VISQ factors; this is also the case with the findings reported here. It may be that dialogicality acts as a "core" feature of inner speech, and the other VISQ factors mediate its relation to other variables. Alternatively, it may be that the concept of dialogicality is liable to varying interpretations, leading to inconsistent performance across samples: for example, in Spanish translations of the original VISQ (Perona-Garcelán et al., 2017) a further dialogic factor needed to be added that explicitly referred to inner speech in terms of position of self in a dialogue (see also Tamayo-Agudelo et al., 2016). In our experience, English-speaking users of the scale have typically interpreted dialogic items to both include dialogues with oneself and dialogues with another (indeed, the focus of a Vygotskian understanding of dialogicality is arguably in structure, rather than the identity of interlocutors), but it is possible that cultural understandings of dialogue will differ considerably depending on the language used. A further point to note is that the idea of hallucinatory experiences existing on a continuum stretching into the general population has been increasingly challenged (e.g., Garrison et al., 2017), and it may need to be acknowledged that relations between hallucination-proneness and inner speech variables may transpire differently in clinical and non-clinical samples.
As in earlier studies (Alderson-Day & Fernyhough, 2015b;Ren et al., 2016), condensed inner speech items were not strongly endorsed, and correlations with other inner speech factors were low. In general, condensed inner speech scores also show few correlations with non-VISQ variables (Ren et al., 2016). As noted above, there is evidence that patients with psychosis endorse this experience more than controls, and that it relates to increased levels of thought disorder (de Sousa et al., 2016). As such, it would seem to be an important factor to retain and may be more informative when used in clinical samples. This may also have been the case for a number of the other new items that did not survive the initial exploratory factor analysis in sample 1. Specifically, questions regarding control, surprise, metaphor, and feelings of passivity in inner speech (such as experiencing it as hearing rather than speaking) did not cluster sufficiently either with themselves or the existing factors to be included in the final scale. While still potentially important experiences to explore, it may be that these characteristics of inner speechalong with condensed inner speech are too extraordinary for non-clinical respondents and do not become salient until things start to go awry. Such a question bears on contemporary debates in philosophy of mind about how and why we experience our thoughts as our own (Roessler, 2016) questions that are important to consider when probing the distinction between inner speech, auditory verbal hallucination, and other atypical experiences such as thought insertion (Wilkinson & Alderson-Day, 2016).
Some limitations to the present study are important to consider. First, asking people to report on characteristics of their own inner speech via questionnaires raises the perennial concern about how reliably people can report on their own inner experience (see, for example, Hurlburt et al., 2013;Hurlburt & Schwitzgebel, 2007). Notwithstanding the fact that such self-report methods are relied on extensively in personality and individual differences research, it is important when studying inner speech to employ multiple methods. We therefore recommend that future use of the VISQ-R involve it being deployed alongside other tools such as cognitive tasks (Ren et al., 2016) or neuroimaging . Methods of tracking the characteristics of inner speech are developing all the time, with a recent example being provided by Whitford et al. (2017), who used EEG to demonstrate perceptual capture of external sounds during inner speech production.
Second, examining the relations between conceptually similar notions (such as inner speech and hallucination-proneness) raises the risk of measurement overlap. We have taken steps to address this issue: for example, by showing that the relation between other people in inner speech and auditory hallucination-proneness remains even when potentially overlapping items are removed, concerns about direct conceptual overlap are to some degree minimised. Moreover, we have endeavoured to design VISQ-R items that focus on the form and phenomenology of inner speech, rather than its contents (to avoid confounding the content of a thought or mood and its structure). Nevertheless, ideally such issues would be explored more extensively in a procedure that would enable clarification and disambiguation, such as a one-to-one interview. This would seem to be particularly important when dealing with patients who may struggle to report on their own inner experience (de Sousa et al., 2016).
Finally, both of these samples reflect highly-educated populations: in one case respondents to a survey on reading experiences promoted by the Guardian newspaper , and in the other a university sample. Although Alderson-Day et al.'s (2017) sample was an international one, the majority of its respondents were still either from the UK or from other western, English-speaking countries (as was the case for sample 2). We have already discussed some language-specificity issues relating to translations of the VISQ. For these and other reasons, the generalisability of the present findings is limited, and it will be particularly important to test the VISQ-R in more mixed, general population samples and non-English speaking countries.
A related issue that will be important for future research on inner speech is to expand its horizons into more diverse populations. For example, research on inner speech (or equivalent experiences of inner language, such as signing) in those who are blind or deaf is still meagre (for exceptions, see Campbell & Wright, 1990;Zimmermann & Brugger, 2013). Similarly, research on neuro-diverse populationssuch as autistic peoplehas been largely confined to examining "deficits" in self-talk rather than exploring qualitative differences in inner speech or more broadly, inner experience. Evidence of differential use of verbal strategies on cognitive tasks in autistic adults (Williams et al., 2012) and vivid accounts of inner experience by adults with Asperger Syndrome (Hurlburt, Happé, & Frith, 1994) suggests that inner speech may be radically different for this group, and deserving of a more systematic exploration.
In conclusion, inner speech has been proposed as a key tool for unlocking creative, exploratory, and abstract thought. Our results, with the revision of the VISQ, provide new avenues for probing these relationships while continuing to explore the important connections between the phenomenology of self-talk and psychopathology. By expanding the horizons of inner speech, and picking up a greater range of experiences, a richer narrative will be available to turn Vygotsky's (1987) 'cloud of thought'to use his evocative phraseinto 'a shower of words'.