A data-driven linguistic characterization of hallucinated voices in clinical and non-clinical voice-hearers

Background: Auditory verbal hallucinations (AVHs) are heterogeneous regarding phenomenology and etiology. This has led to the proposal of AVHs subtypes. Distinguishing AVHs subtypes can inform AVHs neurocognitive models and also have implications for clinical practice. A scarcely studied source of heterogeneity relates to the AVHs linguistic characteristics. Therefore, in this study we investigate whether linguistic features distinguish AVHs subtypes, and whether linguistic AVH-subtypes are associated with phenomenology and voice-hearers' clinical status. Methods: Twenty-one clinical and nineteen non-clinical voice-hearers participated in this study. Participants were instructed to repeat verbatim their AVHs just after experiencing them. AVH-repetitions were audio-recorded and transcribed. AVHs phenomenology was assessed using the Auditory Hallucinations Rating Scale of the Psychotic Symptom Rating Scales. Hierarchical clustering analyses without a priori group dichotomization were performed using quantitative measures of sixteen linguistic features to distinguish sets of AVHs. Results: A two-AVHs-cluster solution best partitioned the data. AVHs-clusters significantly differed in linguistic features (p < .001); AVHs phenomenology (p < .001); and distribution of clinical voice-hearers (p < .001). The “ expanded-AVHs ” cluster was characterized by more determiners, more prepositions, longer utterances (all p < .01), and mainly contained non-clinical voice-hearers. The “ compact-AVHs ” cluster had fewer determiners and prepositions, shorter utterances (all p < .01), more negative content, higher degree of negativity (both p < .05), and predominantly came from clinical voice-hearers. Discussion: Two voice-speech clusters were recognized, differing in syntactic-grammatical complexity and negative phenomenology. Our results suggest clinical voice-hearers often hear negative, “ compact-voices ” , un-derstandable under Broca's right hemisphere homologue and memory-based mechanisms. Conversely, non-clinical voice-hearers experience “ expanded-voices ” , better accounted by inner speech AVHs models.


Introduction
Auditory verbal hallucinations (AVHs) are understood as the experience of hearing voices in absence of corresponding stimuli.Nowadays, their presence across psychiatric disorders is well recognized, as is the fact that they also occur in non-clinical populations, with estimated lifetime frequencies in the 5-15% range (Beavan et al., 2011;Maijer et al., 2018).In recent years, the appreciation of heterogeneous features in AVHs has increased, both within and across different populations (Larøi et al., 2012;Waters and Fernyhough, 2017;Woods et al., 2015).For instance, while loudness and number of voices are similar between clinical and non-clinical voice-hearers, these two groups differ in frequency of and control over their AVHs (Daalman et al., 2011).It has been suggested that, to fully understand AVHs and their origin, this heterogeneity may require the study of potential underlying subtypes of AVHs (Jones, 2010;McCarthy-Jones et al., 2014a, 2014b;Sommer et al., 2018).In previous research, some support was found for five AVHs subtypes, namely hypervigilant subtype, autobiographical memory subtype, inner speech subtype, epileptic subtype, and deafferentation subtype (McCarthy-Jones et al., 2014a).So far, data-driven support for this subdivision is lacking.In addition, negative emotional content and form (e.g., commands) were identified as dimensional constructs that vary across subtypes (McCarthy-Jones et al., 2014a).Altogether, distinguishing subtypes of AVHs can have important implications for both clinical practice and research, such as developing treatments for AVHs and informing neurocognitive models of AVHs (David, 2010;McCarthy-Jones et al., 2014a;Sommer et al., 2018).
Until now, studies on AVHs subtypes have mostly relied on phenomenological and etiological features, scarcely including linguistic characteristics (Chang et al., 2015(Chang et al., , 2009;;McCarthy-Jones et al., 2014a, 2014b;Stephane et al., 2003).Moreover, most studies on AVHs subtypes have focused exclusively on clinical AVHs (Chang et al., 2009;McCarthy-Jones et al., 2014b;Stephane et al., 2003).Therefore, to arrive at a more comprehensive understanding of AVHs subtypes, a linguistic approach to AVH heterogeneity in both clinical and non-clinical voicehearers is warranted.
The language of AVHs or "voice-speech" from clinical voice-hearers presumably has linguistic characteristics that distinguish it from other registers of speech (Tovar et al., 2019).Specifically, it has been suggested that clinical voice-speech displays unpleasant and recurrent semantic content, short utterances lacking syntactical errors or grammatical connectivity, and a low use of the grammatical first person (Frank et al., 1980;Hoffman et al., 1994;Tovar et al., 2019;Turkington et al., 2019).Importantly, when comparing voice-speech between clinical and non-clinical voice-hearers, both similarities and dissimilarities in linguistic features have been found (de Boer et al., 2016).Dissimilarities include shorter mean length of utterance, lower verb complexity, and more verbal abuses and perseverations in the voice-speech of individuals with a clinical status (de Boer et al., 2016).This raises the question whether different linguistic subtypes of AVH can be identified, and whether these subtypes are present in both clinical and non-clinical voice-hearers.Therefore, the main aim of the present study was to investigate how linguistic features can be used to distinguish subtypes of AVHs.Specifically, we studied how AVHs linguistic subtypes might be characterized in terms of phenomenology, and whether they are associated with participants' clinical status.To achieve this, we used a datadriven approach to overcome possible limitations of a priori dichotomization of clinical and non-clinical voice-hearers.

Participants
A total of 21 clinical and 19 non-clinical voice-hearers participated in this study.The majority of this sample was previously described in de Boer et al. (2016).Four clinical voice-hearers were added in the current study.Inclusion criteria for all participants were: (a) being a native Dutch speaker, (b) experiencing verbal hallucinations at least once per month, (c) at least three months free of alcohol or drugs abuse, and (d) absence of a chronic somatic disorder.Patients were recruited via the University Medical Center Utrecht.Non-clinical voice-hearers were recruited via a website (www.verkenuwgeest.nl),and were required to pass an online screening about hallucinations, a telephone interview about the inclusion criteria, and the face-to-face psychiatric screening.All participants were screened for a psychiatric disorder using the Comprehensive Assessment of Symptoms and History (CASH) (Andreasen et al., 1992) and the Structured Clinical Interview for Personality Disorder (SCID-II) (First et al., 1995).Participants were classified as non-clinical voice-hearers when they did not meet the criteria for a psychiatric disorder, and as clinical voice-hearers when they did.

Procedures
Registrations of the AVHs were collected from all participants using the shadowing procedure.This consisted of instructing each participant to "shadow" (i.e., repeat verbatim) her/his AVHs just after experiencing them (de Boer et al., 2016).The verbatim repetitions were recorded on a voice recording device.This procedure was repeated three times, each recording lasting a minimum of 1 min, resulting in a total of at least 3 min of "shadows" per participant.Similar to previous reports (de Boer et al., 2016;Tovar et al., 2019), the recording time required for obtaining the sound recordings spanned between a couple of minutes and half an hour, depending on the frequency and duration of the hallucinations.All sound recordings of the "shadows" were orthographically transcribed by consensus rating of three Dutch native, linguistics graduate students.They successfully identified blurry sounds as either verbal or non-verbal units, disentangled specific words, and differentiated "shadows" from self-talk (see also Linguistic data preprocessing in Supplementary materials).The Auditory Hallucinations Rating Scale (AHRS) and the Psychotic Symptom Rating Scales (PSYRATS) (Haddock et al., 1999) were used to assess eleven phenomenological features of the AVHs.Procedures were approved by the ethical committee of the University of Utrecht, and participants provided written informed consent before participation.Declaration of Helsinki's principles were followed throughout all steps of the research.

Linguistic features
Sixteen features were analyzed: two types of pronouns (nominative first-person singular, and relatives), three verbal time expressions (simple past, present, and future tenses), three content-and-structure measures (mean length of utterance or MLU, mean word length, and movingaverage type-token ratio), four function-word classes (definite and indefinite articles, prepositions, and subordinating conjunctions), and four content-word classes (attributive adjectives, locative adverbs, plural and singular nouns) (see definitions and examples in Table 1).This choice was informed by work about spoken Dutch from Grieve et al. (2017).Based on it, and following Biber's (1988) procedure to retain only salient linguistic variables for analysis, these 16 features were identified as the most suited to explore differentiating patterns in spoken Dutch.
Relative frequencies of the linguistic features were calculated by dividing each absolute frequency by the total number of words of the corresponding shadow file and then multiplying the quotient by 10,000.In order to prevent absolute frequencies of zero from remaining zero in relative frequencies, one unit was added to all relative frequencies.The

Table 1
Linguistic features for analysis in the AVHs.

Linguistic feature Description
Attributive adjective Word conveying properties of or adding features to a given noun, e.g., "pretty".

Locative adverb
Word for details of place or position, e.g., "homeward".

Mean length of utterance
The average number of words forming an utterance.

Mean word length
The average number of letters forming a word.

Moving-Average Type-Token Ratio
A ratio that expresses the number of different word forms relative to the total number of words.Nominative first-person singular pronoun Word standing for the grammatical first person functioning as singular subject, i.e., "I".

Plural noun
Word standing for concrete or abstract entities, e.g., "tables".

Preposition
Word usually placed before a noun phrase and typically indicating spatial or temporal relations, e. g., "at" and "in".

Relative pronoun
Word that depends on an antecedent and that both introduces and plays a role in a new sentence, e.g., "who".Simple future Grammatical value of a later time, e.g., "will".

Simple past
Grammatical value of a previous time, e.g., "did".

Simple present
Grammatical value of a current time, e.g., "have".

Singular noun
Word standing for a concrete or abstract entity, e.g., "table".

Subordinating conjunction
Word linking an independent clause to a dependent clause, e.g., "because".
Moving-Average Type-Token Ratio (MATTR) was computed by means of the Quantitative Analysis of Textual Data tool version 2.0.1 (Benoit et al., 2018) implementing logarithm with base ten and a moving window with a size of ten words as parameters.MATTR (Covington and McFall, 2010) was chosen over Type-Token Ratio (TTR) (Richards, 1987) as it is more robust in dealing with the influence of text length on the ratio calculation (Brezina, 2018;Covington and McFall, 2010).

Statistical analysis
We used hierarchical clustering analyses to classify the participants into groups based on linguistic characteristics of their AVH.Specifically, different subgroups of AVHs were distinguished by grouping according to standardized linguistic features, using Canberra distance and Ward's method.A two-step procedure was conducted to assess AVHs-cluster validity implementing both relative and internal criteria.First, the number of AVHs-clusters was estimated by means of the R 'NbClust' package (Charrad et al., 2014).Secondly, a Sequential Minimal Optimization algorithm (SMO) (Platt, 1999) was implemented along with polynomial kernel and ten-fold cross-validation in order to evaluate the AVHs-clusters' partitions as labels for classification categories.The Waikato Environment for Knowledge Analysis software (Weka) (Witten et al., 2016) was used only for the classification task.
Chi-square tests of independence without continuity correction were carried out to test for differences in the distribution of clinical status and sex.Two-tailed independent t-tests were performed to test for differences in age and years hearing AVHs.One-way non-parametric multivariate analysis of variance (MANOVA) (Burchett et al., 2017) was done independently to test for the presence of significant differences on both the combined linguistic variables and the combined phenomenological features between AVHs-clusters.Non-parametric two-tailed independent Wilcoxon rank sum tests with continuity correction were carried out for analyzing possible differences in both individual linguistic quantitative measures and individual phenomenological features between AVHsclusters.Non-parametric two-tailed Spearman bivariate correlations were conducted in order to assess possible relations between individual linguistic features and individual phenomenological features.In both Wilcoxon and correlation tests, Holm correction for multiple comparisons was applied to control for false positive results.This method was used because it is suited for exploratory studies (Menyhart et al., 2021).
For all analyses, after correcting for multiple comparisons, statistical results with p-values <.05 were considered to be significant.All analyses were performed in RStudio version 1.2.5019 (RStudio Team, 2019) running R version 4.0.2(R Core Team, 2020).

Results
Clinical voice-hearers were diagnosed either with a schizophrenia-spectrum (95%) or a bipolar (5%) disorder.Clinical and non-clinical voice-hearers were similar in age (ranging from 21 to 75) and sex (both p > .05),but differed in years hearing AVHs (p < .001)(Table 2).General characteristics of their AVHs are presented in Table 3.
The data-driven hierarchical clustering procedure showed that a two-AVHs-cluster solution best partitioned the data.AVHs-clusters' validation showed that the percentage of correctly predicted instances was 92.5% (Cohen's kappa coefficient = 0.85).There were no significant differences between the two AVHs-clusters regarding age, sex, and years hearing AVHs.The AVHs-clusters differed significantly in terms of distribution of participants with and without a psychiatric disorder (see Table 4), with one AVHs-cluster consisting mainly of non-clinical voicehearers, and the other predominantly of clinical voice-hearers (p < .001).
Multivariate analysis of variance indicated the presence of significant differences on the combined linguistic variables between AVHsclusters, F(9.05,340.34)= 7.42, p < .001.Follow-up comparisons showed that the AVHs-clusters differed on four linguistic features.Compared to the other cluster, AVHs in the cluster with mainly clinical voice-hearers were characterized by fewer definite and indefinite articles, less prepositions and shorter utterances.Henceforth, the AVHs from this cluster will be called "compact-AVHs".In contrast, the AVHs from the cluster made out of mainly non-clinical voice hearers, being richer in articles and prepositions and showing longer utterances, will henceforth be called "expanded-AVHs" (see Table 5).Illustrative fragments of compact-AVHs and expanded-AVHs are given in Fig. 1.Furthermore, compact-AVHs and expanded-AVHs differed significantly in terms of phenomenology, F(4.28,141.19)= 5.34, p < .001.Compact-AVHs showed a larger amount of negative content and a higher degree of negativity than expanded-AVHs (Table 6).
As the possibility exists that AVHs in patients with schizophreniaspectrum disorders are different from those in another psychiatric disorder, we duplicated our analyses with the exclusion of the patient who was diagnosed with bipolar disorder.Compared to the above-mentioned results, this did not lead to substantial changes (see Supplementary Tables S1-S3).

Discussion
In this study we investigated whether linguistic features can be used to recognize different subtypes of AVHs, and whether this links to AVHs phenomenology and presence or absence of a clinical diagnosis.The cluster analysis revealed a two-cluster solution.The expanded-AVHs cluster mainly contained non-clinical voice-hearers' AVHs, while the compact-AVHs cluster mostly had clinical voice-hearers' AVHs.In comparison with expanded-AVHs, compact-AVHs had fewer determiners and prepositions, as well as shorter MLU.Regarding phenomenology, compact-AVHs showed a larger amount of negative content and higher degree of negativity.Linguistic features and phenomenology were not correlated, emphasizing the importance of a role of language in characterizing our sample of AVHs.

Linguistic characterization of the AVHs
Compared to expanded-AVHs, compact-AVHs had fewer determiners overall.Determiners identify referents, either as concrete (e.g., the country you live in) or abstract entities (e.g., the topic of this text) (Giacalone Ramat and Andorno, 2006;Juvonen, 2006).This smaller number of determiners was unlikely due to fewer noun phrases in compact-AVHs, since both singular and plural nouns did not differ significantly between AVHs-clusters.Post-hoc qualitative assessment of our results gave rise to a few possible explanations.First, compact-AVHs were found to have a larger proportion of singular proper nouns (see Supplementary Table S4), which never require a determiner in Dutch (Hanks, 2006;Oosterhoff, 2015).Second, the Dutch indefinite article "een" can form adverbial constructions of degree (e.g., een beetje; "a bit") (Klein, 1998), and the proportion of this type of constructions was smaller in compact-AVHs (see Supplementary Table S5).Third, compact-AVHs had a larger proportion of countable indefinite plural nouns (e.g., Ø stemmen; "voices"), which always lack a determiner (Oosterhoff, 2015) (see Supplementary Table S6).
Furthermore, our results suggest that prepositional constructions were more frequent in expanded-AVHs compared to compact-AVHs.Prepositions typically form locational/positional, temporal or directional/movement constructions (Kurzon, 2006;Svenonius, 2007).Importantly, prepositional constructions often specify information in connection with verbs (Svenonius, 2007;Talmy, 2000).For instance, someone can either "live" in a house, "eat" at lunchtime, or "walk" toward a door.Indeed, a post-hoc exploratory evaluation of our data showed    that expanded-AVHs had a larger proportion of verbs of location/position, time, and direction/movement (see Supplementary Table S7).MLU was also significantly different between the AVHs-clusters, with compact-AVHs consisting of shorter MLU than expanded-AVHs.It can be reasonably assumed that fewer determiners and prepositions underlie this shorter MLU of compact-AVHs.However, correlations between MLU and both determiners and prepositions did not remain significant after correction for multiple comparisons, leaving this hypothesis unconfirmed.Notwithstanding, our finding that compact-AVHs had shorter MLU and were more often experienced by individuals with a psychiatric disorder is in line with previous studies that found a shorter MLU in clinical AVHs (de Boer et al., 2016) and reduced grammatical connectivity in clinical voice-hearers' AVHs (Tovar et al., 2019).
Interestingly, our results showed that the occurrence of the nominative first-person singular pronoun was not different between AVHsclusters.This is intriguing, since this feature has been shown to be characteristic of clinical AVHs (McCarthy-Jones et al., 2014b).When including other linguistic functions and also counting its plural forms, the grammatical first person could still typify clinical AVHs (Tovar et al., 2019).The pronoun "ik" (i.e., "I") represents only one of several forms and functions in which the grammatical first person can be indicated in Dutch.Therefore, our results would suggest that, rather than the nominative first-person singular pronoun, other forms and functions of the grammatical first person could be more important in distinguishing linguistic sets of AVHs across clinical and non-clinical voice hearers.

Phenomenological characterization of the AVHs
We found that compact-AVHs, which had a higher proportion of AVHs from clinical voice-hearers, were associated with a larger amount and a higher degree of negative content.This is consistent with previous research showing that AVHs are experienced to be more negative by clinical voice-hearers, although non-clinical voice-hearers also report some negative AVHs (Baumeister et al., 2017;Daalman et al., 2011;Larøi et al., 2019;Nayani and David, 1996).It is noteworthy that we failed to find a relation between the amount and degree of negative content on the one hand, and linguistic features on the other hand, as both differed between the two AVHs-clusters.However, this absence of an association is not surprising, since emotional valence of words is mainly grounded in adjectives, nouns, and verbs, as shown by affective norms for words (e.g., Moors et al., 2013), rather than in determiners, prepositions, or MLU that emerged from our linguistic analysis.

Neurocognitive models of AVHs
Inner speech/self-monitoring AVH models (Frith and Done, 1988;Jones and Fernyhough, 2007;McGuire et al., 1995) suggest that AVHs arise from inner speech, which is substantiated by neuroimaging studies (Allen et al., 2008(Allen et al., , 2007)).In comparing characteristics of inner speech to our subtypes of AVHs, a few points are of note.Fernyhough (2004) describes two types of inner speech, namely expanded and condensed inner speech.Expanded inner speech represents an internalization of overt dialogue (Fernyhough, 2004).As such, it might commonly display locational, temporal, and directional information (Yule, 1996), much like expanded-AVHs.In contrast, condensed inner speech retains few of these features (Fernyhough, 2004), similarly to compact-AVHs.However, compact-AVHs differ from condensed inner speech in their negative phenomenology.This is in line with previous research showing that clinical voice-speech contains more unpleasant and more controlling words than inner verbal thoughts (Turkington et al., 2019).While expanded-AVHs might instantiate what Fernyhough (2004) calls "expanded inner speech", our results suggest that this is not the case for all AVHs.Specifically, the inner speech-model seems to accommodate some linguistic characteristics, but not the negative valence of compact-AVHs.
Meta-analytic evidence consistently relates activation of Broca's area right homologue to experiencing AVHs (Zmigrod et al., 2016).Importantly, it is likely the source of AVHs with negative phenomenology and relatively simple linguistic features (de Boer et al., 2016;Larøi et al., 2019;Sommer and Diederen, 2009).Since compact-AVHs displayed features like these, it can be hypothesized that they are triggered by activation of Broca's homologue in the right hemisphere.
Another possibility is that compact-AVHs are in fact memories of previously heard speech, which is no longer recognized as such (Waters et al., 2006).Deficits in biding contextual cues (Waters et al., 2006) might also underlie the scarce identification through determiners and the few locational, temporal, and directional information shown by compact-AVHs.Moreover, misattributed recalled memories might trigger negative affect (Waters et al., 2006), and this would be reflected in compact-AVHs' negative phenomenology.
Alternatively, compact-AVHs' linguistic features might arise from disruptions in generative circuits of language-related mechanisms (Brown and Kuperberg, 2015), which tallies with the observation that referential systems are disrupted in schizophrenia-spectrum disorders (van Schuppen et al., 2019;Zimmerer et al., 2017).Çokal et al. (2018) and Sevilla et al. (2018) showed that speech of patients with schizophrenia-spectrum disorders and formal thought disorder is characterized by aberrant use of definite, but not of indefinite articles.This suggests that referentiality deficits in formal thought disorder may be linked to linguistic definiteness (Çokal et al., 2018).Our results are surprising in this respect, as both definite and indefinite articles were less frequent in compact-AHVs than in expanded-AVHs.As mentioned earlier, this does not necessarily imply anomalous referentiality in compact-AVHs, since these had more proper nouns, which by definition are referential words (Hanks, 2006).However, to date no direct comparison has been made between linguistic characterizations of spontaneous speech of voice-hearers, on the one hand, and the content of their voices, on the other.For this reason, it remains an open question to what extent our diverging findings may be explained by referentiality differences in spontaneous speech compared to AVHs.

Future directions
It has been pointed out that shared mechanisms may account for different subtypes of AVHs (McCarthy-Jones et al., 2014a).This possibility is partially supported by a recent study showing that, in nonclinical voice-hearers, different AVHs show both common and distinct brain activation (Lin et al., 2020).This is in line with the claim that multiple mechanisms could be at the root of AVHs subtypes (McCarthy-Jones et al., 2014a), which is related to meta-analytic evidence on plurality of mechanisms in AVHs (Rollins et al., 2019).It remains an open question whether expanded-AVHs and compact-AVHs can be explained by a unique neurocognitive model, or whether different models are needed to explain the emergence of these different linguistic subtypes of AVHs.Future research must address this controversy, taking into account that both linguistic features and phenomenology of AVHs subtypes should be consistently integrated with findings on other levels of explanation (Hugdahl and Sommer, 2018).
Our findings on expanded-AVHs and compact-AVHs can benefit personalized psychological therapies for AVHs.For example, expanded-AVHs displayed more identification through determiners, and richer locational, temporal, and directional information.These features are important for conversation (Yule, 1996).Thus, the Voice Dialogue method (Stone and Stone, 1989) and derivatives thereof (e.g., Corstens et al., 2012) may be indicated for voice-hearers with expanded-AVH.Those features were less present in compact-AVHs, and these AVHs were also shorter.In treating compact-AVHs, dialogue therapies would then face obstacles similar to those of trying to talk to someone who violates communicative principles (Clark, 2004;Grice, 1975).Besides, compact-AVHs had overall negative phenomenology.Considering all this, a psychological treatment for compact-AVHs might rather take advantage of voice-hearers' metalinguistic skills (Basturkmen et al., 2002;Bialystok and Ryan, 1985;Gombert, 1993) in judging linguistic properties of their voice-speech.Just as people display attitudes toward speakers' features of speech (Dragojevic et al., 2020), voice-hearers with compact-AVHs can be taught to do so with their voices.Research shows that perceived power and superiority of the AVHs plays a large role in the resulting distress (Chadwick and Birchwood, 1994).Hence, metalinguistic therapy reflecting on the "poor" linguistic quality of compact-AVHs could help voice-hearers to counter with the perceived status of the voices.This might also deviate their attention from the negative phenomenology, possibly alleviating distress.
Finally, as some languages do not have determiners (e.g., Finnish) nor prepositions (e.g., Pilagá, spoken in Argentina) (Dryer, 2013a(Dryer, , 2013b)), part of our results may not be replicable in those languages.In that case, the analysis will have to be focused on elements or mechanisms that may fulfill the corresponding typological linguistic functions of determiners, prepositions and/or referential units.

Limitations
A first limitation of the present study is that we constrained our analyses to 16 linguistic variables, while of course more aspects of language could have been assessed.A related matter is that our sample size was relatively small, and there was no homogeneity in clinical voice-hearers' diagnosis nor in non-clinical voice-hearers' phenotype (Baumeister et al., 2017).This might have reduced statistical power and added variation to the data, increasing the difficulty of finding linguistic patterns in our relatively small sample.On the other hand, this heterogeneity does best reflect clinical practice and could therefore make the generalization of our results easier, since the differences we found between compact-AVHs and expanded-AVHs in both linguistic and phenomenological features had either a medium or a large effect size.
Secondly, there is no way of knowing whether the "shadows" of the participants' AVHs are a perfect reflection of the AVHs they heard.For instance, due to embarrassment of the AVHs content, some participants might have been hesitant in fully repeating their AVHs.Moreover, personal (e.g., mood and cognitive abilities) and situational factors (e.g., distractions in the room) might have influenced participants' performance in repeating their AVHs verbatim.In parallel, AVHs data from clinical voice-hearers might have been constrained by their verbal repetition skills, since results from a simulated shadow-task showed that patients with schizophrenia or schizoaffective disorder have poorer performance when compared to controls (Fuentes-Claramonte et al., 2019).Furthermore, we did not control for medication use.As antipsychotic medication can influence language production (de Boer et al., 2020;Salomé et al., 2000), the possibility that medication effects underlie some characteristics of the clinical voice-hearers' AVHs as repeated by them cannot be ruled out.

Conclusions
To conclude, using a data-driven approach with linguistic features, two clusters of voice-speech could be recognized.Linguistically, these AVHs-clusters, which we named "compact" and "expanded", mainly differed in their use of referential information and syntactic complexity.Phenomenologically, the amount of negative content and degree of negativity were the most important differences between the AVHsclusters.Our data show that, compared to expanded-AVHs, compact-AVHs were mainly experienced by clinical voice-hearers.These findings can inform neurocognitive models of AVHs, and also be useful in developing treatments for people with a specific subtype of AVHs.

Table 2
Demographic characteristics of participants.
n = sample size, SD = standard deviation.a Data available only for 18 clinical voice hearers.b Data available only for 18 clinical and 17 non-clinical voice-hearers.

Table 3
Details of the AVHs' shadows per group of voice-hearers.

Table 4
Characteristics of the participants per cluster of AVHs.
a Data available only for 18 clinical voice hearers.b Data available only for 18 clinical and 17 non-clinical voice-hearers.

Table 5
Linguistic variables across the two clusters of AVHs.
p-Values were adjusted using Holm correction.n=sample size, SD = standard deviation.H.Corona-Hernández et al.

Table 6
Phenomenological features of the two clusters of AVHs.
p-Values were adjusted using Holm correction.n=samplesize, SD = standard deviation.aThisinformationwas available only for 17 of 18 participants.bThisinformation was available only for 18 of 22 participants.H.Corona-Hernández et al.