Prosodic deficits and interpersonal difficulties in patients with schizophrenia

The present study examines the use of receptive emotional and linguistic prosody in patients with schizophrenia; particularly, its aim was to evaluate the type and number of errors made when comprehending the emotions and modes implied by meaningless utterances. Seventy-eight participants were enrolled to the study, i.e. two groups (patients with schizophrenia and healthy controls) consisting of 39 subjects. The severity of illness was evaluated with the Positive and Negative Syndrome Scale; comprehension of emotional and linguistic prosody was assessed by the subtests of the Polish Version of the Right Hemisphere Language Battery. Neither emotional nor linguistic prosody comprehension both correlated with schizophrenia symptoms. The study group experienced more difficulties in distinguishing between happiness and anger, and were more likely to misunderstand imperative utterances, confusing them with interrogative or affirmative ones. Such impairments are significant as they may affect the ability to form and sustain relationships with other people, achieve success in the work environment, and integrate in the community. They may also be a trait mark of the illness independent of psychotic symptoms. Further research is needed to translate this knowledge into meaningful and therapeutic interventions to improve quality of life, both for affected individuals and for their communication partners.


Introduction
The ability to communicate underpins social interactions and allows speakers to cultivate relationships that fulfill their personal needs. As such, successful interpersonal communication is arguably one of the most important capacities of human beings (Carton et al., 1999). It requires not only the correct and appropriate productive use of language, but also accurate interpretation; such interpretation must also take into account non-linguistic signals conveyed by posture, facial expressions, eye gaze, gesture and tone of voice (Balconi, 2010). One important aspect of such non-verbal communication is speech prosody (Green et al., 2008), which enables the listener to identify and interpret intentions, beliefs, attitudes or sentence ambiguities (Dahan, 2015;Frazier et al., 2006;Kurumada et al., 2014).
The term speech prosody refers to the vocal modulations that accompany speech, and comprises variations in pitch, intonation, loudness, tempo or rhythm (Balconi, 2010). These vocal modulations are used to group syllables and words together so as to render some elements more remarkable and important than others. Prosody can be differentiated based on intrinsic (grammatical), intellective (pragmatic) and emotional functions (Balconi, 2010;Caletti et al., 2018). The first aspect deals with the intonational profile of sentences and allows interrogative, imperative or affirmative sentences to be distinguished from each other. The second relates to the speaker's intention and marks particular elements of a sentence, allowing the speaker to express specific intentions and clues regarding its correct interpretation. Finally, the emotional function one applies to the vocal expression of emotions and enables the distinction of, for example, a happy or angry voice.
Generally, prosody is an important element of the correct interpretation of sentences (Ross, 2000) especially when they are ambiguous. In such situations, the listener usually relies on prosodic elements to correctly interpret the speaker's meaning. What is more, the prosodic profile of an utterance, i.e. its stress distribution and intonation, can change the meanings of words and sentences, thus conveying different intentions and inducing various perlocutionary effects. Finally, the emotional modulations of prosodic profiles can be used to express mental and emotional states, such as happiness, disapproval and sadness (Dahan, 2015). Consequently, deficits and impairments in both prosody production and comprehension may lead to severe problems in interpersonal communication (Green et al., 2008). Such disturbances have been recognized in various psychiatric disorders (Fricchione et al., 1986;Leentjens et al., 1998;Levin et al., 1985;Murphy and Cutting, 1990) including schizophrenia (Bosco and Parola, 2017;Castagna et al., 2013;Gurańska and Gurański, 2013;Pawelczyk et al., 2018;Ross et al., 2001).
Numerous studies suggest the presence of disturbances in emotional and, possibly, linguistic prosody in affected individuals, although little research has been carried out on emotion-specific and language-specific recognition. A limited number of studies suggest that people with schizophrenia may perform worse when recognizing negative emotions than positive ones (Bell et al., 1997), that they encounter greater difficulties in identifying anger (Amminger et al., 2012b), and in recognizing disgust, happiness and fear than anger or sadness (Bonfils et al., 2019) and that those with comorbid depression can recognize sadness more accurately than happiness (Herniman et al., 2017). Furthermore, only a few studies (Caletti et al., 2018;Gold et al., 2012;Murphy and Cutting, 1990) examine difficulties in recognizing specific emotions, e.g. sadness, happiness or anger, or in identifying interrogative, affirmative and imperative sentences (Caletti et al., 2018). Notably, some of these studies group facial and prosody tasks together (Bell et al., 1997;Edwards et al., 2001), analyze only first-episode patients (Amminger et al., 2012a;Herniman et al., 2017), or do not employ tasks with meaningless sentences (excluding language performance) (Caletti et al., 2018;Murphy and Cutting, 1990). Although some papers evaluate the comprehension of emotions or modes of sentences, most fail to include any investigation of mistakes characteristics. Also, acoustic features are rarely evaluated (Gold et al., 2012). It is worth mentioning that these small differences in task complexity may influence outcomes across studies and limit the possibility for comparison. Additionally, the body of evidence concerning the relationship between prosody impairments and the symptoms of schizophrenia remains inconsistent (Caletti et al., 2018;Edwards et al., 2001;Parola et al., 2020;Tseng et al., 2013). To date, although various procedures have been used to study prosody impairments in schizophrenia (Amminger et al., 2012a;Caletti et al., 2018;Edwards et al., 2001;Gold et al., 2012;Herniman et al., 2017) few studies have employed standardized procedures (Edwards et al., 2002).
Consequently, emotional and linguistic prosody constitute a vital element of language communication, allowing information to be gathered about the speaker, such as emotional state, dominance and confidence, and in real life they appear to coexist (Lucarini et al., 2020). Therefore, the aim of this study was to analyze differences in emotional and linguistic prosody comprehension between patients with schizophrenia and healthy controls using standardized, validated tests (subtests of the RHLB-PL). More precisely, the study evaluates the type and number of errors made when distinguishing emotions (sadness, anger, and happiness) and the grammatical functions of sentences (expressing statement, question and order), and when listening to meaningless utterances. A second aim was to determine how the particular emotions and grammatical functions of sentences were comprehended when they were incorrectly recognized. Finally, the study also examines the relationship between prosody comprehension and schizophrenia symptoms.

Participants
Seventy-eight participants were enrolled to this study; they were divided into two groups consisting of 39 subjects: one group of patients with schizophrenia and another of healthy controls. The patients met the ICD-10 criteria for schizophrenia and were considered clinically stable by their physicians, i.e. they had been on the same oral antipsychotic therapy for the illness but demonstrated a Clinical Global Impression-Severity Scale (CGI-S) (Guy, 1976) score of ≤1 with no change for six or more weeks prior to enrolment. The mean duration of illness was 4.69 years (range: min. one year and three months, max. ten years). The severity of illness was evaluated with the PANSS (Positive and Negative Syndrome Scale) (Kay et al., 1987;Rzewuska, 2002) which was administered by a specialist psychiatrist. The background antipsychotic therapy and concomitant medications were chosen and titrated according to the Polish standards of pharmacotherapy of mental disorders (Jarema, 2015). Daily doses of antipsychotics used were converted into chlor-promazine equivalents using an equivalency table provided by Gardner et al. (2010). The mean chlorpromazine equivalent dose for schizophrenia patients is given in Table 1.
The inclusion criteria for healthy controls comprised no psychiatric history and no family history of psychiatric illness. The participants were confirmed as mentally healthy after an evaluation using a Polish version of the Mini International Neuropsychiatric Interview: M.I.N.I. (Sheehan et al., 1998). The exclusion criteria for all participants were as follows: traumatic brain injury, a neurological or chronic somatic disorder, or alcohol and/or substance abuse or dependence. The groups were matched according to age, sex and education. All participants were native speakers of Polish, Caucasian and of Polish ethnicity. Demographic and clinical information for all participants can be found in Table 1. Abbreviations:. PANSS = Positive and Negative Syndrome Scale; ptwo-tailed asymptomatic probability value (p-value); n = number of participants; SD = standard deviation; N/A = not applicable or not assessed; CPZ = chlorpromazine.

Procedure
All the participants were tested during one session with a clinical neuropsychologist. Comprehension of emotional and linguistic prosody was assessed by two subtests of the Polish Version of the Right Hemisphere Language Battery (RHLB-PL) by E. Łojek (Łojek, 2007). In these two tests, meaningless sentences were prepared and read by a professional lector in a professional recording studio; although the utterances were based on Polish language sounds, they had no sematic associations (e.g. Hime gifa te lara; Wao tala binera). Ten-second breaks were inserted between each of the sentences. The recording could be stopped and replayed only in the part with examples if a patient had difficulty in understanding the test procedure. The sentences were randomly organized, and the order of the emotional prosody test was as follows: happiness, sadness, sadness, happiness, anger, happiness, anger, sadness, happiness, happiness, anger, anger, sadness, anger, happiness and sadness, as well as the order of the linguistic prosody test: question, order, question, statement, statement, order, statement, question, question, statement, order, question, statement, order, order and question.
In the emotional prosody test, the examinee listened to 16 nonsense/ meaningless sentences, each conveying one of three emotions, and was asked to choose which of emotion was expressed by the speaker. After listening to each sentence, i.e. in the 10-second break, the subject pointed to the written name of the correct emotion. The test was preceded by an additional three examples (all three stimuli: "Inebese ote norpe") which expressed sequentially: sadness, happiness, and anger; these were not repeated in the proper test to avoid any priming effect.
In the linguistic prosody test, the examinee listened to 16 nonsense sentences, each conveying one of three modes, and again was asked to identify which mode was expressed by the speaker. After listening to each sentence during the 10-second break, the subject pointed to the written name of the correct mode. The test was also preceded by three examples (again all three stimuli: "Inebese ote norpe") sequentially: statement, question, order; again, this was repeated in the proper test to avoid any priming effect. All the answers were classified as correct or incorrect. All correct answers were awarded one point, and these scores were summed.
Identical nonsensical sentences were given in the examples section of both tests; none were used in the proper tests. Within the body of the test, a total of six nonsensical sentences were used, and they were employed in such a way that the sentences expressing a certain emotion or mode would not be repeated with a different emotion or mode. The test was organized to avoid any priming effect by linking the linguistic layer with prosody and to make it impossible to know how many sentences express a particular emotion or mode.
The type of errors made by examinees were also analyzed: which emotions were incorrectly recognized and how they were recognized, e. g., whether happiness was recognized as anger or sadness, and which modes were incorrectly recognized and how they were recognized e.g., whether a question was recognized as an order or a statement. All participants denied having a hearing problem at the time of enrollment to the study.
Also, the patients with schizophrenia were assessed with the PANSS (Kay et al., 1987;Rzewuska, 2002); this tool evaluates positive and negative symptoms and measures their relationship to one another and to global psychopathology. It constitutes four scales measuring positive and negative symptoms, their differential, and the general severity of illness.
All participants gave their written informed consent prior to their inclusion in this study. The study received approval from the Ethical Committee of the Medical University of Lodz and was performed in accordance with ethical standards laid down in the Declaration of Helsinki.

Statistical analysis
Descriptive and inference approaches were used in the statistical analyses. The Shapiro-Wilk test was used to determine the distributions of continuous variables. Depending on their distribution, comparisons between treatment groups for continuous variables were made using the Student's t-test or the Mann-Whitney U test. Depending on the assumptions reached, differences in categorical variables were evaluated using a Chi-square test, Chi-square test with Yates' correction or Fisher's exact test. Tau b correlation coefficient was used to test the associations between variables. All statistical tests were two-tailed, with alpha = 0.05 set as the level of statistical significance. To assess the differences in the number and type of errors produced by the groups, a conservative Bonferroni correction was used to correct for multiple comparisons in emotional and linguistic prosody errors. Due to the preliminary character of the study and limited sample size, the assumptions of multivariate methods were not met, and hence multivariate methods were not used. The statistical analyses were performed using Dell Statistica software, version 13 (Dell Inc. 2016).

Demographic and clinical characteristics
Demographic and clinical characteristics of the study groups are presented in Table 1. No significant differences were observed in age, sex, handedness, or years of education between the groups.

Correlation between PANSS and prosody tests
Prosody comprehension, both emotional and linguistic, did not correlate with schizophrenia symptoms measured by the PANSS. Probability value for tau-b correlations are presented in Table 2. Table 3 displays the performance on the emotional prosody and linguistic prosody tests and shows the type of errors made in these two tests. The results show that the patients with schizophrenia demonstrated more errors than the healthy controls in both tests. More specifically, the patient group were more likely to recognize anger instead of joy and joy instead of anger. No differences were found in the number of mistakes made with recognition of sadness and anger, nor with joy and sadness: these emotions were not confused. Moreover, the patients with schizophrenia obtained a higher number of mistakes related to recognizing an order as a statement or a question. However, the two groups did not demonstrate any differences regarding the comprehension of statements and questions: these intonations were recognized correctly.

Table 2
Statistics and probability value for tau-b correlations of the prosody comprehension tests and the PANSS in group with schizophrenia. Abbreviations:. PANSS = Positive and Negative Syndrome Scale, p -two-tailed asymptotic probability value (p-value) for tau-b correlation coefficients.

Discussion
This study examines the differences in the types of mistakes made in comprehending emotional and linguistic prosody between schizophrenia patients and healthy controls. The results indicate that the patients with schizophrenia performed poorer than controls in both tasks, making significantly more mistakes. In particular, they were more likely to recognize anger and happiness incorrectly, and to erroneously understand (imperative) the intonation of an order. A significant novel finding was that the intonation of anger was recognized as happiness, happiness was recognized as anger and imperative utterances were recognized both as interrogative and affirmative ones. Emotional and linguistic prosody were analyzed together as they appear at the same time, they interact with each other and both convey important information about the speaker, such as his or her emotions, intentions, dominance or even confidence (Lucarini et al., 2020). What is more, in the present study, the number of mistakes did not correlate with psychopathological symptoms of schizophrenia measured by the PANSS. Importantly, the results of our study have been obtained with a standardized procedure.
Our findings confirm our hypothesis that patients with schizophrenia make more mistakes when trying to comprehend prosodic aspects of meaningless utterances, and this is consistent with previous studies (Caletti et al., 2018;Castagna et al., 2013;Edwards et al., 2001;Gold et al., 2012;Lin et al., 2018;Murphy and Cutting, 1990); however, it is difficult to compare our findings with those of previous studies due to differences in task requirements. Our patients were also more likely to confuse anger with happiness when compared to healthy controls, which is coherent with Caletti et al. (2018) and Kucharska-Pietura et al. (2005) although, unlike these studies, they correctly recognized sadness. The differences in sadness recognition may be attributed to dissimilarities in methodology: the previous studies employed the Voice Emotion Recognition Test and Montreal Protocol for the Evaluation of Communication (Caletti et al., 2018;Kucharska-Pietura et al., 2005) whereas the present research used the RHLB subtests. One important difference involves the presented material: previous studies used neutral sentences (Caletti et al., 2018;Kucharska-Pietura et al., 2005) while the participants in the present study were presented with meaningless sentences (pseudo sentences). However, the results of anger and happiness identification were coherent with these studies despite their different methodology. What is more, in the present study, mood was not measured or controlled as in Kucharska-Pietura et al. (2005); in addition, a more recent study by Herniman et al. (2017) which employed facial emotion recognition found that comorbid depression can be associated with better recognition of sadness. These results could suggest that lack of any significant differences in sadness comprehension in our results may have arisen due to the lack of mood differences between our groups: the correlations between facial recognition tests and acoustic emotion tests range from 0.35 to 0.70 (Trémeau, 2006). However, this would need further studies with emotional prosody and mood evaluation.
Also, the importance of multimodal presentations could be considered in future studies in patients with schizophrenia (Swerts and Krahmer, 2005), as it is known that the visual modality is important for emotional processing and prosody may be difficult to interpret outside context (Nadeu and Prieto, 2011). It could be hypothesized that the visual cues may offer further information to support listeners with a lower capacity to interpret vocal-only cues, allowing them to identify unclear prosodic performances.
Interestingly, the patients with schizophrenia more frequently faced difficulties in distinguishing a meaningless/pseudo sentence read in an angry voice with one read in a happy voice. This could have appeared due to similar acoustic cues which have been misidentified by the listeners (Erb et al., 2020). Previous reviews (Banse and Scherer, 1996;Juslin and Laukka, 2003) have shown that a happy utterance shares most of the acoustic characteristic of an angry one (fast speech rate, high voice intensity levels and variability; high pitch level and variability); however, happy utterances are defined by more of a medium-high pitched voice and less variability in intensity. It is possible that patients with schizophrenia may experience difficulty distinguishing between a high and a medium-high voice pitch and changes in intensity, due to possible neurostructural or neurophysiological disturbances (Demenescu et al., 2015;Hagoort, 2019); this could result in misidentification of anger and happiness. Also, Gold et al. (2012) report that patients exhibited difficulties in emotion recognition tasks due to acoustic features, and that emotion recognition appeared to correlate with pitch perception. Nevertheless, this highlights the need for further research associating the acoustic features of stimuli and emotional prosody identification in patients with schizophrenia.
In the present study, the patients with schizophrenia demonstrated difficulties in linguistic prosody comprehension; while this finding contradicts those of Murphy and Cutting (1990), it is consistent with the * -Significance level was set using the Bonferroni correction separately for emotional (p'=alpha/n = 0.05/6) and linguistic prosody (p'=alpha/n = 0.05/5). Abbreviations:. RFFT -Ruff Figural Fluency Test; HChealthy controls; SCZschizophrenia, p-valueprobability for standardized test statistic, a -Fisher's exact test; b -Chi 2 test; c -Chi 2 test with Yates' correction; nnumber of participants in the group. general linguistic prosody disturbances described by Caletti et al. (2018). However, some important differences exist between the methods employed by Caletti et al. (2018) and those of the present study. Firstly, while only first-episode individuals were enrolled by Caletti et al., the present group of patients comprised individuals diagnosed with schizophrenia, with a mean duration of illness of 4.69 years. Secondly, while the first-episode patients in the previous study did not correctly recognize the intonation of interrogative sentences, our participants with schizophrenia exhibited disturbances in the comprehension of imperative intonation. Thirdly, unlike our present study, the Caletti et al. study did not assess how the participants actually understood sentence intonation. The patients in our study group identified imperative intonation both as interrogative or affirmative, which might suggest they were unsure what intonation they had just heard, particularly the pitch. Furthermore, unlike Murphy and Cutting (1990) our group of patients displayed difficulties in comprehending linguistic prosody. However, in the earlier study, patients listened to sentences with one word stressed and were asked to show the stressed word; our participants listened to pseudo-sentences pronounced with different intonation and were asked to choose the mode of utterance they had heard. Consequently, the two studies employed different evaluation procedures and assessed different types of prosody.
Nevertheless, the fact that patients with schizophrenia were found to experience trouble in distinguishing between various modes of sentences might be a sign of difficulty in differentiating between intonations, i. e. pitch range, tone of voice, loudness, rhythmicality and tempo. The patients might experience problems with decoding and differentiating cadence of the utterances, fundamental frequency contours and duration cues. What is more, as the patients had trouble correctly experiencing happiness and anger (similar in arousal dimension) it could be theorized that this inaccuracy could rely on valence or dominance; however, this would need further research. Nevertheless, these disturbances may appear due to possible dysfunctions in the neuronal networks that underpin these processes (Zhang et al., 2010) and that are associated with linguistic prosody comprehension (Dushanova et al., 2020;Hagoort, 2019). However, these hypotheses would need future studies to be verified.
Furthermore, in the present study, no significant correlation was found between schizophrenia symptoms measured by the PANSS and prosody impairment, which is consistent with some previous studies (Caletti et al., 2018;Kucharska-Pietura et al., 2005). However, this is not in line with other findings indicating a negative correlation between nonverbal emotion recognition and total PANSS score (Tseng et al., 2013) or between emotional prosody and disorganization, negative symptoms and reality distortion (Ventura et al., 2013), or studies suggesting a relationship between voice atypicalities, alogia and flat affect (Parola et al., 2020). These discrepancies might have arisen due to differences in prosody evaluation: while our present study employed pseudo-sentences spoken with various intonations, Tseng et al. (2013) used nonverbal emotion recognition consisting of both facial emotion recognition and voice emotion recognition, Ventura et al. (2013) used only facial emotion recognition and the meta-analysis by Parola et al. (2020) considered voice atypicalities of patients with schizophrenia.
Furthermore, the present study enrolled patients with schizophrenia with medium illness duration of 4.69 years and current medium age of 25.67 while previous studies included older participants, with medium ages of 38.23 (Tseng et al., 2013) and 36.3 years (Ventura et al., 2013), or with a longer illness duration, e.g. of 13.84 years (Tseng et al., 2013). These demographic and methodological differences might have led to the observed dissimilarities in study results. Nevertheless, the lack of correlation between prosody disfunctions and symptoms of schizophrenia might suggest that difficulties in prosody comprehension could represent an independent trait of the illness, though it should be verified in future studies examining emotional and linguistic prosody in larger groups of patients at different stages of the illness. Also, it is possible that a more complex correlation may exist between schizophrenia symptoms, cognitive and/or executive functions and prosody, although this should be analyzed in future studies.

Clinical implications
As prosody comprehension plays a vital adaptive role (Juslin and Laukka, 2003), being crucial to social relationships, such disturbances in patients with schizophrenia may seriously affect the development and maintenance of relationships such as friendships and intimate partnerships (Cummings, 2017). Also, incoherent comprehension of prosody and spoken language may seriously disorganize communication and result in inappropriate interpretation of utterances. What is more, difficulties in distinguishing between various emotions, e.g. anger and happiness, along with a tendency to negatively interpret social situations may impair adaptive behavior in everyday life and may increase the risk of violent and criminal behavior (De Sanctis et al., 2013). Therefore, our present findings may be useful for the detection of prosody disfunctions in patients with schizophrenia and for guiding new rehabilitation programs (Marsh et al., 2016;Tan et al., 2018) matched to auditory emotional and linguistic processing. These programs might relate both to the comprehension and production of prosody, target differentiation between emotions and modes of sentences, failures of prosodic information restrain and the inclination to under-rate negative information. What is more, such programs could use various methods to direct the participant's attention to prosodic features and to integrate the prosodic speech channels with the semantic ones; this could result in improving prosodic recognition and integration with verbal information. Consequently, they could also improve the social communication and everyday functioning of patients with schizophrenia.

Strengths and limitations
This study has several limitations. As a pilot study, the sample sizes were adjusted to detect large differences, and the study might have been underpowered to detect smaller differences. No assessment was made of premorbid general intellectual functioning and therefore, it is not clear whether the differences between patients and healthy controls observed in the present study are only attributable to prosody dysfunctions, or partly to a global cognitive deficit. Furthermore, prosody was evaluated without formal hearing test prior to the current study, and deficits within sensory modality may interfere ability to understand prosodic aspects of other people speech. However, all participants denied having a hearing problem at the time of enrollment, and such difficulty was not observed during the assessment processes. Also, poor effort was not controlled for; however, no clinical signs of intergroup differences were observed in this regard. Also, the duration of untreated psychosis was not measured, which could have confounded the results. The results of the prosody evaluation in schizophrenia group might also have been influenced by the antipsychotic medication used; however, the dosage of anti-psychotics used was relatively low (mean 340.26 mg/d of CPZ equivalents), and there is little risk that the sedating effects adversely influenced the results in prosody assessment.
In addition, no independent evaluation was made of concurrent mood status, and it might be argued that failure to identify a happy emotion is secondary to depressive mood, which is prevalent in patients of schizophrenia. Finally, it is important to emphasize that the study is based on an experimental condition that is very different from real interpersonal communications, where nonsense phrases are not used, and the prosody test itself needs further analyses to better assess its prosodic and acoustic characteristics.

Conclusions
In summary, our results, although somewhat limited in their generalizability, suggest that patients with schizophrenia demonstrate disturbances in emotional and linguistic prosody comprehension. In particular, they experience difficulties in distinguishing between happiness and anger, and appear likely to misunderstand imperative utterances, confusing them with interrogative or affirmative ones. Such impairments are significant since they may affect the ability to form, manage, and maintain relationships with other people, or achieve success in the work environment.
Also, deficits in prosody processing may represent a trait mark of the illness, being independent of psychotic symptoms. As such, it would be valuable to pursue further studies on prosody comprehension and production exploring the influence of such impairments on different stages of the illness and in first-degree healthy relatives. Much further research is needed, however, to translate this knowledge into meaningful therapeutic interventions that can yield improved quality of life, both for affected individuals and for their communication partners.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interest
The authors declare that there is no conflict of interest.