Introduction

The situation of the COVID-19 pandemic that occurred in the spring of 2020 significantly impacted the lives of people in many aspects of their everyday lives. The individual countries’ measures as they were striving to contain the epidemic differed both in timing and severity. As for the Czech Republic (EU), the reaction was relatively fast and strong. Between 10 and 16 March 2020, severe restrictions were adopted, especially restrictions on free movement, closure of schools, shops, restaurants, bans on sporting, cultural, and other activities. As of 19 March, all persons were obligated to wear a face mask in public. The borders were closed completely and Czech nationals staying abroad were predominantly repatriated. All these measures were highly progressive and radical within the context of the situation in Europe, although similar restrictions were subsequently adopted by numerous other countries.

The state of emergency had a major impact on the behaviour and emotional experience of people. Certain studies published so far show an extensive psychological impact, e.g. Twenge and Joiner (2020) demonstrated that the level of mental distress in the US population is three times higher than in 2019 or 2018 using the Kessler‐6 scale. Wang et al. (2020) reported that 53.8% of all respondents in their Chinese study scored as medium to serious psychological impact, out of whom 28.8% scored as medium to high seriousness and 8.8% scored as very serious anxiety level symptoms. The first studies from the Czech Republic also refer to an increased level of anxiety and fear (e.g. 40% of the respondents experienced the fear that they or their loved ones will fall ill with COVID-19 with serious symptoms; Rabušic, 2020). The emergency brought significant changes on the level of interpersonal contact as well. Personal meetings were substituted by phone calls and video conferencing, in the media as well as in day-to-day communication, such words as “face mask”, “lockdown”, and “quarantine”, very rare up to that point, resonated strongly. However, experiencing negative emotions did not necessarily influence only the explicit content of the communication (language content), but also language style (i.e. lexical and morphological layers of the communication; Chung & Pennebaker, 2007).

The present study aims to analyse selected parameters of communication in the form of thematic verbal utterances which might be significant in relation to the psychological and social aspects of the emergency state of COVID-19 epidemic in the Czech Republic. The basic research hypothesis postulates that an unexpected and emotionally disturbing situation will be reflected in the content and form of people’s utterances on the situation (for similar studies, see, e.g. Fiehler, 2002; Peters et al., 2009; Sun et al., 2019; Bernard et al., 2016). This hypothesis is important not only in terms of a possible description of emotional experiences of speakers at a specific time, but also in terms of verifying the potential of an analysis of psychological processes from a quantitative linguistic perspective, i.e. through natural language processing.

The hypothesis is followed by three research questions:

  1. (1)

    Which words resonate most in the thematic utterances, i.e. which lexical–semantic basis do people use to describe the existing situation and the emotional experience thereof?

  2. (2)

    What are the specifics of the utterances on the lexical–morphological level in terms of respondents’ gender and various age cohorts?

  3. (3)

    In which manner is the lexical–morphological level of the utterances influenced by the respondents’ current emotional experience?

The study consists of performing a series of psychological–linguistic analyses that relate to the utterances of the respondents (N = 2552) to two posited questions: “What does the current situation mean to you: Has it changed your life in any way? If so, how?” (this question focused primarily on the interpretation of the situation) and “How do you currently experience the situation: What do you consider the worst? On the other hand, what helps you?” (this question focused primarily on the impact of the situation and the related coping strategies). The data further include demographic descriptors of the respondents and the results of the SEHW (Scales Emotional Habitual Subjective Well-being) questionnaire which determines the valence of the experienced emotions.

Verbal Communication Analysis: Psychology of Word Use

A person’s verbal communication is the subject of study in several disciplines and especially a subject of long-term research in psychology (Gray, 1991). The relationships between specific communication patterns and a person’s interpersonal and intrapersonal functioning have been established in a large number of studies, e.g. screening and diagnostics of disorders through the analysis of speech products (Crystal, 1987), revealing the identity of anonymous communication (Matoušková, 2013), the prognosis of an author’s text or communicator’s behaviour (Canter & Youngs, 2009), or automatic extraction of opinions and attitudes from a text (Rodríguez-Puente et al., 2016). Studies inquire into the specific linguistic markers of gender (e.g. Sboev et al., 2016), emotionality (e.g. Brewer & Gardner, 1996), relationships (e.g. Newman et al., 2008), temperament (e.g. Mairesse et al., 2007; Schwartz et al., 2013), or pathological characteristics (e.g. Demjén, 2014).

Psychological analysis of language use usually differentiates between what a person says (language content) and how the person says it (language style; Chung & Pennebaker, 2007). The importance of language variability of a single person (language style) was repeatedly described on the level of general language usage (e.g. Chen & Bond, 2010), but also in specific word manipulations (e.g. Ireland & Mehl, 2014; Newman et al., 2008).

In terms of the relationship between specific linguistic features (language style) and the characteristics of the communicator, it is most often cited that, for example, women more frequently use “involved” parts of speech (e.g. more pronouns, present tense verbs) in comparison with men who prefer “informative” language (e.g. more nouns, long words, numbers, articles, prepositions; Biber, 1991; Newman et al., 2008). Women also more often use words in first person singular (Mehl & Pennebaker, 2003), negations and verbs (Newman et al., 2008), and more frequently express emotions and self-disclosure tendencies in the text (Holtgraves, 2014). In terms of communicators’ age, the documented differences include, for example, a higher ratio of words with positive emotional load and future tense in older people (Pennebaker & Stone, 2003).

If we focus on the connections between language style and the specifics of emotional experience (emotional state), numerous approaches attempting to successfully detect emotions in the text have been introduced (see, e.g. Pang & Lee, 2008; Ribeiro et al., 2016; Sun et al., 2019). Most of the studies utilize the traditional quantitative dictionary for detection of words, most frequently using the LIWC language analysis software (Linguistic Inquiry and Word Count; Pennebaker et al., 2015; see below). This research documented, for example, a close connection between emotionally loaded words (negative emotional experience resulted in higher use of negatively loaded words; Bernard et al., 2016) and the occurrence of pronouns (the same relationship with personal pronouns in the first person) (see meta-analysis; Holtzman, 2017). These conclusions are also supported by studies focusing on the issue of manifestations of depression and anxiety (e.g. Anderson et al., 2008; Arntz et al., 2012).

The aforesaid methods of quantitative natural language processing, in which both lexical–semantic and morphological analyses are employed, are substantial for our research. As in the above-mentioned studies, we also apply computational and statistical methods to search for relationships between language style and descriptors of the emotional experience of respondents (based on their psychological test results). In this case, however, we use an updated set of techniques, designed with respect to the Czech language specifics and higher linguistic variability.

The majority of the published studies were conducted in the English language, which brings certain risks to the transferability of the results to other non-English speaking populations. This paper focuses on the analysis of the Czech language. The Czech language, a member of the West-Slavic language group, is spoken by relatively few native speakers (10.7 million native speakers; cf. 379 million first-language speakers in English), but it is the 20th most frequently used language on the internet (W3Techs, 2020). In terms of psychological research, several studies on the Czech language, associated under the CPACT project (Computational Psycholinguistic Analysis of Czech Text; Kučera, 2018b), have been conducted in the past few years, focusing on the relationship between the morphological and lexical aspects of the text and the Big Five personality traits (Havigerová et al., 2018), dominance (Kučera et al., n.d.), and depressivity (Havigerová et al., 2019). The results of these studies imply comparability of a large part of the discoveries with the results of the anglophone studies (see Havigerová et al., 2018). However, the type of the text and the comparability (similarity) of the communication situation (to be more precise, the comparability and variability of the text registers selected for the study) plays an important role (see e.g. Biber, 1993; Kučera, 2020).

Method

Data Collection

The data collection was carried out within the JUPSYCOR project, “Research on the Psychological Impacts of the Coronavirus Epidemic in the Czech Republic” (www.jupsycor.cz). The open online interface was promoted through social networks and e-mails sent by cooperating institutions. The interface enabled two types of data collection—one for individual respondents and the other one for assistant interviewers who received online training and collected utterances from other respondents (i.e. the respondents who agreed to participate but would not be able to participate online were interviewed in the very same format through a phone call and their utterances were typed into the web form; see below). The respondents were fully informed about the nature of the survey, asked for informed consent, and participated voluntarily. They provided demographic data, answered two open questions related to their perception and experience of the COVID-19 situation (i.e. utterances), and completed self-reporting questionnaires capturing their emotional states in recent days. The data were collected for 68 days, from 18 March to 25 May 2020. This period was chosen with reference to the development of the pandemic situation in the Czech Republic (from the adoption of major restrictions in March to their decrease in May; Vlada, 2020) (Fig. 1).

Fig. 1
figure 1

Research period and situation milestones in the Czech Republic (www.jupsycor.cz)

Sample

A total of 2552 respondents participated in the research, men and women aged 18–89 years. This sample was obtained via opportunity sampling (see Data collection). The respondents were divided into six categories based on age (age groups; Table 1). We also included the respondents’ representation in terms of their highest achieved level of education (primary school, secondary school, university) and social classification (student, pensioner/retired, other) (Table 2). For the needs of further analysis, the respondents were divided into six groups according to age and gender (cohort groups, Table 3).

Table 1 Sample: Age groups
Table 2 Sample: Education level and social classification
Table 3 Sample: Cohort groups

Due to the data collection method, the sample could not be balanced with respect to various demographic variables or time of participation (see CSU, 2020). The sample has a significantly increased proportion of females (79%) and young persons under 26 years (42%). The sample also shows an increased proportion of people with a university degree (38% compared to 22% of the population aged 25–64 years as reported by the National Education Fund in 2015; NVF, 2015).

Text Materials (utterances)

The respondents were asked two questions focusing predominantly on (Q1) the interpretation of the COVID-19 situation by the respondent and (Q2) the negative and positive impacts on the respondents and their coping strategies. The respondents could write any utterance in reply to this question (no min./max. length of the text was specified). These utterances were typed into a web form.

  • Q1: What does the current situation mean to you: Has it changed your life in any way? If so, how?

  • Q2: How do you currently experience the situation: What do you consider the worst? On the other hand, what helps you?

As stated above, some utterances are based on a literal transcription of the respondent’s utterance by the assistant interviewers (N = 561); however, most were entered directly by the respondent (N = 1991). Only editing of the materials was performed solely in relation to typos in the text.

Linguistic Analysis

The basis of the linguistic analysis is the use of three referential dictionaries, SYN2015, SENS, and LIWC2007, and a set of linguistic applications for the analysis of text in the Czech language (PMA).

SYN2015: Representative Corpus of Contemporary Written Czech (Křen et al., 2016) is a 100-million-word corpus. It was created as a representation of printed language from 2010 to 2014, containing a wide range of text types (fiction, professional literature, newspapers, etc.). The corpus is lemmatized, morphologically and syntactically annotated by a combination of stochastic and rule-based methods. In terms of this study, SYN2015 was used for frequency analysis.

SENS: Dictionary of Emotionally Loaded Words was created by the adjustment of Czech SubLex 1.0 dictionary (Veselovská & Bojar, 2013), performed by the Institute of Theoretical and Computational Linguistics (UTKL; Faculty of Arts, Charles University). The adjustment consisted of deleting 94 words without a sufficient and/or completely obvious emotional load. SENS features 928 words (lemmas) altogether, annotated by a positive, negative, or undetermined emotional load.

LIWC2007: Linguistic Inquiry and Word Count (Pennebaker et al. 2007) is a text analysis program which functions on the basis of the closed-vocabulary approach. Its dictionary is composed of circa 4,500 words and word stems. Each word or word stem is defined by one or more word categories or subdictionaries (see ibid.).For example, the word “cried” is a part of five word categories: sadness (130 Sad), negative emotion (127 Negemo), overall affect (125 Affect), verb (11 Verbs), and past tense verb (13 Past). For this study, synonymous expressions were identified in the dictionary (after translating Czech words into English), specifically in the relevant categories of Psychological Processes (Sects. 121–253) and Personal Concerns (Sects. 354–360). In comparison with the used LIWC2007 version, the newer LIWC2015 version features some modified parts of the dictionary (moreover, certain categories were deleted, e.g. the morphological category of Tense); nevertheless, both versions demonstrate comparable parameters and a high degree of congruity (Pennebaker et al., 2015).

PMA: Prague Morphological Analysis (Hajič, 2001). All obtained texts were further processed via UTKL applications (Jelínek, 2018), collectively termed as PMA. These applications represent a Czech variant of the LIWC (see the comparison in Kučera & Haviger, 2019). However, except for one specific category (Emotions), they primarily focus on morphological analysis. The outcome of this process is the allocation of morphological tags to every lexical unit of the text with an average of 95% accuracy and, in the case of detection of various linguistic variables (e.g. part of speech), as high as 99.5% accuracy (Skoumalová, 2011). This study utilized such linguistic categories that show high compatibility with the anglophone LIWC, i.e. the grammatical categories of Part of speech, Person, Tense, Degree, and Negation, and the semantic category of Emotions that is based on the implementation of the SENS dictionary (see above). All these categories are processed in terms of values of the relative frequency occurrence (i.e. the ratio of the given category to the number of words in the utterance). The overview of the categories and subcategories is included in Table 4.

Table 4 Linguistic categories in the analysis

Psychological Measures

To ascertain the respondents’ emotional experience, SEHW: Scales of Emotional Habitual Subjective Well-being questionnaire (Džuka, 2015; Džuka & Dalbert, 2002) were used. The SEHW questionnaire is a ten-item questionnaire focused on the emotional component of subjective well-being (Diener, 1994), with the Positive Affect Scale consisting of four descriptors (pleasure, happiness, joy, and physical freshness/energy/briskness) and the Negative Affect Scale comprising six descriptors (anger, guilt feelings, shame, fear/anxiety, pain, and sadness/sorrow). Respondents were answering the question: “How often have you experienced these affects in the past few days?”. The explicitness of questionnaire statements and simplicity of answering for respondents were the main reasons for choosing SEHW. The questionnaire has been used in numerous studies on well-being in different populations (e.g. Džuka & Dalbert, 2006; Gurková et al., 2012). The respondents indicated how often they experienced each affect state in recent days on a six-point frequency scale ranging from 1 (almost never) to 6 (almost always). The internal consistency estimate for the Positive Affect Scale was Cronbach’s alpha = 0.83, and for the Negative Affect Scale, it was Cronbach’s alpha = 0.67 in the validation study (Džuka & Dalbert, 2002). In the present study, the respective Cronbach’s alphas were 0.85 and 0.74.

Results

Description of Utterances

Table 5 describes the numbers of words, sentences, and tokens (individual occurrences of a linguistic unit) that were recorded within the utterances (texts Q1 and Q2) while employing PMA (Prague Morphological Analysis). All texts which featured at least one word in both utterances (Q1 and Q2) were included in the analyses. The results demonstrate that both men and women wrote utterances of similar length, equivalent to short commentaries, slightly longer in the case of text Q2.

Table 5 Utterances: Q1 and Q2 texts description for the whole sample, females and males

Description of SEHW Questionnaire

Table 6 features an overview of the SEHW (Scales of Emotional Habitual Subjective Well-being; Džuka & Dalbert, 2002) questionnaire results according to the individual items and the average score for negative emotions (SEHW_N) and positive emotions (SEHW_P). Table 7 features a division of respondents into two groups—low scoring and high scoring in SEHW_N negative emotions (high = mean > 2.3) and low scoring and high scoring in SEHW_P positive emotions (high = mean > 3.5), determined according to the sample median.

Table 6 SEHW questionnaire descriptives (SEHW)
Table 7 SEHW questionnaire: High and low scores

In previous studies, the positive affect mean score SEHW_P ranged from 3.00 (elderly people; Džuka & Dalbert, 2006) to 3.85 (high school students; ibid.) and negative affect mean score SEHW_N ranged from 2.28 (nurses; Gurková et al., 2014) to 2.96 (elderly people; Džuka & Dalbert, 2006). The values gathered in the present study fall within these ranges.

Most Distinctive Words: Results of Frequency Lexical Analysis

After performing the frequency analysis, words occurring most frequently in the utterances (Q1 and Q2) of the whole sample (N = 2552) were detected, namely in the Part of speech category: Nouns, Adjectives, Verbs, and Adverbs. These words were identified in the frequency dictionary of the SYN2015 corpus, determining the rank in which they occur in this corpus. Subsequently, S/J ratio was calculated (dividing rank 2 in SYN2015 by rank 1 in JUPSYCOR Q1/Q2 texts). For instance, the noun “restriction” is listed with rank 2 = 903 in the SYN2015 corpus, but it ranked rank 1 = 9 in our Q1 responses. It was therefore mentioned approximately 100 × more frequently in the utterances than in common Czech communication. The overview in Tables 8 and 9 includes words with S/J ratio ≥ 25. This value was determined ad hoc, regarding the legible arrangement of the list and visualization (approximately 25 words for all parts of speech related to one question). It was ascertained for each word whether it appears in the SENS dictionary (and if yes, the emotional load of the word was included) and whether the same or synonymous word appears in the LIWC dictionary (if yes, semantic–psychological connotations of the word were included).

Table 8 Question 1 (Q1): Most distinctive words in respondents’ utterances (N = 2552)
Table 9 Question 2 (Q2): Most distinctive words in respondents’ utterances (N = 2552)

For the illustration of significant words (in terms of all parts of speech), a visualization of Q1 (Fig. 2) and Q2 (Fig. 3) was performed in a word cloud form. The font size corresponds to the S/J ratio value.

Fig. 2
figure 2

Q1: Most distinctive words (word cloud)

Fig. 3
figure 3

Q2: Most distinctive words (word cloud)

The most significant words appearing are, for instance, “as”, “face mask”, “contact”, and “uncertain”. The significant words with negative emotional load according to the SENS dictionary are in the absolute majority, except for two words with positive load in Q2 (“calmness” and “manage”). In terms of psychological connotations of words according to the LIWC dictionary, words in the categories of Social (9 words), Anxiety (6 words), Inhibition (5 words), and Work (4 words) are in majority.

Comparison of Respondent Groups in Terms of Linguistic Categories Usage

Further analyses were aimed to compare the use of linguistic categories (in what way are the utterances phrased) between men and women (Table 2) and between 6 cohorts (Table 3). A Mann–Whitney U test and a Kruskal–Wallis H test were run to determine whether there were significant differences in the relative frequencies of linguistic categories between these groups. The effect sizes (Cohen’s d) of the presented results are within a range that Cohen (1988) reports as a small effect (0.1–0.3), as given in Tables 10 and 11.

Table 10 Gender and linguistic categories for both Q1 and Q2 (Mann–Whitney U test)
Table 11 Cohort groups and linguistic categories for both Q1 and Q2 (Kruskal–Wallis H test for independent samples)

The influence of gender on phrasing the utterances was proven in three linguistic categories (POS–V, POS–R, and Em2.-), in both Q1 and Q2. In their utterances, men used a significantly lower number of verbs, fewer prepositions, and fewer emotionally negatively loaded words (Table 10).

The diversity of the cohort groups was proven in nine linguistic categories for Q1 and Q2 simultaneously (Table 11). The groups’ general tendencies in linguistic categories usage (group means/medians) are highly comparable for both texts (Q1 and Q2). The most distinctive in this regard are categories: prepositions (POS–P), used more frequently by younger people (men and women) in contrast to middle-aged people; verbs (POS–V), which are used to a higher degree by young people (especially women), and, on the other hand, less by especially older men; first person (Per–1), once again used primarily by young people, but also older women.

The Polarity of Respondents’ Emotional Experience in Relation to Usage of Linguistic Categories

Another set of analyses focused on the differences in linguistic categories usage between respondents who scored either high or low on the SEHW_N (negative emotions) and SEHW_P (positive emotions) scales (Table 7). A Mann–Whitney U test was run to determine whether there were significant differences in the relative frequencies of linguistic categories between these groups for both Q1 and Q2. However, the effect sizes (Cohen’s d) of the presented results are within the range that Cohen (1988) reports as a very small effect (0.1 on average), as given in Tables 12 and 13.

Table 12 SEHW_N score (negative emotions) and linguistic categories for both Q1 and Q2 (Mann–Whitney U test)
Table 13 SEHW_P score (positive emotions) and linguistic categories for both Q1 and Q2 (Mann–Whitney U test)

Although the differences between the groups are minor, people reaching a higher score in SEHW_N (negative emotions) exhibit a significantly higher usage of adjectives (POS–A) and future tense (Ten–F) in both texts (Q1 and Q2), and, contrastingly, lower usage of proverbs (POS–D) (Table 12). Regarding SEHW_P (positive emotions), the differences are again very minor; however, there is an obvious difference between the groups in the category of Deg–2 (second degree, comparative), more often used by people with a higher ratio of positive emotions (in both texts; Table 13).

Relationships Between Emotional Experience and Linguistic Categories Usage in Various Respondent Groups

A Spearman’s rank-order correlation was run to assess the relationship between 27 linguistic categories (relative frequencies, Table 4) and 12 SEHW (SEHW results, Table 6) for Q1 and Q2. The test was performed on nine different sample groups altogether: on the whole sample (N = 2552), on the six cohorts (Table 3), and on men and women (Table 2).

A high number of small but significant correlations (p < 0.05) were found within all groups, usually with the value of rs (< 0.1 and > − 0.1). Three hundred and fifty-one significant correlations were found in Q1, and 336 significant correlations in Q2. Ninety-four correlations thereof were significant in both texts, and 93 of these demonstrated the same correlation direction (see Supplement 1).

After making Šidák’s adjustment (Šidák, 1967) to the level padj < 0.0001583, 48 significant relationships were found in Q1 across all groups and 10 significant relationships in Q2. Six relationships thereof fulfilled the p-adjustment conditions in both Q1 and Q2 (Table 10). The relationships proven by Šidák’s statistical correction are linked primarily to the linguistic category Em2.- (emotionally negatively loaded words), which positively correlates with the scales SEHW_6 (fear) and SEHW_N (negative emotions mean). It is therefore apparent that the respondents experiencing negative emotions use a higher number of negative words. The only confirmed morphological category was Ten-F (future tense), which showed positive correlation with the scale SEHW_6 (fear) within the whole sample (N = 2552). People experiencing fear therefore use a higher number of words related to the future (Table 14).

Table 14 SEHW and linguistic categories for Q1 and Q2 (Spearman’s rank-order correlation; Šidák’s adjustment)

In terms of relationships concerning solely morphological categories, Table 15 features an overview of 11 significant relationships that are related to both Q1 and Q2 and scales SEHW_N (negative emotions mean) and SEHW_P (positive emotions mean). The complete correlation matrix is included in Supplement 2.

Table 15 SEHW_N and SEHW_P scales and morphological categories for Q1 and Q2 (Spearman’s rank-order correlation)

Discussion

The previous text introduced the results of a study focusing on word usage in a reflection of the state of emergency, COVID-19 epidemic in the Czech Republic, and the connections these words have to the emotional experience in 2552 respondents. The importance of the study in this regard lies in two aspects—first, it describes the specifics of thematically focused utterances and their linguistic parameters in different respondent groups, and, second, it documents those linguistic features that refer to the respondents’ emotional experience.

Before interpreting the results as such, it is necessary to point out to the specifics of the research sample, the specifics of the time framework of the data collection, and the specifics of the Czech language. As mentioned above, the sample features a majority of women and young people, predominantly students. It is, therefore, necessary to consider the extent of the influence of selection bias on the results. Nevertheless, we appreciate, in comparison with other COVID-19-themed researches (see, e.g. Özdin & Bayrak Öydin, 2020; Rodríguez-Rey et al., 2020), the relatively large representation of older people. This representation was achieved also because the older respondent group was often questioned via assistant interviewers (see above), without relying on contacting respondents solely via social media. The time frame selected for the research covered 68 days. It is therefore apparent that the respondents’ utterances might have been (and undoubtedly partially were) influenced by the situation at that time. From 18 March to 5 April, the situation was at its most serious in the Czech Republic (adoption of major emergency measures, e.g. closure of schools, restaurants, shops, imposed face masks, restrictions of free movement, etc.); afterwards, the restrictions were softening, and in late May, the situation in the society was relatively optimistic (albeit with milder restrictions still applicable). In this aspect, it was difficult to set a clearer limit than the one relating to the adoption of government measures (see Vlada, 2020). The use of Czech language analysis also resulted in certain compromises, connected predominantly to the necessity of key word translation (including presentations of relevant examples and tracing relationships with the English expressions) and the selection of suitable linguistic categories (which were selected especially regarding their compatibility with English). The central point in this regard was the transparency of the whole process while also striving for a maximum transferability of the results to other languages.

If we focus on the first research question (What words resonate the most in the thematic utterances, i.e. which lexical–semantic basis do people use to describe the current situation and their emotional experience thereof?), it is not surprising that across all utterances, the words that resonated most were words connected with the social situation and with negative connotations. Words related to anxiety and inhibition and references to social environment and work are prevalent. However, in the second utterance (focused, among else, on coping), words suggesting activities perceived positively appeared as well (e.g. “calmness, nature, walk, chill”). It is also not surprising that the highest ranking positions of lexically unique words are occupied by such words as “face mask, lockdown, infected”, which were omnipresent in the media in the Czech Republic at that time as well (see, e.g. Trait, 2020). The adverb “as” (“jako” in Czech) is an interesting phenomenon, because it appeared in both utterances 390 times more than in regular communication. This word may have fulfilled several roles in the utterances—the common usage (e.g. She works “as” a teacher), to express similarity (e.g. He behaves “like” a mad man), to connect (e.g. In winter “as well” as in summer), to present an example (e.g. Some people, “such as” old persons), and colloquially to express aloofness (e.g. So what?). The word may thus indicate a tendency to refer to another fact or parallel, or the inability to specify the content of the communication. The transition towards unspecified cognitive categories and metaphorical language might mean that the situation is cognitively more complex than is common, or that it is not sufficiently cognitively processed by the respondent (which manifests also on the verbalization level; see e.g. Lupyan & Casasanto, 2015).

In terms of the second research question (What are the specifics of the utterances on the lexical–morphological level in terms of respondents’ gender and various age cohorts?), the analysis of differences in linguistic categories usage in the utterances among various respondent groups proved several significant results, albeit with a relatively low effect size. In their utterances on the perception of the COVID-19 situation, men used fewer verbs, fewer prepositions, and fewer emotionally negatively loaded words. However, these findings generally conform to the referential research focusing on a common text, i.e. communication outside of an exceptional situation (e.g. Biber, 1991; Newman et al., 2008). That the presented findings are more of a result of common gender differences is supported by comparing the results with studies on the Czech language carried out within the CPACT project (Kučera, 2018b), i.e. the use of verbs and the more frequent use of the first person can be generally considered as a relatively reliable gender indicator (Kučera, 2020, p. 84). In terms of comparing the six cohorts (based on gender and age), the distinctive differences include the parts of speech of prepositions, which were more frequently used by younger people (men and women) in comparison with middle-aged people, verbs, which were more frequently used by young people (especially women; in contrast, less frequently by older men), and first person, which was again used predominantly by young people, but also older women.

The results related to the third research question (In which manner is the lexical–morphological level of the utterances influenced by the respondents’ current emotional experience?), which concentrated on the relationship between the linguistic categories usage in the utterances and the score in the SEHW questionnaire (negative and positive emotions), confirm the premises presented in the referential research. Regarding the lexical–semantic basis of the utterances, it is apparent that negative emotional experience positively correlates with the usage of emotionally negatively loaded words (see Bernard et al., 2016). This relationship is even more distinctive in the sixth item of the SEHW questionnaire, which asks, within the negative emotions group, directly about the experience of fear, standing out in the group of younger women in particular. It is necessary to mention that especially the female group, namely younger women, generally scored highest in the negative emotions scales in comparison with other groups (although the negative emotions scores generally fell within the SEHW test norms). The congruence between words and emotions is therefore the most pronounced here. The importance of the aforementioned relationships is supported also by the comparison with the results of studies carried out within the CPACT project (see above), which attest to a significantly lower occurrence of emotionally negatively loaded words in a common text in contrast to their occurrence in the studied thematized utterances. The evidence of higher scores of negative emotional experiences in women (especially in younger women) is well documented in many cross-cultural studies (see, e.g. De Bolle et al., 2015; Klimstra et al., 2009). It could therefore be assumed that even in the state of emergency, the general trend is similar.

Regarding purely morphological variables, 93 significant correlations (without performing statistical correction) were found appearing in both utterances in the same manner, with six significant correlations thereof after performing Šidák’s p-adjustment. One of the interesting findings is, for instance, the higher usage of future tense in persons who describe more negative emotions, and, contrastingly, a higher usage of comparatives in persons who express more positive emotions. Let us add that both these morphological categories appear in the respondents’ utterances to a degree significantly different from common Czech text—they might therefore present a potentially interesting psychological indicator. A higher degree of future tense usage was documented, for example, in research focused on observing respondents’ confusion (see D’Mello & Graesser, 2012). Simultaneously, this relationship may be supported by the reasoning that the worries of respondents scoring higher in the negative emotions scale will be directed primarily towards the future (e.g. anticipatory anxiety, Butler & Mathews, 1987), and therefore refer to future. In terms of the use of comparative, the use of the second degree (comparison or gradation of the meaning) may be an expression of a certain aloofness of the communicator, related to experiencing the situation in a more positive manner. Nevertheless, a more precise interpretation must be verified by further research. It is pertinent to add that in contrast to anglophone research, no relationship of higher significance between the respondents’ characteristics and pronouns usage was detected. However, it is probably a specific characteristic of the Czech language, which does not require the use of a pronoun in a sentence. (The pronoun can be implicitly expressed by the verb form.) Additionally, the low frequency of these relationships is confirmed by previous research on Czech texts (see, e.g. Kučera, 2018a).

Several key findings arise from the presented study. The situation related to COVID-19 modified the respondents’ (personal) vocabularies in connection with the description of their emotional experience. The respondents presumably adapted to the general discourse, and a distinctive preference for words with a negative connotation appeared. Negatively emotionally loaded words occurred more frequently in women’s utterances and positively correlated with experienced negative emotions, especially with fear. This relationship was also confirmed within the whole sample. The experience of fear also positively correlated with the morphological category of future tense, where the highest scores were detected in the younger age category of 18–25 years.

The benefits of this study in comparison with big-data analyses of online communication (e.g. Yu et al., 2020; Madria & Kabir, 2020) lie in the emphasis on the psychological level of communication and the usage of standardized psychological measures, and the more precise thematic specification of the analysed texts (the relationship to subjective interpretation, emotional experience, and the respondents’ coping with the situation). Owing to the data collection procedure, it was also possible to ensure a higher representation of older people, who are especially important with regard to the topic of the study. Another valuable aspect of the presented research lies in the use of the combination of lexical–semantic and morphological analyses of the texts, in contrast to, for example, stand-alone sentiment analysis (e.g. Liu, 2015).