Abstract

The emotional impact of the COVID-19 pandemic and ensuing social restrictions has been profound, with widespread negative effects on mental health. We made use of the natural language processing and large-scale Twitter data to explore this in depth, identifying emotions in COVID-19 news content and user reactions to it, and how these evolved over the course of the pandemic. We focused on major UK news channels, constructing a dataset of COVID-related news tweets (tweets from news organisations) and user comments made in response to these, covering Jan 2020 to April 2021. Natural language processing was used to analyse topics and levels of anger, joy, optimism, and sadness. Overall, sadness was the most prevalent emotion in the news tweets, but this was seen to decline over the timeframe under study. In contrast, amongst user tweets, anger was the overall most prevalent emotion. Time epochs were defined according to the time course of the UK social restrictions, and some interesting effects emerged regarding these. Further, correlation analysis revealed significant positive correlations between the emotions in the news tweets and the emotions expressed amongst the user tweets made in response, across all channels studied. Results provide unique insight onto how the dominant emotions present in UK news and user tweets evolved as the pandemic unfolded. Correspondence between news and user tweet emotional content highlights the potential emotional effect of online news on users and points to strategies to combat the negative mental health impact of the pandemic.

1. Introduction

The mental health impact of the coronavirus disease 2019 (COVID-19) pandemic has been profound: almost all sections of society have been adversely affected. The UK’s Centre for Mental Health predicts that an additional 20% of the UK’s population will require mental health support due to the pandemic [1]. There is consistent evidence that the pandemic has triggered widespread stress, anxiety, and depression [2]. The social restrictions imposed have been a significant contributor to this, increasing loneliness levels across much of the population, with consequences for mental illness symptomatology [3]. In response, many people came to rely even more on social media and online news platforms to communicate and interact and access and share information, accelerating prepandemic trends: UK adults spent 15% more time online in 2020 compared to 2019 [4]. Evidence suggests that loneliness levels in young adults have been particularly impacted by the pandemic, with loneliness amongst young adults linked to a greater increase in both their social media use and social support seeking [5]. Findings such as this, as well as from studies prepandemic, suggest that social media and online news platforms can impact users’ mental health and wellbeing [6]. Characterising the emotional content of media and user posts on social media allows deeper insights into this. Although these channels could potentially be leveraged to benefit the health and wellbeing of society as we begin to deal with the impact of the pandemic, the complex relationship between online media content and users’ emotional responses and wellbeing needs to be better understood.

With the usage of social media and online news platforms at record levels, it is crucial to develop this understanding. While potential benefits include an expansion of social networks [7], better access to information, and a supportive online community, for those facing difficulties [8], significant risks and downsides have been identified. Studies conducted during the pandemic have tended to focus on social media use frequency, usually measured by self-report in cross-sectional designs; higher usage has been linked to higher stress levels [9], anxiety [10], and loneliness [11], alongside a poorer quality of life [11] and wellbeing [12]. However, it is difficult to draw inferences regarding causality from such data, and far fewer studies have investigated online news content and the responses to these on social networks. This allows more objective insights into the emotional impact on users, as well as general insights into the overall emotional status of the population at particular points in time, as the pandemic situation unfolded.

Previous work focusing on past disasters has identified important effects of exposure to online news content on mental health. In the Boston Marathon bombings of 2013, those who viewed graphic images of the attack online were subsequently at higher risk of showing posttraumatic stress disorder (PTSD) symptoms [13, 14]. Likewise, longitudinal work by Silver et al. [15] in a US sample () linked higher exposure to media coverage of the 9/11 attacks with increased stress and poor health across the 3 years after the event [15]. However, the directionality is not clear-cut Thompson et al. [16] found via surveys that those showing symptoms of trauma after the Boston bombings, and after the Pulse nightclub attacks of 2016, were more likely to view media coverage of the event, which in turn exacerbated and perpetuated their trauma. Nevertheless, studies such as these point to the direct impact that online news consumption can have in times of crisis. However, these studies relied on self-report measures of both media exposure and mental health, and these isolated events involved graphic and traumatic imagery, and thus might not be very generalisable to pandemic situations. However, while not a single isolated incident like 9/11, the pandemic has had a wide and profound global impact and has produced a stream of fear-inducing news reports and content, particularly in the earlier stages of the virus’ emergence and spread, and consistent with these prior findings, emerging evidence points to a connection between online news consumption and mental health during the pandemic. Based on self-reported survey responses from a probability-based US sample (, American Trends Panel, March 2020), Stainback et al. [17] found that those who followed the news “very closely” reported nearly 25% higher distress levels compared to those who followed the news “fairly closely,” “not too closely,” or “not at all closely”; similar cross-sectional studies that focused specifically on online news consumption during the pandemic have likewise found correlations with negative affect, anxiety, and stress [1820]. However, these studies, relying on subjective data, have limitations including social desirability and sampling bias, and larger sample sizes would be helpful to infer effects across populations.

An alternative approach is to consider the content being generated by online users and draw inferences from the emotions expressed within it. A powerful method for doing this is natural language processing (NLP) implemented using machine learning models: this approach involves analysing large datasets of user comments by automated means, to extract patterns and trends in emotions expressed. One form of NLP is sentiment analysis and emotion recognition; this measures the emotional polarity of the text, deriving measures of negative and positive emotions, and levels of specific emotions, expressed within it. Another useful element of NLP is topic modelling which can classify text into a set of preselected topics or themes; this allows inferences regarding relationships between topics and emotional responses evoked and user reactivity to certain topics. The capacity of these tools to process data on a very large scale allows for very well-powered and generalisable studies to be conducted, although the ability to consider individual-specific variables is limited since the characteristics of the users providing the content are largely unknown.

Some studies have been published using NLP tools to investigate user responses to the pandemic on Twitter [2133]; heightened emotions and high levels of negatively coded sentiments were present in the first half of 2020, with the economy, lockdowns, and health as prominent topics. Other studies have tracked sentiment change over time, finding an increase in negativity expressed as the pandemic situation emerged, with increased stress, isolation, and anxiety being identified [27, 3437], with a general increase in individual user engagement with Twitter, as the pandemic situation unfolded [37]. Emotion recognition alongside topic classification can be used to identify specific emotions: Basile et al. [38] analysed emotional reactions to events (via Reddit) between January and June 2020; fear was found to predominate across the threads, and negative sentiment increased when national cases of COVID-19 increased, with sentiment also being connected to governmental responses. Likewise, Li et al. [39] found a correlation between stress levels, estimated from Twitter data in major US cities, and the number of local COVID cases. Low et al. [35] focused on Reddit users in support groups, including groups for suicide, schizophrenia, and depression, finding increased negative emotions, loneliness, and risky behaviour including suicidal ideation during the pandemic compared to before. With regard to user responses to online news during the pandemic, Aslam et al. [40] analysed nearly 150000 news headlines from the first half of 2020, across 25 global news organisations, finding that 52% of headlines were negative in polarity. However, Krawczyk et al. [41] looked at 26 million news articles from the front pages of 172 major online news sources across 11 countries between January and October 2020; their sentiment analysis revealed widely heterogeneous emotional polarity, and only 16% of COVID-19 news articles were classified as highly negatively polarized. Nevertheless, there is evidence that COVID-related topics and themes show consistency across both online news content and Twitter posts by users. de Melo and Figueiredo [42] analysed nearly 20 thousand news media articles and over a 1.5million tweets, from Brazil. Topics and themes showed high correspondence between news media and social media, pointing to a strong influence of online news content on user postings.

Evidently, the use of NLP techniques to understand the impact of the pandemic on user wellbeing and attitudes has been fruitful, allowing large-scale inferences regarding the emotional responses of populations, as the pandemic continues to unfold. Some studies have also revealed useful insight into emotional content within online media reporting. However, studies using emotion recognition techniques have tended to focus on either user content or on news headlines, not the relationship between both. Although some small, country-specific studies have been conducted (e.g., Tune et al. [43] examined users’ reactions to COVID-related media coverage in Bangladesh in the initial wave of the pandemic), there is limited literature around this, and work focusing on this relationship is needed in order to better understand the degree to which user sentiment is influenced by online news consumption. Therefore, this study analysed the emotions present in both online media content and the associated user responses to this content, to investigate how the emotions recognised in news tweets (tweets from news organisations) interact with users’ emotions, in the direct comments to these posts. Studies continue to emerge covering the pandemic, and this study sets out to enhance the understanding of the pandemic’s impact, focusing on temporal patterns of emotions contained in online news and the corresponding emotional responses from users, between January 2020 and April 2021. We looked specifically at UK news sources, considering a range of outlets. This allowed us to conduct analyses within particular time epochs that mapped the time course of the COVID situation in the UK, to investigate how online emotions and content linked to the UK social restrictions and virus spread. Topic modelling is helpful as it points to the key topics within online content at a specific time point. For example, Kit et al. [44] also focused on the UK and used topic modelling to identify how key themes on Twitter (such as mask wearing) tracked the public reaction to current events and government policy and found evidence of a relationship between the volume of COVID-related tweets in specific parts of the UK with the local death rates, producing what they describe as a potential “early warning system” for tracking cases.

The studies discussed above demonstrate the ability of NLP tools to provide valuable insights into population mental health to characterise the impact of the pandemic both over time and within specific user groups. Using large datasets of tweets from multiple major UK news outlets, and user responses to these, we set out to draw inferences regarding patterns of emotional content and prevailing topics in these and the correspondence between user and news tweets. The research questions under study were (1)Characterise how the emotions present in UK media postings, and user content, varied across 2020/2021, considering key epochs reflecting the UK pandemic situation and the UK’s response to it, with an increase in negative affect (sadness and/or anger) predicted(2)To investigate the degree of correlation between emotions (anger, joy, optimism, and sadness) expressed in news organisation tweets and the tweets made by users in response

Topic modelling was used to provide context regarding the emotions present, regarding prevailing themes. Characterising the emotional status of users could inform support strategies for mental health, and by studying the relationship between emotions present in users and online media content, will inform and provide better awareness of the news’s impact on societal wellbeing.

2. Materials and Methods

2.1. Data Collection

We use data from Twitter, an online microblogging and social networking site publicly available and accessible to those with access to the Internet. Registered Twitter users are able to post short messages called “tweets” and interact with other users’ tweets by comments, replies, and “likes.” Recent years have witnessed an increase in the use of Twitter data to understand social phenomena, and it has been shown to be an effective means of tracking the public’s reaction to both large- and small-scale events [45].

The data used in this project is in the format of written tweets collected using Snscrape API, an open-source web-scraping library for social networks. This API allows researchers to access public tweets via specific searches (e.g., via keywords and location). All tweets collected are shared publicly, and therefore, all users whose tweets are included in the dataset have consented to this data being accessible to the public via Twitter’s privacy policy (https://twitter.com/en/privacy). No demographic information of the individual user of each tweet was collected, only the content and location. Ethical approval was granted via the University of Surrey. The data is organised into news tweets (posts by news organisations on their Twitter accounts) and user tweets (Twitter users’ comments on these news tweets, in the form of comments to the news threads). Data used was a subset of that collected by a larger ongoing study (by Global Affects, https://globalaffects.org/); here, data from 19 UK-based news is used (Table 1). Tweets were sourced via the API from the preselected news channel tweets and the replies to these tweets over the period from 1 January 2020 to 30 April 2021 using a set of keywords: wuhan, ncov (abbreviation of Novel Corona Virus), coronavirus, covid, sars-cov-2, pandemic, lockdown, quarantine, social distancing, wearing masks, vaccination, vaccine, outbreak, panic buying, remote working, homeschooling.

2.2. Data Cleaning

To clean the data, a set of exclusion criteria was used. The following tweets were excluded: tweets from private accounts, comments on comments (leaving only direct comments to the news tweet), comments already excluded by the platform or by the user themselves, tweets that were not classified as English by Twitter, and tweets less than 3 words long. A total of 2,868,260 tweets resulted: 174,696 news tweets and 2,693,563 user tweets; this meant on average there were 15 comments per news tweet collected. For the correlation analyses, we used data relating to BBC News (UK), The Guardian, and the Daily Mail Online only (1,114,466 user tweets total). The corpus of tweets was then put through two NLP processes: (1) topic modelling and (2) emotion recognition.

2.2.1. Topic Modelling

To identify topics being discussed in tweets we used latent Dirichlet allocation (LDA) [46], an unsupervised machine learning approach that finds latent topics in unlabelled text data. We used the MALLET (machine learning for language toolkit) [47] implementation of LDA. It generates a probabilistic distribution of words over latent topics, with each topic characterized by this distribution. LDA was chosen to identify topics because it is an unsupervised approach that does not require annotated data for training; also, it allows us to specify the number of latent topics to look for and produces results that are easily interpretable.

The input to an LDA model is a text corpus (tweets in this case). Since raw tweets contain a lot of uninformative word tokens (e.g., stopwords, punctuations, special characters, and numbers), they need to be processed before feeding it to the LDA model. This was done by firstly applying a regular expression to keep words and spaces while removing numbers and underscore symbols. Then, each news tweet was converted into a list of tokens (i.e., a sequence of characters that is a useful semantic unit for processing) in the lowercase lemma format (i.e., the base or dictionary form of a word), and the team then removed all tokens with length equal to one. All tokens that start with the “@” symbol were also removed, to avoid references to other Twitter accounts. Words classified as geographical entities (i.e., countries, cities, and state names) using a named entity recognizer (NER) were removed. This ensured that the tweets were more reflective of different themes related to the COVID-19 pandemic and not biased towards grouping together tweets originating from similar geographical locations. Lastly, tokens that occurred in less than 0.005% or more than 33% of the documents were removed, so as not to allow rare and highly common tokens to bias the topic modelling.

LDA requires the number of latent topics to be specified beforehand. We used coherence, a proxy for topic quality, as a proxy to identify the optimal number of latent topics in the given tweets. Coherence is based on the distributional hypothesis that words with similar meanings tend to cooccur within similar contexts [48]. We experimented by varying the number of topics between 2 and 40 with steps of 3 and computed the coherence for each. 17 was found to be the optimal number of topics for the dataset that yields a coherence score of 0.51. These topics were manually merged into 8 main topics/themes based on the similarity of the topics with regard to COVID-19. Table 2 contains the topics identified by LDA for our dataset and the corresponding keywords.

2.2.2. Emotion Recognition

The RoBERTa model [49] was selected as it is retrained on ~58 M tweets to capture Twitter language specifics and is fine-tuned for emotion recognition on SemEval “Affects in Tweets” dataset [50]. The model is evaluated on TweetEval [51], an evaluation benchmark for Twitter-specific classification tasks, and achieved a state-of-the-art performance of 79.8% macroaveraged F1-score on the task of emotion recognition. A macroaveraged F1 score, also referred to as macro-F1-score, is calculated based on the arithmetic mean of all the per class F1 scores. This metric treats all classes equally regardless of their support values [52].

Emotion recognition codes the data into different emotions, providing a score to each emotion for each tweet collected. This model effectively distinguishes between four core emotions: joy, optimism, sadness, and anger. The model generates a softmax score of between 0 and 1 for each emotion, and this score was used to classify each tweet. Softmax is an exponential function that normalizes the output of a model to a probability distribution over predicted classes. We classified each tweet according to the emotion with the highest softmax score, as long as this was above 0.50. Softmax scores also help identify instances that do not fall within any of the four core emotions used: a tweet containing emotions other than the ones our model can recognise usually gets a more homogeneous distribution of confidence scores across the four emotions it can detect; when none of the emotion softmax scores exceeds 0.50, we label such tweets as being “undefined.”

As an example, the sentence “I hate the lockdown, it feels so isolating.” gets scored: anger .230, sadness .746, joy .010, and optimism .013. Since sadness has the highest softmax score and it is above 0.50, we label this tweet as “sad.”

2.3. Analysis

To provide an overall picture of the topic prevalence and emotion levels over the timeframe under study, the entire UK news tweet dataset was considered first (all 19 channels). This allowed us to depict the wider trends in emotion and topic. Next, we focused on three specific channels: BBC News (UK), The Guardian, and the Daily Mail, to formally investigate correlations between news tweets and user tweets. The three channels were selected as they represented three major mainstream UK media channels with large followings but with different audiences and political leanings: according to a YouGov poll in 2017, The Guardian was regarded as the most left-wing UK newspaper included in the survey, and the Daily Mail was the most right-wing [53]. The BBC is officially a politically neutral organisation. Conducting the analyses on each channel separately not only allowed us to infer channel-specific effects but also to check for replicability across channels.

Effect of time. Similar to the methodology of Su et al. [54], to create the time epochs for the analyses, time periods were chosen according to the UK Government data on the COVID “waves” of infections [55] and the corresponding imposition/easing of social distancing measures. The four time epochs were (1)Early phase, before UK social distancing imposed: Jan 2020–22nd March 2020(2)23 March 2020–30 May 2020: “Wave One,” social distancing measures imposed(3)31 May 2020–6 September 2020: lower numbers of cases, restrictions eased(4)7 September 2020–April 2021: “Wave Two,” increased number of cases

Only the BBC News (UK) news tweet dataset and corresponding user tweet datasets were used for this analysis, as (a) it was the largest of all the media channel user datasets and (b) the BBC is politically neutral.

3. Results and Discussion

Each tweet represented one unit of data in the study; each tweet had multiple variables derived from it including date, tweet content, topic, prevalent emotion, and emotion scores for anger, joy, optimism, and sadness (ranging between 0 and 1, for each). News tweets were labelled with which organisation they were from, and user tweets were labelled according to which news tweet they were commenting on.

3.1. Emotions and Topics in the News Tweets (Full Dataset)

The full news tweet dataset consisted of 174,696 individual tweets from 19 media channels across the UK, see Table 1 for the number of tweets per channel. The average news tweet emotion scores were calculated across the period January 2020 to April 2021, see Figure 1.

Overall, 50.5% of news tweets were classified as sad, 14.4% as angry, 9.3% were optimistic, 5.9% were joyful, and 19.9% of news tweets were labelled undefined. The average anger score across the news tweet data set was (), for joy (), optimism (), and sadness (). Supplementary Table 1 presents average emotion scores for each of the news channels.

Table 3 shows the overall frequencies of each topic, and Figure 2 depicts how the distribution of topics in the news tweets changed over time. The most common topic was “Virus Spreading” at 20.5%, and the least was “Vaccines And Vaccination” at 4.4%, although the prevalence of “Vaccines And Vaccination” only became apparent from midway through 2020.

The average emotion scores for each topic were then examined, see Table 4 for the mean emotion scores, by topic. The topic with the highest anger score was “Preventative Measures” (, ). The topic with the highest joy and optimism score was “Vaccines And Vaccination” (, and , , respectively). The topic with the highest sadness score was “Cases and Deaths,” (, ).

3.2. Correlation between News and User Emotions

To investigate whether the news tweet emotion levels were correlated to the user tweet emotion levels, correlation analyses were carried out. These analyses were conducted using news and user tweets from three selected news channels: BBC News (UK), the Daily Mail Online, and The Guardian. User tweets were direct comments and replies to the news tweets. In total, the user tweets consisted of 1,114,466 user tweets across the timeframe under study (1 January 2020 to 30 April 2021). The average emotion scores for every month for each chosen news tweet were calculated. The same process was repeated for their user comments, and then, a correlation matrix was created for each of the three channels (see Tables 57). Spearman’s Rho coefficients were used due to the data being nonparametric. Figures 38 show emotion levels over time, by channel, for news tweets (Figures 3, 5, and 7), and corresponding user tweets (Figures 4, 6, and 8).

These highlight significant correlations between user emotions and news emotions: significant positive correlations emerged for all emotions and across all three channels. Based on the test statistics (Tables 57), anger consistently emerged as having the strongest correlation between user and news tweets (compared to the other emotions studied) across all three datasets (for anger: BBC News: , ; The Guardian: , ; and the Daily Mail Online: , )

3.3. Changes in News and User Emotions across Specific Time Epochs (BBC News (UK))

We then conducted statistical tests to investigate changes in emotion level by epoch. Four key epochs were distinguished, representing (1) Early phase (before UK social distancing imposed), (2) Wave One (spring 2020), (3) low cases/minimal restrictions (summer 2020), and (4) Wave Two (autumn 2020-spring 2021). One-way ANOVA was conducted to identify differences in news and user emotions between these four epochs, using the BBC News (UK) tweets (news tweets and corresponding user tweets) only (see Table 8 for the descriptive statistics per emotion, by epoch).

For the news tweet dataset, the assumption of homogeneity of variance was violated, due to Levene’s test being significant for all emotion categories. Therefore, Welch’s -statistics are reported. There was a main effect of epoch for all emotion categories: anger: , ; joy: , ; optimism: , ; and sadness: , . Post hoc tests (Games-Howell due to unequal variances) for each emotion showed changes between each epoch with significant increases in anger scores between epochs one and four (), epochs two and four (), and epochs three and four (). Joy levels increased between each epoch, with significant increases between time epochs one and three (), one and four (), two and four (), and epochs three and four (). Optimism levels ended significantly higher with significant increases in optimism between epochs one and four (), two and four (), and three and four (). Lastly, sadness levels decreased across all epochs. Sadness significantly decreased between epochs one and two (), one and three (), and one and four () and also between epochs two and four () and three and four ().

For the corresponding user tweet dataset, the assumption of homogeneity of variance was also violated due to Levene’s test being significant. Therefore, Welch’s -statistics are reported. As in the news tweets, there was a main effect of epoch for all four mean emotion scores: anger: , ; joy: , ; optimism: , ; and sadness: , . Post hoc tests (Games-Howell) were then conducted. Similar to the news tweets, anger levels increased throughout, with user anger scores significantly increasing across the four epochs ( for all pairwise comparisons). Joy levels were seen to increase across the first 3 epochs and then decline: there was a significant increase from epoch one versus two and three (), followed by a significant decrease in joy levels in epoch four, compared to both epochs two and three (). On the other hand, optimism levels significantly decreased in epochs two, three, and four compared to epoch 1 ( for all), but there was a significant increase in optimism between epochs three and four (). However, as in the news tweets, sadness levels decreased across the epochs, with a significant decrease in sadness across the first 3 epochs (all values < .001) and between epochs three and four ().

4. Discussion

By leveraging NLP techniques, this study has provided a large-scale picture of online emotional responses by Twitter users, to COVID-related news. The aim was to characterise how prevailing emotions evolved between the virus first being publicised at the start of 2020 and April 2021. Topic modelling was used to provide context to the findings. We classified emotions within a large corpus of news tweets from leading UK media channels, the user emotions in response, and the correlations between the two. Finally, focusing just on the largest dataset (BBC News UK), we conducted analyses of effects between four specific time epochs, chosen based on the timeline of social distancing and “waves” of infections, between January 2020 and April 2021.

Consistent emotion patterns in the news and user tweets across the different news channels were observed. Sadness was the most prevalent emotion across channels and throughout the time period under study. Analysing correlations between news and user tweet emotions for three major UK news sources (BBC News (UK), the Daily Mail Online, and The Guardian) revealed significant correlations, across all emotions and for all 3 sources. Regarding the effect of time epoch, ANOVA results analysing the BBC News data revealed significant effects of time epoch on emotion levels for both news and user tweets. In the news tweets, anger levels increased throughout the time epochs, and sadness declined, although sadness remained the most prevalent emotion in the news tweets throughout, with just over 50% being classified as “sad”; this remained relatively consistent despite the significant change in scores between time epochs. Both joy and optimism levels increased across epochs. For user tweets, anger followed the same pattern, significantly increasing across time epochs. Likewise, sadness decreased throughout. Unlike the news tweets, joy levels were seen to increase across the first 3 epochs and then decline in epoch 4 (which represented Wave Two of infections). For optimism, levels were significantly lower in epochs two, three, and four compared to epoch 1 ( for all), but there was a significant increase in optimism between epochs three and four. This could potentially indicate that despite infection rates increasing and lockdown measures being reimposed (impacting levels of joy amongst the user tweets), there was nevertheless growing optimism that vaccine rollout could combat the virus spread, going forward.

Previous research has identified correlations between stress levels and local COVID infection rates [38, 39], with those authors directly attributing this to news reporting. Here, epochs represented key periods in terms of infections and lockdown restrictions. The increasing anger levels over time, observed in both the news and user tweets, is likely a reflection of wider sociopolitical factors including frustration with the lockdown measures and dissatisfaction with governmental responses. The topic with the highest anger score was “Preventative Measures,” supporting this. The strong correlations between news and user tweets in terms of emotional classification provide good evidence that news content has been influencing user emotions. However, it should be noted that the direction of causality cannot be inferred based on these data alone. Overall, our prediction that there would be an increase in negative affect was supported, with user anger scores significantly increasing across the four epochs (Table 8); this finding indicates a trend of worsening mental health in terms of anger and stress across 2020/2021, in this sample. However, only anger increased rather than both anger and sadness: encouragingly, sadness was seen to decrease across the epochs. Also, as noted above, there was some indication that despite levels of joy decreasing as the Wave Two of infections took hold, optimism increased. This suggests reliance and a sense of hope that virus spread, and associated mortality, could be curtailed. Topic modelling supports this: the topic with the highest joy and optimism score was “Vaccines and Vaccinations,” which was not prominent in the news until the end of 2020, since this is when the vaccine rollout began in the UK.

Emotions in the news channels selected were significantly positively correlated to the equivalent emotions in the user reactions. This supports the notion that emotional content in the news triggers and/or exacerbates corresponding emotional responses in users, suggesting that emotions projected by news channels online are having an important impact on users.

This study provides unique insight by virtue of its methodology, whereby the user’s comments are analysed directly in relation to the original news tweet it is a response to. Findings thereby add detail to previous literature showing that high levels of news consumption during the pandemic are linked to increased levels of stress, anxiety, and depression symptoms [56]. Emotions in the news, such as sadness and anger, are positively correlated with the user’s emotional response to it and this could contribute to the higher stress and depression observed in heavy media consumers; this is particularly relevant during the time period under study, where we found that sadness was the most prominent emotion in the news tweets.

Work reviewing research into the phenomenon of “Digital Emotion Contagion” by Goldenberg and Gross [57] highlighted how users’ emotions are influenced by the emotions of content they are interacting with online, and that these emotions spread across users as a result. Also, there is evidence that this effect is actively encouraged by media organisations in that they aim to generate content that generates intense user emotions [58]. Also, in terms of tweet counts and sharing online during the pandemic, Nanath and Joy [59] found that two of the topics most likely to spread amongst users were mental health topics and content with negative emotional impact. They also found that these trends had changed as a result of the pandemic, highlighting the dynamic nature of the online world and the unique impact the pandemic has had on it. The current study adds further insights, showing how topics and emotional patterns have evolved over the time period under study, with correspondence between news and user tweets, highlighting the role of emotional contagion, and the impact of media news reporting, on users’ emotions.

Figures 38 highlight the relationships between the emotion levels in the news versus the emotion levels in the corresponding user tweets. Despite the ANOVA tests revealing significant changes in emotion scores and both the news and users following a pattern of increased anger and decreased sadness, what remains consistent across all channels and throughout the whole period is that the news showed high sadness levels (above all other emotions), and in the user tweets, the prevailing emotion was anger. Thus, it seems that news outlets might exaggerate negativity in their headlines: historically, news organisations have reported more negative news in print and on television in order to attract readers [60]; evidence suggests that this policy also applies to online news reporting [58]. However, it seems that from the data here, this has taken the form of a bias towards sadness rather than anger in the context of the pandemic, while user tweets have instead trended towards anger in response. Theories of emotion highlight the differences between anger and sadness regarding their cognitive, physiological, and behavioural responses. Both emotions would be classed as negative in affect, but evidence points to them having different outcomes, with anger being linked to increased physiological activity and a drive to overcome obstacles and sadness being linked to reduced activity and a focus on goal failure [61]. Anger has been linked to dominance and control, which also impacts how a person confronts risk, leading to more optimistic judgments [62, 63]. In a world of uncertainty, fear, risk, and a lack of control, it is understandable why people are perhaps more inclined to express and share content involving anger rather than sadness, as it is a more proactive response that counteracts feelings of helplessness. However, compared to sadness, anger causes a heightened physiological stress response, activating the autonomic nervous system and triggering increases in blood pressure and heart rate [64]. Chronic stress and anger have a range of negative consequences for health, Yaribeygi et al. [65] highlight effects across the cardiovascular, gastrointestinal, and endocrine systems, brain, and immune system functioning. This shows the negative effects of prevailing anger could be having on users. To manage this risk, strategies could include targeted messaging on media platforms to protect the wellbeing of users. For example, reminding users to take time away from the screen, or the importance of self-care and techniques to manage stress could be helpful.

We identified consistently high anger levels amongst UK Twitter users over the time period studied, and this is in line with some previous work. Li et al. [39] analysed the affective trajectories of users based on Twitter and Weibo data from Jan to May 2020. Their results showed that in China, anger intensity first went up steeply at the initial stage of the outbreak and then had another sharp increase around February 8th: this peak was attributed to the death of Wenliang Li, the ophthalmologist who issued the warnings about the virus. The intensity of anger then gradually decreased, with no further spikes. In the USA, their data showed that anger levels were low initially but showed a rapid rise from the beginning of March 2020; analyses suggested that the topics of “Trump,” “lockdown,” and “government” were the key triggers for this. Basile et al. [38] studied emotional content in Reddit posts amongst users from 5 European countries, and New York City, for the period Jan to June 2020. They found that the UK showed high levels of anger compared to the other European countries, supporting our findings, noting that expression of anger in both the UK and New York City was the highest amongst the countries included, with values increasing across the time period under study. They found a similar pattern for fear. Moreover, there was a decrease in trust over the time period, and levels of trust were the lowest amongst all the considered countries. Thus, they argue that because both UK and NYC adopted quite a late and initially soft approach to managing the pandemic, the patterns of emotional responses from users in these locations reflect anger directed towards the late response of their governments, to the crisis. As discussed above, emotion contagion amongst users [57], and the ways that online news perpetuates and exacerbates distress and worry [16], also likely contributes to the pattern of findings identified here and other comparable studies.

This study has identified a lack of balance in the news tweets regarding positive and negative affect. The news channel content contained consistently high levels of sadness, followed by anger, across the entire 16-month period (Figure 1). Levels of joy and optimism remained very low. Of course, during a global health crisis, perhaps these emotion levels reflect the stark reality, and it would be unlikely that the news could have reported the pandemic-related events without sadness and anger prevailing. However, in the UK, the vaccine rollout began at the very end of 2020, which perhaps underlies the significant increases in user optimism seen in the last time epoch. In the news, joy and optimism levels did increase across the epochs; nevertheless, sadness remained far more prominent. Based on the correlation analyses, we provide further evidence around how negative emotional news channel content links to emotions expressed by users in response. With increasing numbers of people using platforms like Twitter to join the online community and to access the news (a trend exacerbated by pandemic-related social restrictions), awareness of how media organisations impact user emotions and mental health is important; follow-up longitudinal work is necessary to ascertain causality behind the correlations identified here and study longer-term effects. Further work is also needed to generalise findings to other countries. Here, user tweets were assumed to reflect UK sentiment due to being comments on UK news; however, we did not collect personal demographic information on the users, so it is likely that a proportion of the user comments were from individuals outside of the UK. Also, while our model is effective at classifying core emotions, follow-up work could take this further to focus on mental health problems associated with sadness and anger. This was seen in the work by Li et al. [39] where the lexicon from the PHQ-9 was used to identify emotional content indicating possible clinical depression. Earlier work on the news and trauma, as outlined in the introduction, found that victims of trauma are more likely to consume the news, and this then perpetuates their trauma [16]. Thus, it should be noted that the current findings around the user comments might have been biased towards individuals more negatively impacted by the pandemic, which then resulted in more engagement with online news and stronger negative sentiment expressed. On the other hand, a strength of this study is that we utilised a large corpus of user comments and considered a range of major UK news outlets, enhancing generalisability. In the UK, 54% of the online population uses Twitter [4]: there are approximately 17 million Twitter users in the UK, representing around a quarter of the UK population. However, Mellon and Prosser [66] found that Twitter users are not representative of the general UK population, in terms of age, political opinions, and education levels; also, it should be noted that those who comment on the posts of news organisations represent a specific subset of Twitter users. Therefore, it is important to recognise that despite analysing over one million user comments, these might not represent the emotions of the general UK population, but rather a specific subset.

News organisations during a time of crisis are central to how we react, behave, feel, and understand the ever-changing situation around us; they are an important part of how society experiences, and responds to, a crisis. Therefore, it is important to generate understanding and awareness of how emotive language impacts the readership and spreads through emotion contagion. Protecting the mental health of users, particularly during a time of crisis where, across all age ranges and populations, the prevalence of mental health issues has increased [67], is a priority. This study adds to the literature and is novel in that we show correlations between emotions in news content and the emotions in the direct comments made by users. It has provided strong evidence of a relationship between the two, across various major UK news channels, thus demonstrating the impact of news reporting on how users feel in response: thus, strategies that consider the content and consumption of online news content by users could be useful for protecting or improving the mental health of populations as we move into the later phases of the pandemic. The study furthers understanding around the timelines and corelationships between news and user emotional content during 2020/2021, which is essential to encourage better practice by users, news organisations, tech companies, and regulatory bodies, to mitigate the potentially harmful consequences of heavy online news consumption on societal wellbeing.

Data Availability

Data sharing is not applicable to this article as no new data were created or analysed in this study.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This research was supported by grants from the University Global Partnership Network (UGPN; https://ugpn.org/) and the Regional Collaborations Programme COVID-19 Digital Grants (Australian Academy of Science), awarded to DM and SE.

Supplementary Materials

Supplementary Table 1: average emotion scores by news channel. (Supplementary Materials)