Effect of daily new cases of COVID-19 on public sentiment and concern: Deep learning-based sentiment classification and semantic network analysis

Che, ShaoPeng; Wang, Xiaoke; Zhang, Shunan; Kim, Jang Hyun

doi:10.1007/s10389-023-01833-4

Effect of daily new cases of COVID-19 on public sentiment and concern: Deep learning-based sentiment classification and semantic network analysis

Original Article
Published: 31 January 2023

Volume 32, pages 509–528, (2024)
Cite this article

Download PDF

Journal of Public Health Aims and scope Submit manuscript

Effect of daily new cases of COVID-19 on public sentiment and concern: Deep learning-based sentiment classification and semantic network analysis

Download PDF

ShaoPeng Che¹,
Xiaoke Wang²,
Shunan Zhang³ &
…
Jang Hyun Kim ORCID: orcid.org/0000-0001-7750-2664³

2017 Accesses
5 Citations
1 Altmetric
Explore all metrics

A Correction to this article was published on 13 February 2023

This article has been updated

Abstract

Aim

This study explored the influence of daily new case videos posted by public health agencies (PHAs) on TikTok in the context of COVID-19 normalization, as well as public sentiment and concerns. Five different stages were used, based on the Crisis and Emergency Risk Communication model, amidst the 2022 Shanghai lockdown.

Subject and Methods

After dividing the duration of the 2022 Shanghai lockdown into stages, we crawled all the user comments of videos posted by Healthy China on TikTok with the theme of daily new cases based on these five stages. Third, we constructed the pre-training model, ERNIE, to classify the sentiment of user comments. Finally, we performed semantic network analyses based on the sentiment classification results.

Results

First, the high cost of fighting the epidemic during the 2022 Shanghai lockdown was why ordinary people were reluctant to cooperate with the anti-epidemic policy in the pre-crisis stage. Second, Shanghai unilaterally revised the definition of asymptomatic patients led to an escalation of risk levels and control conditions in other regions, ultimately affecting the lives and work of ordinary people in the area during the initial event stage. Third, the public reported specific details that affected their lives due to the long-term resistance to the epidemic in the maintenance stage. Fourth, the public became bored with videos regarding daily new cases in the resolution stage. Finally, the main reason for the negative public sentiment was that the local government did not follow the central government’s anti-epidemic policy.

Conclusion

Our results suggest that the methodology used in this study is feasible. Furthermore, our findings will help the Chinese government or PHAs improve the possible behaviors that displease the public in the anti-epidemic process.

Deep Learning-Based Sentiment Analysis on COVID-19 News Videos

Analysis of sentiment changes in online messages of depression patients before and during the COVID-19 epidemic based on BERT+BiLSTM

Article 13 July 2022

Sentiment Impact of Public Health Agency communication Strategies on TikTok under COVID-19 Normalization: Deep Learning Exploration

Article 11 May 2023

Introduction

Over the past two years, the release of daily new cases by governments or public health agencies (PHAs) on television and social media has been recognized as an effective communication strategy during the COVID-19 pandemic (Rodríguez-Rey et al. 2020). However, few researchers have focused on how this communication strategy affects public sentiment and concerns, especially within the context of the increasing normalization of COVID-19 (Ma et al. 2022).

In China, the communication strategy of daily new cases starting in 2022 seemed synonymous with the “weather forecast.” People may need to consider whether the outbreak’s end will affect their future travel plans based on the daily new cases. Because the public’s right to travel is heavily influenced by local governments’ classification of cities as low risk, medium risk, or high risk based on the number of new cases per day in the area. For example, when someone’s city is classified as a medium- or high-risk area due to an increase in the number of new cases per day and leading to an increase in the risk level and control conditions, they can easily feel pessimistic and distressed (Gao et al. 2022).

As of September 2022, only Harman et al. (2021) had explored the impact of public perceptions of daily new cases, highlighting that these perceptions change over time. For example, as the number of daily new cases decreases, individuals’ approval of public health measures reduces. However, there is still a lack of measurable instances to help us understand how public sentiment changes amidst daily new cases, and how the public focus varies over time, especially in the context of COVID-19-based normality.

The Shanghai lockdown in the first half of 2022 provided a sample to study for our research. The 2022 COVID-19 outbreak in Shanghai was the most severe since the Wuhan city closure, and the Shanghai government responded with a city closure (Cheshmehzangi et al. 2022). The outbreak not only had a significant negative economic and social impact on Shanghai, but the delay in city closure also led to ripple effects in other provinces and cities, including Beijing.

Considering these issues, this study analyzed the Shanghai lockdown to explore how daily new cases affected the public sentiment disposition and concern.

TikTok

As of August 2022, existing studies have focused on traditional text-based social media, like Twitter. However, few studies have focused on emerging video-based social media, like TikTok (Chen et al. 2021). The United Nations Educational, Scientific and Cultural Organization shows that 750 million adults worldwide are illiterate. Undoubtedly, text-based social media is not friendly to illiterate people. However, the emergence of video-based social media has solved this problem to some extent. Although individuals may not be literate, they are still able to access information by watching videos and listening to sounds (Fu et al. 2022). Therefore, when illiterate or semi-literate people use the Internet to acquire knowledge, video media are given higher priority than text media.

Furthermore, the most critical aspect of risk and crisis communication is to reach a broad audience. In 2021, during the COVID-19 pandemic, TikTok became the world’s most visited Internet domain, surpassing Google. Therefore, TikTok has become an indispensable information source and communication channel between governments, organizations, and the public. Therefore, this study proposed the following research question:

RQ1: Does Healthy China use daily new cases as a communication strategy on TikTok?

Crisis and Emergency Risk Communication (CERC)

CERC’s five-stage approach is more theoretically based in comparison with time series to analyze the public impact of daily new cases. CERC has divided the crisis into five stages: pre-crisis, initial event, maintenance, resolution, and evaluation. These represent the transition from risk to outbreak, cleanup to recovery, and finally to evaluation (Che et al. 2022b).

In the pre-crisis stage, the purpose of official communication is mainly to help build public awareness of the risks.

In the initial event stage, a connection is established with the public to help reduce their inner uncertainty.

In the maintenance stage, officials continue to reduce the uncertainty held by the public.

In the resolution stage, officials provide the public with guidance and information related to recovery and resumption of work.

In the evaluation stage, the problems revealed in the first four stages are evaluated and summarized to improve society’s ability to face the next crisis.

Based on the CERC model, we divided the cycle of Shanghai lockdown into five stages: pre-crisis, initial event, maintenance, resolution, and evaluation. We then explored the impact of daily new cases on the public in these stages. Thus, our second research question was:

RQ2: Is there any variation in the daily new cases posted by Healthy China on TikTok at different stages?

Pre-training model ERNIE

The pre-training model BERT represents the most cutting-edge and popular method for user sentiment classification (Gao et al. 2019), but its results on Chinese tasks have never been satisfactory. However, ERNIE (Large-Scale Knowledge Enhanced Pre-Training for Language Understanding and Generation), a knowledge-enhanced multi-paradigm unified pre-training model released by Baidu, outperformed BERT in most Chinese tasks due to its advantage regarding Chinese corpus richness (Zhang and Shang 2022).

ERNIE has designed a new continuous multi-paradigm unified pre-training framework with the following advantages (Li et al. 2022):

First, in terms of knowledge fusion and mask language modeling tasks (Sun et al. 2022), ERNIE utilizes a phrase- and entity-level mask approach to fuse external knowledge.

Second, regarding the rich Chinese corpus, ERNIE has added Chinese corpora, such as the Baidu encyclopedia and Baidu news, to enhance its effectiveness on Chinese tasks.

Finally, regarding dialog embedding (Hsieh and Zeng 2022), ERNIE’s training corpus introduces knowledge from multiple data sources, such as news and information and web forum conversation data. This kind of learning for conversation data is an essential method of semantic representation.

Therefore, this study used ERNIE to classify sentiments based on the crisis stages. The third research question was as follows:

RQ3: How do the public sentiment tendencies differ across stages?

Semantic Network Analysis (SNA)

Topic modeling is one of the most mainstream methods for exploring textual themes (Koo 2022), but there are two shortcomings in this type of modeling. First, it does not work well for short texts (Valero et al. 2022). Second, it cannot clarify the association between words (Graham and Lu 2022).

However, SNA overcomes the above two shortcomings. First, SNA implies that after screening the representative terms statistically, the relationship between terms is numerically processed based on the co-occurrence relationship between these terms two by two; subsequently, the structural relationship between terms is revealed in a visual way. Based on such a semantic network structure diagram, terms' hierarchical relationship and affinity can be analyzed intuitively (Che et al. 2022a).

This study used SNA for the content analysis of the text after sentiment classification:

RQ4: How does the public’s concern differ under positive and negative sentiments at different stages?

Methods

Crisis stage

We first classified all the critical events and time points in the 2022 Shanghai city closure event, after which we used these important events as the basis for the stage division based on the CERC model. As a result, the final stage division results are shown in Fig. 1. Generally, February 24, 2022, was the starting point of the Shanghai city closure event, while June 30, 2022, was the endpoint.

Data acquisition

We first used the National Health Commission of the People’s Republic of China (NHCC)’s account Healthy China on the video-based social media platform TikTok as the data source. Then we performed crawler data gathering of all the videos with the theme of “daily new cases” posted on this account from February 24 to June 30, 2022. As a result, we obtained 79 videos, including their posting time and 1220 user comments. Figure 2 shows the distribution of videos and comments at different stages.

Manual annotation

As shown in Table 1, we first extracted 20% of the data from each of the five stages for data annotation (244) to construct the pre-training model ERNIE.

Table 1 Distribution of stratified sampled data at different stages

Full size table

Table 2 shows the parameters of the manual annotation process. We classified the emotional tendency of the data into positive and negative, with positive marked as 1 and negative marked as −1. To ensure the reliability of the data annotation, we recruited a Ph.D. student in data science and a Ph.D. student in medicine to complete the annotation. First, we randomly selected 30% of the 244 data points for test annotation (73); we then calculated the Holsti coefficient to determine the quality of the data annotation. Finally, we repeated the test until the Holsti coefficient reached 0.9 or higher and completed 70% of the annotation.

Table 2 Parameters of the manual annotation process

Full size table

Table 3 shows the final annotated data we obtained.

Table 3 Sentiment distribution of the manually annotated data

Full size table

Training model

We used the annotated 110 positive comments and 134 negative comments as the dataset for model training. Then used the trained model to analyze the sentiment tendency of the remaining 80% of the data. Table 4 shows the parameters of the model.

Table 4 Parameters of the ERNIE model

Full size table

Figure 3 and Table 5 show the distribution of the 1220 user comments at different stages.

Table 5 Distribution of 1220 user comments at different stages

Full size table

Semantic network analysis

In the first step of the semantic network analysis (SNA), we preprocessed the data based on Python. First, we loaded the Baidu stopword list to help filter out some meaningless expressions, symbols, and terms in the comments; we then used Jieba to split the comment text and observed the top 50 words to analyze whether there were incorrect splitting results and synonyms in the results. For incorrect word splitting, we manually added new words to improve the splitting effect, and we merged the synonyms. After several iterations, the co-occurrence matrix and matrix network were created using the 50 most frequently used words.

We used Gephi for community discovery and visual analysis in the second step. First, our network layout was based on the ForceAtlas2 algorithm. The node size depended on the degree; the larger the degree, the larger the nodes. The thickness of edges was based on the number of word co-occurrences; the higher the number, the thicker the edges. In addition, the number of co-occurrences between words also affected the distance between nodes; the more co-occurrences, the closer the nodes were to each other.

In the third step, we used Gephi’s built-in Louvain clustering algorithm to discover communities within the network. This algorithm performs better in terms of efficiency and effectiveness, and is able to discover hierarchical community structures. Modularity is a measure of the structure of a network graph, and generally, modularity > 0.44 indicates that the network graph has reached a certain level of modularity.

Results

Pre-crisis

SNA based on positive comments

Table 6 shows the top 38 words, which were calculated based on degree and eigencentrality, and Fig. 4 shows a semantic network of these words.

Table 6 Top 38 terms based on degree and eigencentrality

Full size table

As shown in Table 7, Fig. 4 consists of 38 nodes and 85 edges. The average degree was 4.474, average weighted degree was 5, density was 0.121, and modularity was 0.5.

Table 7 Parameter calculation in Gephi

Full size table

We identified a total of six themes. Theme 1 (blue, 28.95%) included an appeal to the people of Shanghai to not cause trouble to the country in the fight against the epidemic; these representative words were country, anti-epidemic, love, staff, nucleic acid test, free, vaccine, nurse, salary, frontline staff, and cheap. For example, “I hope that when you come back from abroad, I will take the initiative to report the quarantine and not make trouble for the country and the medical staff and that everyone is responsible for protecting their country.”

Theme 2 (green, 21.05%) involved praying for the end of the worldwide epidemic as soon as possible; it contains words such as epidemic, national people, prosperity and national peace, medical staff, world, garlic, get out, and responsibility. For example, “I hope the epidemic will end soon, and our country will be safe and prosperous.”

Theme 3 (purple, 15.79%) referred to an expression of confidence in the fight against the epidemic. The words comprised ruthless, affectionate, prevention and control, spirit, place, and confidence. For example, “We have the confidence to fight the virus by staying home as much as possible and not going out of the house.”

Theme 4 (red, 13.16%) expressed an active anti-epidemic stance. Its representative words were people, countrywide, universal, affect, and children. For example, “Come on, everyone, it is about our future generations.”

Theme 5 (orange, 10.53%) involved morale boosting; its representative words were sad, contribute, revolution, and comrade. For example, “The war against the epidemic has not yet been won, and we still need to work hard.”

Theme 6 (black, 10.53%) included a call to pay attention to personal protection, and its representative words were mask, healthy, wash hands frequently, and ventilation. For example, “Slogan: Wearing masks, washing hands, being ventilated, and avoiding gatherings.”

SNA based on negative comments

Table 8 shows the top 22 words extracted, based on degree and eigencentrality. Figure 5 presents these 22 words in the network.

Table 8 Top 22 terms based on degree and eigencentrality

Full size table

As shown in Table 9, Fig. 5 comprises 22 nodes and 36 edges. The average degree was 3.273, average weighted degree was 3.545, density was 0.156, and modularity was 0.471.

Table 9 Parameter calculation in Gephi

Full size table

In this network, we identified three themes. Theme 1 (green, 45.45%) was related to the cost of fighting the epidemic being too high for ordinary people; this theme is represented by the words epidemic, medical staff, sad, examination, end, anti-epidemic, car loan, salary, assistance, and mask. For example, “I still have a mortgage and a car loan to pay, and I cannot survive if I am quarantined at home and cannot go to work.”

Theme 2 (blue, 27.27%) included complaints about not being able to get the booster vaccine off-site, represented by the words nonlocal, nucleic acid test, positive, ID card, staff, and case. For example, “Why can’t I get a booster shot in other provinces even if I show my ID card?”

Theme 3 (purple, 27.27%) referred to the lack of care for older adults, with words such as time, vaccine, cell phone, TV, research, and country. For example, “It is strongly recommended to install TVs and other devices in the vaccine waiting area to let the elderly pass the time, rather than let them sit in the corner and watch the young people play on their mobile phones.”

Initial event

SNA based on positive comments

Table 10 shows the top 44 words calculated based on degree and eigencentrality, and Fig. 6 shows these 44 words in the network.

Table 10 Top 44 terms based on degree and eigencentrality

Full size table

As shown in Table 11, Fig. 6 comprises 44 nodes and 185 edges. The average degree was 8.409, average weighted degree was 8.409, density was 0.196, and modularity was 0.475.

Table 11 Parameter calculation in Gephi

Full size table

We identified three themes. Theme 1 (purple, 38.64%) included cheering, represented by the words epidemic, rose, warrior, unity, defeat, compatriot, sad, neighborhood, nucleic acid test, free, grassroot staff, medical staff, salary, smile, disinfect, spray, and love. For example, “In the face of the epidemic, we are all soldiers. However, I believe that if we all work together to prevent and control the epidemic scientifically, we will be able to overcome it!”

Theme 2 (green, 36.36%) was a call for personal protection, including words such as science, mask, wash hands frequently, ventilation, crowd, crowded, air, circulation, place, smell, taste, stuffy nose, runny nose, sore throat, conjunctivitis, and myalgia. For example, “To adhere to the science of wearing a mask, not shaking hands, washing hands frequently, ventilating, and trying not to go to crowded places where the air is not circulating.”

Theme 3 (orange, 25%) consisted of words related to the fight against the epidemic being a protracted war, represented by anti-epidemic, countrywide, combat, advice, mutation, virus strain, vaccine, battle, combat power, physical strength, and frontline staff. For example, “The fight against the epidemic is a protracted battle! Therefore, we must fight together with the frontline workers!”

SNA based on negative comments

Table 12 shows the top 43 words calculated based on degree and eigencentrality, and Fig. 7 consists of these 43 words.

Table 12 Top 43 terms based on degree and eigencentrality

Full size table

As shown in Table 13, Fig. 7 comprises 43 nodes and 53 edges. The average degree was 2.465, average weighted degree was 2.558, density was 0.059, and modularity was 0.793.

Table 13 Parameter calculation in Gephi

Full size table

We identified 12 themes. Theme 1 (blue, 23.26%) questioned the criteria for assessing the epidemic in Shanghai, represented by the words epidemic, rose, warrior, unity, defeat, compatriot, sad, neighborhood, nucleic acid test, free, grassroot staff, medical staff, salary, smile, disinfect, spray, and love. For example, “There are so many new cases in Shanghai, but there is not even a high-risk area. So what are the evaluation criteria?”

Theme 2 (orange, 11.63%) involved a travel code abnormality, represented by words including nucleic acid test, staff, negative, yellow code, and community. For example, “I have had three negative nucleic acid tests in eight days, but my travel code is still yellow.”

Theme 3 (light green, 9.3%) involved a request for statistics on the number of spillover cases in Shanghai, represented by the words epidemic, data, world, and case. For example, “Can you count the number of spillover cases in Shanghai since the outbreak?”

Theme 4 (light gray, 9.3%) referred to the problem of ventilation in waiting rooms, represented by the words waiting room, window, screw, and steel nail. For example, “The windows in the waiting room do not open, which may lead to indoor infections.”

Theme 5 (pink, 6.98%) included the official non-reporting of Shanghai’s case data, which was represented by universe, center, and scope. For example, “Why did the authorities not report new cases in Shanghai?”

Theme 6 (light blue, 6.98%) involved geographical discrimination, represented by the words place, children, and rhinitis. For example, “We went from Shenzhen to a nearby city to see rhinitis for our children. Unfortunately, the hospital here refused to see us because Shenzhen is a high-risk area.”

Theme 7 (light gray, 6.98%) referred to travel code forgery, whose representative words were ID card, phone number, and authenticity. For example, “One ID card should correspond to one phone number to effectively improve the authenticity of the travel code.”

Theme 8 (light red, 6.98%) questioned the segregation policy, with words such as asterisk, go home, and self-financed. For example, “I have a green travel code, why do I need to be quarantined when I go home?”

Theme 9 (light green, 4.65%) involved a questioning of Shanghai’s diagnostic criteria, represented by the terms COVID-19 and distinguish. For example, “How does Shanghai distinguish asymptomatic cases from confirmed cases?”

Theme 10 (purple, 4.65%) implied the hope that there were no after-effects, represented by the words long covid, and symptom. For example, “hoping there are no long-covid or that the symptoms are mild.”

Theme 11 (dark green, 4.65%) referred to the segregation policy being unreasonable, with words such as regulation and reasonable. For example, “Is it reasonable to require seven days of centralized quarantine, even from a low-risk area to a low-risk area?”

Theme 12 (dark gray, 4.65%) is related to the inability to secure supplies, which was represented by the words supplies and everyone. For example, “Supplies are scarce, and some people have nothing to eat.”

Maintenance stage

SNA based on positive comments

Table 14 shows the top 20 words calculated based on degree and eigencentrality, and Fig. 8 shows these 20 words in the network.

Table 14 Top 20 terms based on degree and eigencentrality

Full size table

As shown in Table 15, the network in Fig. 8 consists of 20 nodes and 42 edges. The average degree was 4.2, average weighted degree was 5.2, density was 0.221, and modularity was 0.221.

Table 15 Parameter calculation in Gephi

Full size table

We identified five themes in this stage. Theme 1 (purple, 45%) was cheering, represented by the words epidemic, volunteer, ruthless, military uniform, responsibility, stay true, medical staff, confidence and frontline staff. For example, “Please be kind to all the medical personnel and volunteers working hard for us.”

Theme 2 (light green, 20%) involved gratitude toward health care workers and volunteers, expressed by words such as life, unity, family, and COVID-19. For example, “COVID-19 is not terrible, and we are confident to end the epidemic and return to normal life.”

Theme 3 (blue, 15%) referred to insisting on scientific epidemic prevention, which was represented by the words anti-epidemic, universal, and science. For example, “We should insist on the scientific prevention of epidemics.”

Theme 4 (orange, 10%) implied a prayer for an end to the epidemic and was represented by the words people, and countrywide. For example, “The people of the whole country work together.”

Theme 5 (dark blue, 10%) implied a prayer for an end to the epidemic and was represented by the words fist and end. For example, “May COVID-19 end soon.”

SNA based on negative comments

Table 16 shows the top 30 words calculated based on degree and eigencentrality, and Fig. 9 consists of these 30 words.

Table 16 Top 30 terms based on degree and eigencentrality

Full size table

As shown in Table 17, Fig. 9 consists of 30 nodes and 89 edges. The average degree was 5.933, average weighted degree was 5.933, density was 0.205, and modularity was 0.723.

Table 17 Parameter calculation in Gephi

Full size table

We identified a total of four themes. Theme 1 (purple, 30%) involved a request to buy traditional Chinese medicine in ordinary pharmacies, which was represented by the words COVID-19, feeling, anti-COVID-19, secret recipe, pharmacy, Chinese medicine, infectiousness, end, and epidemic. For example, “We hope the medicine for COVID-19 can be bought directly from the pharmacy.”

Theme 2 (light red, 26.67%) included a travel code anomaly, represented by the words yellow code, self-financed, ride, cross-provincial, destination, red code, hotel, and home. For example, “The travel code is malfunctioning because the travel code turns red after I cross the province.”

Theme 3 (orange, 26.67%) referred to the wearing of masks not being good for breathing, with the words country, scent, mask, fragrance, comfort, smell, disease, and function. For example, “The country could develop masks with herbal scent to improve breathing comfort.”

Theme 4 (green, 16.67%) involved complaints about Shanghai’s attitude toward the epidemic, represented by the words hospitality, world, strength, efficiency, and courage. For example, “Shanghai is neither warm and curious nor solid and efficient, but arrogant and prejudiced in the fight against the epidemic.”

Resolution

SNA based on positive comments

Table 18 shows the top nine words extracted, based on degree and eigencentrality. Figure 10 comprises these nine words in the network.

Table 18 Top nine terms based on degree and eigencentrality

Full size table

As shown in Table 19, Fig. 10 consists of nine nodes and five edges. The average degree was 1.111, average weighted degree was 1.111, density was 0.139, and modularity was 0.72.

Table 19 Parameter calculation in Gephi

Full size table

We identified a total of four themes. Theme 1 (purple, 33.33%) involved praying for an early end to the epidemic, represented by epidemic, COVID-19, and anti-epidemic. For example, “Pray for an early end to the epidemic.”

Theme 2 (green, 22.22%) praised the decline in the number of diagnoses, represented by the words love and rose. For example, “Thanks to all the staff and send you rose.”

Theme 3 (orange, 22.22%) included cheering, represented by the words fist and all. For example, “Come on, everybody.”

Theme 4 (blue, 22.22%) implied being confident in the recovery of the economy, represented by the words two-digit and economy. For example, “The number of confirmed cases in Shanghai has finally dropped to double digits, and China’s economy will fully recover.”

SNA based on negative comments

Table 20 shows the top 14 words calculated based on degree and eigencentrality, and Fig. 11 shows these 14 words in the network.

Table 20 Top 14 terms based on degree and eigencentrality

Full size table

As shown in Table 21, Fig. 11 consists of 14 nodes and 15 edges. The average degree was 2.143, average weighted degree was 2.143, density was 0.165, and modularity was 0.4.

Table 21 Parameter calculation in Gephi

Full size table

We identified four resolution-stage themes. Theme 1 (blue, 35.71%) questioned the current situation of the epidemic to determine the quarantine policy back home, with words such as epidemic, go home, case, countrywide, and world. For example, “I pray that the epidemic will end soon.”

Theme 2 (green, 28.57%) involved sadness at facing the epidemic, which was represented by the words sad, same, love, and cheat. For example, “Do you want to cry when you see the epidemic as much as I do?”

Theme 3 (purple, 21.43%) referred to questioning the authenticity of information, with words such as place, work, and message. For example, “Why is the official information different from what I know?”

Theme 4 (orange, 14.29%) included being tired of official announcements of new daily cases, represented by the words title and disgusting. For example, “I’m disgusted by the daily new cases.”

Evaluation

SNA based on positive comments

Table 22 shows the top 24 words calculated based on degree and eigencentrality, and Fig. 12 consists of these 24 words.

Table 22 Top 24 terms based on degree and eigencentrality

Full size table

As shown in Table 23, Fig. 12 consists of 24 nodes and 32 edges. The average degree was 2.667, average weighted degree was 2.667, density was 0.116, and modularity was 0.624.

Table 23 Parameter calculation in Gephi

Full size table

We identified a total of seven themes. Theme 1 (red, 20.83%) addressed Shanghai bearing too much risk, due to international imports, represented by the words epidemic, flight, risk, divert, and everyone. For example, “This shows that Shanghai was under tremendous pressure in terms of international risk inputs before.”

Theme 2 (dark green, 16.67%) involved the gratitude for the staff who fought against the epidemic, with words such as anti-epidemic, population, case, and staff. For example, “China has the largest population but the fewest cases; salute those who fight the epidemic.”

Theme 3 (blue, 16.67%) referred to the priority level of fighting against the epidemic, which was represented by the words country, trivia, victory, and countrywide. For example, “Fighting the epidemic is an essential thing during the epidemic, and anything else is trivial.”

Theme 4 (purple, 12.5%) represents a wish for good luck in the college entrance exams, and the representative words were high school students, perseverance, and good luck. For example, “Good luck to all college entrance examination candidates.”

Theme 5 (black, 12.5%) included airplane screening being excellent, represented by the words awesome, airplane, and check. For example, “Your flight security checks are outstanding.”

Theme 6 (gray, 12.5%) comprised the expectation of resuming production, represented by the words nucleic acid test, mask, and production. For example, “Please do a good job of personal protection and nucleic acid testing masks so we can resume work and production as soon as possible!”

Theme 7 (light green, 8.33%) involved comments on the epidemic situation being very good; its representative words were situation and resume production. For example, “Everyone is doing well so far, and I hope to keep doing well.”

SNA based on negative comments

Table 24 shows the top 42 words calculated based on degree and eigencentrality, and Fig. 13 consists of these 42 words.

Table 24 Top 42 terms based on degree and eigencentrality

Full size table

As shown in Table 25, Fig. 13 comprises 42 nodes and 80 edges. The average degree was 3.81, average weighted degree was 4.048, density was 0.093, and modularity was 0.631.

Table 25 Parameter calculation in Gephi

Full size table

We identified a total of nine themes. Theme 1 (red, 23.81%) involved the local government having a different quarantine policy from the central government, represented by the words nucleic acid test, self-financed, surveillance, out-of-province, all, place, staff, measure body temperature, hotel, and camera. For example, “Although the central government requires that the quarantine policy not be enforced, you still need to quarantine for three days at your own expense when you travel.”

Theme 2 (purple, 21.43%) included comments on how to get vaccinated if one has other diseases, represented by the words vaccine, COVID-19, patient, sad, hospital, first shot, rescue, mother, and rhinitis. For example, “Can patients with other underlying medical conditions be vaccinated?”

Theme 3 (blue, 16.67%) comprised praying for the virus to disappear sooner, with words such as epidemic, virus, country, countrywide, people, world, and disappear. For example, “When will COVID-19 disappear?”

Theme 4 (black, 11.9%) contained words related to missing family members, with words such as work, video, go home, elderly, and care. For example, “After the outbreak of the epidemic in 2020, I have been working in other places, and I have no chance to go home to visit my father.”

Theme 5 (orange, 7.14%) implied a request for Shanghai to divide prevention and control policies according to district level, with words such as district level, policy, and record. For example, “I strongly recommend that Shanghai adjust its epidemic policy according to the district level.”

Theme 6 (dark green, 4.76%) included comments of pessimism about the development of the epidemic, represented by words like end and unable. For example, “It seems that the epidemic is not going to end.”

Theme 7 (light green, 4.76%) referred to the current prevention and control policy being meaningless, represented by the words pointless and give up. For example, “Foreign countries have given up fighting against COVID-19, so there is no need for us to continue.”

Theme 8 (light pink, 4.76%) comprised discouragement, which was represented by words and describe. For example, “Year after year of the epidemic, I am speechless.”

Theme 9 (gray, 4.76%) involved the health qualification examination, which was represented by the words health qualification examination and examination. For example, “Has the time for the health qualification examination been set?”