The risk of future waves of COVID-19: modeling and data analysis

Abstract: After a major outbreak of the coronavirus disease (COVID-19) starting in late December 2019, there were no new cases reported in mainland China for the first time on March 18, 2020, and no new cases reported in Hong Kong Special Administrative Region on April 20, 2020. However, these places had reported new cases and experienced a second wave since June 11, 2020. Here we develop a stochastic discrete-time epidemic model to evaluate the risk of COVID-19 resurgence by analyzing the data from the beginning of the outbreak to the second wave in these three places. In the model, we use an input parameter to represent a few potential risks that may cause a second wave, including asymptomatic infection, imported cases from other places, and virus from the environment such as frozen food packages. The effect of physical distancing restrictions imposed at different stages of the outbreak is also included in the model. Model simulations show that the magnitude of the input and the time between the initial entry and subsequent case confirmation significantly affect the probability of the second wave occurrence. Although the susceptible population size does not change the probability of resurgence, it can influence the severity of the outbreak when a second wave occurs. Therefore, to prevent the occurrence of a future wave, timely screening and detection are needed to identify infected cases in the early stage of infection. When infected cases appear, various measures such as contact tracing and quarantine should be followed to reduce the size of susceptible population in order to mitigate the COVID-19 outbreak.


Introduction
The novel coronavirus SARS-CoV-2 has caused a global pandemic of unprecedented viral pneumonia [1,2]. This infection is known as coronavirus disease 2019 (COVID-19) [3]. Because of the high human-to-human transmissibility, SARS-CoV-2 has spread rapidly around the world [4][5][6][7]. In mainland China, the outbreak started in December 2019, reached the peak in February and then the number of new confirmed cases decreased. On March 18, 2020, there were no new cases of infection for the first time and the economy and daily life gradually returned to normal [8][9][10][11]. In Hong Kong Special Administrative Region (SAR), the first case was reported on January 18, 2020. The outbreak peaked in late March 2020 with no new confirmed cases on April 23, 2020. However, spread of COVID-19 in the world is continuing and the outbreak is ongoing globally [12,13]. The infection has been confirmed in about 190 countries up to now. As of February 21, 2021, there have been over 110.6 million cases and 2.45 million deaths reported globally since the beginning of the pandemic. The European Region has the largest new cases and new deaths. The United States accounts for the greatest proportion of cumulative cases and deaths [14]. In China, although the epidemic has been under control, confirmed cases have been found occasionally in different places, which raised significant concerns on the resurgence of future waves of COVID-19.
If there are no confirmed cases in a region for a long time, then the risk of disease re-emergence might be mainly from imported cases or viruses. The major COVID-19 transmission pathway is from human to human through respiratory droplets [15,16]. In particular, asymptomatic individuals who do not have COVID-19 symptoms can still spread the virus. Transmission from asymptomatic individuals poses a significant public health challenge [17][18][19][20][21][22][23]. The cases imported from other high-risk places are another path of viral spread. To reduce the potential of imported cases, many countries have issued travel restrictions, for example, reducing the frequency of flights from abroad [24,25]. However, as the infection is still prevailing in many places, imported cases still represent a tremendous risk, which may lead to new local outbreaks [26][27][28][29]. Another possible path of SARS-CoV-2 transmission might be through the food supply chain, surfaces and environment. In China, the coronavirus was detected on frozen foods, including their packaging materials and storage environment in July 2020. There seemed to have two outbreaks related to the transmission via frozen food [30]. In view of this, interventions that reduce foodborne transmission of pathogens need to be considered [32].
Non-pharmaceutical control measures implemented so far are mainly wearing mask, hand washing, social distancing, quarantine and city/region lockdown [31]. These interventions were gradually lifted in consideration of the trade-off between economic sustainability and public health. An agent-based model was developed to evaluate the possibility of a second-wave emergence under different extents and timing of intervention relaxation [32]. More work assessed the risk of secondary waves since the control measures like lockdowns were relaxed [33][34][35][36][37][38][39][40]. The study [41] compared the data of the epidemiological pattern of COVID-19 in 53 countries or regions where the pandemic experienced two waves, and analyzed the differences between the two outbreaks. Their results suggested that there was a shift of infection to younger age groups, which may make it more difficult to control the pandemic.
In this work, we focus on the COVID-19 spread in serval places where the epidemic has been under control but new cases have been reported occasionally. To study the impact of imported cases on the dynamics of COVID-19 in China under different scenarios of prevention and control measures, Jia et al. developed an impulsive epidemic model to describe imported cases from abroad [42]. In their model, the time when the exposed cases were imported was fixed. However, the exposed cases who carry virus without symptoms are usually unknown. When an infected case is identified, the virus has probably been spreading for a period of time. In the beginning of a new wave of epidemic, the infection might be induced by a small number of infected cases. The disease transmission in this stage can be affected by many random factors. In addition, the data of new/accumulated cases were reported on every day. All of these motivate us to develop a stochastic discrete-time compartmental model that considers randomness, epidemic data, as well as the impact of input virus/cases and the initial entry time. By fitting the model to the two waves of outbreaks in two places in mainland China (Beijing and Xinjiang) and Hong Kong SAR, we evaluate the risk factors that can affect the second or future wave of COVID-19.

Model
We develop a stochastic discrete-time model based on the classic compartmental model. Individuals who have no clinical manifestations such as fever, cough, sore throat and other symptoms that can be self-perceived or clinically recognized, but test positive in serological or blood test are referred to as asymptomatic infection. This population includes two types of individuals. One is asymptomatic infection in the incubation period. They will later develop clinical symptoms or become a confirmed case by screening test or CT (Computed Tomography) examination. The other has no symptoms until the nucleic acid test turns negative. The total population is divided into five epidemiological classes, including susceptible (S ), exposed (E), asymptomatically infected (A), symptomatically infected (I), and recovered (R). Due to quarantine, the susceptible and exposed states are further divided into S q and E q . With hospitalization, the infected class (both asymptomatic and symptomatic) can be further divided into H A and H I . Because the infection and disease progression can be affected by random factors, we assume that the flow between any two compartments is a stochastic process [43][44][45]. For example, D 11 (t) is the number of susceptible individuals who become newly infected and this process obeys a binomial distribution. The diagram of the model is shown in Figure 1 and the corresponding stochastic discrete-time model is given by the following system: where D i j (t) obeys a binomial distribution Bin(n, p) with the parameters (n, p), and the specific form is as follows D 11 (t) ∼ Bin(S t , P 11 (t)), D 12 (t) ∼ Bin(S t , P 12 (t)), D 21 (t) ∼ Bin(E t , P 21 ), D 31 (t) ∼ Bin(A t , P 31 (t)), D 32 (t) ∼ Bin(A t , P 32 ), D 33 (t) ∼ Bin(A t , P 33 ), Here exp[−βc(t) (I+θA) N h] is the probability of staying in the compartment S . The time period h is chosen to be one, so it is omitted in the expression. Thus, P 11 is the probability of individuals leaving the susceptible compartment. The other P functions can be explained in a similar way. The meaning of each parameter in the model is summarized in Table 1. Due to limited pharmaceutical interventions, wearing mask and social distancing play a critical role in the control of the COVID-19 pandemic. As the epidemic is gradually controlled, people's vigilance will decrease. Strict intervention measures may have to be lifted because of economic consideration. We use a time-varying function for the contact rate to describe this change. When the pandemic began and spread rapidly, control measures such as city lockdown, wearing masks and social activity reduction greatly reduced the contact between people. We denote the time of strict control implementation by T 0 . When the number of infected cases gradually decrease after the peak, the control measures are relaxed and people's lives gradually return to normal. We denote this time by T 1 . When new cases are reported again, people's vigilance increases, and prevention and control measures are implemented again. We denote this time by T 2 . The following time-varying function for the contact rate c(t) is used to describe the change of human behavior and effect of control measures during the epidemic. The time of importation of the first case in the 17 16 -Estimated second wave p E (T 2 − τ) The number of exposed cases entered at the time 6 7.02 -Estimated T 2 − τ in the second wave α Disease-induced death rate 0 0 0 Assumed -means the parameter is not included in that place.
We define the quarantine rate q(t) in a similar way. The quarantine rate increases as the epidemic gets worse and decreases as it improves. Thus, we assume that the quarantine rate is a time-dependent piecewise function, given by The functions c(t) and q(t) are shown in Figure 2(a-b), Figure 3(e-f) and Figure 4(a-b) for three different places.

Data
We collected the data of Beijing and Xinjiang from the local health commissions in mainland China, and the data of Hong Kong SAR from the Centre for Health Protection. It includes the time series data of confirmed COVID-19 cases, recovered cases, and asymptomatic coronavirus carriers. On December 26, 2019, a respiratory and critical care physician in Wuhan reported the pneumonia of unknown cause for the first time. The epidemic then spread rapidly in mainland China, and the number of newly confirmed cases reached the peak on February 4, 2020. As of March 18, the number of newly confirmed cases in mainland China became 0 and the number of confirmed cases fell below 20,000. After that, the reported cases in mainland China were mainly imported cases. A few months later, infected cases began to rise again in some places. On June 11, 2020, a confirmed case was reported in Beijing, without history of traveling outside Beijing and without close contact with suspected infection in the past two weeks. This ended a 56-day streak of no local infection in Beijing. On July 15, 2020, i.e., 149 days since the previous confirmed cases, one confirmed case and three asymptomatic cases were found in Xinjiang. In Hong Kong SAR, there were sporadic confirmed cases after April 20. On July 5, a second-wave outbreak emerged. This paper will focus on the data from these three places to study the risk of the emergence of a future wave of COVID-19. The switching time T 0 , T 1 and T 2 in the piecewise function are determined by the responding time in each place.

Cause of resurgence
If there are no cases for a long period of time, e.g., several months, after a wave of COVID-19 outbreak, then the new infection is likely to be caused by imported cases or exposure to the virus. The virus that caused a second wave can be summarized by the following three sources: (1) imported cases from abroad. Despite strict regulations on international travel and border inspections, there are still some reported cases imported from abroad. There is no guarantee that 100% of the infected or exposed cases entering the country will be isolated. The incubation period of the infection is not well known and may not be the same for all infected people. With fixed-duration quarantine implemented, the infected individual may become a confirmed case after the quarantine is over. This may be a risk for a second wave in mainland China. (2) Asymptomatic cases. These people carry the virus but cannot be identified if they do not have the nucleic acid test. However, they can infect other people. Therefore, asymptomatic carriers represent another risk for the occurrence of the second wave. (3) Virus from the environment. Some studies have shown that low temperature can greatly promote the persistence of coronaviruses. Frozen foods are potential carriers. Transmission occurs via touching contaminated objects that mediate the infection through mouth, nose, or eyes. This seems to be another risk of transmission that have been ignored.
The potential causes summarized above can be described by new exposed individuals added to our model at a certain time. The time point when the new confirmed case was reported is T 2 but when the exposed individual was introduced remains unknown. Here we assume that the number of input exposed individuals is p E (T 2 − τ) where τ represents the time lag from the entry of the exposed individual to the later confirmation of infection. Thus, T 2 − τ is the time point when the exposed individuals entered. The equation of E(t) in model (2.1) can be replaced by the following equation It is noted that the reported case and the imported case may not be the same person.
The increase in the susceptible population due to lifted interventions may also contribute to the second wave. The first wave of COVID-19 emerged in Wuhan in early January of 2020. The time happened to be about ten days before the Lunar New Year. This made most people stay at home and take the longest vacation, which greatly reduced the probability of contact. In addition, public transportation was terminated and schools and restaurants were all closed. This series of strict measures reduced the number of susceptible people to a very small level. In our model, we assume that the number of susceptible people in the first wave of outbreak is S 01 . After the first wave, social activities gradually returned to normal and the size of susceptible population increases to S 02 when the second wave emerges. The time of the susceptible population change, denoted by T 3 , depends on the region. For Beijing and Xinjiang, we let it be the same as T 1 . For Hong Kong SAR, it is the time when the local restriction policy is released. Thus, the number of susceptible is given by the following piecewise function

Results
We use the discrete stochastic model (2.1) with the input parameter p E (T 2 − τ) to fit the data of the two waves of outbreaks in Beijing and Xinjiang using the least square method. The data fitted include the number of reported confirmed cases, asymptomatic cases and recovered cases. For the epidemic in Hong Kong SAR, there were still sporadic reports of confirmed cases after the first wave. The reason for the second wave in Hong Kong SAR is likely the increase in the number of susceptible population due to lifted restriction of interventions. We use the model (2.1) without the input parameter p E (T 2 − τ) to fit the data in Hong Kong SAR. Parameter values obtained from the fitting are listed in Tables 1 and 2. The population size of susceptible in the three places is less than the entire population of those places.
Here the susceptible population refers to those who may contact with the infected cases. The stochastic simulations provide good fits to the data in these three palaces, see Figure 2(d-f), Figure 3(d-f) and Figure 4(d-f). The corresponding contact rate c(t), quarantine rate q(t) and the susceptible population change S 0 (t) are shown in Figure 2(a-c), Figure 3(e-g) and Figure 4(a-c), respectively.
The emergence of the second wave is influenced by the number of input exposed individuals and how long the infection has been spreading before the report of confirmed cases. We conduct numerical simulations to study the risk of having a second wave. The occurrence of a second wave is evaluated by the maximum number of confirmed cases in 500 stochastic simulations. We denote the average number by MH and choose a threshold value 30. If the MH value exceeds 30, it will be regarded as a second wave. The result shows that not all the scenarios result in a second wave. From 500 stochastic simulations, we calculate the probability of the occurrence of a second wave, which is denoted by Prop.
In Figures 5 and 6, we explore the effect of varying the input parameter on the risk of second wave in Beijing and Xinjiang, respectively. The range of the parameter p E is set to [0, 30] at time T 2 − τ, and the time delay parameter τ is within the range [0, 20]. From Figure 5(a), we find that both the number of input exposed individuals and the time between initial entry and subsequent confirmation affect the severity of the second wave. The average maximum value of the second wave peak can reach 1600 cases in Beijing. Increasing the number of input exposed individuals can expand the scale of the disease spread. A larger time delay τ implies that the disease had spread for a longer time without any interventions before its detection. This poses a substantial challenge for the subsequent control of the disease.
We provide the parameter region of a second wave occurrence in Figure 5(b). The deep blue points represent the parameter range of the occurrence of a second wave, while the deep red points represent the parameter range of no second wave. The simulation shows that a second outbreak would not take place when less than three exposed cases were imported. If the infection induced by the imported cases can be quickly identified, then the chance of having a second wave decreases. Figure 5(c) further shows the probability of the occurrence of the second wave under the same parameter range in Figure 5(a). Large values of p E and τ will make a second wave inevitable. We have the similar conclusion from the simulation for Xinjiang (see Figure 6). The scale of the second wave is larger than Beijing with the same parameter range because the average maximum value of possible second wave peak can reach 5000 cases in the worst scenario. Figure 9(a) shows the average result of 500 stochastic simulations of the model (2.1) with six different susceptible populations in Hong Kong SAR. As the susceptible population increases, the average maximum value of the second wave peak also increases. Interestingly, the probability of the occurrence of the second wave remains almost the same for different susceptible populations (see Figure 9(b)). Numerical results on the effect of varying the susceptible population size in Beijing and Xinjiang are shown in Figures 7 and 8, respectively. Based on the simulations in these two places, we have a conclusion similar to Hong Kong SAR. This analysis suggests that the susceptible population size plays a minor role in leading to the second wave when the other parameters are fixed.  The other is the same as that in Figure 5.

Conclusions
COVID-19, a highly contagious disease first reported in December 2019, has been spreading globally for more than one year. Some countries/regions have mitigated the outbreak by various measures but are still at risk of recurrence. In this paper, we constructed a stochastic discrete-time compartmental epidemic model to analyze the risk of the occurrence of a second or future wave of outbreak. Compared with a deterministic system, a stochastic model is able to include the random factors in the spread of an infectious disease, particularly when the number of initial infected individuals is small. This is the case when a new wave of outbreak occurs. This discrete model can more intuitively describe the flow between any two compartments. The transition between two compartments is not deterministic and assumed to obey binomial distributions in our model. The change between two compartments in one time step corresponds to the daily data. Thus, using the discrete stochastic model facilitates full use of the data from multiple sources, thereby improving the reliability of the parameter estimation results.
To describe the change in the intensity of control measures in response to the COVID-19 pandemic, we adopt time-varying contact rate and quarantine rate in the model. There are a few possible factors that may lead to a second wave, including import exposed cases, asymptomatic cases, and the presence of viruses in the environment such as the frozen food chain. The common characteristic of these factors is that the transmission is silent and difficult to be identified. We find that the time between the exposed case entry and the confirmation of subsequent infection plays a critical role in the occurrence of the second wave.
The cause of the second-wave outbreak in Beijing and Xinjiang is mainly the imported cases and an increase in the susceptible population due to relaxed interventions. The model provided a good fit to the data of the second wave in Beijing in June 2020. Based on the fitting, the value of input exposed cases is estimated to be 6 and the time from exposed individual entry to the detection of infection is 17 days. The size of susceptible population increases from 5.001 × 10 3 in the first wave to 6.012 × 10 3 in the second wave. For Xinjiang where the second wave of the epidemic occurred in July 2020, the value of input exposed cases is estimated to be 7 and the time from entry to detection is 16 days. The change in the number of susceptible people is greater than in Beijing.
Hong Kong SAR also experienced a second wave in July 2020. Unlike Beijing and Xinjiang, there were occasional reports of infected cases all the time in Hong Kong SAR after the first wave and the main cause of the second wave is likely to be the increase in the number of susceptible people. Our modeling result suggests that in a region where the infection is not cleared (e.g., in Hong Kong SAR) susceptible people will increase as the control measures are lifted and this may lead to a second wave. If there is no case for a long time (e.g., in Beijing and Xinjiang), it is necessary to screen imported cases and viruses (e.g., via the food chain), which may be the major cause of the second wave.
On the basis of the fitting to the data in Beijing and Xinjiang, we further evaluated the possibility of having a future outbreak and its severity. Because there were no confirmed cases for a long time after the first wave in Beijing and Xinjiang, the contact rate returned to the normal level, as shown in Figures 2 and 3. If there are imported exposed cases, the time to detect the infection is shown to be critical in leading to the second wave. The simulation shown in Figure 5 and Figure 6 indicates that the second wave is determined by the number of imported exposed individuals and the time needed to detect them. The results suggest that if the imported exposed cases are less than three, then the number of confirmed cases will be below the threshold 30 we set, which would not be considered as a second wave. If the values of imported exposed individuals and the time lag in detection are larger (e.g., in the red region in Figures 5 and 6), a second wave will emerge. The more imported exposed cases and the longer for the infection to be detected, the more likely a second wave will occur. Once a confirmed case is found, it is imperative to track the trajectory of that case and the contacted persons. The information obtained from this study can be used to evaluate the possibility (i.e., the possibility of infected cases above a threshold level) and scale of a future wave of outbreak.
By investigating the effect of the susceptible population on the second wave in Beijing, Xinjiang and Hong Kong SAR, we found that the larger the susceptible population size, the more infections if the second wave occurs. However, the susceptible population size itself does not affect the probability of the occurrence of a second wave. This result suggests that imported cases might be an important factor leading to the occurrence of a second wave in a place where the epidemic has been well controlled. Once a case is found, reducing the number of susceptible people will help control the disease spread in the second wave.
Our study cannot predict when a second or future wave of COVID-19 would take place. When a new wave occurs, the model can be used to predict the scale or severity of the outbreak. This is based on the fitting of the model to existing data. If the data are not sufficient for fitting, then the power of the model prediction would be limited. Lastly, the model does not include the influence of vaccination on the disease spread. How the vaccine rollout influences the emergence of future waves remains to be further investigated.
In summary, we established a stochastic modeling framework that incorporates control measures at different stages of the epidemic and potential causes for the second wave emerged in Beijing, Xinjiang, and Hong Kong SAR. Because infected people without symptoms are contagious and the virus attached to goods is difficult to be detected, comprehensive measures are still imperative to curb the COVID-19 pandemic. It is necessary to screen the imported cases in flights and to detect the virus that may be transported by various routes. If a confirmed case is found, the contact of the case should be thoroughly tracked and the close contacts should be quarantined. Finally, it is important to continue protective measures such as wearing masks and avoiding large-scale gathering to reduce the number of susceptible people. This will make the future wave of outbreak less severe if it takes place.