Phase-adjusted estimation of the COVID-19 outbreak in South Korea under multi-source data and adjustment measures: a modelling study

1 School of Mathematics and Informational Technology, Yuncheng University, Yuncheng 044000, China 2 Shanxi Applied Mathematics Center, Taiyuan 030006, China 3 Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, FL 33314, USA 4 Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830011, China 5 Complex System Research Center, Shanxi University, Taiyuan 030006, China 6 Shanxi Key Laboratory of Mathematical Techniques and Big Data Analysis on Disease Control and Prevention, Taiyuan 030006, China 7 School of Mathematics, Sichuan University, Chengdu 610064, China 8 School of Mathematics and Information Science, Shaanxi Normal University, Xi’an 710119, China

applied to estimate the epidemic trend in Wuhan, China. It computed the effective reproduction number at different four phases with varying intervention measures. Nishiura [9] estimated the incidence of infection with novel coronavirus (COVID-19) on the Diamond Princess by employing a backcalculation method. The work [10] studied the COVID-19 outbreak on the Diamond Princess cruise ship and obtained the median value of R 0 at 2.28 (95% C r I 2.06 − 2.52) and the size of the cumulative cases in the following days. Few study has focused on this infection disease outbreak in South Korea. Due to the hard efforts and practical experiences in the containment measures and strategies of controlling COVID-19 prevalence, the timely surveillance and adequate nucleic acid detecting at the early phase can be supplied in Korea. Therefore, fitting the reported data in the outbreak in South Korea by the dynamical model will make a better understanding and we can predict the prevalent situation and account for the effectiveness of public health inventions and measures.

Data Source
The Korea Centers for Disease Control and Prevention (KCDC) report the data twice daily from January 20, 2020, 9:00 and 16:00 respectively. Since March 2, 2020, KCD changed to update COVID-19 statistic data daily as of 0: 00. Here, we collected the reported data of daily infected cases as of 16:00 in South Korea from January 20 to March 9, 2020 [11]. The KCDC has conducted contact tracing activities for the first 46 confirmed cases and released the information of the tracing and mapping to public on February 19, 2020. It was found that all cases were related with travel activities or contacts with confirmed cases till mid-February. The strategies to contain the virus worked well by strictly screening travelers at the airports and tracing and quaranting the closed contracts of confirmed cases. However, a few cases with unknown infection source including case ID 29, 30, 31, 32, 38 and 46 were reported since February 16, 2020 (See Figure 1). Moreover, case ID 31 confirmed on February 18, 2020 is considered as a super-spreader. Case 31 attended services at a church twice in Daegu and traveled extensively through South Korea. Soon she began to develop some symptoms and still continued her regular routine. Since then, a dramatic spike in cases can be seen in South Korea. Therefore, we chose February 16, 2020 as the starting time in the fitting. We will fit the cumulative confirmed cases, the cumulative death cases and the cumulative recovery cases in South Korea from February 16 to March 9, 2020 in order to evaluate the effectiveness of adjusted control measures.

Model
Based on the epidemiological status of individuals and control interventions in South Korea, hospital treatment and self-isolation after an infectious individual is diagnosed are incorporated in the basic SEIR epidemic model. The total population at time t, denoted by N(t), includes the following epidemiological compartments: susceptible (S ), exposed (E), infectious (I), hospital treated (H) and home quarantined (Q). Susceptible population move to the latent compartment E upon the successful exposure by contacting the infection source including exposed, infectious and self-isolated population. Here, we consider that self-isolation may still be possible to infect the family members. Let β is the probability of successful disease transmission during each contact and c is the contact rate per unit time per individual. The incubation period averagely is 1 δ . The infection with symptom onset is diagnosed over the period 1 θ through Real-Time Reverse Transcription-Polymerase (RT-PCR) tests. We assume that both exposed and infectious population carry the virus and may spread it to others. Meanwhile, a proportion κ of confirmed infectious cases will be treated in hospitals and the rest will be advised under self-isolation and wait for the possible future health care. d is denoted as the disease-induced death rate. γ h is the recovery rate under hospital treatment. 1 ρ is the median waiting time of hospitalization. 0 < ε < 1 and 0 < η < 1 are scalars for the transmission from exposed population and self-isolated population compared with that from infectious population, respectively. The transmission dynamics are governed by the following system of equations: Using the next generation matrix [12,13], we obtain the expression for the basic reproduction number or the effective control reproduction number as follows:

Estimation of model parameters
We will estimate the basic reproduction number and the effective control reproduction number under three different phases. The first phase ( from February 16, 2020 to February 24, 2020) can be regarded as the early phase of the epidemic when a few prevention and control measures were implemented. On February 25, 2020, the South Korea government began to warn the residents to take precautions and all public libraries, museum, churches, day-care centers and courts were closed in Daegu, the epicenter of the outbreak [14]. The control measures were continuously enhanced in the second phase (from February 25, 2020 to March 2, 2020). On March 3, 2020, the government stated that the detection speed of the novel coronavirus SARS-COV-2 would be greatly improved [15] and placed all government agencies on a 24-hour full alert. Meanwhile, some temporary treatment centers equipped with certain medical facilities began to accept patients with mild symptoms [16].
For the first 28 confirmed cases at the first phase, we collected the dates of entry to South Korea, symptom onset, diagnosis time (for hospitalization and/or self-isolation) and close contact numbers (See Table 1). No imputation was made for missing data. It was calculated that the average contact rate per day per person was 14.2 and the mean duration from onset-of-symptoms to diagnosis was 3.4 day. The average incubation period is assumed to be 5.2 days (95% C r I 4.1 − 7.0) [17,18,19,20] and the proportion κ as 0.56 [21] according to the new sites reports. The initial values of H(0), Q(0) and R(0) are given based on the reported data on February 26, 2020. Other parameters and the initial values of S (0), E(0) and I(0) are estimated (See Table 2).
At the second phase, people are encouraged to work at home and reduce the unnecessary gathering activities and so on. Therefore, we assume that the values of parameters remain the same as the first phase except the contact rate c and the disease related death rate d.
Since the third phase, more than 500 'drive-thru' coronavirus testing stations were launched in South Korea to provide free tests to people in a less crowded manner. Then the detection speed were greatly improved and the values of the remaining parameters are the same as the second phase except the mean duration from onset-of-symptom to diagnosis 1/θ. We denote Y 1 (t) as the cumulative confirmed cases at time t, then d dt Y 1 (t) = θI. And we denote Y 2 (t) as the cumulative death cases at time t, then d dt Y 2 (t) = dH. Together with R(t) as the cumulative recovery cases at time t, We regarded these three groups reported cases as three random variables following Poisson-distribution respectively, and fitted our model to real data by sampling the posterior distribution of the parameter vector. To carry out the Markov chain Monte Carlo (MCMC) procedure, we used an adaptive Metropolis-Hastings (M-H) algorithm. The algorithm was run for 20,000 iterations and we discarded with the first 10,000 iterations as a burn-in period. The median and confidence interval of each estimated parameter are listed in the Table 2.

Results
By using the proposed model to fit the reported numbers of confirmed, death and recovered cases (see Figures 2 and 3), we estimate the basic reproduction number R 0 ≈ 4.79 (95% C r I 4.38 − 5.2) at the early phase and the control reproduction number R c ≈ 0.32 (95% C r I 0.19 − 0.47) and R c ≈ 0.27 (95% C r I 0.14 − 0.42) respectively at the second and third phase with the implementation of effective countermeasures. Moreover, we can also see that simulated number of daily confirmed cases fits the reported data well and the data from March 10, 2020 to March 17, 2020 are used to verity the model (see Figure 4). Meanwhile, the decline of the reproduction number is very fast at the second phase. This indicates that the prevention awareness and the cooperation degree of public in South Korean were both high and control measures were working when special management areas were set up. It was very effective that the increasing number of temporary medical treatment centers were used to isolate patients to avoid further spread. The South Korea government launched the extensive nucleic acid detection at the epidemic areas and published the movement information of diagnosed patients, which make the great efforts in the rapid containment. At the third phase, the South Korea government announced that the whole country enter the "war" against COVID-19 and the detection speed will be significantly improved. It is known that the final size of the epidemic outbreak depends on the current control measure. Therefore, it is predicted that the estimated value of the final size is 9661 (95% C r I 8660 − 11100) with keeping the control measures at the third phase. Moreover, we predict that the whole epidemic will be over by late April (see Figures 2 and 4).  We plot the new daily infected cases with varying contact ratio c and the test ratio θ to examine the possible impact of enhanced interventions on COVID-19 infection (see Figure 5). By the sensitive analysis (see Figure 5a,b), it is found that reducing contact ratio and enhancing the testing speed not only decrease the peak value and but also delay the peak time. Moreover, compared with the diagnosed speed from 4.3 days (0.8θ) to 0.7 day (5θ), the decrease of the contact ratio from (21.4) (1.5c) to 5.7 (0.4c) per day per person will has a quick effect on containing COVID-19. It illustrates that reducing the aggregation and staying home are very important and effective. If everyone could try his or her best staying home, the spread will may be contained even if the shortage of the testing kits. It can be explained that why the spread of COVID-19 in the Europe especially in the Italy is so bad. Further, we study the impact on the peak value of new daily infected cases by combining the two parameters pair (see Figure 6). It is explained that the outbreak of COVID-19 in South Korea could be contained rapid by the incorporate control measures of the increase of detection speed, the habit of wearing a mask and reducing the aggregation.

Conclusions
The study presents the spread and control situation of COVID-19 in South Korea by formulating mathematical modelling, estimating the basic reproduction numbers and evaluating the effectiveness and strength of control measures. According to the implementation time of control measures issued by the government, we divide the transmission process from February 16, 2020 to March 9, 2020 to three different transmission stages. We find that the decline of the control production number is rapid at the second stage. If keeping the current testing efforts and control measures, we predict that the final size of this outbreak in South Korea is 9661 (95% C r I 8660 − 11100) and the whole epidemic will be over by the middle of April. It is noticeable that the duration from onset-of-symptoms to diagnosis is very short worldwide. This aggressive testing capacity allowed South Korea to rapidly identify cases  and then isolate them quickly and this also allowed the government to effectively control the virus spread without shutting everything down. Actually, there have been 10,765 cases of COVID-19 as of April 30, 2020 in South Korea and it was the first time since February that no locally infected cases have occurred. The signs of a slowdown were observed clearly. The final size of 10,765 stays in our prediction interval 95% C r I (8660, 11100). One limitation in our study is that the impact of importation is not considered. As a result of this limitation, our estimation of the end of the outbreak which was in late April is too optimistic since the South Korea is now under the pressure of the importation cases from countries who just experienced the peak or are still stay in the peak of outbreak such as Europe and the United States.