Phenomenological and mechanistic models for predicting early transmission data of COVID-19

Forecasting future epidemics helps inform policy decisions regarding interventions. During the early coronavirus disease 2019 epidemic period in January–February 2020, limited information was available, and it was too challenging to build detailed mechanistic models reflecting population behavior. This study compared the performance of phenomenological and mechanistic models for forecasting epidemics. For the former, we employed the Richards model and the approximate solution of the susceptible–infected–recovered (SIR) model. For the latter, we examined the exponential growth (with lockdown) model and SIR model with lockdown. The phenomenological models yielded higher root mean square error (RMSE) values than the mechanistic models. When using the numbers from reported data for February 1 and 5, the Richards model had the highest RMSE, whereas when using the February 9 data, the SIR approximation model was the highest. The exponential model with a lockdown effect had the lowest RMSE, except when using the February 9 data. Once interventions or other factors that influence transmission patterns are identified, they should be additionally taken into account to improve forecasting.


Introduction
In December 2019, clusters of atypical pneumonia cases driven by a novel coronavirus (SARS-CoV-2) emerged in Wuhan, China [1]. A rapid surge of coronavirus disease 2019 (COVID- 19) cases was identified initially in Hubei Province, and by January 23, 2020, it involved a total of 25 provinces in China, with 571 cases and 17 deaths [2]. The Chinese government responded by implementing intensive control measures now referred to as a "lockdown." This started with Wuhan on January 23, 2020, and the rest of Hubei Province in the following days, by shutting down domestic and international flights, as well as trains, buses, subways, and ferries across the affected areas [3][4][5]. Newly reported cases in China reduced greatly soon after the lockdown's implementation and it was lifted on April 8, 2020 [6]. By then, the number of cumulative cases and deaths had reached 83,161 and 3,342, respectively [7].
Real-time forecasts of future incidence can provide valuable insights into the scale and control of an epidemic, and can help assess the effects of possible interventions [8,9]. Mechanistic [10,11] and phenomenological [12][13][14][15][16] mathematical models have been used for forecasting epidemics. Phenomenological models can efficiently capture epidemic trajectory by simply fitting the model to the incidence data as a function of time [17][18][19]. A frequently used forecasting methodology is the Richards model [20], which expresses a flexible S-shaped curve with a single inflection point; i.e., the epidemic peak. Forecasting with this model tends to be certain when the data contain the epidemic peak [18]. Mechanistic models such as the susceptible-infected-recovered (SIR) compartment model [21] can capture mechanisms of transmission dynamics, incorporating heterogeneities such as age dependence and spatial variations, and can be useful for evaluating interventions' effectiveness. During the early COVID-19 epidemic period, however, limited information was available; building detailed mechanistic models that reflected population behavior was, thus, too challenging. However, forecasting without an intervention effect could yield extremely different results from those acquired through observation [11].
An important question is whether mechanistic models should proactively be employed for short-time forecasting, during the very early stage of an epidemic, even if the mechanisms considered in the models are limited/simple (i.e., homogeneously mixing population is assumed and mechanistic details of public health countermeasures yet remain completely unknown). For the COVID-19 epidemic, the lockdown policy was a severe and intense countermeasure. In line with this, the present study aimed to compare the forecasting performance of phenomenological models and mechanistic models, the latter of which can incorporate the lockdown effect.

Epidemiological data
We retrieved daily incidence data of confirmed COVID-19 cases in China by reporting date from January 4 (the first case reported through surveillance) through February 18 from the World Health Organization (WHO) website [7]. This time frame encompassed the start of the epidemic to roughly 2 weeks after the epidemic's peak.
From February 13, the definitions in the reporting criteria in China were revised, which was followed by an abrupt increase in the number of cases for February 13 and 14. These numbers were considered outliers, and we therefore excluded these 2 days of data from the analysis. In total, we analyzed incidence data of 53,330 confirmed cases.

Models
We used four different models for forecasting in this study. The first two were phenomenological: the Richards model [20] and the approximate solution of the basic SIR differential equations (SIR approximation model) [22]. The latter two were the exponential growth (exponential with lockdown) model and the SIR (SIR with lockdown) model, incorporating the lockdown effect by changing the growth rate/transmission parameter before and after the lockdown. In these models, we assumed the start of the lockdown was January 28, 2020, the date that all prefecture-, county-level cities, and an autonomous prefecture, except for the Shennongjia Forestry District, in Hubei Province were subject to it [ [18,20] to fit the observed data and predict the epidemic. This model is known for real-time prediction of outbreak and real-time detection of turning points. The model's basic premise is that the incidence curve consists of a single peak of high incidence, resulting in an S-shaped epidemic curve and a single turning point of the outbreak [18]. The cumulative incidence was expressed as the following formula: where t cum C is the cumulative number of infected cases at time t in days; K is the carrying capacity or total number of cases in the outbreak; r is the per capita growth rate of the infected population; and a is the exponent of deviation from the standard logistic curve. ti is the inflection point of the which is generally a symmetrical, bell-shaped curve, where β and γ are the transmission parameter and recovery rate, respectively, and S0 and I0 are the initial number of susceptible and infected individuals, respectively. /    = is the relative recovery rate; thus, the basic reproduction number R0, can be calculated as follows 00 / RS = .

Exponential with lockdown model
We employed the intervention, lockdown, effect in mechanistic models. First, we used an exponential model by dividing newly infected people before and after the lockdown. We modeled the number of newly infected people per day as: where i0 is the initial number of infected people, r1 and r2 are the growth rate before and after implementation of the lockdown, tlockdown. R0, can be calculated using the formula [10,25], where v is the coefficient of variation of the generation time [26], for which we adopted 0.5.

SIR with lockdown model
We then applied the time discrete SIR model for newly infected cases per day. The following equation was used for the model: where St, It, and Rt are, respectively, the numbers of populations in the susceptible, infected, and recovered compartments on day t. γ is the recovery rate. N indicates the population size of China-1.4 billion-and is equal to the total population of the S, I, and R compartments. We expressed newly infected cases (inew) at time t+1 as the following equation: where β1 and β2 are transmission parameters before and after the start of the lockdown. Not all infected carriers were reported, but the ascertainment rate, p, can be seen as about 10% [27], and the carrying capacity of reported cases would be approximately 10% of all infected carriers. We treated the number of inew×p individuals as the observed data, considering the ascertainment rate. In the SIR model, R0, was calculated using the formula 01 / R  = .
Note that for the estimation of R0, we did not differentiate the impact of various interventions, i.e., different effectiveness of lockdown policy and other countermeasures such as contact tracing, isolation and mask waring. Because the estimation of intrinsic R0 without any impact of interventions cannot be attained, here it should be noted that our estimate reflects underlying interventions.

Convolution for the mechanistic models
As the number of reported cases in China were the observed data, the probability density where g(t) and h(t) are the probability density functions of the incubation period (mean: 5.6 days, standard deviation: 3.9 days [28]) and the onset-report delay (mean: 4.9 days, standard deviation: 3.3 days [29]).

Calibration
Using the above formulae, we fitted the observational reported data to Ccum(t) of the Richards model and Cnew(t) of the SIR approximation model and the two mechanistic models (i.e., exponential and SIR with lockdown models) using maximum likelihood estimations with Poisson errors. We performed a bootstrapping method [30] with 10,000 iterations to have 95% confidence intervals (CIs) for the parameters estimated; this produced epidemic curves with 95% CIs. We used R statistical software (R Foundation for Statistical Computing, Vienna, Austria) [31] to perform these analyses.

Prediction assessment
We generated a 7-day forecast from each model for calibrating three different data cutoff points to evaluate the forecasting capability during the course of infection. The cutoff points were: February 1 (before the epidemic peak), 5 (peak), and 9 (after the peak). We compared root mean squared errors (RMSEs) and relative RMSEs for the forecasting among models calibrated using three different data periodseach cutoff datum. We defined RMSE as follows: while we defined relative RMSE as [32]: where ˆt c is the number of newly reported cases forecasted and ct is those observed. As the reporting definition in China was changed after February 13 and the reporting rate changed, 1/α is unknown, and we consequently performed a sensitivity analysis assuming α of 1, 0.9, 0.8, and 0.7. To calculate RMSE and relative RMSE, we used ct multiplied by α for the changed reporting rate period. We omitted the reported number of cases on February 13 and 14 from the analysis, with the assumption they were outliers. Table 1 shows the R0 estimated from the four models above (Richards, SIR approximation, exponential with lockdown, and SIR with lockdown) using three different data periods (cutoff dates of February 1, 5, and 9). The R0s of the phenomenological models (Richards and SIR approximation models) decline, as the data points used for the calibration vary from before the epidemic peak to after (R0: 2.6-7.7), while those for the mechanistic models (exponential with lockdown and SIR with lockdown) are relatively stable (R0: 2.4-3.3), irrespective of the data period used. Table S1 shows all results of the other parameters estimated. 52) The 95% confidence interval derived from profile likelihood is given in parentheses. The estimated incidence represents infection, inclusive of mild and asymptomatic cases. SIR: susceptible-infected-recovered. Figure 1. Estimated number of cases from calibration and 7 days forecasting from the Richards, susceptible-infected-recovered (SIR) approximation, exponential with lockdown, and SIR with lockdown models in China by date of reporting. Calibrations were conducted using three different data cutoff points: February 1 (red), 5 (green), and 9 (blue). Solid lines with shaded areas show medians and 95% confidence intervals for calibration, while dashed lines with light-shaded areas show medians and 95% prediction intervals for forecasting. Gray bars show the number of cases by reporting date, and those on February 13 and 14 were omitted for the forecasting period, with the assumption they were outliers. Figure 1 shows the model fit to the observed data and 7 days of forecasting. Table 2 shows RMSEs and relative RMSEs calculated for the forecasting. When we used the reported data by February 1 (before the epidemic curve peak) and February 5 (peak), the Richards model had the highest RMSE value (15362 and 1375), whereas when we used the data by February 9 (after the peak), the SIR approximation model had the highest RMSE value. This trend was stable throughout the different reporting rates used for the sensitivity analysis (1529, 1402, 1259, and 1121 with α of 1, 0.9, 0.8, and 0.7, respectively). The exponential with lockdown model had the lowest RMSE, except for when we used the data by February 9 without the reporting rate adjustment, in which the SIR with lockdown model had the lowest RMSE (627). The results of relative RMSE had almost the same trend as for RMSE, except when the data by February 9 were used with a reporting rate adjustment of α = 0.9. Throughout the data cutoff points, the mechanistic models tended to have lower RMSE/relative RMSE values than the phenomenological models except for the following: the RMSE of the phenomenological Richards model (652, 500) was lower than that of the mechanistic SIR with lockdown model (882, 1025) when αs were 0.8 and 0.7 from the sensitivity analysis; and the relative RMSE of the Richards model (0.41) was lower than that of the SIR with lockdown model (0.49) when α was 0.7.  February 15, the numbers of reported cases observed were adjusted by multiplying α because of the definition change of reported cases for the sensitivity analysis (i.e., the reporting rate changed was assumed as 1/α). The reported case data for February 13 and 14 were omitted for the RMSE and relative RMSE calculations, with the assumption they were outliers. SIR: susceptible-infected-recovered.

Discussion
In this study, simple mechanistic models that account for the lockdown effect forecasted the short-term future incidence of COVID-19 better than the phenomenological models. This finding was consistent irrespective of the epidemic phases (i.e., data points used for the calibration). In the data-limited setting, such as in the early phase of the epidemic in China, phenomenological models were useful because they required only the incidence data of the infectious disease and they captured the overall epidemic trajectory. Once interventions or other factors that may influence the transmission patterns (e.g., the lockdown), occurred, however, the epidemic dynamics changed considerably and it became essential to account for the change in the forecasting models.
The Richards model forecasted the epidemic better than the SIR approximation (other phenomenological) model when data were available until after the epidemic peak. The Richards model, also known as the generalized logistic model, has been used to predict the spread of infectious diseases during previous epidemics, such as with foot-and-mouth disease in the United Kingdom [33] and the Ebola outbreak in West Africa [15,19,34,35]. It was also applied to forecast the incidence of the ongoing COVID-19 epidemic (now pandemic) [12,13]. The Richards model is well known for its ability to express flexibility of epidemic curves, meaning a deviation from the standard logistic curve can be captured [18] and fluctuations in data at hand before the epidemic peak greatly affect the final size (i.e., data can dramatically alter the final size). In this study, however, once the epidemic passed its peak, the Richards model forecast was comparable with the SIR with lockdown model. The SIR approximation model forecasted the epidemic better than the Richards model when epidemic data used for the calibration preceded the epidemic peak. Although the SIR approximation model has less interpretive ability compared with the Richards model, and generally is simply a symmetrical bell-shaped curve, there is potential merit in having this modeling option to evaluate the forecasting especially in the epidemic's early phase (i.e., before the peak).
In the sensitivity analysis considering the reporting rate change (1/α) from February 13, the trend of forecasting ability was not consistent between the SIR with lockdown and Richards models (see Table 2). The RMSEs/relative RMSEs of the SIR with lockdown model were smaller than those of the Richards model when αs were 0.9/0.9-0.8, but the result was reversed when αs were 0.8-0.7/0.7, although the RMSE/relative RMSE differences between the two models were small. This was because the SIR with lockdown model overestimated the number of reported cases while the Richards model underestimated it. Overestimation from the SIR with lockdown model could be due to the great number of individuals in the susceptible compartment, assuming that S0 is equal to the total population in China and homogeneous mixing. The exponential with lockdown model did not need these assumptions (homogeneous/heterogeneous mixing or the number of susceptible individuals in China's total population), and most effectively forecasted the future incidence. This model had both mechanistic and phenomenological model aspects, as the intervention mechanism was incorporated into the exponential curve. Note that the data used for the analyses were the reported numbers of cases in China, and an under-ascertainment rate may influence data. We assumed the constant under-ascertainment rate, which might be in reality time dependent, although, due to the relatively short time horizon of our study period, the time dependence may not have a large impact. When further epidemiological data (e.g., time-dependent seroprevalence) are available, validation of the observed data and forecasting should be addressed in further research.
When one model overestimates the number of cases and the other underestimates it, relative RMSE maybe suitable for the model validation. This is because the absolute number of the difference between prediction and observation, which is used in RMSE calculation, has a different meaning between the two models, even if they have the same value. In this study, this is evidenced in the RMSE of the SIR with lockdown model being larger than that of the Richards model when the data cutoff point was February 9 and α was 0.8, whereas the relative RMSE of the SIR with lockdown model was smaller than that of the Richards model in the same condition (i.e., data cutoff point: February 9; α: 0.8).
This study involved several limitations. First, as mentioned above, in this study, the mechanistic models generally forecasted better than the phenomenological models. In an epidemic's early phase, however, when there is high uncertainty in the underlying transmission dynamics, forecasting future incidence with an SIR mechanistic model without any intervention effect may produce biased results [11]. The effect of China's lockdown assessed in this study also was clear and had substantial impacts. Note that if unknown effects that influence the disease transmission are not employed into mechanistic models, the better forecasting observed in the mechanistic models may change (i.e., forecasting could be less accurate). Second, short-term forecasting during the very early period of pandemic was conducted in this study. Other published studies explored the heterogeneity of populations and multiple factors that influence the epidemic dynamics [36,37], which we assumed unavailable yet during the process of forecasting. It should be noted that consistent result to ours may not be obtained if the scope is extended to longer period of forecasting. Third, the R0 values estimated from our model might be influenced by interventions other than lockdown measure in China. The R0 from the phenomenological model was obtained when an inflection point was properly identified. Due to the uncertainty of the incidence data and the extent of interventions implemented in the early phase of the epidemic, the estimate has fluctuated substantially. Therefore, the R0 values estimated were comparable between phenomenological and mechanistic models only when data after the epidemic peak was used; however, this was not the case when the dataset before epidemic peak was used. The R0 estimation from the phenomenological model was influenced by the above-mentioned issue (i.e., data uncertainty and the extent of interventions implemented at the time in the early phase), while the R0 from the mechanistic model reflected overall intervention effect before lockdown measure (thus, more stable regardless of the data cutoff points). We did not account for this time-varying intervention effect [38,39] in the mechanistic model, because our exercise assumed a shortage of information in the very beginning of an epidemic.

Conclusions
In conclusion, this study investigated the forecasting ability of phenomenological and simple mechanistic models. The mechanistic models considering a lockdown effect forecasted better than the phenomenological models. To effectively capture disease dynamics, integrated interpretations from both phenomenological and mechanistic models are required as factors such as the epidemic's phase, interventions implemented, and population behavior influence the results of forecasting.