Introduction

Background

After the emergence of COVID-19 in December 2019, the disease spread rapidly across the globe, becoming a serious public health event that endangers humans (Zhu et al. 2020). On January 30, 2020, the World Health Organization announced that the novel coronavirus outbreak was being classified as a “public health emergency of international concern (PHEIC)” (Lu et al. 2020).

A total of 82,631 confirmed cases of COVID-19 was reported in China, including 3321 deaths (Health Emergency Office 2020) up until March 31, 2020. As of 2020, more than 200 countries worldwide have been adversely affected by COVID-19 (Lonergan and Chalmers 2020). The first wave of COVID-19 transmission was effectively controlled in China due to its strict public policies and measures (Tian et al. 2020a, b). As an unprecedented outbreak, an accurate assessment of the severity of the disease was and is necessary, especially in the early stages of the pandemic.

Case fatality rate (CFR) is an index used to describe the severity of a disease and to measure the percentage of deaths compared to all patients diagnosed with the disease during a certain period. For chronic diseases with long courses, “a certain period” can be as long as one year. For infectious diseases of short duration and rapid progression, the CFR is usually estimated after the disease has been brought under control or has disappeared at the end of the pandemic season, in order to reflect the true disease severity. Before the end of the pandemic, the resulting CFR, that is, the number of deaths divided by the number of total confirmed cases, was called the naive CFR and did not represent the true CFR (Kucharski and Edmunds 2014). The naive CFR ignored patients who were still hospitalized. The length of stay for patients with COVID-19 in China usually exceeds 10 days (Rees et al. 2020). Hospitalized patients with COVID-19 will either recover or die in the future. After the occurrence of COVID-19, various media reports and published studies, such as The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team 2020, (Tian et al. 2020a, b), used CFR to measure the severity of the disease. As of February 10, 2020, 42,708 total confirmed cases, 1017 deaths, 3998 recovered cases, and 37,693 cases of hospitalized patients were reported as being due to COVID-19 in China. According to these data, the CFR of COVID-19 was 2.38% (Special Expert Group for Control of the Epidemic of Novel Coronavirus Pneumonia of the Chinese Preventive Medicine Association 2020). However, 37,693 (88.26%) cases that were part of the denominator in this calculation were still hospitalized, and probably still facing death. This factor has not been taken into account in these calculations. Therefore, there is a limitation to estimating the true CFR based on this approach since the denominator includes a large number of cases that have been confirmed, but which have not yet developed an outcome of interest.

However, the discharged case fatality rate (DCFR) can still be calculated, even when the epidemic is in a dynamic stage, thus making it possible to estimate the DCFR and compensate for its shortcomings, even though it does not reflect the actual status of the epidemic as truly and accurately as the CFR. Currently, the calculation of DCFR focuses on the number of cases with clinical outcomes and is not influenced by the number of cases without clinical outcomes. Therefore, we developed a new epidemiological method to assess the overall situation of COVID-19 in a timely manner. In our previous study, the DCFR was introduced as a new evaluation index instead of CFR to analyze the severity of the COVID-19 pandemic in Italy (Yan et al. 2022). This is a valid approach since the epidemiology of infectious disease not only focuses on the severity but also the trend (Adiga et al. 2020; Wu and Cowling 2011).

Objective

The purpose of this study was therefore to explore the value of DCFR in estimating the severity of COVID-19 and assessing the epidemic trend of COVID-19 in China.

Methods

Data source

Epidemiological data on COVID-19 in China, and Hubei Province in particular, were obtained from the National Health Commission of the People’s Republic of China from January 20, 2020, to March 31, 2020 (Health Emergency Office 2020). Data for countries other than China for the same period were obtained from Johns Hopkins University (Johns Hopkins University 2020).

Statistical analysis

CFR indicates the proportion of people who die from a certain disease compared with the number of people with the disease in a certain period. A certain period for a long course of a disease can be a year, whereas a short course of the disease can be months, or even days. DCFR is the proportion of deaths among discharged cases which includes deaths and recovered cases. DCFR includes the total discharged case fatality rate (tDCFR), daily discharged case fatality rate (dDCFR), and stage discharged case fatality rate (sDCFR). The tDCFR is the proportion of deaths among discharged cases in the entire pandemic, dDCFR is the proportion of deaths among discharged cases on any one day, and sDCFR is the proportion of deaths among discharged cases at any particular stage (Yan et al. 2022).

The case fatality rate (CFR), total discharged case fatality rate (tDCFR), daily discharged case fatality rate (dDCFR), and stage discharged case fatakity rate (sDCFR) were calculated and analyzed as follows.

$$\mathrm{CFR}=\frac{\#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{deaths}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19\kern0.5em }{\#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{confirmed}\ \mathrm{cases}\ \mathrm{of}\ \mathrm{COVID}-19}\times 100\%$$
$$\mathrm{tDCFR}=\frac{\#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19\kern11.5em }{\kern1em \#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19+\#\ \mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{recovered}\ \mathrm{cases}}\times 100\%$$
$$\mathrm{dDCFR}=\frac{\kern1em \#\mathrm{of}\ \mathrm{daily}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19\kern12.75em }{\kern1em \#\mathrm{of}\ \mathrm{daily}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19+\#\ \mathrm{of}\ \mathrm{daily}\ \mathrm{discharged}\ \mathrm{recovered}\ \mathrm{cases}}\times 100\%$$
$$\mathrm{sDCFR}=\frac{\#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{at}\ \mathrm{each}\ \mathrm{stage}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19\kern17.75em }{\kern1em \#\mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{deaths}\ \mathrm{at}\ \mathrm{each}\ \mathrm{stage}\ \mathrm{attributed}\ \mathrm{to}\ \mathrm{COVID}-19+\#\ \mathrm{of}\ \mathrm{to}\mathrm{tal}\ \mathrm{discharged}\ \mathrm{recovered}\ \mathrm{cases}\ \mathrm{at}\ \mathrm{each}\ \mathrm{stage}}\times 100\%$$

The CFR, tDCFR, dDCFR, and sDCFR were estimated with 95% confidence intervals (CIs). CI is an interval range, in our case containing population parameters constructed under a certain degree of confidence, and is widely used to estimate the range of population parameters (Nakagawa and Cuthill 2007; Sim and Reid 1999). We calculated the 95% CIs using a normal approximation method and the following formula (Moreno-Kustner et al. 2019; Simon 1986):

$$95\%\mathrm{CI}=\left(p-{Z}_{\mathrm{a}/2}{\mathrm{S}}_p,p+{Z}_{\mathrm{a}/2}{\mathrm{S}}_p\right)$$

where p = n/N, Za/2 = 1.96 for a 95% CI, and SP = \(\sqrt{p\left(1-p\right)/N}\).

To estimate pandemic stages from dDCFR, we applied a pruned exact linear time (PELT) approach (Killick et al. 2012) to search for changes in the means and variance of the dDCFR, an approach which has been implemented in the changepoint R package (Version 2.2.3). All statistics were performed using the R program (version 3.6.3, R Core Team). The specific parameters were:

Cpt.meanvar (Data, penalty = "Asymptotic", pen.value = 0.01, method = "PELT", class = TRUE, Q = 5, test.stat = "Normal"). All statistical significance was set at p < .05.

Results

CFR, tDCFR, and dDCFR

Until March 31, 2020, the CFRs in China, inside, and outside Hubei province were 4.02%, 4.71%, and 0.86%, and the tDCFRs were 4.16%, 4.80%, and 0.97% respectively. Figure 1 presents the trends of CFR, tDCFR, and dDCFR of COVID-19 in different regions from January 20, 2020, to March 31, 2020. The CFR, tDCFR, and dDCFR of COVID-19 fluctuated greatly at high levels in the first week from January 20 to January 26. The CFR of COVID-19 decreased between January 26 and February 10, and increased again after February 11. The overall trend was very clear, with an initial decrease, then an increase to a stable level. In China, for example, the CFR was the lowest on February 5 at 2.01% and the highest on March 31 at 4.02%. As for tDCFR and dDCFR, they were high at first and then gradually decreased after January 26. The overall trend was very clear, with an initial increase and then a decrease to a stable level. In China, the highest tDCFR was 63.86% on January 27 and the lowest was 4.16% on March 31.

Fig. 1
figure 1

COVID-19 trends of CFR, dDCFR, and tDCFR in different regions

Estimating pandemic stages from dDCFR and calculating sDCFR

We determined the potential changes of dDCFR by applying PELT (Killick et al. 2012). We found that the pandemic period in China, and inside and outside Hubei province could be classified into four potential stages based on the identified three change points, namely February 3, February 15, and February 23 (Fig. 2). The first stage, named the spread stage, lasted 14 days from January 20 to February 2, the second stage, named epidemic stage, lasted 12 days from February 3 to February 14, the third stage, named decline stage, lasted eight days from February 15 to February 22, and the fourth stage, named sporadic stage, lasted 38 days from February 23 to March 31. Moreover, we calculated the sDCFR at each stage based on the number of deaths and recovered cases in the different regions (Table 1, Fig. 3). It was observed that the sDCFR of the earlier stages was significantly greater than that of the later stages in all regions, p < .001 (Table 2, Fig. 3).

Fig. 2
figure 2

Estimation of pandemic stages using dDCFR of China. Each red line was a specific stage period

Table 1 COVID-19 sDCFR (%) of different stages in different regions of China
Fig. 3
figure 3

COVID-19 sDCFR of four stages in different regions

Table 2 Chi-square test for sDCFR (%) of different stages of COVID-19 in China

The sDCFR decreases gradually in all regions. For example, the sDCFR of the spread stage in China is 43.18%, the sDCFR of the epidemic stage is 13.23%, the sDCFR of the decline stage is 5.86%, and the sDCFR of the sporadic stage is 1.61%.

DCFR could be used to assess pandemic trends and future deaths. The sDCFR of the fourth stage in different regions was closer to the CFR of the cases under treatment than the sDCFR of the first three stages, which could be used to estimate the death cases among discharged cases.

COVID-19 pandemic of in China and other countries

We selected COVID-19 data for the top ten countries with the highest total number of confirmed cases as of March 31, 2020 from Johns Hopkins University (Table 3, Fig. 4). We found that the United States had the highest number of total confirmed cases (192,079), followed by Italy (105,792), Spain (95,923), China (82,631), Germany (71,808), France (51,579), Iran (44,605), the United Kingdom (38,484), Switzerland (16,605), and Belgium (12,775). The United Kingdom had the highest tDCFR (94.78%), China had the lowest tDCFR (4.16%), and tDCFRs of other countries were in the middle range. Except for China and Germany, tDCFR was much higher than CFR.

Table 3 COVID-19 situation outside China
Fig. 4
figure 4

Epidemic situation in top 10 countries based on total number of confirmed cases

Discussion

Principal findings

We found that the CFR, tDCFR, and dDCFR of COVID-19 fluctuated greatly at high levels in the first week from January 20 to January 26. This was a result of the early discharged cases being dominated by death cases and the number of discharged cases being small. The CFR of COVID-19 decreased between January 26 and February 10, due to the increasing number of recovered cases. The CFR then increased after February 11, owing to the increasing death cases and decreasing confirmed cases. In short, the overall trend was to decrease at first and then increase to a stable level. Both tDCFR and dDCFR were high at first owing to a large number of death cases, and then gradually decreased after January 26 owing to a large number of recovered cases. The overall DCFR trend increases at first, and then decreases to a stable level.

We found that the trends of these two indicators, CFR and DCFR, are opposite. This is because the denominator of the calculation of the DCFR only relies on the number of discharged cases, which is a proportional indicator, whereas the denominator of the calculation of the overall CFR includes all cases, which is a rate indicator. Despite these differences, the DCFR is a more rapid indicator of the severity of COVID-19 than the CFR, especially in the middle and late stages of the epidemic, where the DCFR is very close to the CFR values.

The DCFR, which is determined by two factors, namely the number of deaths and the number of recovered cases, is useful to estimate the severity and control effect of COVID-19 at an early stage, and DCFR can reflect the actual situation of the current epidemic dynamics to some extent. Our study found that DCFR could be used to reflect the epidemic trend of COVID-19, classified as a spread stage, epidemic stage, decline stage, and sporadic stage according to the PELT, thus providing a reference for the rational allocation of medical resources.

In the current study, DCFR is introduced as a new and crucial concept. Our DCFR results indicate an overall decreasing trend in all four stages of the pandemic. Possible reasons for this are the characteristics of early cases (Li et al. 2020), improvements in diagnostic and treatment measures (General Office of the National Health Commission 2020), government intervention (Ali et al. 2020; Yan and Zhao 2020; Zhao et al. 2020), and an increase in the proportion of less virulent viruses (Tang et al. 2020). This means that it is reasonable to use the DCFR to estimate pandemic trends.

The numerator in the true CFR formula is a part of the denominator, so the results are not very accurate when used for dynamic data, especially in the early stages of an epidemic. Another way to calculate CFR is to follow up on daily confirmed cases to obtain their clinical outcomes, but these data are not easily available and the calculation method is more cumbersome.

Moreover, because there are many confirmed cases without clinical outcomes, there may be a delay in the emergence and reporting of deaths (Battegay et al. 2020), which leads to the use of CFR resulting in underestimations of disease severity. Therefore, we listed the top 10 countries based on the total number of confirmed cases and calculated the tDCFR for each country to compare the difference between DCFR and CFR. The United Kingdom had the highest tDCFR (94.78%) because it was in the outbreak stage, China had the lowest tDCFR (4.16%) because it was in the sporadic stage, and tDCFRs of other countries were between the two as they were in the epidemic or decline stages. Also, tDCFR is much higher than CFR in the spread and epidemic stages and tDCFR is close to CFR in the decline and sporadic stages. This means that DCFR in different regions or countries at different stages is not comparable.

We also found that tDCFR was much higher than CFR in all countries at the early stages. Such results suggest that the use of either DCFR or CFR alone is inaccurate when assessing the severity of early-stage COVID-19. DCFR would overestimate its severity and CFR would underestimate it. The DCFR may have a large margin of error at the beginning of the pandemic when the number of discharged cases is extremely low, but the error decreases as the number of discharged cases increases.

Although DCFR does not accurately reflect the true mortality rate and is not comparable at different stages, our findings suggest that the DCFR can be used to analyze the true severity of COVID-19 in combination with CFR, which is important for evaluating pandemic trend and drawing more realistic conclusions. Until now, the number of total confirmed cases and CFR have been used to evaluate the control effect (Flaxman et al. 2020; Hsiang et al. 2020; Ram et al. 2020; Team 2020).

Due to the unavailability of full data access in the earlier stage, we plan to conduct further critical analysis. In the next step, we will perform DCFR analysis for detailed categories, such as age and sex, in a relatively large database and over a longer period.

Limitations

Inevitably, our study has several limitations. First, due to the uncertainty surrounding official deaths being clinically attributed to COVID-19, DCFR does not solve some known epidemiological COVID-19 problems. This should be considered when applying such an index to other countries. Second, the data in our study did not include demographic characteristics such as age, sex, and local risk factors, such as pollution, climate, temperature, wind, relative humidity, and local management capacity, to calculate adjusted DCFRs.

Conclusions

In conclusion, DCFR is an indicator that we first proposed to compensate for the fact that CFR during the outbreak phase would severely underestimate the disease and mortality rate. Considering that DCFR may overestimate the mortality rate at that stage, the combination of DCFR with CFR can more accurately assess the severity of emerging infectious diseases and analyze dynamic trends.