Evaluations of heterogeneous epidemic models with exponential and non-exponential distributions for latent period: the Case of COVID-19

: Most of heterogeneous epidemic models assume exponentially distributed sojourn times in infectious states, which may not be practical in reality and could affect the dynamics of the epidemic. This paper investigates the potential discrepancies between exponential and non-exponential distribution models in analyzing the transmission patterns of infectious diseases and evaluating control measures. Two SEIHR models with multiple subgroups based on different assumptions for latency are established: Model I assumes an exponential distribution of latency, while Model II assumes a gamma distribution. To overcome the challenges associated with the high dimensionality of GDM, we derive the basic reproduction number ( R 0 ) of the model theoretically, and apply numerical simulations to evaluate the effect of different interventions on EDM and GDM. Our results show that considering a more realistic gamma distribution of latency can change the peak numbers of infected and the timescales of an epidemic, and GDM may underestimate the infection eradication time and overestimate the peak value compared to EDM. Additionally, the two models can produce inconsistent predictions in estimating the time to reach the peak. Our study contributes to a more accurate understanding of disease transmission patterns, which is crucial for effective disease control and prevention.


Introduction
Throughout history, infectious diseases have posed a threat to human existence and even influenced the course of history [1].For example, the Black Death epidemic in Europe in the 14th century killed more than 24 million people [2].Now, with the outbreak of COVID-19, which has already caused unprecedented public health challenges worldwide [3], it is more important than ever to study the prevalence and spread of infectious diseases.
Mathematical modeling is a momentous approach to understand the transmission pattern of epi-demic [4].Establishing and analyzing epidemic models plays a significant role in revealing the epidemic law of diseases and predicting their development trends [5].Furthermore, it also helps to identify the causes and key factors of disease epidemics and provides theoretical guidance in the search for appropriate control measures [5].
Epidemic models generally assume that the population is homogenous or randomly mixed, allowing the model to ignore meta-population heterogeneity including gender differences, regional variations, age differences, etc. [6].However, due to the oversimplification and idealization of homogenous models, sometimes the analysis results deviate from the actual situation [4].Especially for emerging diseases such as MERS, SARS, and COVID-19, population heterogeneity has an important impact on the epidemic law of diseases and the assessment of prevention and control measures.Therefore, it is crucial to consider heterogeneity factors in epidemiological model in order to solve practical problems [4].
In recent years, much progress has been made in investigating how heterogeneity changes or influences the dynamic behavior of infectious diseases.Lajmanovich et al. [7] described the effect of heterogeneity on infectious diseases by establishing a model of infectious diseases in n groups.On this basis, Nold [8] discussed the dynamic behavior of a class of disease models by considering the heterogeneity of infectious disease transmission in population.Wang et al. [9] considered aged heterogeneity factors of a COVID-19 model and further evaluated the effectiveness of control measures.Cui et al. [4] developed a meta-population model that considered latent and asymptomatic individuals separately and investigated the impact of heterogeneity factors.Dimarco et al. [10] analyzed the effect of heterogeneity on the transmission pattern of infectious diseases by combining social contacts and epidemic dynamics.Other studies related to heterogeneity can be found in the literatures [6,[11][12][13][14][15][16].
In spite of the above excellent literatures on heterogeneous epidemic models, however, most of these references assume exponential distribution of latency, however, for many infectious diseases, this is not realistic.Indeed, the distribution of latency maybe not exponential [17], and models with exponential distribution may lead to bias in understanding the dynamic behavior of diseases and evaluating control measures [18].Moreover, there have been many cogent literatures supporting non-exponential latency distribution for COVID-19 [19][20][21][22][23]: in [23], by fitting the latency data of 109 COVID-19 cases, the results show that the gamma distribution may be the optimal distribution of COVID-19 latency.Therefore, it is meaningful to consider that latency obeys a non-exponential distribution in the study of infectious diseases.
In recent years, there have been some studies generated on latency obeying a non-exponential distribution.Feng et al. [24] developed a model for disease stages following a general distribution and investigated the impact of disease stages following exponential and non-exponential distributions on the aspects of evaluating the effectiveness of interventions.Feng et al. [25] further derived the basic reproduction number of disease stages obeying a non-exponential distribution, providing threshold conditions for outbreak control and prevention.Capistran et al. [26] presented a COVID-19 mathematical model with a non-exponential distribution of disease stages, in order to forecast hospital demand in urban regions throughout a COVID-19 pandemic and estimated lockdown-induced second fluctuations.The above-mentioned literatures provide an in-depth study of epidemic models in which the latent duration admits a non-exponential distribution but does not consider the influence of heterogeneous factors on the model.
On the other hand, Blyuss et al. [27] used an SEIR-type heterogeneous mathematical model as-suming non-exponentially distributed disease stages, in order to explore the dynamics and control of COVID-19.Chen et al. [28] investigated the effect of mathematical models with heterogeneous susceptible populations when the infectious period is non-exponentially distributed on studying COVID-19 transmission dynamics.None of the above studies have considered the potential discrepancies between exponentially and non-exponentially distributed heterogeneous models, when applying models to reveal patterns of transmission of infectious diseases, predict their trends, and evaluate control measures.It remains a worthwhile question of how the non-exponential distribution for latency of heterogeneous models impacts the outbreak transmission dynamics and interventions compared to exponential distributions.To our knowledge, this problem has not been well researched.
In this paper, motivated by the above issues, we investigate the potential differences between exponentially distributed and non-exponentially distributed heterogeneous models when used to understand patterns of transmission of infectious diseases and to evaluate control measures.To the best of our knowledge, compared to the existing literatures, the main innovation of this paper is that it considers for the first time the differences between latent period obeying exponential and non-exponential distributions in heterogeneous models.It is believed that our research seems to bring new viewpoint for the study and control of diseases.
The paper is organized as follows: Section 2 includes the establishment of the heterogeneous model with m subpopulations whose disease stages obey a general distribution.Then, we reduce the general distribution model into exponential distribution model (EDM) and gamma distribution model (GDM).Furthermore, the basic reproduction number R 0 is given a detailed theoretical derivation.In Section 3, PRCC (Partial rank correlation coefficient) is used to examine how the sensitivity of parameters changes during disease transmission, and possible control strategies are evaluated by sensitivity analysis.Next, in order to investigate how control parameters affect the dynamic behavior of infectious diseases and conclude differences in the assessment of the models, we use numerical simulations to compare EDM and GDM.Section 4 involves some discussion and conclusion remarks.In 2007, Feng et al. proposed a general distribution homogeneous model by using the method of probability [24].Referring to the research method, to investigate the effects of non-exponential stage distributions assumed in heterogeneous epidemic models, we establish an SEIHR model with m subgroups and arbitrarily distributed disease stages.

The model with general distribution of disease stages
Hypothesis: 1) The total population is divided into m groups, each of which includes S i , E i , I i , H i , R i , 2) The initial conditions: See other more detailed definitions of variables and parameters listed in Table 1.
The SEIHR model with m subpopulations and arbitrarily distributed disease stages consists of the following integral and differential equations: where and The latency in the system (2.1) considered above is the general distribution.Next, we consider exponential distribution and gamma distribution instead of general distribution to derive two ODE models with different distributions of the latency period, respectively.

Model with exponential distribution of latency (EDM)
Referring to Feng et al. [18,25,30], system (2.1) can be simplified to the following ODE model (see Figure 1 for flowchart): (2.2) 2), which is degenerated from model (2.1) when the latency period follows an exponential distribution (i, j denote different subpopulations).
Referring to [4,29], the basic reproduction number is obtained by the next generation matrix method.The increasing rate of secondary infection and disease progress in disease compartment i are denoted by F i , V i , (i = 1, 2, . . ., m), respectively.
The disease-free equilibrium of system (2.2) is
The Jacobi matrix of F i , V i at E a is F , V are defined as the increasing rate of secondary infection and disease progress in the total population, respectively: .
The next generation regeneration matrix is FV −1 , the basic reproduction number is defined as follows: where ρ denotes the spectral radius of a matrix.

Model with gamma distribution of latency(GDM)
Referring to Feng et al. [18,25,30], system (2.1) can be simplified to the following ODE model (see Figure 2 for flowchart): (2.3) 3), which is degenerated from model (2.1) when the latency period follows a gamma distribution (i, j denote different subpopulations).
Referring to [4,29], the basic reproduction number is obtained by the next generation matrix method.the increasing rate of secondary infection and disease progress in disease compartment i are denoted by F i , V i , (i = 1, 2, . . ., m), respectively. .
The disease-free equilibrium of system (2.3) is , .
F , V are defined as the increasing rate of secondary infection and disease progress in the total population, respectively: The next generation regeneration matrix is FV −1 , the basic reproduction number is defined as follows: where ρ denotes the spectral radius of a matrix.

Comparison of EDM and GDM based on real data
In this section, we divide the population of Guangdong Province into four age groups based on the transmission characteristics of COVID-19: group 1, those ≤ 5 years old; group 2, those 6-19 years old; group 3, those 20-64 years old; and group 4, those ≥ 65 years old, and conduct global uncertainty and sensitivity analysis through Latin hypercube sampling (LHS) and Partial Rank Correlation Coefficients (PRCC).Then, according to Table 2, we used numerical simulations to explore the discrepancies between EDM and GDM in revealing disease transmission and assessing control strategies.

Uncertainty and sensitivity analysis
Sensitivity analysis provides significant information concerning how uncertainty and variability of model parameters may impact model consequences and which parameters are most influential.In this subsection, we perform uncertainty analysis of the input data for all parameters of the model mainly by the LHS method and further use PRCC to investigate the global sensitivity of the related parameters [31][32][33][34].In this way, it is determined which factors are important in the disease epidemic for better selection of epidemic control measures.
Figure 3 depicts the correlation between parameters and the total number of infected persons over time, and the gray area in the figure represents PRCCs that are not significantly distinct from zero.Note that the PRCC values range from −1 to 1. Negative (positive) values represent a negative (positive) correlation of the parameter with the model output.A negative (positive) correlation suggests that a negative (positive) variation of the parameter will decrease (increase) the output of the model.From the Figure 3, we clearly notice that, for both EDM and GDM, the correlation between parameters and total infections showed consistent results: the most significant positive effect on the number of total infected persons is β 33 , κ.That is, when the transmission rate β 33 between the third age group and the hospitalization infection rate κ increase, it will make the total number of infected individuals increase rapidly, and more people will be infected.Next, the correlation between transmission rate and total number of infected persons varies by age group, and it can be noted that an increase in the transmission rate of the third age group has a greater impact on the increase in the total number of infections than the other three age groups.
In addition, the correlation of parameters may change over time.Especially, the hospitalization rate γ have a more positive correlation with the output of the target model, which is the total number of infected persons at the beginning of the disease.With the progression of the disease and the increase in the total number of infected individuals, this parameter will negatively affect the target output.In fact, due to the initial lack of understanding of the characteristics and mode of infection of the disease, the flow of infected people is promoted as hospitalization rates continued to increase, increasing the number of people who contracted the disease.However, as time passes, much progress has been made in the understanding of how diseases transmitted and cured, resulting in fewer people being infected.And α, µ show a significant negative impact on the total number of infected persons over time.Higher progression rate of exposed individuals to infectives (α) and recovery rates (µ) directly imply a shorter latency and recovery period, which further suggests that the time to make effective contacts are less, thereby reduction in the spread of infection and thus the number of infections.

EDM
GDM (n = 2) GDM (n = 3) Therefore, based on the results of sensitivity analysis, it is indicated that decreasing the transmission rate and the hospitalization infection rate or increasing the hospitalization rate could effectively reduce the number of infected individuals.Accordingly, we can take the following control measures: reducing contact between populations by isolating at home and wearing masks when traveling and the hospitalization infection rate by taking measures such as sterilization and isolation in hospitals, increasing the hospitalization rate by shortening the period of diagnosis, and improving the medical system.

Effectiveness of different interventions on EDM and GDM
In this subsection, we perform numerical simulations for three models, including EDM and GDM (n = 2, 3), to compare the results when the models are used to estimate different interventions specified by γ (hospitalization rate ) and/or κ (hospitalization infection rate), and then investigate how the models may obtain different assessments on the effectiveness of various control measures.Different control strategies are presented separately for Guangdong province, and some key indicators are used to evaluate the effectiveness of the control strategies include the peak value, peak time, final size of an outbreak and the infection eradication time * [36].The simulation results are shown in Figures 4-6.
Figure 4 shows the numerical simulation of EDM, GDM (n = 2) and GDM (n = 3) based on the parameters of Guangdong province, it plots active infectious individuals (red curve), cumulative infections (blue curve).The four rows compare for scenarios based on implementing interventions such as performing hospital sterilization to reduce the hospitalization infection rate(κ) and shortening the diagnosis period to increase the hospitalization rate (γ).It consists of a baseline scenario and three strategies.We observe from Figure 4 that when hospitalization rate (γ) increases (baseline scenario and strategy I) or the hospitalization infection rate (κ) decreases (baseline scenario and Strategy II), the final size and peak size will be reduced and the peak time will be advanced in all three models (see rows [1][2][3].By comparing strategy III with Strategies I and II (see rows 2-4), it can be observed that the most effective way to control infectious diseases is to combine the two methods (Strategy III).See Table 3 for different strategies.
A more detailed comparison of the infection eradication time, the peak value and time for the three control strategies is shown in Figure 5.In particular, comparing Strategies I, II and III to the baseline scenario, they all lead to earlier end of infection, lower peak value and earlier peak time.By comparing Figure 5(a) with Figure 5(b),(c), we found that the decrease in hospitalization infection rate has a more pronounced effect on the end time of infection than on the peak value and time, but increasing hospitalization rate has a similar effect on time to infection eradication, peak value and time.What's more, we observe that all models provided consistent assessments of control strategies, but we also note that EDM has a later infection eradication time, smaller peak value and earlier peak time, while GDM, specifically GDM (n = 3) has an earlier infection eradication time, larger peak value and later peak time.
It is clear from Figure 5(a) and Figure 6 that EDM, GDM (n = 2) and GDM (n = 3) appear to have the same final size in the case of the same control strategy, but the infection eradication time (i.e., the time to reach the final size) can be significantly different.By comparing three strategies with the baseline scenario, decrease in hospitalization infection rate or increase in hospitalization rate is found to be effective in reducing the final disease size as well as shortening the end time of the outbreak.

A more detailed comparison of EDM and GDM on interventions
To further investigate the difference between EDM and GDM with the change of control parameters, we simulate contour plots of the infection eradication time, peak value, peak time with respect to hospitalization infection rate (κ) and hospitalization rate (γ) in three different models, including EDM and GDM (n = 2, 3).Figures 7-9 further confirm the conclusions in Figures 4 and 5: in both EDM and GDM, as γ increases and κ decreases, the peak time and infection elimination time are advanced, and the peak size is lowered.This suggests that control measures such as disinfection and isolation in hospitals and a shorter diagnosis period during the stage of the disease outbreak would reduce the severity of the epidemic.Figures 7 and 8 indicate that EDM overestimates the infection end time and underestimates the infection peak value when compared to GDM. Figure 9 shows that EMD and GDM are inconsistent in their estimates of the peak time: (a) if γ < 0.24, EDM overestimates peak time compared to GDM; (b) if 0.24 ≤ γ ≤ 0.2625, when the hospitalization infection rate is higher and the hospitalization rate is lower (i.e., Top left corner in Figure 9

Conclusions and discussion
In this paper, we have developed and investigated two SEIHR epidemic models with exponential and gamma distribution of latency, taking into account the effect of population heterogeneity and infectious hospitalized individuals.Firstly, we have theoretically derived the formula for the basic reproduction number R 0 .Secondly, we have performed sensitivity analysis of the model parameters using the partial rank correlation coefficient (PRCC) approach and identified key parameters affecting disease transmission to derive reasonable control measures.Next, we have evaluated the impact of EDM and GDM on interventions based on age heterogeneity.Finally, using the parameters of COVID-19 in Guangdong Province, we have obtained the similarities and differences between EDM and GDM in revealing disease transmission and assessing control strategies through numerical simulation results.
Our results suggest that the evaluation of the different strategies by EDM and GDM is synchronous, however, when control strategies are the same, it seems that GDM may not have an effect on the overall size of an epidemic but it does affect the process of infection and further change the overall duration of an outbreak.EDM may result in later infection eradication time and lower peak value compared to GDM.Moreover, another finding of our study is that EDM and GDM produce inconsistent results for predicting peak time, and the outcome of the prediction may depend on the strength of the control strategy.
When we consider a mathematical model with population heterogeneity to describe disease transmission, we underestimate the infection eradication time and overestimate the peak value, assuming that the latent period follows a gamma distribution (actually an exponential distribution).In contrast, when the latent period is assumed to follow an exponential distribution (actually a gamma distribution), we overestimate the infection eradication time and underestimate the peak value.Furthermore, the strength of the control strategy (especially considering the effect of hospitalization rate) may also have an impact on the prediction results of the peak time when the exponential distribution of latency is replaced by a gamma distribution.Specifically, if the control is weak, the peak time may be underestimated, and if the control is strong, the peak time may be overestimated.Therefore, when applying mathematical models to describe the spread of diseases, one must be careful to make assumptions about the distribution of latency periods.If unreasonable assumptions are made about the distribution of latency periods, we may overestimate or underestimate the peak value, peak time, and infection eradication time.Meanwhile, estimating peak time needs to take into account the effect of control strategies when considering different distributions of latency periods.
In contrast to the existing literatures, it should be noted that the main innovation of this paper is perhaps the first consideration of the difference between latency obeying exponential and non-exponential distributions in a heterogeneous model.In the literature [27], the model comparisons done in Blyuss et al. focus on the differences between exponential and non-exponential distributions of homogeneous models.In this paper, our results show that the estimates for the end time of infection are consistent with the results obtained by Blyuss et al. (i.e., increasing the number of stages in the latency period leads to an earlier overall end time), but not for the time to peak, which may depend on the intensity of control measures.Moreover, our results suggest that in the heterogeneous model, increasing the number of stages of latency leads to an increase in the number of infections.These results significantly expand the understanding of the law of heterogeneous infectious disease.The model proposed in this paper is obtained under the assumption of ignoring demographic dynamics.However, for example, AIDS, tuberculosis, and other such long-term epidemics, we should not ignore the influence of demographic dynamics and therefore need to consider long time scales models of heterogeneous infectious diseases.Moreover, in our paper, the results obtained by comparing the exponential and gamma distributions help to understand the disease transmission pattern more accurately, but for the latent period, there are not only exponential and gamma distributions [38].Hence, it is necessary to consider models of other distributions and analyze their differences in understanding disease transmission.We leave it to future work to address these questions.

Figure 3 .
Figure 3. Time-varying PRCCs sensitivity indexes of total infectives in three differential models.

Figure 4 .
Figure 4. Comparison of the epidemic sizes generated by the models EDM, GDM (n = 2) and GDM (n = 3) under the baseline scenario (top row) and strategies I, II and III (rows 2, 3 and 4, respectively).

Figure 5 .
Figure 5.Comparison of infection eradication time (a), peak value (b) and peak time (c) generated by the three models under the baseline scenario and the three control strategies I, II and III.

Figure 6 .
Figure 6.Comparison of the final size generated by the three models under the baseline scenario and the three control strategies I, II and III, in Guangdong Province.

Figure 7 .Figure 8 .
Figure 7. Contour of the infection eradication time with respect to κ (hospitalization infection rate) and γ ( hospitalization rate) in three different models.

Figure 9 .
Figures 7-9 further confirm the conclusions in Figures4 and 5: in both EDM and GDM, as γ increases and κ decreases, the peak time and infection elimination time are advanced, and the peak size is lowered.This suggests that control measures such as disinfection and isolation in hospitals and a shorter diagnosis period during the stage of the disease outbreak would reduce the severity of the epidemic.Figures7 and 8indicate that EDM overestimates the infection end time and underestimates the infection peak value when compared to GDM.Figure9shows that EMD and GDM are inconsistent in their estimates of the peak time: (a) if γ < 0.24, EDM overestimates peak time compared to GDM; (b) if 0.24 ≤ γ ≤ 0.2625, when the hospitalization infection rate is higher and the hospitalization rate is lower (i.e., Top left corner in Figure9(b)), EDM overestimates the peak time compared to GDM; when the hospitalization infection rate is low and the hospitalization rate is higher (i.e., Bottom right

Table 1 .
Explanation of symbols.

Table 2 .
Parameter values in Guangdong Province.

Table 3 .
Three control strategies in Guangdong Province.