Analysis of the current status of TB transmission in China based on an age heterogeneity model

: Tuberculosis (TB) is an infectious disease transmitted through the respiratory system. China is one of the countries with a high burden of TB. Since 2004, an average of more than 800,000 cases of active TB has been reported each year in China. Analyzing the case data from 2004 to 2018, we found significant differences in TB incidence by age group. A model of TB is put forward to explore the effect of age heterogeneity on TB transmission. The nonlinear least squares method is used to obtain the key parameters in the model, and the basic reproduction number R v = 0.8017 is calculated and the sensitivity analysis of R v to the parameters is given. The simulation results show that reducing the number of new infections in the elderly population and increasing the recovery rate of elderly patients with the disease could significantly reduce the transmission of TB. Furthermore, the feasibility of achieving the goals of the World Health Organization (WHO) End TB Strategy in China is assessed, and we obtained that with existing TB control measures it will take another 30 years for China to reach the WHO goal to reduce 90% of the number of new cases by the year 2049. However, in theory it is feasible to reach the WHO strategic goal of ending TB by 2035 if the group contact rate in the elderly population can be reduced, though it is difficult to reduce the contact rate.


Introduction
Tuberculosis (TB) is an ancient disease with a worldwide distribution and is the leading cause of death from bacterial infections.Mycobacterium TB, is the causative agent of TB.It was discovered and proved to be the causative agent of human TB by the German bacteriologist Koch in 1882, and it can invade all organs of the body, but is most common in causing pulmonary TB.TB is mainly transmitted through the respiratory tract, and the source of infection is through contact with TB patients who have excreted the bacteria [1].In the 19th century, TB became a major epidemic in Europe and elsewhere, spreading to all levels of society and causing one out of every seven deaths, known as the "Great White Plague" [2].Due to the low effectiveness of drugs used to treat TB, the disease is still uncontrollable and remains widespread worldwide.Between 1993 and 1996, the number of TB cases worldwide increased by 13% and TB killed more people than AIDS and malaria combined.In late 1995, the World Health Organization (WHO) established March 24 as World TB Day to further promote global awareness of TB prevention and control [3].Approximately 80% of new TB cases worldwide occur in 22 high-burden countries, with the largest number of global cases accounted to India with 26% and China with 12%, respectively [4].To this day, TB remains the leading cause of disease and death in most high-incidence countries [3].To end the global TB epidemic, WHO proposed in 2014 a post-2015 global end-TB strategy target of a 50% reduction in TB incidence by 2025 (compared to 2015) and a 90% reduction in new cases by 2035.
Mathematical models have become a powerful tool for analyzing epidemiological characteristics [5].Many scholars developed mathematical models reflecting the characteristics of TB based on its transmission mechanism, principles of biology, seasonal characteristics and social influences.In 1962, Waaler developed the first model of TB transmission kinetics [6].In 1967 Brogger further refined Waaler's model.He not only introduced heterogeneity but also changed the method of calculating the incidence rate; however, he did not give the relationship between infection and incidence rates [7].ReVelle developed the first nonlinear differential equation model for TB using Brogger and Waaler's model as a template [8].E. Ziv et al. studied the effect of early treatment on the incidence of TB and found that early treatment reduced the incidence of TB if the treatment rate for active TB was increased from 50% to 60%.Carlos Castillo-Chavez et al. studied the role of mobility and health disparities on the transmission dynamics of TB [9].In addition, medical studies have shown that anti-TB drugs can reduce the length of treatment for TB [10].Meanwhile, many studies have considered the effects of drug-resistant cases [11], time lag [12] and age structure [13].However, few studies use models of TB with different age groupings.Therefore, based on the collected data and the observed data characteristics (Figure 2), a susceptible-exposed-infectious-recovered (SEIR) model with different age groups is developed, and the feature of heterogeneity is considered in the model to assess the effect of age as a factor on TB transmission.
The main research of this paper is as follows: in section two, the detailed data collected is given and the data in relation is analyzed.A kinetic transmission model is developed, the main parameters in the model are fitted and the value of the basic reproduction number Rv is calculated.In section three, a sensitivity analysis of the basic reproduction number Rv is performed, considering the effect of the proportion of preferential contact within the group on Rv.Finally, a feasibility assessment of the WHO strategic objective of ending TB is presented.Section four contains the discussion section.

Data analysis
China is one of the high TB burden countries and faces a serious TB epidemic.The burden of TB in China has increased in the last two decades due to the emergence of drug-resistant strains of Mycobacterium TB [14], with an average of more than 800,000 new infections per year.The number of new cases from 2004 to 2021 is presented in Figure 1(A), which shows that the number of new infections per year is decreasing each year.A three-dimensional plot of the incidence by age group is given, and it is shows that the incidence of TB in different age groups is significantly different.The incidence rate was lowest in the 0-15 age group and much higher in the 60+ age group compared to other age groups.Overall, the number of TB cases was the highest in the 20-25 and 60-65 age groups, the results of which are shown in Figure 1(B),(C).Statistical data on the morbidity of TB in each age group is shown in Figure 1(D).According to the data of the United Nations [15], the rate of aging in China is gradually accelerating.The incidence of TB in different age groups was counted, mainly including the mean number of TB incidence, the mean incidence rate and its 95% confidence interval (CI) for each age group in the past fifteen years and the results of statistical analysis, which are shown in Table 1 with the incidence rate in units of 1/100,000.The results show that with the gradual increase of age, the number of TB incidence showed a trend of rising and then decreasing, while the incidence of TB showed a trend of increasing.Based on the statistical data from 2004 to 2018, a cluster analysis of the data on the incidence of TB in different age groups is given.The complete classification process for all data is shown in Figure 2 and their complete clustering results are shown in Table 2, where distance refers to a distance between a member of each class and the center of that class, and center, in this case, refers to a concept similar to the mean value within group.The cluster column in the table shows the final clustering results.We obtained the age group 0-15 years is the category with the lowest incidence of TB, the age group 15-60 years is the category with higher incidence and the group over 60 years is the category with the highest incidence.We analyzed the influence of age as a factor in TB infection in Figure 3, which shows the important role of age as a factor in the transmission of TB.Therefore, in this paper we develop a TB epidemic model that includes age-group heterogeneity, and through qualitative analysis and numerical simulation we predict the future incidence of TB in China.We assess whether China could meet the WHO strategic target by 2035 with the current control measures, as well as explore measures to more effectively prevent and control TB outbreaks.

Data collection
The annual number of TB cases reported in mainland China from 2004 to 2021 was obtained from the Public Health Sciences Data Center [1].Over a period of nearly 20 years, the number of reported cases exceeded 16 million.Among them, the number of cases between the ages of 15-60 years is the highest, with an average of 680,000 new infections per year, or 69%.The number of cases between the ages of 0-15 years is the lowest, with an average of 12,000 new infections per year, or 1.3%, as shown in Table A1 (Appendix A).Looking at the overall TB data in China, the number of new infections per year is gradually decreasing from the initial 970,279 cases in 2004 to 639,548 cases in 2021.

Model building
In this section we develop a model of TB dynamics that includes age heterogeneity and vaccination and divides the entire population into three groups according to TB incidence, with 0-15 years as the first group, 15-60 years as the second group and over 60 years as the third group.Each group is also divided into susceptible (S), exposed (E), infectious (I) and recovered (R).The number of natural deaths and the natural death rates are different in each age group.The age transition from the previous age group to the next age group is considered.Exposed TB cases refers to individuals who have been infected with TB bacteria but are asymptomatic, and the exposed patients and reinfection of recovered individuals has no place in the model.The dynamic process of TB transmission is shown in Figure 4.

The model is： ( ) , /
A is the annual number of births in the population and  is the Bacillus Calmette-Guerin (BCG)  is the hazard rates of infection of infected persons to susceptible persons among members of group i.The parameter i  is related to the average number of contacts of group i members i a , which is the probability that a member of group i is infected after each contact with an infected person i  .ij c was proposed by Jacquez et al [16] in 1988 to represent the proportion of contacts between members of group i and members of group j, as shown in expression (3), which is preferentially contacted to members within the same group [16].The Kronecker function ij  with a value of one when i = j and zero otherwise.j f is the proportional mixing fraction, as in the expression (4).

Numerical simulation and sensitivity analysis
The outbreak of severe acute respiratory syndrome (SARS) in 2003 posed a challenge to the public health system in China, and the government, in an effort to better address public health issues, increased public health funding, revised laws regarding infectious disease control, implemented an internet-based disease reporting system and initiated a program to rebuild local public health facilities.These measures have facilitated TB control [17].Complete data on TB cases is available on the official website of the Chinese Center for Disease Control and Prevention from after 2004.When numerical fitting was performed, data from 2004 was used to calculate the initial values and data from 2005-2021 was used for parameter fitting.

Determination of parameters and initial values in the model
We first estimate the parameters in the model, and the results are shown in Table 3.(a) From the data published in the China Population Statistical Yearbook 2005-2018 [18], the annual number of births of the population is 16, 440,000 A  /year.The natural mortality rates of the three age groups were 1 0.0017   /year, 2 0.0023   /year and 3 0.0367   /year [19].China started its immunization planning policy in 1992, and newborn infants must be vaccinated within 24 hours of birth with the BCG vaccine [20], thus, (b) The average incubation period of TB is about two months [21], so the conversion rate of patients with latent TB is taken as 6 i   .The mortality rate is i d = 0.0025/year according to the WHO Global TB Report 2013 [4].The recovery rate of infected cases is i  = 0.496/year [22].Each person will have contact with an average of 10-12 people per day [23], and we calculate the value of   China has taken various control measures to control TB transmission, including a five-year national plan in the 1980s, a 10-year national plan in the 1990s and the modern TB control strategy (Directed Observed Treatment Short-Course) introduced in the 20th century [24].After the implementation of these short-term and long-term plans, TB was effectively controlled in China.The model takes into account the heterogeneity of age subgroups and exposure between age groups.After 2004, China began to provide detailed data on TB cases, so we selected the data from 2005 to 2021 and used the model to fit the number of TB cases in China from 2005 to 2018 [25].The values of the parameters obtained from the fitting at this time are shown in Table 3 and the fitting effect is shown in Figure 5.

Data fitting results
It can be seen from the figure that the fitting results are very good, and the gap between the fitting curve and the actual data is very small.We performed a goodness-of-fit test on the fit results and the goodness-of-fit coefficient was 0.954 (the calculation of the goodness-of-fit coefficient can be found in Appendix B).The better fitting effect shows that the considered model is very trustworthy and can reflect the variation of the actual data well, even when considering more realistic situations.

Sensitivity analysis of Rv
In epidemiological studies, the reproduction number (denoted as Rv) indicates the average number of infections in an infected person during the period of infection [26], and is one of the most important indicators to assess the risk of an infectious disease.The reproduction number Rv is also considered a key epidemiological parameter in determining whether the disease can spread in an area, with Rv > 1 often implying that the disease will persist and Rv < 1 implying that the disease will become extinct.The reproduction number for model (1) was calculated using the next generation matrix method [27], and the complete calculation process is presented in Appendix C. Based on the values of the parameters obtained from the fit, the basic reproduction number Rv = 0.8017 is calculated for model (1).The strength of the correlation between each parameter in the model and the basic reproduction number Rv is judged as a way to find the most sensitive epidemiological parameter that should be prioritized when controlling infectious diseases [28].The sensitivity analysis of Rv is performed using the partial rank correlation coefficient (PRCC) [29] of each parameter in the model, and the results are shown in Figure 6 Negative correlation transmission of TB.

Effect of preferential contact proportion i  on reproduction number Rv
Transmission of TB occurs mainly through close human-to-human contact, and the concept of contact is quantified in the model.The hazard rate of infection i  is defined by the function of parameters of the average number of contacts i a , the proportion of preferential contacts within the group i  and the probability of infection i  .Among these parameters, the most important parameter  has the greatest effect on Rv and 1  has the least effect on Rv.The oldest groups' frequent contact with each other can increase the spread of TB.Therefore, efforts to protect the elderly population should be strengthened by calling on them to increase their nutrition, exercise themselves and have regular health checks.We also appeal more young people to give care to the elderly.It is believed that TB in China will be better controlled if there are fewer TB infections in the elderly group, though this is not easy to do.

Assessing the feasibility of achieving the WHO End TB Strategy in China
In the nearly two decades between 2004 and 2021, China has taken various control measures against TB and achieved very significant results, with the number of new TB cases in China decreasing from a peak of 1.25 million in 2005 to 630,000 in 2021, and significant progress has been made in controlling TB.However, TB still remains a huge challenge and in order to end the TB epidemic globally, WHO proposed a post-2015 global TB endgame strategy in 2014 with the strategic goal of reducing TB incidence by 50% by 2025 (compared to 2015) and new cases by 90% by 2035 [30].The number of new TB cases in China in 2015 was 864,015, and the WHO target expects China to reduce the number of new TB cases to 86,402 in 2035.
In the previous analysis, we calculated that the reproduction number of TB Rv = 0.8017.The value of Rv is less than one, but without considering the increase of other control measures, by 2035 China will still have nearly 300,000 new infections of TB.This will not reach the desired target of WHO.According to the simulation results shown in Figure 8, China will be in 2049; that is, it will take nearly 30 years to reach the desired target of WHO.

Study of feasible control strategies
The effectiveness of BCG vaccine  , average number of contacts i a and the proportion of preferential contacts i  are the most important factors in the TB control process.To investigate the impact of parameters  , i a and i  on the number of new cases of TB based on current control strategies, we tried to find the feasibility of achieving WHO's end-TB strategy goals.The parameter values listed in Table 3 are used to compare the control effects.

Effect of BCG vaccine effectiveness
First, considering an intervention scenario that increases the parameter  , the results are shown in Figure 9(A) and the number of TB cases for different values of  is given in Figure 9(B).It was found that increasing the effectiveness of BCG was of limited help in reducing the number of TB cases in the short term.Increasing  by 10% would reduce the number of TB cases by an average of 3000 per year (Appendix D).Even if the effectiveness of BCG was increased from 72.8% to 95%, China would still have more than 230,000 new infections in 2035 (Appendix D), but the number of new cases per year would already be significantly reduced.Therefore, increasing the effectiveness of the vaccine would be a good option, although it would not meet the WHO strategic goal of 2035.The reason for this is that BCG is more effective in younger children, but less effective in the older age groups.The number of new cases in younger TB-infected patients has been low, so the impact of increasing the effectiveness of BCG is not significant.Cases Year Actual data a 1 =4380a 2 =3650a 3 =2920   In summary, for all intervention scenarios it will be difficult for China to reach the ultimate WHO target by 2035 by using the existing TB control measures.However, if the average contact rate 3 a can be reduced by 25% or the preferential contact rate 3  within the group can be reduced by 50%, the WHO strategic target for TB control by 2035 can be reached.China still has a long way to go on the road to eliminating TB by strengthening the implementation control measures, achieving early detection and treatment and improving the effectiveness of anti-TB drugs.
The milestones goal of WHO End TB strategy (A) The milestones goal in 2035 of WHO End TB strategy (B)

Discussion
With TB control measures, the number of TB cases reported each year in China is gradually decreasing, which means that China is reducing the number of new cases of TB each year.However, there is still a long way to go to eliminate TB and the TB epidemic may remain a serious problem in the future.According to the data of the China Population Statistical Yearbook [18] and China's population structure, the proportion of elderly people is increasing year by year.China considers its post-2015 end-TB strategy [31], and an aging population poses a great challenge to TB control in China.More importantly, our study shows in Figure 3 that there were more significant differences in TB incidence between different age groups.Using TB data reported in China from 2005 to 2021, we developed a SEIR infectious disease model that includes three age groups, juvenile (0-15 years), middle-aged (15-60 years) and elderly (60 years or older), to investigate the role of age in the transmission of TB in China.The parameters in the model were fitted using the least squares method and numerical simulations were performed using the fitted parameters.The fitted data was compared with the reported real data in high agreement with the annual data reported for TB in China.All of our fits were obtained from the number of reported TB cases in China, but we have no way of knowing whether the number of reported cases equals the actual number of cases.If the number of reported cases is less than the actual number of cases, this will have an impact on the estimation of parameters and model predictions.On this basis, the current basic reproduction number of TB transmission in China is Rv = 0.8017.Even if the value of the basic reproduction number of TB transmission in China, Rv, is less than one, it would still take 45 years for China to eliminate TB (to reduce the number of new infections to less than 10,000 per year).The feasibility of achieving the WHO strategic goal of ending TB by 2035 under the current TB control initiatives adopted in China was assessed.With the current control measures, it would take nearly 30 years for China to reach the expected WHO goal.How to shorten this process is one of the issues to be considered in China.
According to the model, we evaluated the effect of different intervention options and the effect of increasing the effectiveness of BCG vaccine exists, but it was limited.Even if BCG effectiveness was increased to 95%, China would not reach the WHO strategic goal of ending TB by 2035.However, if there was a 25% reduction in 3 a , a 50% reduction in 3  or a reduction in overall contact in the elderly population, China could reach the WHO strategic goal by 2035.In order to eliminate TB as soon as possible, China needs to continue to strengthen the implementation of TB control measures, improve the effectiveness of TB drugs and further explore other effective control measures.
It is reasonable to consider age grouping and contact heterogeneity in the TB model, which would be more realistic and help us to improve control strategies for TB in China [19].Interventions such as increased nutrition for the elderly and early detection and treatment for specific groups of the elderly can be a very effective epidemic control measure [32].Thus, our age grouping model provides a valuable foresight.For example, BCG is highly effective in young adults but less effective in middleaged and older adults, with effectiveness in the second and third age groups being only about 50% [33].However, with the increased aging of the Chinese population and the fact that the elderly population has a high incidence of TB, a similar BCG control strategy should be implemented for the potentially high-risk elderly sub-population, which may significantly reduce the incidence in this group.According to reports on TB [34], approximately 0.5% to 7.2% of TB cases in developed countries was caused by Mycobacterium bovis, while in many developing countries, the severity of human infection with bovine TB was much higher than in developed countries [35].In China, there are several regions that depend on animal husbandry, such as the pastoral areas of Xinjiang, Tibet and Inner Mongolia where cows are mostly raised on a small scale or free-range, which can greatly facilitate the transmission of bovine TB between humans and cattle [36].Therefore, it is reasonable to believe that a proportion of the TB patients in China are bovine TB patients, and that bovine TB patients are more capable of transmitting the virus.If measures can be taken to control the number of infections in this group, it is believed that this will help to reduce the overall number of TB cases in China.This is an issue that we intend to continue to study.Studying the role of age in TB transmission may help to predict long-term health risks and, thus, suggest targeted TB control strategies, more rational programming and more efficient use of limited resources [37], which is still of great significance for TB control.Also, when focusing on the elderly population, improving their healthy living standards, increasing their nutrition and calling for their greater participation in exercise all have a positive impact on TB control.
The limitation of the findings in this paper is that only Chinese yearly TB cases data were fitted, with less analysis and simulation of different age groups.We will improve this in future studies.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The partial derivatives of i E and i I are obtained, The partial derivatives of i E and i I are obtained,    Therefore, at the disease-free equilibrium point 0 P , there is

c S T N A N A A N A
The basic reproduction number is the spectral radius of the matrix

Figure 1 .
Figure 1.Confirmed cases of TB in mainland China in the past two decades.(A) Number of new infections per year.(B) Three-dimensional plot of TB data at different ages.(C) Contour plot of TB data at different ages.(D) Three-dimensional plot of TB morbidity data at different ages.

Figure 2 .
Figure 2. Clustered analysis of incidence data for different age groups.

Figure 4 .
Figure 4. Flow chart of TB transmission with age-structure.

Figure 5 .
Figure 5. Fitting results on annual new cases of TB, 2005-2021.(A) Actual data and fitted optimal curves.(B) Absolute error between fitting curve and actual data.

Figure 6 .
Figure 6.Correlation coefficients between basic reproduction number and model parameters.
. The basic reproduction number Rv, A (annual population births), 3  (probability of exposure to infection in people over 60 years of age) and 3  (recovery rate in elderly people over 60 years of age) are the most sensitive parameters, and , reducing new infections in the elderly population and improving recovery rate in older patients with the disease can significantly reduce the -

is i
, which indicates the extent to which each individual prioritizes contacts with members of the same group.A larger i  means that an individual has more frequent contacts with members of the same group.We considered the effect of i  on Rv by changing the value of i  ( 1, 2,3 i  ) to observe the change in the basic reproduction number Rv.The effect of i  on Rv is considered by fixing one of the values of parameters i  and changing the value of the others, as seen in Figure 7.The results show that increasing the values of 1  , 2  and 3  , respectively will cause an increase in the basic reproduction number Rv.The value of 3

Figure 7 .
Figure 7.The comparison results of the effect of i  on Rv .(A) The effect of 1  and 2  on

Figure 8 .
Figure 8. Predicted number of TB cases after 2035 according to model (1).

Figure 9 .Figure 10 .
Figure 9.The effect of on BCG vaccine.(A) Actual data and model simulation results.(B) The number of TB cases for different values of  .

 and 2 
by 25% and 33%, respectively, would still leave China with more than 200,000 new infections in 2035 and would not meet the WHO TB strategy target.However, reducing the proportion of preferential contact 3  of group three would have a dramatic effect.By reducing the value parameter 3 by 50%, the number of TB infections in China decreased rapidly.With only 70,000 new TB infections per year by 2035, the WHO strategic target for 2035 can be reached.

Figure 11 .
Figure 11.The effect of preferential contacts proportion i  on the cases.(A) Actual data and model simulation results.(B) The effect of lowering 1  and 2  on the cases.(Compared

Table 1 .
Number and incidence of TB by age group and their confidence intervals.

Table 2 .
Clustering results by age groups based on the incidence of TB.

Table 3 .
Fitting results of dynamics parameters in the model (1).