Potential effect of household contact management on childhood tuberculosis: a mathematical modelling study

Summary Background Tuberculosis is recognised as a major cause of morbidity and mortality in children, with most cases in children going undiagnosed and resulting in poor outcomes. Household contact management, which aims to identify children with active tuberculosis and to provide preventive therapy for those with HIV or those younger than 5 years, has long been recommended but has very poor coverage globally. New guidelines include widespread provision of preventive therapy to children with a positive tuberculin skin test (TST) who are older than 5 years. Methods In this mathematical modelling study, we provide the first global and national estimates of the impact of moving from zero to full coverage of household contact management (with and without preventive therapy for TST-positive children older than 5 years). We assembled data on tuberculosis notifications, household structure, household contact co-prevalence of tuberculosis disease and infection, the efficacy of preventive therapy, and the natural history of childhood tuberculosis. We used a model to estimate households visited, children screened, and treatment courses given for active and latent tuberculosis. We calculated the numbers of tuberculosis cases, deaths, and life-years lost because of tuberculosis for each intervention scenario and country. Findings We estimated that full implementation of household contact management would prevent 159 500 (75% uncertainty interval [UI] 147 000–170 900) cases of tuberculosis and 108 400 (75% UI 98 800–116 700) deaths in children younger than 15 years (representing the loss of 7 305 000 [75% UI 6 663 000–7 874 000] life-years). We estimated that preventing one child death from tuberculosis would require visiting 48 households, screening 77 children, giving 48 preventive therapy courses, and giving two tuberculosis treatments versus no household contact management. Interpretation Household contact management could substantially reduce childhood disease and death caused by tuberculosis globally. Funding and research to optimise its implementation should be prioritised. Funding UK Medical Research Council, US National Institutes of Health, Fulbright Commission, Janssen Global Public Health.


Number of children in households
The number of children aged [0,5) and [5,15) in each household was computed. For each survey, the mean number (and variance) of children in each age category cohabiting with individuals of each age category and sex was computed. The survey design was accounted using the R package survey and specifying the weights, and cluster and strata ids, as per the DHS manual.
The mean number of cohabiting children in age groups [0,5) and [5,15) for individuals of given age and sex is shown in Figures 1 and 2 respectively.

Are households of TB patients different?
We are ultimately interested in predicting the number of children found in households of TB patients of a given age and sex, and it is possible that TB predicts household size and composition independently of age, sex and other variables. Often TB has different epidemiology in rural vs urban populations, so differences by rural/urban classification are also of interest.
Data to explore these patterns generally are limited. However, the large Indian DHS dataset included a self-reported TB prevalence question. We used this to briefly explore whether the household composition of TB patients is systematically different to non-TB patients. We graphed a (generalized additive model [GAM]) smoothed number of cohabiting children (age [0,5) in Figure 3, age [5,15) in Figure 4), together with uncertainty bounds, additionally stratifying by rural/urban classification. There were small differences for children age [5,15), and other settings may be different, but we were generally reassured in proceeding under the assumption that the numbers of cohabiting children cohabiting with TB patients are not substantially different to those cohabiting with non-TB patients of a given age and sex.

Regression analysis of number of children in households
We used multivariate linear regression to model the log-transformed expected number of children aged [0,5) and [5,15) that individuals of given age and sex share households with in each country. To account for the uncertainty arising from sampling error in the DHS survey, we used a Bayesian hierarchical approach with a Normal measurement error. The underlying pattern of means by index case age and sex (e.g. as in Figure 1) was flexibly modelled as multivariate normal (in log space), and the country-level covariates described in the previous section were used.
More formally, we used a model: and was used as prior for the covariance matrix . A Gibbs sampling scheme implemented /20 Ψ = 1 Σ in the R package mvregerr ( https://github.com/petedodd/mvregerr ) was employed to generate 1000 samples (individual parameter chains suggested this was sufficient for convergence). Prediction samples representing both parameter and prediction uncertainty were simultaneously gathered for countries without DHS surveys and using the most recent World Bank data for updated predictions in countries with DHS surveys. This prediction dataset was reduced to predictions (and their uncertainty) for the mean number of cohabiting children in ages [0,5) and [5,15) years a person of a given age and sex and country.
The model fit for children [0,5) years can be seen in Figure 5. The prediction errors from applying prediction to the years and countries of the DHS surveys for children [0,5) years can be seen in Figure  6. The counties with worst predictions are outliers to the general pattern (labelled in Figure 7). Fits for children age [5,15) years were comparable.

Mathematical modelling
Additional input data We obtained the WHO TB notification and burden estimate data available from http://www.who.int/tb/country/data/download/en/ (downloaded 11/01/2018). We also obtained the WHO/UNICEF Joint Report Form data on BCG coverage by country and year from http://www.who.int/immunization/monitoring_surveillance/data/en/ (downloaded 11/01/2018). Missing data were filled with coverages from the nearest year. BCG vaccination coverage by year was converted into BCG coverage in children at each at the present. World Bank country income classifications were from http://databank.worldbank.org/data/download/site-content/CLASS.xls (downloaded 15/03/2018).
Pulmonary vs extrapulmonary disease WHO notification data up to and including 2012 included age-and sex stratified data on whether new notifications were smear-positive, smear-negative or extrapulmonary. We aggregated adult data over all complete years by country, classifying either pulmonary or extrapulmonary (see Figure 8), and restricted to countries with more than 200 cases included in their data. Other countries were assigned regional means of proportion pulmonary for each age and sex (see Figure 9), and these proportions were applied to notifications by age and sex to arrive at the number of pulmonary notifications to be followed up with HHCM in a given country. Decision tree model structure The model was implemented as a decision tree using the open source R package HEdtree ( https://github.com/petedodd/HEdtree ). The logic of the decision tree is shown in Figure 10: boxes show intermediate and final outcomes beginning with a child household contact of a TB case of a given age and HIV/ART status. The labels on the arrows denote probabilities of following a particular branch in the tree; these probabilities may depend on the age and HIV/ART status of children. The model is specified within this framework in full on the repository at https://github.com/petedodd/PINT The model object automatically generated Figure 10 (which therefore serves as a way to assess the specified logic), and also generates functions that evaluate the means of quantities (e.g. incidence, prevalence, deaths, life-expectancy etc) over the tree for a given set of input parameters (sampled from the distributions specified below).

Modelling approach and evidence
In this section we describe the modelling logic and choices for particular aspects of the model in more detail.

Number of household contacts
The statistical models described in the previous section were used to generate 1,000 predictions for the mean number of cohabiting children age [0,5) and [5,15)  To extend this to the 217 countries in the WHO TB notification data, we used WHO regional averages of quantities from the 180 countries for the missing 47 countries.
For each of these countries, the WHO TB notification data stratified by age and sex were first adjusted by the proportion pulmonary for each group, then merged with the child household contact predictions (by age, sex and country) and the predicted number of total child TB household contacts in each country computed for each prediction. This results of this analysis were summarised by modelling the distribution of child household TB contacts in each country using separate log-normal distributions.

Prevalence and incidence in household contacts
The prevalence of active TB and latent TB infection (LTBI) in child household contacts of TB patients were based on the meta-analysis of Fox et al. , 1 sampling separately for children in age groups [0,5) and [5,15) years, using the data for low and middle-income countries and high-income countries separately. Co-prevalent children were excluded from LTBI and the resulting number of infections taken as the at risk group for progression to incident TB disease within one year. This model of progression to disease was that used in Dodd et al. 2 Since progression risks following infection in children have been based predominantly on TST status, 3 we used estimates of infection risk determined by TST to estimate the number at risk of progressing to disease.
The prevalence of disease is represented by the decision tree probability 'coprev' in Figure 10, which is a function of age and country income group, and depends on parameters 'coprev04', 'coprev514', 'coprev04hi' & 'coprev514hi' in Table 4 below. The prevalence of disease is represented by the decision tree probability 'ltbi.prev', which is a function of age and country income group, and depends on parameters 'LTBI04', 'LTBI514', 'LTBI04hi' & 'LTBI514hi' in Table 4 below.

Case detection
The overall case detection ratio (CDR) for children age [0,5) and [5,15) for each country can be calculated as the ratio of estimated incidence and notifications reported by WHO. We estimate country-specific beta distribution parameters to model this parameter in each age group from the WHO data. However, we are interested in children cohabiting with notified TB cases, for whom the CDR might reasonably be expected to be higher. At one extreme, it could be the case that every child notified with TB is from a household with a notified adult TB patient. Previous work modelling work for high-burden countries 2 suggested around 70% of TB incidence in children might occur in households with adult TB cases, implying the upper-bound CDR for cohabiting children would be higher by a factor of 1/0.7 = 1.4 (corresponding to assuming all notifications were from households with an index case [numerator], but only 70% of total child incidence occurs in this group [denominator]). To account for this, we therefore scaled the mean of each beta distribution by a uniform variable lying between 1 and 2 (i.e. , with mean 1.5), truncating the mean at 1.
This probability, which depends on age and country, is denoted 'CDR' ('CDRp' is identical in this work) in the decision tree model shown in Figure 10.

HIV
The prevalence of HIV among notified TB cases, and the coverage of ART in TB/HIV cases was determined for each country using the WHO notification data (the denominator being those with test results). There are limited data to inform the detailed relationship between HIV/ART of adult TB cases and cohabiting children. We used data from Martinez et al. 4 to parametrize the probability of a child of age [0,5) or [5,15) years cohabiting with a notified HIV-positive TB case being HIV-positive. We assumed the same ART coverage among HIV-positive cohabiting children as among HIV-positive adult TB notifications. We modelled the impact of HIV/ART status on individual TB risk using the rate ratios in Dodd et al. 5 HIV prevalence in contacts enters the decision tree model in Figure 10 via the HIV & ART prevalence in the entry cohort (at the root compartment). The parameters governing household child HIV prevalence are 'HHhivprev04' & 'HHhivprev514' in Table 4 below.
To bound the potential for household contact management (HHCM), we considered three idealized interventions with perfect coverage: A. a base case where no HHCM occurs; B. HHCM following WHO guidelines with complete coverage -all prevalent TB in children is found, all children under 5 years and all HIV-positive children under 15 years are given PT; C. HHCM as in B, but additionally giving PT to all tuberculin skin test positive children age [5,15) years.
In the decision tree model in Figure 10, these interventions define the age and HIV-status dependent preventive therapy coverages 'PTcovN' & 'PTcovP' for LTBI test-negative and positive, respectively.
For each intervention, we calculated the number of households visited, the number of children screened for TB, the number of children identified with co-prevalent TB, the number of anti-tuberculosis treatments dispensed, the number of PT courses dispensed, the number of children developing incident TB, the number of deaths due to TB, and the expected number of life-years lived by children cohabiting with notified TB cases. We also calculated incremental measures of effort and effect between interventions B and C and the base case A.

Model parameters, distributions and evidence
The parameters in Table 2 pertain to the incidence model described in detail in Dodd et al 2014 2 and Dodd et al 2017 6 and their Appendices. The parameters in Table 3 pertain to the mortality model described in detail in Dodd et al 2017 6 . The parameters in Table 4 are specific to modelling HHCM. See above for discussion of parameters.    Figure 11: Probability distributions for all input parameters. Figure 11 shows the probability density functions for all input parameters. Individual graphs of all distributions and a table of distribution means and quartiles are available at https://github.com/petedodd/PINT/test

Probabilistic sensitivity analysis
The distributions characterising the model input parameters were sampled from to generate 1,000 parameter sets. This dataset was replicated and parameters representing interventions modified before merging. This merged dataset was merged with a dataset of countries (including parameters characterising the mean and variance of the number of child household TB contacts for each country and in each age group). Model outputs for this joint dataset were calculated using the outcome functions determined by the HEdtree package and summaries of the results produced by intervention, and by country, region and globally.

Supplementary Figure
The regional and country share of deaths preventable by HHCM is shown in Figure 12.