Disease progression and mortality with untreated HIV infection: evidence synthesis of HIV seroconverter cohorts, antiretroviral treatment clinical cohorts and population‐based survey data

Abstract Introduction Model‐based estimates of key HIV indicators depend on past epidemic trends that are derived based on assumptions about HIV disease progression and mortality in the absence of antiretroviral treatment (ART). Population‐based HIV Impact Assessment (PHIA) household surveys conducted between 2015 and 2018 found substantial numbers of respondents living with untreated HIV infection. CD4 cell counts measured in these individuals provide novel information to estimate HIV disease progression and mortality rates off ART. Methods We used Bayesian multi‐parameter evidence synthesis to combine data on (1) cross‐sectional CD4 cell counts among untreated adults living with HIV from 10 PHIA surveys, (2) survival after HIV seroconversion in East African seroconverter cohorts, (3) post‐seroconversion CD4 counts and (4) mortality rates by CD4 count predominantly from European, North American and Australian seroconverter cohorts. We used incremental mixture importance sampling to estimate HIV natural history and ART uptake parameters used in the Spectrum software. We validated modelled trends in CD4 count at ART initiation against ART initiator cohorts in sub‐Saharan Africa. Results Median untreated HIV survival decreased with increasing age at seroconversion, from 12.5 years [95% credible interval (CrI): 12.1–12.7] at ages 15–24 to 7.2 years (95% CrI: 7.1–7.7) at ages 45–54. Older age was associated with lower initial CD4 counts, faster CD4 count decline and higher HIV‐related mortality rates. Our estimates suggested a weaker association between ART uptake and HIV‐related mortality rates than previously assumed in Spectrum. Modelled CD4 counts in untreated people living with HIV matched recent household survey data well, though some intercountry variation in frequencies of CD4 counts above 500 cells/mm3 was not explained. Trends in CD4 counts at ART initiation were comparable to data from ART initiator cohorts. An alternate model that stratified progression and mortality rates by sex did not improve model fit appreciably. Conclusions Synthesis of multiple data sources results in similar overall survival as previous Spectrum parameter assumptions but implies more rapid progression and longer survival in lower CD4 categories. New natural history parameter values improve consistency of model estimates with recent cross‐sectional CD4 data and trends in CD4 counts at ART initiation.


I N T R O D U C T I O N
HIV epidemic models, including the Joint United Nations Programme on HIV/AIDS (UNAIDS)-supported Spectrum model [1], use data on HIV prevalence, HIV diagnoses or HIV-related deaths to estimate HIV incidence trends [2][3][4][5] and extrapolate estimates of HIV burden. Relationships between HIV incidence, prevalence and mortality are governed by natural history parameters of HIV infection, consisting of HIV disease progression and mortality rates among untreated individuals. The assumptions of the Spectrum model are regularly updated to make the best possible use of new data. Previous disease progression and mortality rates implemented in Spectrum were estimated using data from observational cohort studies conducted before antiretroviral therapy (ART) was widely available [6]. Survival times after HIV seroconversion observed in East African seroconverter cohorts were a key data source for this estimation [7,8]. These data inform average survival times but may not uniquely identify specific parameters, since similar survival times can result from faster disease progression and lower mortality rates or vice-versa. Evidence from European and African seroconverter cohorts suggested that CD4 counts at seroconversion may be lower, and disease progression faster, than currently assumed in Spectrum [9][10][11].
Population-based HIV Impact Assessment (PHIA) surveys [12][13][14][15][16][17][18][19][20][21][22] conducted in 10 sub-Saharan African countries between 2015 and 2018 measured CD4 counts among nationally representative samples of untreated HIV-positive adults. These data may improve our understanding of HIV natural history. Unlike data from the pre-ART era, CD4 counts observed in untreated PHIA respondents are affected by ART uptake, which may informatively censor people who seek ART due to symptomatic HIV infection. Nevertheless, these data could help identify HIV disease progression and mortality rates among people living with HIV (PLHIV) who are not on ART and ensure that model-based estimates reflect empirical CD4 distributions among untreated adults and at treatment initiation.

M E T H O D S
We conducted a Bayesian evidence synthesis [23][24][25] to update HIV natural history inputs to Spectrum. We included cross-sectional CD4 counts from ten PHIAs alongside previously analysed HIV progression and survival data. We distinguished "training data" included in our synthesis from data withheld for validation. We developed natural history equations to specify inputs for Spectrum, then evaluated the likelihood of training data by direct comparison to these inputs when possible. When direct comparison was not possible, we used models derived from Spectrum to calculate comparators. We describe these components below and in detail in Appendix S1.

Spectrum natural history and ART dynamics
Spectrum and the EPP-ASM sub-model used to estimate HIV incidence [2,26] share common demographic and epidemiologic structure and inputs [27]. Adult HIV disease progression is modelled as seven infection stages defined by CD4 cell count thresholds (CD4>500, 350 to 500, 250 to 349, 200 to 249, 100 to 199, 50 to 99 and <50 cells/mm 3 ; Figure 1) that reflect historic ART initiation criteria and other clinically relevant values [27]. in this supplement) and resume pre-treatment disease progression and mortality rates while off ART. Treatment initiation rates are determined by countryspecific input data about numbers of adult ART patients by sex and year. Spectrum combines two approaches to allocate new ART patients among treatment-eligible infection stages (per contemporary national guidelines): treatment initiation proportional to numbers eligible, or proportional to expected HIV-related deaths in each stage, representing that those at higher mortality risk are more likely to have AIDS symptoms and seek care. Spectrum assigns weights to expected deaths and (1 − ) to numbers eligible to average these approaches. Country analysts can modify , but since this parameter is not observable, most assume the default = 0.5 to give equal weight to both approaches.

Natural history model parameterization
We specified parametric models for the distribution of CD4 count following seroconversion, CD4 count decline rates and HIV-related mortality by CD4 category, resulting in 11 parameters that determine the 80 natural history inputs in Figure 1. Parameters varied by four age groups a = 0, 1, 2 or 3, denoting 15-24, 25-34, 35-44 or 45+, respectively.

Initial CD4 cell counts
We modelled initial CD4 counts using a log-logistic distribution with age-specific median m a and shape 1 . We specified the median directly for ages 15-24 ( m 0 = 2 ) then assumed medians decreased linearly with increasing age ( 3 ). The proportion of seroconverters with initial CD4 counts of c or less is as follows: (1) We derived Spectrum inputs h,a directly from P a .

CD4 decline after seroconversion
We modelled CD4 counts as decreasing continuously over years t since seroconversion according to polynomial curves. We specified these curves c a (t) in terms of a baseline CD4 count c a (0), an age-specific "depletion time" a when CD4 counts reach zero cells/mm 3 and shape parameter 1 . We specified depletion times relative to ages 15-24 ( 0 = 2 ) that decreased linearly at older ages ( 3 ).
We defined baseline CD4 counts c a (0) as the median among seroconverters with initial CD4 counts over 500 cells/mm 3 (Equation 1). We calculated progression rates h,a for Spectrum infection stages as the reciprocal of the time between crossing the upper and lower CD4 count thresholds of each stage.

Mortality by CD4 cell count
We modelled HIV-related mortality rates a (c) that increase rapidly as CD4 counts c decline [5].
The shape parameter 1 controls how rapidly mortality increases as CD4 counts fall. Other parameters specify mortality rates among 15-to 24-year-olds with zero CD4 cells/mm 3 ( 2 ), a linear age effect ( 3 ) and a mortality rate ratio ( 4 ) that adjusts for potential bias in all-cause mortality training data. We derived mortality rates h,a for Spectrum infection stages as the average rate across CD4 counts in each stage.

Treatment initiation
We assumed a non-informative ∼ Uniform(0, 1) prior on Spectrum's treatment allocation weight.
Prior distributions on parameters are provided in Table 2 and described in Appendix S1.

Training data
Our synthesis incorporated four data sources that each inform one or more natural history components. We formulated the joint likelihood as the product of likelihoods of each data source.
To inform initial CD4 counts, we included data on CD4 counts measured within 1 year of seroconversion compiled from 13,176 seroconverters in Europe, North America, Australia and sub-Saharan Africa between 1982 and 2012 [9]. Data consisted of seroconverter counts tabulated by sex, CD4 category and 5-year age group at seroconversion. We assumed that CD4 category frequencies were multinomially distributed conditional on age.
To inform HIV-related mortality rates by CD4 category, we included all-cause mortality data from the Concerted Action on SeroConversion to AIDS and Death in Europe (CASCADE) Collaboration [28] consisting of 12,679 personyears of follow-up and deaths observed in 1997-2004 among untreated HIV seroconverters by age and infection stage. Seroconversion date was identified from negative and first positive HIV tests done at most 3 years apart. We extracted data from the published report, which excluded people exposed to HIV via injection drug use. We assumed that observed deaths were Poisson-distributed conditional on person-years of follow-up and all-cause mortality rates, calculated as unadjusted (i.e. 4 = 1) HIV-related mortality plus background mortality. We used age-specific central mortality rate estimates for 1985-1990 Europe [29] weighted by sex (14% female) to approximate background mortality.
For overall HIV survival, we used individual-level data from four population-based cohort studies of HIV seroconverters from East Africa [7]. Individuals were followed during 1994-2004. Seroconversion dates were identified as the midpoint between last negative and first positive HIV tests done at most 4 years apart. Individuals exited at death or were censored at end of study or when last known alive. We included data from 1421 participants aged 15-59 at seroconversion. Overall survival depends on initial CD4 counts, disease progression, HIV-related mortality and underlying non-HIV mortality by age. Therefore, we calculated survival probabilities and mortality densities using a compartmental cohort model that implemented aging, background mortality and HIV natural history ( Figure 1) as in Spectrum. Since most (70%) seroconverters were from Ugandan cohorts, we incorporated contemporary age-and sex-specific Ugandan background mortality rates [8,29].
We included data from 10 Africa-based PHIAs conducted in 2015-2018 (Table 1). Data consisted of survey-based estimates of nationally representative CD4 category distributions in untreated PLHIV by 5-year age group (15-19 to 45-49). HIV-positive survey respondents were classified as on ART if they reported ART use or had detectable antiretroviral drugs in blood, and off ART otherwise. These data inform initial CD4 counts, disease progression and HIV-related mortality; they also inform ART uptake since some PLHIV may be "missing" because they started treatment. Since past HIV incidence, treatment scale-up and demographic dynamics also affect these data, we used EPP-ASM to calculate crosssectional CD4 count distributions from our natural history inputs. Remaining EPP-ASM inputs were taken from national Spectrum files derived from the 2019 HIV estimates round [30]. We assumed that observed CD4 counts were multinomially distributed conditional on numbers of untreated HIVpositive survey respondents and modelled age-specific CD4 count distributions.

Validation data
We compared estimated CD4 counts at ART initiation, also calculated using EPP-ASM, to data from ART initiators in IeDEA (International epidemiology Databases to Evaluate AIDS) collaborating clinical cohorts [31]. CD4 counts were measured within 3 months before ART initiation in previously ART-naïve patients. We compared EPP-ASM outputs to data from 471,220 ART initiators aged 16+ in seven countries (Côte d'Ivoire, Malawi, Nigeria, Tanzania, Uganda, Zambia and Zimbabwe), where PHIA data were also available. Since IeDEA sites may not be nationally representative, we aggregated ART initiators across countries. We weighted EPP-ASM estimates by numbers of ART initiators at sites in each country. We restricted this comparison to data from 2005 to 2014, as data were not consistently available from these countries outside this period.
To determine whether natural history input estimates were appropriate outside sub-Saharan Africa, we compared estimated survival times to published estimates in European, Australian and North American seroconverter cohorts [32]. These survival estimates included CASCADE cohort members and overlap with some of the training data about initial CD4 distribution and mortality by CD4 category.

Analyses
We used incremental mixture importance sampling (IMIS) [33] to sample posterior distributions on model parameters. We reported posterior mode point estimates and 95% central credible intervals about model parameters and outputs.
We used the IMIS sample with maximal posterior density to approximate the posterior mode. We evaluated goodnessof-fit to training data graphically. We compared training and validation data to posterior predictive distributions on model outputs. In sensitivity analysis, we considered an alternate model that incorporated sex differences in disease progression and mortality via a rate ratio parameter subject to a Gamma(2,1) prior.

Ethics statement
Local ethics committees in each country and Institutional Review Boards at the US Centers for Disease Control and Prevention and Columbia University approved PHIA surveys. All participants provided written informed consent. Anonymized PHIA data were used for statistical analyses. Each programme participating in the IeDEA collaboration obtained ethical approval from relevant local institutions to collect and share patient data; in addition, each regional data centre obtained ethical approval to collate and analyse the de-identified data. All collaborating cohorts in CAS-CADE received approval from their regulatory or national ethics review boards (Appendix S3). Separate ethical approval was not required for this study for data from ALPHA Network or Rwanda Maternal Child and Health studies, as these were secondary analyses of fully anonymized existing datasets [7].

Parameter estimates
The median CD4 count at seroconversion for persons aged 15-24 was estimated to be 579 cells/mm 3 [95% credible interval (CrI): 570-589] and would decline to zero CD4 cells/mm 3 23 years (95% CrI: 21.5-24.4) after infection ( Table 2). CD4 counts declined approximately linearly, but slopes decreased gradually as infection duration increased (Figure 2c). The ART allocation weight was 0.21 (95% CrI: 0.16-0.25), indicating more weight to numbers of people in each eligible infection stage and less weight to expected deaths in those stages, when calculating ART uptake. Table 3 reports corresponding Spectrum inputs; Appendix S2 compares our estimates to previous Spectrum assumptions.

Model fit to training data
Initial CD4 counts were available from 13,176 individuals. The log-logistic distribution fitted these data closely (Appendix S2). The estimated proportion with 200-349 CD4 cells/mm 3 was slightly higher than the observed distribution and proportion with 350-500 cells/mm 3 slightly lower but provided a higherlikelihood fit than alternate distributions considered (lognormal or gamma). CASCADE data on mortality by CD4 category consisted of 665 deaths during 12,679 person-years. Fitted unadjusted mortality rates were close to observed rates in most age groups and CD4 categories (Figure 3). The large credible intervals about the adjusted rates reflect substantial uncertainty in mortality rate and rate ratio estimates.
We included survival data for 1421 seroconverters from Tanzania

Model validation
Compared to published median survival times in European, North American and Australian seroconverter cohorts, adjusted survival time estimates after seroconversion agreed well at ages 15-34 ( Figure 6). Fitted median survival for the adjusted inputs was shorter than observed survival in 35-to 54-year-old seroconverters from these regions, while median survival for the unadjusted inputs was longer than observed survival in 15-to 34-year-old seroconverters despite overlapping patient populations in the survival time and mortality rate studies [28,32]. We compared CD4 count distributions at ART initiation to fitted model trends (Figure 7). Cohort data were available from 471,220 ART-naïve patients in seven countries (65% from Zambia; Malawi, 10%; Uganda, 9%; Nigeria, 5%; Côte d'Ivoire, 4%; Zimbabwe, 4%; and Tanzania, 3%). Observed initiation at CD4 counts below 50 cells/mm 3 declined from 24% in 2005 to 8% in 2014. The modelled proportion was comparable but declined more slowly, from 19% (95% CrI: 13-22%) to 14% (95% CrI: 11-16%). Modelled proportions initiating ART at 50-200 and 200-350 CD4 cells/mm 3 were similar to observed trends. Spectrum imposes national treatment guidelines, which limited treatment to PLHIV with lower CD4 counts (or members of specific eligible populations like pregnant women or tuberculosis patients) before the World Health Organization raised recommended CD4 count thresholds for ART initiation in 2013 and 2015 [34,35]. Consequently, modelled proportions starting ART at CD4 counts above 350 cells/mm 3 (<3%) were substantially lower than observed proportions (5-14%) before 2013.

Sensitivity analysis
Sex-disaggregated data were available from two training data sources (initial CD4 counts and survival after seroconversion).
We considered an alternate model that scaled progression and mortality rates for females by a common rate ratio relative to males. The posterior rate ratio estimate (0.99, 95% CrI: 0.94-1.02) suggested that sex differences were negligible (Table S1).

D I S C U S S I O N
Since 2015, the World Health Organization has recommended offering ART to adults living with HIV regardless of CD4 cell counts [35]. While the role of CD4 count in managing HIV patient care is diminished, it remains an important prognostic marker for modelling disease progression, AIDS-related death and care seeking. We used CD4 counts measured in PHIA household survey respondents living with HIV and among HIV seroconverter cohorts to revise the natural history parameters in Spectrum. These data, in conjunction with data collected on HIV survival before widespread ART availability, allowed us to estimate rates of disease progression and HIVrelated mortality more precisely than with the latter data alone.
This revision was motivated by comparison of Spectrum outputs to PHIA data. Previous natural history assumptions in Spectrum [36] generally resulted in overestimating CD4 counts in untreated adults compared to new survey data. Our updated estimates suggest that newly infected people living with HIV are more likely to have lower initial CD4 counts, and progress through infection stages more rapidly, but experience lower HIV-related mortality rates in most stages, consistent with other analyses of CD4 progression in HIV seroconverter cohort studies [9][10][11].
Our natural history parameter estimates may increase HIVrelated deaths in Spectrum. While our mortality rates are generally lower than previously assumed, mortality remains high in advanced infection stages, and people reach these later stages more quickly. This decreased survival times with untreated HIV slightly compared to previous Spectrum inputs. Due to faster progression, people living with HIV may also initiate ART at lower CD4 counts than previously estimated, which will likely increase modelled estimates of HIV-related deaths since immune status at ART initiation is a key determinant of mortality on ART [37].
Our synthesis included survey data from the current era of widespread ART availability, which posed important inferential challenges. Our parameter estimates depend on modelled ART uptake and retention dynamics [9] in Spectrum. This sensitivity is both a strength and a limitation. Sensitivity to ART uptake provided leverage to estimate an allocation parameter used to model ART initiation, which had little empirical basis previously. Sensitivity to modelled treatment interruption is a limitation, however. Country analysts may enter national treatment retention rates into Spectrum, but these data are often sparse and confounded by silent transfers between care facilities [37][38][39][40].
Our natural history parameter estimates produced CD4 count distributions that generally matched PHIA data well. However, our model did not explain variation in proportions with CD4 counts above 500 cells/mm 3 . Our model was equally likely to underestimate or overestimate proportions in this stage, which could reflect geographic heterogeneity in CD4 count distributions among HIV-negative adults [41] and in HIV progression, which may owe to environmental, host or viral factors, such as differences in HIV subtype [9,10,[42][43][44][45][46]. We have not investigated these here because estimation of country-level effects risks overfitting.
Our survival estimates were comparable to previously published survival distributions in East African cohorts and cohorts in Europe, North America and Australia [7,32]. Previous analyses of longitudinal individual-level data found that CD4 counts at seroconversion were lower, and progression and mortality rates were higher in southeastern and eastern Asia compared to Europe and North America [9], consistent with relatively short survival times estimated in Thai cohorts [7,47] and comparative analyses of CD4 count trends between seroconverters in Beijing and the CASCADE Collaboration [48]. Regional variations in natural history assumptions have not been implemented in previous Spectrum versions or our analysis, but could be considered in future multi-parameter evidence synthesis exercises.
Spectrum previously assumed that disease progression and mortality rates did not vary by sex. We considered relaxing     this assumption, but our estimated sex differences were negligible. This is consistent with previous analyses, which suggest that sex differences may be small [7,9,11,46] after adjusting for age and immune status. Nevertheless, we regard our finding with caution. Our study had limited power to identify sex differences because only two data sources were disaggregated by sex.
Our analysis has several limitations. Our synthesis used data from sub-Saharan Africa, Europe, North America and Australia, and may not generalize to other epidemic contexts. Our mortality rate estimates by CD4 category were about twice as large as observed among seroconverters in the CAS-CADE Collaboration [28]. This may indicate bias in the data on mortality by CD4 category, for example, potentially due to better management of opportunistic infections, but we were unable to determine specific mechanisms for that apparent bias. We used national HIV epidemic models to calculate CD4 count distributions for comparison to survey and ART initiator data, thus our natural history parameter estimates may be sensitive to other structural or input assumptions in those models.

C O N C L U S I O N S
Our update to natural history parameter estimates formally synthesizes information from multiple, important sources of information on HIV disease progression and mortality. New data on immune status of HIV-positive PHIA respondents provide the power necessary to identify specific parameters, adjust for potential biases and strengthen the empirical basis for the timing of ART initiation in Spectrum.

A C K N O W L E D G E M E N T S
We thank Jim Todd for sharing data on survival after HIV seroconversion and Nina Anderegg for contributing ART initiator cohort data. We appreciate the UNAIDS Reference Group on Estimates, Modelling and Projections for input on this work. We thank the PHIA participants for their contribution to this work, and the PHIA principal investigators for making survey data available.

D I S C L A I M E R
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the funding agencies.

C O M P E T I N G I N T E R E S T S
The authors declare that they have no competing interests.

A U T H O R S ' C O N T R I B U T I O N S
RG, JS and JWE conceptualized this study. SJ, NK, SB, LFJ, TM, NP and JWE contributed to data collection. RG, SJ, NK and JS analysed the data. RG, SGM, LFJ, TB, JS and JWE interpreted the results. RG wrote the first draft of the manuscript, which was revised by all authors. All authors have reviewed and approved the final version of the manuscript.

S U P P O R T I N G I N F O R M AT I O N
Additional information may be found under the Supporting Information tab for this article: Table S1. Model sensitivity to sex differences in progression and mortality.