PARAMETERISATION OF EXPECTED RESIDUAL LIFETIME AFTER SEROCONVERSION IN A UGANDAN SAMPLE POPULATION

The consequences of the HIV/AIDS epidemic in Africa are expected to be considerable. A great deal of what is presently known about the HIV/AIDS epidemic has originated from studies in Europe or the Americas, or from urban African contexts. In contrast, the majority of people in sub-Saharan Africa infected with the virus will live in rural communities. It is important that demographic information be obtained for this rural African environment. This contribution is aimed at investigation and verification of HIV-1 mortality patterns in a rural African context. The authors examine eight years of incident HIV mortality data arising from the Masaka district in Uganda. These data are bivariate and of a type termed doubly censored. The authors explore the non-parametric procedures required for estimation of self-consistent patterns using doubly censored data. Information on age and gender is also incorporated into the analyses. The results are parameterised using a range of survival probability models. The Gompertz probability model provides a reasonable emulation of eight years of empirical underlying hazard, over all age classes, and can be considered as a reasonable first approximation to the HIV/AIDS mortality model for the phenomenon and context of this study. The results also indicate that expected time to death at seroconversion is appreciably affected by age at seroconversion. Extrapolation of these mortality patterns into the future will, however, necessarily remain unsubstantiated until further data become available and the ability to generalise these findings to other contexts remains a subject worthy of further study.


INTRODUCTION
1.1 It has been estimated that over 25,3 million people in Africa have already become infected with the human immunodeficiency virus (HIV), most in sub-Saharan countries (UNAIDS, 2000).As HIV/AIDS spreads relentlessly within Africa, the societal impacts are being felt.For example, dramatic changes are occurring in the age distribution of populations as a result of AIDS-related mortality (Caraël et al, 1998).Unlike other diseases, HIV usually infects adults in the most productive years of their lives, and the economic and social influences of the epidemic are predicted to be dire.Apart from immense suffering for infected individuals and their survivors e.g.orphans, the HIV/AIDS epidemic is expected to result in productivity losses, to exacerbate poverty and inequality and to strain already overburdened social welfare and health agencies within Africa.In order to take decisive steps to attempt to alleviate the impact of the HIV/AIDS epidemic and to make provision for its expected effect, we need to better understand the demographic consequences of HIV/AIDS.This present contribution is aimed at evaluating aspects of individual mortality patterns of the epidemic within a rural African context.
1.2 The AIDS epidemic is now over a decade old and research in the developed world has achieved some success in understanding the natural history of the virus and its epidemiological consequences in certain population subgroups.Knowledge of the long-term mortality patterns has been studied in many developed countries in cohorts of homosexual and bisexual men (Goedert et al, 1986, Moss et al, 1988, Munoz et al, 1989, Rutherford et al, 1990), transfusion patients (Ward et al, 1989), haemophiliacs (Jason et al, 1989, Kim et al, 1993, Darby et al, 1996), vertically infected (perinatal) patients (Auger et al, 1988) and parenteral drug users (Rezza et al, 1989).The influence of antiretroviral therapy on mortality is also beginning to surface.A decline in AIDS-related deaths and incidence in the USA and western Europe suggests the impact of new antiretroviral combination therapies.These therapies are thought to delay the progression to AIDS and prolong the survival of AIDS patients (Fleming et al, 1998, Hamers et al, 1998, Pezzotti et al, 1999).
1.3 Conversely, reliable HIV/AIDS data is inherently scarce from less-developed countries (LDCs), and especially so from Africa.It is clear that the HIV epidemic, even within Africa, consists of a multiplicity of smaller epidemics each with its own pattern of transmission routes and each with its own dynamics (Udjo, 1998).Researchers do not fully understand the factors underlying the differences in the dynamics and spread of HIV, but there is evidence that the rate of spread of HIV is determined by a complex interplay of behavioural and biological factors (Buvé & Rogers, 1998).Variation in treatment facilities, mode of HIV infection and behavioural and socio-economical characteristics of any studied population are likely to play an important role in influencing documented mortality patterns.
1.4 Where data on HIV mortality patterns are available from Africa, many of the studies have been conducted within an urban or hospital context (Mann et al, 1986, Nelson et al, 1987, De Cock et al, 1990, Lindan et al, 1992).This predominance continues despite the knowledge that most people in sub-Saharan Africa live in rural communities.The vast majority of people with HIV live in LDCs where access to curative treatment is limited or simply non-existent.There is thus an urgent need to describe more explicitly the patterns of expected AIDS-related mortality in a rural context.
1.5 That said, empirical studies on the mortality impact of the disease from African community studies are beginning to emerge (Caldwell, 1997, Nunn et al, 1997).Boerma et al (1998) documented a two-to threefold increase in total adult mortality levels in rural African communities with adult prevalence levels below 10%.Mortality amongst adults ranged between 5 and 11% per year, and more than half of all adult deaths could be attributed to HIV/AIDS.Early studies from Masaka, a rural population in south-west Uganda, have documented the seriously high prevalence levels of HIV amongst the adult population (Mulder et al, 1994).Later follow-up studies on the same sample population show that HIV-1 infection was associated with high death rates and a reduction in life expectancy (Nunn et al, 1997).
1.6 In this research, the authors focus on quantifying mortality patterns of HIV-positive individuals from a rural African population.They examine eight years of mortality data for incident HIV/AIDS individuals from the continuing Masaka study.Here, heterosexual transmission is the principal mode of HIV-1 infection.The incidence of tropical and other infectious diseases, such as tuberculosis, is high.The availability of prophylaxis is limited and antiretroviral therapy is non-existent.This research has two primary foci: -The first is to explicitly model the probability of death from time of seroconversion in a rural African context.Often simpler statistics are utilised to describe the probability of dying (e.g. 45 q 15 : Murray et al, 1992; median term to death: Boerma et al, 1998).It is of considerable heuristic worth examining a likely family of probability density functions that well mirrors the empirical mortality curve.It is also useful attempting to assess the impact of gender and age on the underlying survivorship process.The non-parametric method of self-consistency (sensu Turnbull 1976) is utilised in estimating a mortality curve from doubly censored (see below) survival data.-The resulting curves are then parameterised using a probability model that satisfactorily emulates the self-consistent mortality estimates.For this purpose, the Gompertz model is suggested as a useful initial approximation to the HIV/AIDS mortality phenomenon.The implications of model imputation and the need for further mortality data are discussed.
1.7 It is worthwhile justifying here the complex fitting of self-consistent estimates to the doubly censored survival data and the subsequent parameterisation of these self-consistent estimates, and not the raw data themselves, using a suitable probability model.The reasons for so doing are twofold: firstly, the data is censored according to a complex mechanism.While it may be possible to represent censored mortality data plainly by, for example, assuming that seroconversion takes place at the mid-point of some interval, this simplification comes at the cost of estimation accuracy.The estimates from self-consistency patterns and those of simplified data are not likely to be the same.
Secondly, the study must be repeatable as more data becomes available.The same methodology of self-consistency needs to be applied, and the results in the future may well look different.It is sensible to explicate the methodology in this contribution for this very purpose.

2.1
In epidemiological studies of the human immunodeficiency virus (HIV) disease, interest is focused on the distribution of the length of time between two events.Figure 1 shows a schematic diagram of the natural progression and classified states of the HIV disease.The incubation (or induction) period commonly refers to the time period between infection with HIV (the initiating event) and the time of diagnosis of AIDS (the subsequent event).Conversely, the latency period refers to the time period between infection with the HIV virus and seroconversion.Individuals with HIV are thought to be infective from the onset of infection up until death.The level of infectivity is thought to be highest with seroconversion and the onset of symptoms.The diagnosis of AIDS corresponds to the development of certain specified medical conditions.It is common for other symptoms and infections to occur well before formal AIDS is diagnosed.In Figure 1 the stages of disease progression have not been drawn to scale (i.e. the latency period and the time from diagnosis of AIDS to death are likely to be very much shorter intervals than the time from seroconversion to diagnosis of AIDS).The latency period has been estimated to be as short as six weeks to three months depending on the specific assay used (Gurtler, 1996, Hashida et al, 1997).Studies and estimates of incubation period are plentiful (reviewed in Foulkes, 1998).In other examples, the focus has been on the interval between infection and death (often 'survival time' or 'term to death').In this research, we focus on the time from seroconversion to the time of death (termed 'residual life post-seroconversion').It should be noted that the survival time of an individual is a sum of the jointly distributed variables: latency period and residual life post-seroconversion.These variables are often assumed to be independent, within a given specific group, as a simplification that allows mathematical tractability, but which may demand further enquiry as more data emerges.
2.2 In the study of intervals between two events of interest, as is the case in many of the examples mentioned above, statistical estimation procedures are complicated by the fact that the initiating event and the subsequent event are not observed exactly.For example, in a study of a French population of haemophiliacs at risk of HIV infection from contaminated blood, infection was only known to have taken place between two points in time (infusions were periodic, and screening for HIV was retrospective from stored blood samples) (Kim et al, 1993).Univariate data of this type is said to arise from interval-censoring.In the study of Kim et al, as is the case with many HIV/AIDS studies, the subsequent event did not always materialise (e.g.development of AIDS) within the observational window.We say that the time to AIDS is censored on the right-hand side (Nelson, 1982).If all that is known is that the subsequent event (AIDS) has already occurred for an observation, we term the observation 'left-censored'.Porter et al (1999) illustrate the potential biases that ignoring right-censoring may introduce to the estimation of the incubation period.
2.3 When censoring impacts on both the initiating and subsequent event, we term the bivariate data 'doubly censored'.Jewell (1994) reviews doubly censored data within the context of HIV epidemiological studies and the circumstances in which it is expected to arise.Double censoring, as implied here, should not be confused with the univariate situation (also termed double censoring) in which a single event is observed within a window for some subjects and left-or right-censored for others (e.g.Samuelsen, 1989).
2.4 With censored data, the statistical problem is one of making inferences about a stochastic process for which the realisations are incomplete in time.A range of statistical methods have been developed to deal with different types of censoring (see Foulkes, 1998).In the authors' Masaka mortality data set, the time of individual seroconversion is known only as the interval between two points in time.Furthermore, not all individuals in the study had died by the time that the present study window had terminated.The bivariate data is thus clearly doubly censored.The authors review and implement a non-parametric model that may be used to derive a parsimonious estimate for the probability of death from time of seroconversion from this doubly censored data.

SAMPLE DESCRIPTION
3.1 Data on survivorship was gleaned from HIV/AIDS incident cases from the rural district of Masaka in south-west Uganda (see Nunn et al, 1997 for a description of the sample population).Between 1990 and 1998, 154 individuals were identified as seroconverting.These cases constitute incident enrolment data in the Ugandan AIDS natural history cohort study (NHC) (Morgan et al, 1997, Morgan et al, 1999).The age and gender of each incident case was recorded at time of seroconversion, and mortality status (i.e.alive or dead) for each individual tracked (in the current data set) for up to eight years after seroconversion.Serostatus was assessed by periodic screening.All incident subjects had an HIV negative test followed by an HIV positive test result.The period in which an individual seroconverted was limited to a window of two years or less in all cases.The authors assume an upper bound to the window of two years in the resulting analysis since data was not available to them on precisely which individuals were diagnosed as HIV positive in a period shorter than this two-year time frame.It has been estimated that some 69% of deaths in the Masaka population can be directly attributed to HIV/AIDS (Boerma et al, 1998).(In that study the author investigated a "population attributable fraction" defined as the ratio of excess mortality to total mortality.) 3.2 Individuals were initially categorised into three age groups: 13 to 24 years old, 25 to 44 years old and 45 years and older at time of seroconversion.Preliminary statistical analysis revealed that gender has no statistically significant effect on survival (Log-rank test: P = 0,42).However, Cox's proportional hazards method applied to the three age groups showed a highly significant effect of age on survival (P = 0,0003).The relative risk was computed for each age group.With the under 25s as a baseline reference group, the middle age group has 1,6 times the reference hazard of death and the old age group has 8,7 times the reference hazard of death.This hazard of death is greater than expected given information on the age group alone.No difference in the strength of the association was reported when survival was adjusted for gender by age group (Whitworth, personal observation).There is thus no statistically significant interaction between age group and gender known to the authors, nor accessible to review in the data format to which they currently have access.
3.3 At this stage of the ongoing collection of data from the Masaka study, it therefore appears that there are no statistically significant gender differences in mortality.This finding is consistent with what has been reported from industrialised countries (Melnick et al, 1994, Von Overbeck et al, 1994, Philips et al, 1994, Lepri et al, 1994).However, it should also be noted that, while the evidence here does not indicate significant differences between male and female mortality, several sources do point to the fact that females in some parts of Africa are seroconverting at earlier ages than males, and that the rise in mortality amongst Ugandan females specifically is higher than that evident in their male counterparts (Timaeus, 1998).
3.4 We focus on four groups (three age groups and a pooled group).A summary of the mortality data is presented in Table 1 for the pooled group, ignoring age, and for three subgroups categorised by different age classes.In that table, mortality profiles are tracked for eight years from time of first-observed seroconversion from 154 incident cases.The time of seroconversion has been assumed, for the sake of presentation only, to be the mid-point between the last negative and first positive HIV test assays.The percentage alive and the number of individuals dying at different times from seroconversion are presented for all ages, and for three age subgroups.The data was not available at this stage of analysis in any other format or with any additional augmenting information.), and let the complete set of possible observable values of T be denoted t 1 < t 2 < … < t s and let f k = Pr(T = t k ).The admissible values may or may not be spaced at equal intervals.The mathematical problem is to estimate the unknown probabilities w = (w 1 ,w 2 ,…,w r ) and f = ( f 1 , f 2 ,…, f s ).Gruttola & Lagakos, 1989).The intervals between successive points on the axes need not be equal, nor equal in number.For the observation that C is interval-censored and D is known exactly, the admissible (c,t) region is a diagonal line.When C is interval-censored and D is right-censored, the admissible region is a parallelogram extending to infinity on the t axis.Otherwise we obtain a simple parallelogram.However, the continuous admissible sets are handled by a discretised framework using a discrete ( ) r s ´reference grid for each (C L ,C R ,D L ,D R ).

4.3
We define a matrix a of indicators of whether (c j , t k ) is an admissible value of an observation of (C , T ): The likelihood function for N subjects together with the non-parametric self-consistent solution is presented in Appendix A. The algorithm for deriving self-consistent estimates from doubly censored data has been coded in S-PLUS (Mathsoft, 1998).This code is available on request from the authors.

Self-consistent estimates ( $ $ )
w, f are estimated via a two step process -first the marginal values of $ w are derived, and then the conditional values of $ $ f w computed.We iterate between these two steps until self-consistency is achieved.This formulation and procedure follows the workings of Kim et al, (1993), and obviates the problem of nonidentifiability where ( $ $ ) w, f are estimated simultaneously (see Section 5 below).
4.5 First, the self-consistent estimates ( $ $ ) w, f were computed for each of the four groups.We denote estimates for the pooled group as $ f all , and for the three different aged groups as $ f 13-24 , $ f 25-44 and $ f 45+ .Results are presented in Table 2. Upper and lower 95% confidence bounds to the self-consistent estimates were constructed about the f's for each group.The methods for deriving the variance estimates for the abovementioned solutions are presented in Appendix A. The empirical survivorship curves 1 -F(t) are graphically contrasted with the self-consistent survivorship estimates 1-$ ( ) F t in Figure 3; 95% confidence bounds about 1-$ ( ) F t for each of the four age groups are illustrated by the vertical lines.The empirical survivorship curves have been taken, for convenience, from the mid-point between the last negative and the first positive HIV test assays.The self-consistent method of analysis fully subsumes the nature of the interval-censoring, however.
4.7 Secondly, we aim to parameterise $ f by fitting an explicit probability distribution model to the cumulative form of the self-consistent estimates $ ( ) F t .Several survivalprobability models (notably: exponential, Weibull, Gompertz and linear-hazard) were fitted to the self-consistent estimates $ ( ) F t using maximum-likelihood (MLE) methods, and then tested for goodness-of-fit.In this way, we use the MLE methods as if the self-consistent estimates are the raw uncensored data themselves.self-consistent estimates.Hazard rates, in general, tended to display increasing rates, albeit often with appreciable variation.Figure 4 shows empirical hazard-rate plots of the self-consistent estimates.The hazard rate is the instantaneous probability of death over time h The non-continuity of points is due to the unavailability of estimable hazard in those years in which no individuals died.
4.9 For the purposes of this research, we nominate the Gompertz model as it appears to best approximate the common underlying changing probability of death over time since seroconversion (hazard being an exponential function of time since seroconversion).It is worth while noting here that other probability models, notably the two-parameter Weibull model, can also be seen to mirror the patterns of hazard reasonably well (hazard being a power function of time since seroconversion in a Weibull context).The Gompertz model tends to provide a better goodness-of-fit to the empirical data than does the Weibull, although the magnitude of the differences is not statistically significant at the alpha = 0,05 level (personal observation).The important point to acknowledge is that different imputed models will yield considerably different mortality projections in later years (see Section 5).
4.10 The Gompertz model was parameterised for all four groups.The two-parameter Gompertz model invokes the probability density function (pdf) of a random variable T: The cumulative distribution of T is G(t) where (2) Hazard is an exponentially increasing function of t, i.e. exp(l + xt).The mean of the Gompertz distribution isA e e ( / )/ l l x , where . (3) The Gompertz model was fitted to each of $ f all , $ f 13-24 , $ f 25-44 and $ f 45+ .The resulting parameterisations and standard errors about the estimated parameters are presented in Table 3 and the corresponding self-consistent estimates $ ( ) F t and parameterised Gompertz model $ ( ) G t portrayed graphically in Figure 5.In that figure upper and lower 95% confidence bounds about $ ( ) F t have been included.A q e y e dy Table 4(b) the same summary measures, but now conditioned on the total expected lifetimes.4.11 $ ( ) F t was tested against the imputed Gompertz model structures within the same age groupings for goodness-of-fit using the Kolmogorov-Smirnov two-sample test.The results are shown in Table 5.In that table D max corresponds to the maximum vertical distance between the two cumulative probability curves $ ( ) F t and $ ( ) G t .The P-value reflects the hypothesis that two samples were drawn from the same distribution.None of the fitted models displays statistically significant lack-of-fit.However, it should be noted that this apparent fit is due only to the fact that we are contrasting so few observations (a

Years post-seroconversion
Cumulative probability of death

All ages Ages 13 to 24
Ages 25 to 44 Ages 45 and over FIGURE 5. Self-consistent and Gompertz estimates of the cumulative probability of death maximum of eight and a minimum of five).Only a maximum difference D max of 0,75 would result in statistical significance (at P < 0,05) with a sample size of eight, and a D max of 0,90 would result in statistical significance (at P < 0,05) with a sample size of five.This fact makes rigorous alternative model selection difficult until further data becomes available.We also tested the observed and predicted numbers of individuals dying in each year after seroconversion with a chi-square test.Results once again showed that no statistically significant differences were apparent.4.12 Thirdly, we contrast our derived mortality estimates with several other proposed models and estimates.One noteworthy application of HIV mortality modelling in Southern Africa is within the healthcare, individual life and employee benefit insurance industries.Here, the model developed by the Actuarial Society of South Africa (ASSA) has been serving in many instances as a first approximation to the development of the epidemic and as a description of the expected trends in mortality of those adults contracting HIV.The ASSA600 model was proposed for the South African population at large, and not simply the historically disadvantaged and rural-dwelling black individuals of the population (Dorrington, 1999).However, the ASSA600 model is understood to encompass the mortality dynamics of such individuals as well, and therefore provides a useful benchmark for comparison with the empirical patterns obtained in this study.

4.13
The ASSA600 mortality model is determined by one parameter, f, the median time to death of HIV-infected adults, and setting f = 10.The proportion P(t) of individuals surviving for less than t years after infection is described by the cumulative distribution function: (4) where f = 10 with corresponding density p(t).The mean survival time under ( 4) is .
4.14 We may not expect the mortality dynamics in Southern Africa to mirror those in rural Uganda for a number of reasons, although the socio-economic conditions are likely to be reasonably comparable between rural areas in both countries.Subtypes of HIV-1 are thought to differ between Southern Africa (mostly subtype C) and Uganda (subtypes A and D) (Janssens et al, 1997, Brennan et al, 1997, Van Harmelen, 1999).Recent work has suggested that different subtypes may have different rates of disease progression (Kanki et al, 1999) but this view is not well supported at the moment (Alaeus et al, 1999).It is of interest to examine how the shape and moment-estimates of the ASSA600 formulation differ from the mortality patterns estimated in this study.We assess the differences between $ ( ) f t , $ ( ) g t , and p(t).

4.15
The density and cumulative distributions of the ASSA model are contrasted graphically with those of the fitted models in Figure 6 (overleaf).The first graph in that figure represents the imputed probability density functions for the Gompertz $ ( ) g t , and ASSA600 model formulation p(t), and the second represents the cumulative probability model structure $ ( ) G t and the ASSA600 mortality formulation P(t).In the latter, the horizontal line at p = 0,50 transects the fitted curves at their median values.This pictorial representation shows that the ASSA600 model may be too lenient initially in gauging mortality rates, and only too severe in the young age groups after year six from seroconversion.However, this inference is based on the assumption that mortality trends in rural Uganda are accurately representative of the South African population at large.This assumption warrants further scrutiny and enquiry.4.16 Lastly, it is of interest to compare our estimates for mortality post-seroconversion with those arising from the Americas and Europe (mostly subtype B infection), in order to assess whether any qualitative differences exist in disease progression between Northern Hemisphere and Ugandan subtypes.This contrast must be qualitative as there are few published studies of survival-time estimates with supporting statistical rigour.The benchmark estimate of the mean combined incubation period and survival times by the United Nations Joint Program on HIV/AIDS (UNAIDS) is 10 years, whereas that of the US Census Bureau is approximately 7,5 years (Stover & Way, 1998).The results of this study suggest that the UNAIDS estimate of a mean of 10 years may be an overestimate when applied to the rural Ugandan context.The results also suggest that survival time may need to be stratified by age at infection for more reasonable approximation.However, the reason why AIDS mortality in rural Uganda tends to be slightly more severe than in America and Europe does not necessarily relate to HIV-1 subtype.Socio-economic differences are likely to play an appreciable role in affecting mortality dissimilarities as well.

DISCUSSION
5.1 De Gruttola & Lagakos (1989) first proposed a method to analyse survival data when both the time of the originating event and the subsequent event were either right-censored or interval-censored.They generalised Turnbull's (1976) self-consistency algorithm to accommodate doubly censored data.A weakness in the original algorithm is that it is occasionally difficult to fit as some starting values will converge to saddlepoints of the likelihood function instead of the maximum.Non-identifiability problems can also arise in smaller data sets.Subsequently, work has focussed on rendering an alternative non-parametric estimate that is notationally simpler, computationally faster and free from problems of non-identifiability (e.g.Gómez & Lagakos, 1994).Cumulative probability of death FIGURE 6. Gompertz estimates compared with the ASSA600 model 5.2 Other semi-parametric and parametric methods exist for dealing with doubly censored data (reviewed in Jewell, 1994).For example, Brookmeyer and Goedert (1989) and Kim et al (1993) have proposed parametric approaches to include the influence of a covariate.Sun (1995) has proposed a method to deal with truncated and doubly interval-censored data.All methods for doubly censored data, while conventionally applied to estimation of the interval between infection and onset of AIDS, may be applied to several of the intervals depicted in Figure 1, including the interval between infection and death.The methodology adopted in this chapter is essentially a modified version of the algorithm proposed by Kim et al (1993).(See Appendix A.) 5.3 Studies of survival after HIV infection from North American and European haemophiliac and homosexual populations suggest a median survival time (from infection to death) of nine to eleven years (Hendricks et al, 1993, Veugelers et al, 1994, Schechter et al, 1995).Other studies suggest median survival times ranging from 6,5 to 16,1 years, with most estimates at nine to ten years (reviewed in Stover & Way, 1998).Caldwell (1997) 4).However, the results do highlight the obvious necessity of simultaneously considering more than one distributional summary measure in order to better depict and appreciate the underlying mortality patterns.
5.5 Greater age at infection has been associated with a shorter time to the development of AIDS in several industrialised countries (e.g.Rosenberg et al, 1994, Martin et al, 1995, Sabin et al, 1996).Darby et al (1996) documented the fact that age at infection has a substantial effect on survival (time from infection to death) in the UK population of haemophiliacs.In their study, 86% of patients who seroconverted before the age of 15 survived for 10 years after infection, compared with only 12% of patients who seroconverted at or after age 55.The mortality rates of uninfected haemophiliacs were also known in their study.Studies on the Masaka population have recorded the relatively low survival rates of older individuals after HIV infection (Nunn et al, 1997, Morgan et al, 1997, see Section 3 above).These conclusions are strongly supported in this paper (Figure 5).

5.6
One valuable contribution arising from the present study is the insight into the possible shape of the survivorship models, and how the model parameterisations are likely to be affected by the covariate age.In the Masaka population, the tentative conclusion is that the instantaneous probability of death is likely to be a monotonically increasing function of time since seroconversion (exponential in all Gompertz models) and not constant hazard, as implied, for example, in the model of Blacker and Zaba (1997) where a simple accelerated exponential model for survival of HIV-infected persons was utilised.Even amongst young individual seroconverters, the hazard rate is non-constant for the first several years after seroconversion ( $ x = 0,254).The hazard rate of young individuals is expected to increase further as the young group ages.Corroborating evidence should be sought from other regions as to whether or not these generalisations concerning the observed hazard rates in this investigation are applicable to other localities or to the natural history of HIV-1 within Africa.5.7 Given that patterns for the phenomenon of HIV mortality are gleaned from a relatively short period (five to eight years after seroconversion), the models fitted here are necessarily indefinite imputations when extrapolating into an unknown future.The estimates of HIV mortality produced by other probability models (e.g.Weibull, linear-hazard) may be similar to the self-consistent estimates, but the predicted rates of HIV mortality in subsequent years would differ from those of the Gompertz model.While the Gompertz formulation appears to be a model of choice for the moment, any attempt at forecasting beyond the observed data will remain a conceptual leap of faith, motivated by appeals to sensible extrapolation.Until further data emerges from the Masaka study, the strength of extrapolation will remain tenuous.Furthermore, until researchers understand more of the underlying regional covariates of mortality within Africa, and can confidently ascribe a common process to the phenomenon of HIV-1 mortality in Africa, the strength of extrapolation beyond the Masaka study will remain unsubstantiated.
5.8 Nonetheless, the findings of this analysis point to three pertinent facts: -First, the phenomenon of HIV mortality in one rural African setting is likely to be best described by a probability model that appropriately caters for increasing hazard over time from seroconversion.In this paper, we have endorsed a model in which the hazard rate is an exponentially increasing function of time since time of seroconversion.It is hoped that further work will clarify which family of probability models best describes HIV/AIDS deaths over time in rural Africa, and what are the principal factors and circumstances affecting marginally different populations.-Secondly, hazard differs substantially between individuals seroconverting at different ages.Mortality differences by age are supported empirically from non-African localities (Darby et al, 1996).-Lastly, there is an appreciable degree of variation about expected mortality trends in one rural African setting.Any plausible probabilistic model for HIV mortality may need to include these hazard, age and variational nuances.It may be useful to conceptualise the mortality phenomenon as a 'mixture' (sensu Titterington 1985) of plausible probability models rather than a single unvarying stochastic process over time.Certainly, the degree of variation that is manifest about the self-consistent estimates (Figures 3 and 5) is expected to play an appreciable role in influencing the dynamics of the HIV epidemic.Rather than invoke one most likely model, it may be more reasonable to appeal to a range of equally plausible models, each one potentially fitting the established patterns of self-consistency.It is feasible to couple the variance estimates with the model parameters to provide a more defensible non-finite mixture of unimodal distributions.These parameter-perturbed models would play a useful role in accounting for some degree of the expected variation about the annual predicted trends in HIV mortality.Certainly, it would be of interest to assess how sensitive pre-existing epidemiological models are to agitated mortality values.
5.9 Should the model parameterisations from the Masaka mortality data be illustrative of the fundamental mortality phenomenon within Southern Africa, the differences evident in Figure 6 are disconcerting, and would point to a developing crisis in those healthcare and insurance industries that are dependent on the ASSA600 or similar models.As mentioned, the mortality patterns evidenced in Masaka are not likely to be representative of Southern African mortality rates at large.However, the fact that such discrepancies exist between empirically established patterns (the Masaka models here) and an essentially theoretical construct (ASSA600) should serve as a useful cautionary detail to modellers of the epidemic.It is worth while noting that the ASSA2000 model now utilises a Weibull distribution for the mortality phenomenon (Dorrington, personal communication 2001).
5.10 There will often be urgent justification for modifying notional mortality models that are contributing to our expectations of the development of the AIDS epidemic and the consequent surfeit of human suffering within Africa.It is hoped that the Gompertz model proposed in this chapter will serve as a useful first approximation or simple model structure for the HIV mortality phenomenon.
5.11 It is unfortunate that historical African epidemiological data are too limited to produce useful estimates for particular countries of how much mortality has risen, at what ages and who has been affected most.Currently, the focus of attention for life and health institutions and governmental bodies is to attempt to assess the expected extent of the HIV/AIDS epidemic and to attempt to make provision for the anticipated effects of a deepening crisis.To this end, it is necessary to better explicate the nature of the stochastic process of survival of HIV-infected persons.After more than a decade since the impact of adult mortality of HIV in Africa was forecast (Anderson et al, 1988), the emerging efforts of the Ugandan AIDS natural-history cohort study is likely to yield some of the most reliable data on patterns of adult mortality on the continent.The present research is largely an attempt to model the distributional nuances of the mortality curve from the Masaka area.Further research emerging from AIDS investigations within Africa is needed to ratify the external validity of the findings presented here.

APPENDIX A SELF-CONSISTENCY ESTIMATION EQUATIONS AND CRITICAL VALUES
A.1 The following formulation is constructed from the derivation of De Gruttola & Lagakos (1989) and the work of Kim et al (1993).Essentially, the same two-step procedural computations as in Kim et al have been implemented.The methodology developed here differs from the work of Kim et al (1993)  A.4 De Gruttola & Lagakos (1989) found the maximising solution of (A1) by generalising the self-consistency algorithm of Turnbull (1976) where self-consistency was applied in the analysis of singly censored univariate data.
.1 We consider a population of N individual seroconverters.Let C and D denote the times from infection to seroconversion and mortality respectively, with C < D. The residual life at seroconversion is T = D -C.Assume that the random variables C and T are discrete and independent, and let W(c) and F(t) denote the cumulative distribution functions of C and T respectively.We define w

4. 2
An observation from any one individual is of the form (C L ,C R ,D L ,D R ), where C L < C < C R , and D L < D < D R .Implicitly we take C L < D L and C R < D R .In our data, C is known to be between 0 and 2 years; thus 0 < C L < 2 and 0 < C R < 2. Also, D may be known exactly for patients who died (D L = D R ) or D is unbounded if the patient still survived when the observation period terminated in 1998 (D R = ¥).The observation ( C L , C R , D L , D R ) determines a unique set of admissible (C,T ) values. Figure 2 gives a diagrammatical representation of sets of admissible points (c j , t k ) for bivariate intervals ( C L , C R ,D L ,D R ) with T = C + D (adapted from De

FIGURE
FIGURE 4. Hazard rates reports an average survival-time estimate for Africa of less than nine years.Nunn et al (1997) document median survival times in the Masaka population as being less than three years in subjects aged 55 years or more when infected with HIV-1, and over five years if infected when younger than 55 years of age.Based on evidence from the prospective population-based follow-up study in Masaka, Boerma et al (1998) note a median survival time of eight to eleven years.5.4 Results from the Gompertz modelling of the 1990-98 Masaka mortality data in this investigation do not contradict the abovementioned Masaka estimates to any large degree (Table in two ways: we have explicitly ignored the additional presence of a covariate and have avoided phrasing the model within the context of proportional-hazards model.A.2 Let a i = {a jk i : 1 < j < r, 1 < k < s} and ( , , , ) values of a and (C L , C R , D L , D R ) for the ith of N subjects.The likelihood function may be expressed as is worth while to note that implicit in (A1) is the assumption that (C L , C R ) and (D L , D R ) do not 'predict' the occurrences of C and C + T respectively.In other words, the intervals conditions apply for T.

A. 5
We define I jk i equal 1 if the true (but unobserved) value of (C, T) for the ith individual equals the discrete value (c j ,t k ) and equal to 0 otherwise.The conditional expectation of I jk i given a i is initial starting values for f | w 0 (denoted f 0 ) are then acquired by computing w 0 , one can find a self-consistent estimate (denoted w*) by iterating between (A2) and (A3) until convergence is attained.A maximising solution f | w* is then obtained by substitution.A.10 We express f k in terms of the conditional survival probabilities p k where restrictions of the probabilities p k are removed by reparameterising p k as g k , where (A7)A.12The first-order derivative of log L (denoted l ) with respect to g k is given by ¶

TABLE 1 .
Summary of data

TABLE 2 .
Self-consistent estimator vectors: $ Table 4(a) and (b) shows moments and descriptive summary features of the self-consistent estimates , the imputed Gompertz model structure and the ASSA600 mortality formulation p(t); in Table 4(a), conditioned upon the length of the observation window for each of the four groups, and in

TABLE 3 .
Feasible model parameterisations for each of the four groups of interest.