On the relationship between time-series studies, dynamic population studies, and estimating loss of life due to short-term exposure to environmental risks.

There is a growing concern that short-term exposure to combustion-related air pollution is associated with increased risk of death. This finding is based largely on time-series studies that estimate associations between daily variations in ambient air pollution concentrations and in the number of nonaccidental deaths within a community. Because these results are not based on cohort or dynamic population designs, where individuals are followed in time, it has been suggested that estimates of effect from these time-series studies cannot be used to determine the amount of life lost because of short-term exposures. We show that results from time-series studies are equivalent to estimates obtained from a dynamic population when each individual's survival experience can be summarized as the daily number of deaths. This occurs when the following conditions are satisfied: a) the environmental covariates vary in time and not between individuals; b) on any given day, the probability of death is small; c) on any given day and after adjusting for known risk factors for mortality such age, sex, smoking habits, and environmental exposures, each subject of the at-risk population has the same probability of death; d) environmental covariates have a common effect on mortality of all members of at-risk population; and e) the averages of individual risk factors, such as smoking habits, over the at-risk population vary smoothly with time. Under these conditions, the association between temporal variation in the environmental covariates and the survival experience of members of the dynamic population can be estimated by regressing the daily number of deaths on the daily value of the environmental covariates, as is done in time-series mortality studies. Issues in extrapolating risk estimates based on time-series studies in one population to estimate the amount of life lost in another population are also discussed.

Short-term exposure to ambient concentrations of combustion-related pollution has become a public health concern over the last decade largely because of numerous studies linking fluctuations in the daily number of deaths with daily variations in ambient air pollutants, such as particulate matter and ground-level ozone Katsouyanni et al. 1997;Moolgavkar 2002;Schwartz and Lee, 1999;Stieb et al. 2002). However, there has been considerable debate in interpreting the results from these studies in terms of estimating loss of life expectancy (McMichael et al. 1998). It has been argued that because no identifiable population of subjects was followed over time, the regression parameters obtained from the timeseries studies could not be applied to population life-tables to determine the amount of life lost because of short-term pollution exposure (Kunzili et al. 2001;Rabl 2003).
Associations between daily values of pollution and mortality have been extensively examined using time-series study designs. Here, the daily variations in the number of deaths in a community are related to daily variations of ambient air pollution levels using generalized additive models (Hastie and Tibshirani 1990), controlling for temporal trends, seasonality, and weather variables (Kelsall et al. 1999;Samet et al. 2000;Schwartz 1999). The time scale of exposure is on the order of a few days and therefore represents short-term exposure. Cumulative exposures have also been examined up to 2 months using distributive lag models (Zanobetti et al. 2001), frequency domain (Kelsall et al. 1999;Zeger et al. 1999), and time-domain regression techniques (Dominici et al. 2003). Time-series approaches have limitations in examining the effects of pollution exposure on mortality at longer time scales because these longer temporal averages of daily pollution values tend to be highly correlated with the seasonal component of the mortality time series and weather variables, making inference highly unstable.
In time-series studies, a common value of daily exposure is assigned to all subjects residing in a single community, and variations in exposure between subjects are omitted. Recently, time-series study design has been extended to include multiple communities Katsouyanni et al. 1997Katsouyanni et al. , 2001, where community-specific estimates of effect are obtained from data in each community separately and then pooled among communities using random effects models (Burnett et al. 1995;Dominici et al. 2002;Katsouyanni et al. 1997Katsouyanni et al. , 2001. The effects of longer-term exposure scenarios on mortality can be examined by recording the survival experience of subjects in multiple communities adjusted for subject-specific risk factors. This design generates exposure variation between subjects that live in different communities and has been used in several studies (Dockery et al. 1993;Pope et al. 1995). Unlike time-series studies, a time-invariant multiyear population average exposure is assigned to all subjects residing in a single community. Thus, exposure is measured across communities even though individuals are followed in time.
Estimates of relative risk can be obtained from multicommunity cohort studies of air pollution and mortality Dockery et al. 1993;Krewski et al. 2000;Pope et al. 1995Pope et al. , 2002, in which spatial variations in ambient air pollution are related to spatial variations in survival, after adjusting for covariates obtained at the individual level, such as smoking habits. Estimates of relative risk from such studies have been incorporated into population life tables to determine the amount of life lost because of long-term exposure to ambient pollution (e.g., Brunekreef 1997). A common risk of death due to air pollution exposure and a common probability of death for those not exposed are implicitly assumed at any given age in this approach to estimating loss of life expectancy.
There is a growing concern that short-term exposure to combustion-related air pollution is associated with increased risk of death. This finding is based largely on time-series studies that estimate associations between daily variations in ambient air pollution concentrations and in the number of nonaccidental deaths within a community. Because these results are not based on cohort or dynamic population designs, where individuals are followed in time, it has been suggested that estimates of effect from these time-series studies cannot be used to determine the amount of life lost because of short-term exposures. We show that results from time-series studies are equivalent to estimates obtained from a dynamic population when each individual's survival experience can be summarized as the daily number of deaths. This occurs when the following conditions are satisfied: a) the environmental covariates vary in time and not between individuals; b) on any given day, the probability of death is small; c) on any given day and after adjusting for known risk factors for mortality such age, sex, smoking habits, and environmental exposures, each subject of the at-risk population has the same probability of death; d) environmental covariates have a common effect on mortality of all members of at-risk population; and e) the averages of individual risk factors, such as smoking habits, over the at-risk population vary smoothly with time. Under these conditions, the association between temporal variation in the environmental covariates and the survival experience of members of the dynamic population can be estimated by regressing the daily number of deaths on the daily value of the environmental covariates, as is done in time-series mortality studies. Issues in extrapolating risk estimates based on time-series studies in one population to estimate the amount of life lost in another population are also discussed. However, to place the effect of air pollution episodes in a population health perspective, it is also of interest to assess the impacts on longevity from short-term pollution exposures on the order of days to weeks.
In this article, we propose a survival model that jointly examines the effects of short-and long-term exposure to environmental risk factors on mortality. We also identify the conditions under which the effect estimates on survival associated with short-term exposures are equivalent to effect estimates from timeseries studies. We identify sampling design characteristics under which daily time-series studies can be used to estimate the amount of life lost because of short-term exposure to environmental risk factors.

Dynamic Population Studies
Consider a dynamic population design with the time to response defined as calendar time. Subjects are followed as long as they reside in a given community. Subjects can enter the study population either at birth or by immigration and leave the study population through death, emigration, or termination of follow-up. Further, suppose information is available on a subject's age at entry, sex, and race. In addition, subjects are interviewed periodically to obtain information on smoking habits, diet, occupation, education, and any other risk factors related to mortality. Measurements on several other environmental risk factors may also be available, such as ambient air pollution, aerobiologics, and weather.
The value of the environmental covariates recorded on day t for T days for the ith individual in the sth of S communities, z is (t), can be decomposed into three factors of the form Here, z is (t) is the average over the at-risk population in community s, and z is (t -) is the average of z is (t) over a long period of time, such as years to decades, for the sth community on day t. The first term on the right side of , is the difference between the exposure value for the ith individual in the sth community and the average value on that day for all members of the community and represents the within-community variation in personal exposure. The second term, Α s (t) = z is (t) -z is (t -), is the difference between the community average of personal exposures on day t and the long-term average for community s. Temporally varying exposure measures such as Α s (t) have been used in time-series studies with the population-average personal exposure values replaced by spatial averages of available fixed-site monitoring data. The spatially variable exposure measures such as z is (t -) have been used in cohort studies of air pollution and mortality.
Further temporal decompositions of exposure can be made; for example, one could consider variation in daily averages within a month, monthly averages within a year, yearly averages within a decade, and so on (Dominici et al. 2002;Schwartz 1999;Zeger et al. 1999). For the sake of simplicity of presentation, we restrict our discussion to a three-factor temporal decomposition given by Equation 1.
The relationship between the risk factors and survival is modeled by the hazard function λ is (l) (t) for the ith subject in the lth stratum in the sth community: is a time-dependent vector of individual risk factors for the ith subject in the lth stratum in the sth community, and φ, β, θ are vectors of unknown parameters representing the logarithm of the relative risks associated with a unit change in the within-community between-individual variation in personal exposure [P is (t)], within-community temporal variation in population average exposure [Α s (t)], and variation in the long-term average exposure between communities [z is (t -)], respectively. We assume that φ, β, and θ are constant for all strata. The δ l values are the log-relative risk for the individual risk factors that can vary by stratum.
Strata could be defined by groupings of age at entry, sex, and race, thereby allowing the baseline hazard function to depend on these risk factors. As a result, the effect of these risk factors on survival cannot be estimated. Note that we have indexed the personal exposure values by strata in order to uniquely identify subjects, P is (l) (t). These values are incorporated into the hazard function in the same manner as the other individual-level risk factors, x is We have assumed that each subject within a stratum has the same baseline hazard function, λ 0 (l) (t), and that the association between exposure and survival is identical for all subjects. We therefore cannot distinguish subjects in terms of their sensitivity to die after adjusting for the known risk factors and strata. This model is referred to as a homogeneous survival model. We explore implications of this assumption on the ability to estimate loss in life expectancy, and some extensions to a heterogeneous survival model, in the "Discussion." Our parameter of interest is β, which estimates the effect of temporal variation in exposure within a community, Α s (t), on survival. The study design considered here can be described as following individuals' survival experience over time within each community. Estimates of effects of the environmental covariates on survival can be made within each community separately, and a summary estimate of effect is given by pooling the communityspecific estimates among communities. The longer-term average exposure values, z is (t -), can then be absorbed into the baseline hazard function to form a community-specific baseline hazard of the form λ 0s ]. This is a reasonable assumption because both λ 0 (l) (t) and exp[θ′z is (t -)] are expected to be smooth or slowly changing functions of time.
We consider estimation of β by using information from a single community, and we therefore omit the index s, for the sake of simplifying the notation. Of course, data from across communities can be combined for a pooled estimate of risk (Burnett et al. 1995;Dominici et al. 2002;Katsouyanni et al. 1997Katsouyanni et al. , 2002. Estimates of the baseline hazard function and regression parameters may be obtained by maximum likelihood methods. Under Equation 2, the log-likelihood function of β, l (β), is given by Cox and Oakes (1984) [3] where t 0 is the starting date of the study, T is the end date, t i0 (l) is the time the ith subject in the lth stratum entered the community, and D (l) (t) and C (l) (t) are the sets of individuals who died or were lost to follow-up on day t in the lth stratum, respectively. We define the population at risk by R (l) (t) = D (l) (t) ∪ C (l) (t).
Assuming that the covariate values are constant within a day and writing the limits of integration in Equation 3 as a sum of integrals between consecutive days, we can rewrite the log-likelihood (Equation 3) as [4] where y (l) (t) is the number of subjects who died on day t in the l stratum, Λ 0 (l) (t) = ∫ t-1 t λ 0 (l) (u)du is the cumulative baseline hazard function, and is the effect of individual covariates on survival averaged over the population at risk. Algebraic (t), l = 1, . . . , L}; and the effects of the individual covariates, δ l , and personal exposure, φ, on survival. The maximum likelihood estimate of β can be obtained as the solution to the score function equation [5] where and The function Θ(t) represents the daily baseline hazard function multiplied by the effects of time-varying individual covariates averaged over the at-risk population. First, it is reasonable to model ℜ (l) (t) as a smooth function in time because it is a summation of the individual covariate effects over the at-risk population in the lth strata. For example, the effect of the number of smoked cigarettes on survival may vary markedly from day to day for any single individual, but the average effect of smoking on survival in the population at risk should vary relatively smoothly over time. Second, we would also expect to model the cumulative baseline hazard function, Λ 0 (l) (t), as a smooth function of time. Therefore, it is reasonable to assume that Θ(t) is a smooth function of time.
Finally, Θ(t)exp[β′Α(t)] can be interpreted as the expected number of deaths on day t. To show this, we note that the conditional probability of dying on day t for the ith subject in the lth stratum is given by [6] (Cox and Oakes 1984), with the approximation being reasonable because of the small daily death rate in North American and European cities (≈10 -5 ). The expected number of deaths on day t is given by the probability of death summed over all individuals at risk in the community on day t, or [7]

Time-Series Model
Several investigators have estimated associations between daily variations in population average exposure such as ambient air pollution and mortality using a time-series approach (Katsouyanni et al. 1997;Samet et al. 2000). This approach assumes that the number of daily deaths in the lth stratum, y (l) (t), follows a Poisson distribution with the expected number of deaths on day t given by [8] where Φ (l) (t) is the daily baseline number of deaths for the nonexposed population in stratum l, and γ is the relative rate of daily mortality for a unit change in Α(t), assumed identical for all strata. Assuming that the counts are independent among strata, the log-likelihood function of γ is proportional to The score function for γ, S(γ), is given by is the daily baseline mortality averaged across strata.

Comparison of the Two Models
Score Equations 5 and 10 suggest that if the population-average baseline hazard function times the average effect of individual-level covariates on survival for the population at risk, Θ(t), and the population average baseline mortality, Φ(t), are both modeled using the same unknown function of time, then our modeling approaches to the dynamic cohort and time-series designs provide identical estimates of the effects of environmental covariates. This analytical approach is reasonable because both quantities represent the expected number of deaths on day t when Α(t) = 0 in their respective designs.
Commonly, in the analysis of time series, Α(t) is estimated by the daily average of the concentrations observed at fixed-site ambient monitoring stations. For some air pollutants such as fine particulate matter, aggregated measures provide reasonable estimates of the population average of personal exposure values . Biased estimates of the effects of the environmental exposures will occur because of measurement error if these aggregate measures provide poor estimates of the average of the personal exposures of the atrisk group with the amount of bias dependent on the amount of error in measuring exposure (Zidek et al. 1996).

Discussion
We have demonstrated that dynamic population study and time-series designs provide the same relative rate estimates of mortality associated with exposure to air pollution under the following conditions: a) the environmental covariates vary in time and not between individuals; b) on any given day, the probability of death is small; c) each subject of the atrisk population has the same probability of death after adjusting for known risk factors; d) all members of at-risk population share a common effect of environmental covariates on mortality; and e) the population-average baseline hazard function and association between risk factors and death can be approximated adequately by smooth functions of time. In other words, if conditions a-e hold, then each individual's survival experience can be summarized as the daily number of deaths.
In addition, the equivalency of the estimates of β and γ, obtained from S(β) and S(γ), respectively, depend on whether Θ(t) and Φ(t) are modeled as the same nonstochastic function, possibly involving some unknown parameters. A challenge of time-series studies is the lack of a clear-cut method to choose the smooth time function to eliminate long-term and seasonal trends in the data, and different estimation methods can lead to different results. For example, we have suggested previously ) that the smooth function be selected so that the residual time series is consistent with a white noise process. It seems clear now that estimates of the air pollution effect are sensitive to the method of modeling time and weather, although this sensitivity can vary by location and season depending on how these variables are correlated.
Although we have estimates of the effects of long-term exposure to ambient air pollution on survival (Dockery et al. 1993;Pope et al. 1995) based on variations in exposure between communities and estimates of shorter-term exposure on mortality  based on daily variations in exposure within a community, we as yet have no direct estimates of the total effect of exposure to ambient air pollution from all time scales based on the same study. The sum of these effects gives estimates for two of the three components described in Equation 1. A few studies have attempted to estimate the effects of personal exposure to air pollution on mortality. Variations in personal exposure estimates are generated as a function of a subject's residence within a community or geographic region (Abbey et al. 1999). One could sum the estimates of effect from these time-series mortality and cohort studies to obtain a total estimate of effect. However, they are based on different populations and exposure data. The timeseries studies use mortality data covering the entire population, whereas the cohort studies are not necessarily representative of the target population. It is desirable to obtain joint estimates of risk based on personal variation of exposure within a community, temporal variation within a community, and spatial variation between communities obtained from a multicommunity dynamic population study using a unified survival model. Model formulations for a joint analysis of time-series and cohort studies have been recently discussed (Zeger et al. In press).
We have considered values only of the environmental covariates defined on a single day. However, the estimates of effect between the two epidemiologic designs are equivalent if values of environmental covariates are defined as several-day averages or distributed lag models (Zanobetti et al. 2001). This model formulation is also resistant to mortality displacement by a few days or weeks , a phenomenon in which air pollution plays a role in advancing the time of death by a relatively short period. However, the day-to-day variation in the temporal summary estimate of exposure will decrease as the number of days included in its calculation increases, thus decreasing the ability to detect effects on mortality. Furthermore, this summary measure of exposure could become confounded in time with the baseline hazard function if a large number of time lags are used, resulting in unstable parameter estimates (Dominici et al. 2003;Zeger et al. In press). Consequently, time-series studies have limitations in investigating the association between long-term exposure to environmental covariates, such as air pollution, and mortality. Studies in which individual exposures vary, either within a community or between communities, are required to estimate the effects of longer-term exposure on mortality.
In the absence of any other information in addition to the daily count data, the baseline hazard functions, λ 0 (l) (t), and the regression parameters for the individual covariates, δ l , cannot be estimated. Estimation of the δ l values requires information on the individual covariates, x t (l) (t), which in turn is required to estimate λ 0 (l) (t). Estimates of all these quantities are needed to estimate the amount of life lost because of exposure to the environmental covariates in this study population. The exposure effect estimate from a time-series study is therefore not sufficient to determine the amount of life lost.
However, our results show that relative risks due to exposure to the environmental covariates estimated from studies employing either a dynamic population or time-series design can then be applied to the hazard function to determine the amount of life lost under specific exposure scenarios assuming a homogeneous survival model. Age-and sex-dependent number of deaths and number of persons surviving specific ages are required to construct population-based life tables. These quantities are used to determine the baseline hazard function for specific populations (normally for entire countries). Here, age is the time variable for the hazard function. Survival probabilities also vary with age, sex, and race, and therefore separate estimates of the effects of environmental covariates on survival should be made by these categories.
A fundamental assumption in these calculations is that the relative risks estimated in the study population can be applied uniformly to all members of the population used in deriving the life tables (viz., there is no effect modification between individual characteristics and ambient air pollution). This assumption may not be valid, as evidenced, for example, from the findings of a reanalysis of the Harvard Six Cities and the American Cancer Society studies (Krewski et al. 2000), in which an interaction was found between attained education (a measure of socioeconomic status) and level of air pollution. In addition, the effects of short-term exposure to several environmental covariates such as ambient air pollution, weather, and aerobiologics on survival may be modified by host conditions. For example, Goldberg et al. (2000Goldberg et al. ( , 2001 have shown that persons with certain medical conditions, such as congestive heart failure, are more susceptible to air-pollutionrelated death than is the general population. Their survival experience may also be different from that of the average person in that their disease condition reduces their life expectancy. Information on disease status can be incorporated into the survival model by defining an individual-level covariate as an indicator function of the presence/absence of a disease, which would vary with time. The interaction between the disease state indicator and air pollution would provide a means of assessing the effect modification of host conditions on air-pollution-related deaths. Incorporating the influence of disease condition on the relative risks of environmental covariates into estimates of the amount of life lost would require disease-specific life tables. Such life tables could be determined from national longitudinal population health surveys linked to mortality (Tambay and Catlin 1995). These life tables provide estimates of the expected life span of an individual with a disease condition by age. Incorporation of individual covariates (which is not possible in time-series study designs) is therefore important to capture this difference in susceptibility.
The use of time-series mortality studies to estimate the amount of life lost because of short-term ambient air pollution exposures has been criticized (Kunzili et al. 2001;McMichael et al. 1998;Rabl 2003). However, those authors suggest that it is appropriate to estimate from cohort studies the amount of life lost. We have shown that under certain conditions time-series studies can be viewed as dynamic population studies and that estimates of life lost can be obtained from time-series studies in a manner similar to that used in cohort studies. However, we did have to assume a homogeneous survival model. It is likely that people dying from short-term exposures to Article | Relationship between time-series and dynamic population studies Environmental Health Perspectives • VOLUME 111 | NUMBER 9 | July 2003 , environmental covariates such as ambient air pollution are more vulnerable to dying and therefore do not have the same expected residual lifetime as an average person their age. Similar concerns arise with the cohort studies in that long-term exposure to air pollution could be affecting only those persons with preexisting diseases or some other vulnerabilities (e.g., low education). It is therefore important to develop heterogeneous survival models for both short-and long-term exposure and to conduct epidemiologic studies to both identify vulnerable populations and subgroups sensitive to environmental exposures. Estimates of the heterogeneity of survival and effect of environmental exposures on mortality coupled with disease-specific life tables will enable use to determine reasonable estimates of the amount of life lost because of environmental exposures.