Epidemiologic study design for investigating respiratory health effects of complex air pollution mixtures.

Epidemiologic studies of the respiratory health effects of air pollution are intrinsically difficult because exposure is common, expected effects at concentrations found in developed countries are weak, random misclassification of exposure is common, and the respiratory health indicators have multiple etiologies. Exposures to air pollutants also are multidimensional, generally consisting of a mixture of gases and particles. In this paper, epidemiologic study designs are described, and their potential for evaluating effects of complex pollutant mixtures are discussed. Power to detect the independent effects of individual pollutants in a complex pollutant mixture or to measure their interactions is in general very weak unless the study is specifically designed to test such hypotheses. However, with innovative and creative design, the independent and joint effects of multiple pollutants should be estimable in epidemiologic studies.


Introduction
From its earlier roots, epidemiologists have recognized air pollution as a potentially important determinant of increased morbidity and mortality. In the dassic analysis of the Bills of Mortality in 1662 (1), Graunt attributed the high week-to-week variability in mortality to changes in the "airs" of London. Modern air pollution epidemiologists have attempted to attribute health effects to specific constituents of these "airs." However, it has become dear that these airs are in fact a complex mixture of contaminant gases and partides.
Methods for epidemiologic studies of the health effects of air pollution have been reviewed comprehensively by the National Research Council Committee on the Epidemiology of Air Pollutants (2). This paper builds on that state-of-the-art report, plus discussions by Samet and Lambert (3), to consider epidemiologic study designs for assessing health effects ofcomplex air pollution nixtures.

Difficulties in Air Pollution Epidemiology
Epidemiologic studies of air pollution are particularly challenging. Air pollution expo-This manuscrpt was prepared as part of the Environmental Epidemiology Planning Project of the Health Effects Institute, September 1990 -September 1992. This paper was prepared under a contract from the Health Effects Institute. The National Institute of Environmental Health Sciences grant ES-00002 provided additional support. The author is grateful to Jonathan Samet and William Lambert for their comments and to Grace La for prepanng the manuscrpt.
sures are universal and, as Rose (4) has pointed out: the more widespread is a particular environmental hazard, the less it explains the distribution of cases. The cause that is universally present has no influence at all on the distribution of disease, and it may be quite unfindable by the traditional methods of dinical impression and case-control and cohort studies, for all ofthese depend on heterogeneity ofexposure.
The challenge, therefore, is to develop study designs that provide contrasting exposures in natural settings. Given that environmental exposures are generally to multiple pollutants, studies that differentiate response to the air pollution mixture will require careful and innovative designs.
A second problem is that while exposures are common, the risks tend to be low. Environmental controls that have been put in place in the United States have reduced exposures generally to levels below the National Ambient Air Quality Standards. The standards, established by the EPA, and based on the best available scientific data, were set to prevent any adverse health effects, even among the most sensitive members of the general population. Thus, expected health effects of air pollution at concentrations currently observed in the United States should be expected to be weak, that is, with relative risks less than 2 and often less than 1.5 for typical exposures.
At the present time, it is not sufficient to demonstrate that a certain air pollutant, or mix of air pollutants, is associated with an adverse health effect. Adequate information is available to demonstrate adverse health effects at high concentration. Regulators now require quantitative estimates of the exposure-response associations at concentrations below the National Ambient Air Quality Standards to evaluate adequacy of the standards and for risk and cost-benefit analyses as required under the most recent amendments to the Clean Air Act (5).
Misdassification of exposure is a particular problem in air pollution studies. Personal exposures to air pollution may differ substantially from ambient air data. Innovative methods have been developed for measuring personal exposures, but these methods are labor intensive and often very intrusive on the participants. Thus, the investigator should expect substantial random misclassification of exposure in designing an epidemiologic study. This means that statistical associations will be weakened and larger sample sizes required. Particular attention should be given to the potential for information bias associated with exposure misclassification.
Adverse health effects of environmental pollutants, and air pollution in particular, are generally nonspecific. For example, the development of chronic-obstructive pulmonary disease is a cumulative process in which air pollution is only one of many factors that produce irreversible loss of lung function. Likewise, reversible changes in lung function, as in asthma, may be triggered by many environmental exposures, induding allergens (e.g., house dust mites, pollens, mold spores, fungi, and animals), infections, medication, exercise, heat, cold, and air pollution (SO2 and 03). This implies that respiratory health end points are often common in the study populations. However, this also implies that studies to evaluate the health effects of air pollution must carefully consider such covariates in the design.
Just as the cause of the respiratory health end points is likely to be multifactorial, exposures to air pollution are, in general, multidimensional. It is the purpose of this paper to address methods of designing and analyzing epidemiologic data to evaluate the health effects of such complex air pollution mixtures.
As an example, consider the association between environmental tobacco smoke (ETS) and lung cancer. It is clear that there is a strong association between active smoking and lung cancer. If these risks are extrapolated down to the exposures expected for a never-smoker exposed to environmental tobacco smoke (ETS), the estimated risk ratios would be of the order 1.4 for men and somewhat lower for women (6). Estimates combining results from case-control and cohort studies of lung cancers among nonsmoking women married to smokers in the United States (6) produce a summary relative-risk estimate of 1.14. At such low relative risks, alternative environmental causes, such as indoor radon, must be considered. Estimates from population-based studies may be biased toward the null because exposure to ETS is so common that it is impossible to identify a truly nonexposed control population. Thus, risk estimates in ETS epidemiologic studies are based on comparisons to controls with low, rather than no, exposure.

Respiratory Heafth Effects of Concern
For most air pollutants, indoor or outdoor, singly or in complex mixtures, the respiratory system is the sole or predominant portal of entry into the body and the principal locus of injury. The definition ofwhat constitutes an adverse health effect has been addressed by a committee of the American Thoracic Society (7). Health effects generally are divided into acute and chronic effects. Acute effects are characterized by sudden onset; are usually short-lived, that is, lasting minutes to days; and may be reversible. Chronic effects are characterized by conditions that persist over extended periods of time, possibly years. Although there may be recovery from chronic effects, they may be irreversible and may lead to early mortality. Examples of acute respiratory effects of air pollution include triggering or aggravation of asthmatic attacks, exacerbation of symptoms of chronic obstructive disease, increased upper or lower respiratory infections, transient changes in pulmonary function, increased respiratory symptom reporting, increased respiratory hospital admissions or doctor visits, and increased daily mortality.
Examples of chronic respiratory effects of air pollution include promotion of the development of asthma, increase in nonspecific airway responsiveness, reduced level of lung function, increased rate of lung-function decline, decreased rate of lung growth, development of chronicobstructive pulmonary disease, increased reporting of persistent respiratory symptoms, lung cancer, and increased mortality.

Epidemiologic Study Designs
Epidemiologic methods applied in air pollution research can be described by a small number of study designs. Some study designs are not appropriate or have not been applied to air pollution. Discussing these designs provides a structure for evaluating the potential for investigating the health effects of complex air pollution mixtures.

Cros-Section Studies
In cross-sectional studies, health and exposure information are determined at a single point in time. These studies are often described as surveys. This approach is most appropriate for acute rather than chronic effects, that is, health effects that are temporally close to the exposures. They also are appropriate for exposures that have been stable over time. Cross-sectional studies are readily feasible with manageable costs. In such study designs, it is possible to perform intensive monitoring of exposures to complex mixtures.
Cross-sectional studies are not appropriate for studying the effects of exposures (or mixtures) that are changing over time or health effects that occur only after a long latency period. In particular, cross-sectional data cannot describe the longitudinal relation between exposure and the health end point. The potential for selection and information bias in such studies must be considered carefully.
Ecologic studies are a class of cross-sectional studies in which a group rather than an individual is the unit of comparison. Aggregate information rather than individual information is used to describe both exposure and effect. Ecologic studies are straight-forward, easily undertaken, and low in cost. However, confounding can be a severe problem in these studies. In air pollution epidemiology in particular, semiecologic studies are common in which individual health-status data is collected but exposure is determined from a single ambient-air pollution monitor.
In designing cross-sectional studies, it is often possible to select study populations such that exposures are limited to only one pollutant, or the range of exposures to one pollutant is very limited. For example, exposure to ETS could be limited in a study of NO2 or radon by restricting the population to households with no smokers, as in the Albuquerque study of respiratory illness and NO2 exposures in infants (8). In studies of oxidants, exposures to acid aerosols could be limited by considering only communities with low sulfur emissions (e.g., west coast communities). By such restrictions, the effects of individual pollutants that usually are found in mixtures can be assessed.
Alternatively, a factorial design can be implemented in which groups of participants having similar proportions of exposure are chosen based on prior knowledge of exposure or some marker of exposure. A factorial design allows estimation of the separate effects of each pollutant, as well as estimation of the effect of interaction.
In the Six Cities Study of indoor ETS and NO2 (9), participating households were selected randomly from strata defined by previously obtained reports of smoking in the home and the presence of an unvented combustion appliance. The correlation between annual mean concentration of respirable particles (PM2 5) and NO2 measured in these homes was only 0.1, so that the effect of PM2.5 and of NO2 could each be estimated without strong confounding by the other pollutant. In the Harvard 24 Cities Study of the health effects of acid aerosols and ozone, study connmnities were selected to provide a contrast in the two pollutants (10). Existing ozone measurements for each community were examined along with measured sulfate and other indicators of the potential for acid-aerosol exposure. The purpose of this design was to optimize the power of this study to estimate the separate effects of acid aerosols and ozone. Similar selection criteria could be used in selecting households for inclusion in a study of ETS and radon.
Populations also can be studied crosssectionally in time. For example, rates of diseases can be compared temporally within a community with time-varying air pollution. Chronic effects can be estimated by comparison of annual disease rates with changing concentrations of air pollution. For example, can communities be identified in which sulfates concentrations, a marker of maximum aerosol acidity, have dropped while ozone concentration has risen? Acute effects, such as daily mortality or hospital admissions, can be compared with daily air pollution measurements. These acute health-effects studies are usually described as time-series analyses. For a complex mixture, if the pollutants are not correlated perfectly, it is possible that the separate and joint effects can be estimated. In studies of ozone and acid aerosols, there is generally high correlation between the two exposures. An alternative strategy might be to perform a time-series study in separate communities with contrasting mixtures of these pollutants. For example, a community with both ozone and acid aerosols versus a community with ozone alone might be studied. Optimally, we would want to study a community with acid but no ozone.
In this sense, point sources of pollution may offer unique opportunities to investigate individual effects of pollutants that are usually found in complex mixtures. For example, NO2 usually is found in photochemical smog along with CO and 03. Shy et al. (11) examined the effects of NO2 produced by a TNT plant in Chattanooga, Tennessee. Similarly, a study of a community adjacent to a sulfuric acid plant could provide unique information on the health effects of acid aerosols in the absence of oxidants.
Populations in developing countries are exposed routinely to air pollution concentrations and mixtures that are no longer seen in the United States or elsewhere in the developed world. Unique opportunities exist in such communities for studying mixtures of air pollution at extreme concentrations or in mixtures of pollutants not generally observed in the United States.

Cohort Studies
In cohort studies, subjects are selected based on exposure status and are followed to monitor the development of a specific health end point. Cohort studies can be conducted prospectively or retrospectively. In a prospective cohort study, exposure status is determined from current or historical records and the subjects are followed to monitor the development of disease. This design is not appropriate for rare diseases but works well for common end points. Many disease end points can be considered simultaneously with little increase in cost.
For prospective cohort studies, extensive exposure assessment can be undertaken. Prospective cohort studies are especially efficient for assessing acute associations of air pollution exposures and respiratory health end points that vary over time.
The disadvantages of this design are the potential difficulty and high cost of implementation. The follow-up of study populations over extended periods of time is difficult. Large numbers of subjects are required if rare diseases are to be considered. This study design generally has weak power to measure interactions.
As in the cross-sectional study, interaction between pollutants in a complex mixture can be limited by restriction criteria on the sample cohort such that one pollutant is missing or its range is limited. Factorial designs also can be implemented to insure adequate sample sizes for each pollutant individually and for the joint distribution.
For a two-pollutant mixture, a factorial design allows the separate and joint effects of each pollutant to be estimated. In such a design, study subjects are selected such that there are equal numbers (or constant proportions) in each of the four cells defined by dichotomized exposure (high versus low) to one pollutant crossed with dichotomized exposure to the second pollutant. As an example, in a study of indoor radon and ETS exposures, never-smoking subjects could be selected based on radon levels in their homes (e.g., above or below 4 picocuries/m3) and having a spouse who is a smoker (yes or no). A cohort with equal numbers of subjects in each of the four exposure groups would allow estimation of separate effects of radon and smoking, as well as their interaction. However, as has been noted earlier, for a rare event or an end point with a long latency, such as cancer, such a factorial cohort study would require extremely large sample sizes.
Prospective cohort studies have been used successfully to evaluate the acute effects of time-varying exposures to single air pollutants on daily reports of symptoms and changes in pulmonary function. For example, Pope et al. (12) studied a panel of school children and asthma patients in a location with pollution from particles only. Symptom reporting, peak flows, and medication for asthma were each associated with PM1O. Clinical studies have suggested that exposure to one pollutant may potentiate the subsequent effect of exposure to a second pollutant. For example, Koenig et al. (13) found that exposure to ozone potentiates the subsequent response to sulfur dioxide among adolescent asthmatics. In the ambient environment, however, exposures to complex mixtures usually are highly correlated temporally such that differentiating associations may be impossible. Study populations with unique characteristics may allow the investigation of serial exposure to multiple pollutants. For example, the acute effects of ETS and NO2 may be different among subjects exposed to both pollutants simultaneously, as opposed to subjects exposed only to ETS at work and NO2 at home.

Case-Control Studies
In a case-control study, subjects with a specific outcome of interest, the cases, are identified. A control series also is identified consisting of persons without the disease who potentially would be selected as cases if they were to develop the disease. Exposure histories of both cases and controls are determined and compared to estimate the risk of disease associated with exposure.
Case-control studies are efficient particularly for assessing risks associated with infrequent diseases and diseases with long latency periods. Generally, only one health end point can be considered, but multiple exposures can be evaluated with little additional cost.
Exposure is ascertained retrospectively or estimated from current measurements. Thus, there is potential for substantial random misclassificationofexposure. Information bias is possible if there is not careful blinding of disease status of the participants. Selection bias is possible if cases and controls are not drawn from comparable populations. Case-control studies of the effects of air pollution have been infrequent perhaps because of the difficulty of reconstructing past exposures with acceptable precision (1).
Nested case-control studies are a hybrid design in which cases and controls are selected from within a larger cohort of subjects being followed historically or prospectively. The disease outcome is determined for all subjects in the cohort, but exposure information is determined only for the subset of subjects who develop the disease, that is, all cases, and a subset of subjects selected as controls. Nested case-control studies have been efficient particularly in cohort studies in which blood or other biological samples have been obtained and stored as part of regular evaluations of the study cohort. This approach makes efficient use of the measurement of biomarkers when the costs of the measurement are high. If biologic indicators of exposures to air pollutants can be identified, this design could be especially efficient.
The case-control design has been used widely to investigate the associations of lung cancer with exposure to ETS and to indoor radon. However, because exposures are estimated retrospectively, it is not clear that such a study can be designed to assess interaction of pollutants. Lubin et al. (14) have shown that testing for the interaction of active smoking and indoor radon exposure will require substantial numbers of subjects, possibly more than would be feasible in a single study. Evaluating interactions of indoor radon with ETS will be even more difficult. The off-diagonal exposures, that is, subjects with exposure to one but not both pollutants, can be enriched by selecting cases from populations with limited exposure to one of the pollutants. Restriction can improve the power of the study to estimate separate effects of pollutant mixtures. For example, cases and controls could be identified in areas with low smoking rates to reduce exposure to ETS but with high potential for radon exposures, or in areas with low radon potential to investigate the univariate associations with ETS.

Intervention Studies
In intervention studies, the investigator adds or reduces exposures to a cohort and then follows the cohort, assessing the impact of the intervention. In medical interventions, this approach, in which patients are assigned randomly to a treatment regimen (the randomized clinical trial), is considered the standard for inference and tests of causality. Studies in which air pollution is increased for specific subjects may be unethical. However, studies of subjects with reduced exposures would be acceptable.
In particular, Goldstein et al. (15) have described a cohort study of the acute effects of NO2 in which lung function of women was measured before and after cooking a meal on a gas range. Lung function was also measured before and after cooking an equivalent meal with a portable electric stove replacing the gas stove. A larger scale intervention could be considered in which a cohort of subjects were evaluated for acute effects before and after changing their stove from gas to electric or electric to gas. A less intrusive intervention might be possible through the installation of an aircleaning device specifically to remove ETS or to vent the exhaust of the cooking stove.
Special opportunities can sometimes be found in which specific pollutants are controlled unexpectedly. For example, Pope (16) performed an elegant analysis of the effects of particulate air pollution on hospital admissions based on a strike at a steel mill in Utah Valley. The steel mill was the primary source of particulate pollution in the valley. During the winter, when inversions develop, particulate levels build up to concentrations above the standards. Concentrations of other pollutants usually associated with particulates, that is, SO2, NO2, and 03, were low. During the winter of 1986-1987, the steel mill was closed because of a strike, and particulate concentrations were reduced substantially. Comparison of respiratory hospital admissions in the strike year compared to the years before and after showed a 2-fold decrease in admissions among children. This study of opportunity has provided the clearest information yet on the effects of particulates alone, a pollutant usually observed in a mixture with other pollutants.

Ocupatoa Studies
Occupational cohorts are valuable resources for epidemiologic studies of environmental risks. Cohorts are assembled easily and exposure estimation methods are well developed. The range of exposures can be large, facilitating the detection of associations. On the other hand, exposures often are much greater than those relevant for air pollution studies.
Occupational studies may provide opportunities to study exposures to single pollutants that are not possible in the ambient environment. For example, ozone exposures can be found in occupational settings without acid aerosols or nitrogen oxides.
Occupational studies also have furnished information on interactions that provide guidance for environmental studies. The interaction of active smoking and radon exposures has been demonstrated in uranium miners. Direct tests of interaction may be possible only at such extremes of exposure.
nt Studies Epidemiologic studies of migrant workers have been useful particularly in disentangling the effects of heredity and environment. It is possible that studies of families moving into or out of areas of high pollution could provide insights into the relative contribution of individual components of multipollutant mixtures. For example, families moving from southern California, where oxidant concentrations are high but where acid aerosol concentrations are very low, to the Northeast, where both oxidants and acid aerosols can be elevated, could provide information on the modification of the ozone effect by acid aerosols. However, there are many other environmental changes that would be associated with such a move that also must be considered. Selection bias is also possible if the famiaies have moved, at least in part, for health reasons.

Summary
Epidemiologic studies of the respiratory health effects of air pollution are difficult for the following reasons: a) Exposures are common, so developing contrasts is challenging. Maximum exposures have been reduced in the United States by control strategies. Populations free of exposure to air pollution cannot be found. b) There may be substantial misclassification of exposure. Ambient monitors do not reflect the range of exposures experienced by individuals. Personal monitors provide only a short sample of an individual's time-varying exposure. c) Exposures are multifactorial. Air pollution exposures are universally to multiple pollutants. In addition, other environmental insults, such as temperature and aero-allergens, may be correlated with air pollution exposures. d) Respiratory health end points are multifactorial, with air pollution being only one, and possibly only a minor, etiologic factor. e) Effects are weak and, therefore, difficult to detect. Nevertheless, information is needed to quantify health effects to the lowest observed concentrations.
If the investigation of mixtures of pollutants is not considered in the design of an epidemiologic study, then it is unlikely that the study will have sufficient power to detect interactions or even the separate effects of the individual pollutants in the analysis. Nevertheless, with innovative and well-thought-out study designs, it should be possible to measure the separate and joint effects of multiple pollutants in a mixture. No particular study design stands out as offering the most potential for disentangling the separate and joint effects. Creative epidemiologic designs and studies of opportunity can provide insights into these issues. If epidemiology were simply a matter of analyzing health and exposure data, we could set a computer to work regressing the vast stores of national health data against the immense amount of air pollution data that has been gathered. Fortunately for the epidemiologist, an elegant study design is more compelling than an elegant analysis.