Methods to assess respiratory effects of complex mixtures.

This paper evaluates the influence of exposures on acute and chronic airway obstruction. Clinical, physiological, and immunological aspects are important in evaluating the effects of the pollutant exposures. Aspects of the exposure-response relationships important enough to record are those factors interactive with the pollutants (e.g., smoking and other personal/behavioral factors) and precursor conditions. To determine baseline status and study chronic effects, one uses standardized and modified health questionnaires and standardized pulmonary function. Confirmatory studies of responsive airways, potentially assessed first by diurnal peak flow, can be done using post-bronchodilator maximum expiratory flow volume curves and methacholine challenges. Immunoglobulin determinations for immunological status (a predisposing/susceptibility factor), allergy skin tests (for immediate hypersensitivity status), and blood counts (mostly for eosinophils) are also important. Other tests that could be performed include expired carbon monoxide and/or carboxyhemoglobin and methemoglobin (for smoking and combustion exposures). Measures of acute effects are symptomatic responses (by questionnaires and diaries), responses of the airways (as measured by spirometry and peak flows), and changes in medication usage or associated medical care (in diaries). Methodologies should also include discussions of protocols and analysis.


Introduction
Quantitative assessments ofthe adverse health effects ofcomplex mixtures of air pollution in populations usually require the use ofdifficult and costly epidemiological and exposure assessment techniques to evaluate interactions among pollutants in producing measurable health responses (1)(2)(3)(4). This is a difficult task, as the wide variety ofcontminants have varying temporalspatial concentrations, but it is necessary because the contributions from the indoor environment are thought to be major (e.g., from combustion products, volatile organic compounds [VOCs] and environmental tobacco smoke [ETS]). Designing effective assessments oftotal exposures will require knowledge ofthe concentration and distribution of each class of contaminants, joint distributions ofcontaminants, and the various associated impacts on health. Estimates of contributing factors, their independent and collinear distributions, and the number ofpeople exposed to different combinations of these factors are required, even for selecting the study populations. The expected time interval between exposure(s) and the eventual health effect must be taken into account. Further, patterns of response (5) and the interaction between pollutants and other factors such as socioeconomic characeristics of the home, smoking, and occupational exposure can only be shown in population studies (2)(3)(4). Designs and methods for such exposure-response studies are necessary, especially newer and more cost-effective techniques.
The old approach ofevaluating the relationship ofa single factor to a single effect (one-at-a-time approach) has limitations in both experimental and observational epidemiological studies. The ability to relate experimental results to noncontrolled situations (where all other variables are not constant) has been limited by an overly simple design that excludes important covariates from the design and analysis. In epidemiological studies, a univariate approach will restrict attempts to rule out or control for confounding factors, which might produce similar effects and may provide unreliable evaluations ofdiseases or physiological responses when these are related to several factors (1,4,6).
Multifactorial experimental designs can permit more realistic evaluations ofthe relationships between variables. Because ofthe special characteristics ofthe exposure-response relationships, such studies will require designs different from those of traditional epidemiological studies (7).
On the other hand, possible disadvantages of a study of complex mixtures is either the larger sample sizes for longer periods offollowup required to provide adequate numbers ofsubjects or person-days of observation for multivariate or time-series analyses. However, effective specification of study objectives should imply what combinations ofexposures are ofinterest and reduce the need for completely blocked designs (looking at all possible combinations). By focusing the exposure/health assessments on subpopulations where there are likely to be exposures or responses of interest, the number of cases can be reduced. Then the study design must provide precise measurements of the exposure as well as health effects. This is necessary to identify the effects of the contaminants, as disfinguished from any other factors related to health. Existing combinations of exposures are chosen for detailed evaluations of plausible causal relationships while controlling for confounding factors. Because total human exposure to air pollutants occur in a variety of environments, with contributions from several sources and with mixtures ofpollutants, the problem ofuniquely identifying cause-effect relationships for a single pollutant is complex.
Further, the selecting methods to assess specific health effects is dependent on the characteristics of exposures and of the anticipated responses. These methods (and their quality control) have been widely discussed (4)(5)(6)(7). The methods include questionnaire surveys, tests ofphysiological function, bioassays, and biological monitoring. Also, applications of clinical and psychophysical methods in studies ofindoor pollutants have proved successful recently (5)(6)(7)(8). For multipollutant studies, a combination of instruments is required to determine both exposures and responses.
A typical study might include health evaluations using stndardized health questionnaires and baseline pulmonary function measurements. For acute changes that can be recorded, one uses daily diaries (symptoms, medications, doctor visits, general time allocation); diaries were developed clinically, then epidemiologically over the past few decades (6,(9)(10)(11)(12). For objective pulmonary changes, one can use peak expiratory flow measurements (at least two times daily), used clinically to assess short-term changes and variations in responses (see below). Subjects are given bronchodilator challenges, allergen skin tests, and samples are taken for CO, NO, and ET%. Serum, which can be frozen, is used for biochemical and immunochemical determinations. Such evaluations help to classify subjects for distinct studies. Although used occupationally and clinically, radiographs, especially for airway obstructive diseases (AOD), have not proved to be especially useful in community populations in general or in studying effects of complex mixtures indoors.

Questionnaires
The most commonly used tool for deriving health indicators of chronic disease is the questionnaire. The questionnaire includes demographic information, anthropomorphic information, and information on acute and chronic disease history. It also often includes information on faimily history, occupational exposure histories, and residential histories. Ageand sex are the most important covariables. Height, race, and weight are critical for use with function studies. Marital status, socioeconomic status, and other demographic variables may be important intervening or risk variables.
The chest-allergy history questions should include questions on productive cough, wheeze, exertional dyspnea, attacks of wheezing dyspnea, angina, history ofacute respiratory illnesses, history ofchildhood illnesses, presence and physician confirmation of cardiovascular restrictive lung and airway obstructive diseases (emphysema, chronic bronchitis, bronchiectasis, asthma), history of other cardiopulmonary diseases, history of allergy and ofsinusitis, and history ofany abnormal chest X-rays, and hospitalizations for chest problems or thorax surgery. An occupational history should be complete in terms ofintensity and duration of exposures. There should be an extensive smoking history. The residential history is important in terms ofpotential exposures in the environment.
Much ofthis information has been developed and used in standardized questionnaires. The best-known respiratory questionnaire is the British Medical Research Council questionnaire, which has undergone revisions since around 1954 (13)(14)(15). The pneumoconiosis unit of the Medical Research Council (MRC) also has its own questionnaire, which compares favorably to the regular MRC questionnaire (16). Some variability in questionnaire responses has beendemonstrated with these questionnaires (17,18). A cardiovascular questionnaire has also been developed (19).
The Division ofLung Diseases, National Heart and Lung Institute (NHLI), created a standard respiratory questionnaire predominantly based on the MRC questionnaire (20), which has been used widely in studies ofchronic respiratory disease in the United States. A comparison ofthe MRC, the NHLI, and a new self-completion questionnaire (21 ) has shown a few differences with minor changes in wording, order, typeofadministration, or mode ofadministration. Recently, theAmerican Thoracic Society (ATS), under contract with the Division of Lung Diseases (NHLBI) developed new standardadultandpediatric respiratory questionnaires, which included the information mentioned previously (22). Reported comparisonsofthenewquestionnairewith theold NHLIandwiththeMRC questionnairealsohas shown few differences in regard to minor word changes or mode ofadministration (23). Thus, all the questionnaires discussed can be administered or self-completed andcanbedoneby mail or by phone.
All questionnaires should be either standard or pretested. All questionnaires should require complete documentation ofmodes ofresponse. All interviewers that administer or check questionnaires should be extremely well trained to minimize interviewer bias. These techniques have been developed at length by various survey research centers and by the committee that devised the previously described stndardized questionnaires. Randomness of administration and similar issues have also been discussed before (4,21-23).

Pulmonary Function Tests
Pulmonary function tests ofvarious types are used clinically and epidemiologically to measure functional status, to assess disease, and to measure changes that occur in function related to treatment, a laboratory challenge, or environmental exposures. The forced vital capacity (FVC) maneuver, usually obtained with a spirometer or pneumotachograph, provides the absolute values ofmeasurements, usually adjusted to a reference population, that has meaning in terms of disease status. Long-term changes in these measurements provide indications of growth and decline offunction and ofthe effects ofdisease and exposures. The FVC maneuver tests normally measure the FVC, the forced expiratory volume in 1 sec (FEVI), and flow rates. When one derives a full flow-volume curve or loop, one can measure the flows at 50 and 75 % of expired vital capacity. These are somewhat more sensitive measures offunction and changes in function and are thus useful additions to the standard spirometric measures (22,24).
The criteria for these tests have been described in detail (22,25). These use consistent (reliable) methods. All the instruments require extensive testing and calibration. They all require excellent, well-trained technicians, appropriate instructions to the subject, and careful evaluation of the tests. Present guidelines require that at least three good tests be performed and that the two best tests be within at least 5% or 100 mL of one another. Themaximumperformanceortheaverageofthebesttwo performancescanbeused (22,25). Thebackgroundofthetestand its usefulness have been extremely well documented by the ATS report, and there are normal standards using these tests (26).
One condition that influences pulmonary function temporally is bronchial responsiveness, considered to be a key risk factor in the development ofAOD (27)(28)(29)(30)(31). Short-term changes outside ofthe laboratory can use related measurements such as the peak expiratory flow rate (PEFR); the PEFR can be obtained in absolute terms from the FVC maneuver as well (9,22,32,33). It is recommended that the PEFR from the FVC maneuver be used for quality assurance, but not as an absolute measure of function (22,34).
The mini-Wright peak flow meter is a simple and inexpensive device that has proved useful in many studies, especially of patients with reactive airway diseases, and preliminary studies have indicated its reliability and validity (40,(42)(43)(44)55). Analysis of day-to-day variation in peak expiratory flow rates have been used to help determine the short-term effects of environmental factors on respiratory function (9,29,32,35,39,(43)(44)(45)50,56,57).
The ATS and ACCP (American College of Chest Physicians) have stated that an increased responsiveness of the airways to various stimuli manifest by slowed forced expiration that changes in severity either spontaneously or with treatnent is a characteristic of asthma. The definition includes reference to bronchial reactivity though linked to airway changes; airway response occurs in patients with asthma and with rhinitis as well and can occur in those with "bronchitic" (mucous hypersecretion) syndromes (58; C. J. Holberg, personal communication). It occurs in some percentage of normal subjects as well (27)(28)(29)32,33,37,39,43,45,46,51,56,57,59,61). There is a fair amount of evidence that bronchial responsiveness (or lability) is greatest in early childhood and decreases thereafter (32,33,S7,62). Davies et al. (43) have argued that the PEFR may be of similar value to histamine/methacholine provocation tests to distinguish asthmatics, based on a UK delineation of astuna as a disease characterized by wide variations over short periods of time in resistance to airway flow (63). "Traditionally, the diagnosis has rested on the demonstration of this phenomenon either by measuring airflow obstruction (FEVI) before and after a bronchodilator or more recently recording PEFRs using a mini-Wright peak flow meter 3 or 4 times a day" (40). The advantage of measuring PEFR extends beyond the diagnosis ofastuna and is an objective assessment. This type of test has many advantages over bronchial provocation testing: peak flows are recorded by the patient or subject, and the test is entirely safe; peak flows can be measured irrespective ofthe severity ofthe airflow obstruction.
Example. Because most air pollutants are inhaled and have their major imfpact through that route ofentry, we have empirically determined differential responses to air pollutant exposures in the general population. We have worked on a framework for using intraindividual variability in population subgroups. This approach is used to define criteria for responsiveness and for predicting responses to indoor and outdoor (i.e., total exposure to) pollutants, aero-allergens, weather, and interactions ofthese environmental factors. The use of peak flow rates or other measures of pulmnary fiuction to detect abnormality or significant change must incorporate the normal range of biological variability and the normal degree of intrasubject variability in each test (32)(33)(34). Criteria for significant intaindividual changes used healthy subjects ("normal" reference population). They were identified as follows: a) absence of reported respiratory symptoms (diary and health questionnaire); b) absence of smoking history, current or past; and c) absence ofacute respiratory illness symptoms, including sore throat, cough, wheezing, or attacks of shortness of breath with wheezing. Diurnal changes in PEFR were evaluated using the ratio ofthe maximum ofthe noon PEFR (N) or evening PEFR (E) with the minimum of morning (M) or bed PEFR (B) and the amplitude (maxinum-minimum) as a percentage ofthe day's mean PEFR. The 95th percentiles in the distributions in these scores and individual means were usedto identify criteria to screen each day's scores for excessive changes and to classify the overall study population for subsequent testing. The amplitude/mean was less than 30% for 95% of 1201 person-days in the 5to 15-year-old group. The corresponding max (N,E)/min (M,B) ratio was 128%. These were about twice the limits ofthe 15 to 35 age group (amp/mean, 16.9%; max/min, 115.6%), although the 35 to 65 year olds were intermediate (amp/mean, 21.2%; max/min, 119.8%). Individuals having more than 5% oftheir test days exceeding either of these limits were classified as having excessive diurnal variation. More than 30% ofthe subjects exceeded these limits; this classification was able to detect about 60% ofthe current diagnoses asthmatics, with a specificity near 70%. The sensitivity was highest for the youngest age group (75 %) and lowest for the oldest age group (55 %). Diurnal variability was not appreciably changed by smoking status, but was related to a number of the reported symptoms and grouping. Individuals exceeding diurnal limits were more likely to report allergic-irritation-type symptoms (62.2/53.1%, p < 0.05), acute respiratory illness symptoms (48.5/36.1%,p < 0.01), also, specifically, cough and sore throat, wheezing or whisding in the chest (16.6/7.29% = 2.3, p < 0.001), and shortness ofbreath with wheezing (11.2/4.09% = 2.8, p < 0.001). The ratio for chest colds was 1.8 (p < 0.01), as expected from previous work. However, these more susceptible subjects did not report more of the nonspecific symptoms (e.g., headache, dizziness, nausea), indicating specificity of the criteria.

Other Pulmonary Function Tests
The other highly recommended test in studies of restrictive lung diseases is the carbon monoxide diffusing capacity. It has been extensively reviewed and described (22). It is a difficult test to perform in field conditions. It has a large degree ofvariability and is the most semiquantitive in nature. However, there are standard methods ofperforming the test (22). There is evidence that in some patients with interstitial lung disease, the disease manifests itselffirst andprimarily by restriction and reduced FVC, whereas in other persons it is thediffusing capacity thatbecomes abnormal first; this trend continues in longitudinal studies up to 10 years (22). Recent epidemiological studies ofAOD have also shown the usefulness ofthe carbon monoxide diffusing capacity test (64).
Because a major operating feature of the lung is elastic recoil properties, studies of pressure-volume curves using plethysmography are of great physiological value. However, they are extremely difficult to do in field settings and are both time consuming and uncomfortable. Plethysmography can alsobeused to obtain the total lung capacity and residual volume of the lung, whose ratio varies with both obstructive and restrictive disease. Available standard methods and their usefulness have been described in detail elsewhere (22). Othermethods, such as helium dilution and nitrogen washout techniques, are also available to determine these values, but are not as accurate. None of the methods for obtaining accurate total lung capacity or residual volumes are readily available for field studies, and they require extensive knowledge and equipment. Measures ofairway resistance are also useful tests in a laboratory setting. They present a great deal of difficulty and variability and have not been shown to be critical result indicators ofchronic respiratory disease (22).

Other Tests
Carboxyhemoglobin measures are important in many conditions in which carbon monoxide exposure is pertinent, as is methemoglobin for NO. Immunologic tests of atopy and/or alveolitis, including tests oftype I and type III hypersensitivity and the role ofcirculating immunoglobulins are useful in any studies ofrespiratory disease in which an immunologic, atopic, or allergic mechanism iseither latentormanifest (59). Other immunologic tests (e.g., complement, lymphocyte stimulation) also appear promising.
Genetic predisposing factors to chronic respiratory diseases are very important. However, no specific genetic factors havebeen so clearly indicated as to be valuable in screening studies. The roles of ac-antitrypsin protease inhibitor phenotypes (65) and other possible genetic factors, such as HLA histocompatibility antigen, GM, ABOblood group, ABH secretor status, andceruloplasmin, have not beenproven. Several autoantibodies, suchas ANA, RF, and DNA antibodies (66), play a role in respiratory disease but cannot be used epidemiologically. Bronchoalveoar lavage and mucociliary clearance studies have been shown to be important in general, but not as health outcomes. Collagen, elastin, and surfactant are highly critical in the functioning and the structureofthe lungs, but are also impractical and uncertain as health indicators. Many clinical techniques and other biochemical studies and studiesoflungdefensemechanismsarelikelytobedevelopedfurther and may prove someday to be highly useful.

Discussion
More complex study designs, involving stratified or nested sampling, require the use of appropriate analytic procedures (e.g., multivariate analysis of variance) to incorporate restric-tions on randomization that are specified in the sampling design (e.g., mixed model or repeated measures ANOVA); these require caution in their application and in interpreting results. Statistical analyses, no matter how complex, relate only to the strength of association, not to the determination of causation. Other considerations include (67) the consistency of the relationship observed within a study and by other studies; the specificity of a pollutant related to a particular response (although this is less important when a pollutant produces several responses or when the same response can result from multiple causes) (68); the temporal relationship ofresponse following exposure; the existence ofa biological gradient or exposure/doseresponse relationship (linear or nonlinear); the plausibility ofthis relationship or the coherence between biological mechanisms of response that are implied by it with "generally known facts ofthe natural history and biology of the disease" (e.g., toxicological or experimental studies); and the experimental or semi-experimental modification of exposure affects the frequency of responses.
In population-based studies a considerable amount ofeffort is directed at characterizing, selecting, and recruiting a suitable study population. This includes the initial identification and screening ofthe total population, sampling to obtain the stratified cluster allocations for the study population, and various followups (e.g., on nonrespondents for characterization). The initial screening includes obtaining estimates ofconcentrations of different contaminants in the different exposure settings. Applying the direct exposureand health-assessment program to the selected population groups involves additional efforts to enlist their support for the duration of the study (4). These activities are common for both single pollutant and complex mixture studies. The major difference is in what information is collected by the direct assessments: how efficiently does the study make use of its study population; what questions will the study be able to answer effectively because it contains adequate numbers of subjects who are exposed and nonexposed to both particular and mixtures of contaminants. Analytic adjustments will not compensate for inadequate designs. The additional effort needed to collect several types of samples when already making the measurements is usually small relative to the initial efforts required to obtain the subject's support. A considerable amount of additional information can be gained by identifying mixtures or combinations ofexposures that are ofinterest and evaluating their direct and interactive effects on health. This provides for direct control of other, possibly confounding, factors in the design by selecting subpopulations and/or by having measurements ofthe other factors for indirect (covariance) adjustments. When determing what factors to measure and selecting techniques, researchers should consider the demands (time and effbrt) and inconveniences (e.g., noise and appearance of monitors) placed on the study subjects relative to the benefits (direct and indirect) they will receive. Againstthe mptationtoaddmorequestionnairesor monitors istheriskofoverwhelming the subjectsandlosing their support. (One benefit of offering both monitoring and health evaluations to subjects is thatthey receivemedical anddiagnostic test results that are of interest to the subjects.) In some studies, whereclinic visits werenecessitatedandwherebloodhasbeenrequired, refusal rates went above 50% (69). In well-motivated populations, inwhich neitherbloodnorotherpainful techniques arerequired, refusal rates aregenerally lessthan25 % (21 ). Sur-veys using vans cannotuse highly sensitive instruments. Ifproperly managed, refusal rates will run between 20 and 40% (70).
Field studies require the most hardy equipment. Whether studies are performed in the home, motels, or places of work, field studies usually rely on the simplest and most hardy spirometric techniques and on questionnaires. Occasionally, blood samples can be drawn and skin testing can be performed. In addition, a few ancillary techniques, such as blood pressure measurements, can be obtained. If appropriately performed, field studies should have the lowest refusal rate because the investigator is conducting the measurements in the usual place of residence or work of the subject. Although many other techniques appear promising as health indicators, only a small battery of tests are currently available. Thus, techniques that should be required of all studies are questionnaires and spirometric studies of pulmonary function. This work was supported by EPA Cooperative Agreement no. CR811806 and EPRI contract no. RP2822. Although the research described in this article has been funded wholly or in part by the U.S. Environmental Protection Agency, it has not been subjected to the Agency's required peer and policy review and therefore does not necessarilv reflect the views ofthe Agency, and no official endorsement should be inferred.