Estimating Error in Using Residential Outdoor PM2.5 Concentrations as Proxies for Personal Exposures: A Meta-analysis

Background Studies examining the health effects of particulate matter ≤ 2.5 μm in aerodynamic diameter (PM2.5) commonly use ambient PM2.5 concentrations measured at distal monitoring sites as proxies for personal exposure and assume spatial homogeneity of ambient PM2.5. An alternative proxy—the residential outdoor PM2.5 concentration measured adjacent to participant homes—has few advantages under this assumption. Objectives We systematically reviewed the correlation between residential outdoor PM2.5 and personal PM2.5 (r̄j) as a means of comparing the magnitude and sources of measurement error associated with their use as exposure surrogates. Methods We searched seven electronic reference databases for studies of the within-participant residential outdoor-personal PM2.5 correlation. Results The search identified 567 candidate studies, nine of which were abstracted in duplicate, that were published between 1996 and 2008. They represented 329 nonsmoking participants 6–93 years of age in eight U.S. cities, among whom r̄j was estimated (median, 0.53; range, 0.25–0.79) based on a median of seven residential outdoor-personal PM2.5 pairs per participant. We found modest evidence of publication bias (symmetric funnel plot; pBegg = 0.4; pEgger = 0.2); however, we identified evidence of heterogeneity (Cochran’s Q-test p = 0.05). Of the 20 characteristics examined, earlier study midpoints, eastern longitudes, older mean age, higher outdoor temperatures, and lower personal-residential outdoor PM2.5 differences were associated with increased within-participant residential outdoor-personal PM2.5 correlations. Conclusions These findings were similar to those from a contemporaneous meta-analysis that examined ambient-personal PM2.5 correlations (r̄j = median, 0.54; range, 0.09–0.83). Collectively, the meta-analyses suggest that residential outdoor-personal and ambient-personal PM2.5 correlations merit greater consideration when evaluating the potential for bias in studies of PM2.5-mediated health effects.

Numerous epidemiologic and toxicologic studies have linked particulate matter (PM) air pollution with adverse health outcomes, including mortality (Burnett et al. 2000;Dominici et al. 2003;Katsouyanni et al. 2003), hospital admissions (Burnett et al. 1995;Linn et al. 2000;Oftedal et al. 2003), and subclinical disease (Diez Roux et al. 2008;Liao et al. 2009;Whitsel et al. 2009). A common feature of such studies is their reliance on ambient PM concentrations measured at distal monitoring sites as proxies for personal exposure to PM of ambient origin. The reliance is consistent with regulatory policies developed under the Clean Air Act (1970) which have been informed by studies of the correlation between personal exposures to PM originating outdoors and residential outdoor PM concentrations (Wallace 2000). However, ambient PM may not adequately represent total PM exposure, because human activity pattern surveys suggest that, on average, individuals spend > 85% of their time inside (Klepeis et al. 2001), where they are exposed to numerous sources of indoor PM, the physicochemical properties and toxicities of which often differ from those of ambient PM (Monn and Becker 1999;Wainman et al. 2000).
Available exposure studies, although small in number, have suggested that several factors may influence the relationship between ambient and total PM exposure, including home ventilation, indoor PM sources, and time-activity patterns (Rodes et al. 2001;Sarnat et al. 2006;Williams et al. 2003b). Because these factors are not well quantified (Janssen et al. 1998), we previously reviewed the literature that examined the withinparticipant ambient-personal PM 2.5 correlation to determine the magnitude and sources of measurement error inherent in using ambient PM 2.5 as a surrogate for personal exposure (Avery et al. 2010). We found that characteristics of participants, studies, and the environments in which they were conducted affect the accuracy of ambient PM 2.5 as a proxy for personal exposure and that the potential for exposure misclassification may be substantial.
Although the residential outdoor PM 2.5 concentration measured adjacent to participant homes may be equally prone to misclassification under the assumption of spatial homogeneity, use of this measure as an alternative proxy for personal exposure may have some advantages if this assumption is not uniformly applicable. Studies of spatial variability in ambient PM 2.5 concentrations among 27 U.S. urban areas (Pinto et al. 2004) suggest that this may be the case. The fact that PM 2.5 varies at the microenvironmental level as a function of, for example, topography, proximity to PM 2.5 point sources, adjacency to major traffic arterials, and prevailing winds [U.S. Environmental Protection Agency (EPA) 2009; Zhu et al. 2002] also is consistent with this suggestion. Nonetheless, how spatial variability and outdoor microenvironments affect the use of ambient PM 2.5 concentrations as a proxy for personal PM 2.5 exposure remains unclear. Thus, we performed a meta-analysis using the literature that examined the withinparticipant residential outdoor-personal PM 2.5 correlation and contrasted these findings with those from the review of the within-participant ambient-personal PM 2.5 correlation (Avery et al. 2010). Findings from the two metaanalyses will facilitate the quantification of bias that resulted from the use of surrogates for personal PM 2.5 exposure in studies that relied on outdoor PM 2.5 measurements. We downloaded citations to an electronic reference manager (EndNote X1; Thomson Reuters, New York, NY), de-duplicated, and supplemented with secondary references cited in articles identified in the primary search. The citations were independently reviewed with respect to three inclusion criteria: measurement of residential outdoor PM 2.5 , measurement of personal PM 2.5 , and estimation of the within-participant residential outdoorpersonal PM 2.5 correlation. Study, participant, and environment characteristics were extracted from all articles meeting the inclusion criteria. The study characteristics were journal of publication, publication date, setting, study dates, sample size, duration of study, timing (consecutive, nonconsecutive), lower limit of PM 2.5 detection, number (minimum, mean) of paired PM 2.5 measures, and correlation metric (Pearson, Spearman). Participant characteristics included age (mean, minimum, maximum), percent female, and the presence of comorbidities (pulmonary, cardiovascular, multiple, neither). Environmental characteristics included the mean, median, and standard deviation of PM 2.5 concentrations (residential outdoor, personal), the within-participant residential outdoor-personal PM 2.5 correlation coefficients and corresponding number of paired measurements, season, distance to monitor, monitor type, air exchange rate, percentage of time using air conditioning, and percentage of time with windows open. Discrepant exclusions and extractions were adjudicated by consensus. Supplemental data were requested from authors by electronic mail as needed. City-specific longitudes and latitudes were obtained from the GEOnet Names Server (National Geospatial-Intelligence Agency 2009). Meteorologic data were obtained from the National Climatic Data Center (2009).  17 Statistical analysis. Summary correlation and variance estimates for the jth study were estimated from the personal ambient PM 2.5 correlations measured for each of the ith participants. Each within-participant correlation coefficient (r i ) was converted to its variance-stabilizing Fisher's z-transform: (Fisher 1925). Estimates of the within-participant variance [v i = 1 ÷ (n i -3)] and betweenparticipant variance (τ j 2 = [Q j -(k j -1)] ÷ c) for the jth study were estimated from the number of paired personal-residential outdoor PM 2.5 measurements for each participant (n i ), the number of participants per study (k j ), the weighted sum of squared

Systematic
. The transformed effect size for the jth study is given by - . Negative τ 2 estimates were set to 0 (Field 2001).
We assessed publication bias, which is present when study results influence the chance or timing of publication (Begg and Berlin 1989), using a "funnel plot" of W j versus -Z j . In the absence of publication bias, plots usually resemble a symmetrical funnel, with the more precise estimates forming the spout and the less precise estimates forming the cone. We also evaluated the adjusted rank correlation (Begg and Mazumdar 1994) and regression asymmetry tests (Egger et al. 1997) as well as a nonparametric "trim-and-fill" method that imputes hypothetically missing results due to publication bias (Duval and Tweedie 2000). Low p-values associated with the former tests (p Begg , p Egger ) give evidence of asymmetry.
Interstudy heterogeneity was evaluated using a plot of -Z j ÷ S j versus 1 ÷ S j (Galbraith 1988) and with Cochran's Q-test (Cochran 1954). The plot and test are related in that the position of the jth study along the vertical axis illustrates its contribution to Q-test statistic. In the absence of appreciable evidence of heterogeneity, all studies fall within the 95% confidence interval (CI) and p Cochran > 0.1.
We first assessed variation in the strength and precision of -Z j across levels of the study, environment, and participant characteristics with a summary random-effects estimate of -Z within each study, environment, and participant category (Berkey et al. 1995). We also constructed a series of univariable randomeffects meta-regression models to relate each study, environment, and participant characteristic to differences in -Z j . Lastly, a multivariable random-effects meta-regression model and a backward elimination strategy were used to evaluate 8 study, participant, and environment characteristics routinely available in epidemiologic studies of PM 2.5 health effects: latitude, longitude, mean age, percent female, relative humidity, sea level pressure, mean temperature, and mean residential outdoor PM 2.5 (measured in this setting or spatially interpolated in other studies). Interval-scale characteristics were analyzed before and after dichotomization at their medians unless noted otherwise. We used STATA (version 9; StataCorp LP, College Station, TX) to perform all the analyses. To facilitate interpretation, summary estimates (i.e., -Z ) were back-transformed to their original metricr after data analysis.

Results
The systematic review identified 567 candidate studies for screening. Of these studies, nine (2%) met the criteria for critical appraisal and were abstracted (Brown et al. 2008;Liu et al. 2003;Reid 2003;Rodes et al. 2001;Rojas-Bracho et al. 2000;Suh et al. 2003;Wallace 1996;Williams et al. 2000aWilliams et al. , 2000bWilliams et al. , 2003a. Abstracted studies were published between 1996 and 2008 (Table 1), were set in eight cities in six U.S. states, and were conducted between 1989 and 2001. The median study duration was 1.9 months (range, 0.2-15.2 months), a period in which 70% of the studies collected PM 2.5 data over consecutive days. During data collection, the investigators recorded a median of seven (range, 5-20) pairs of residential outdoor and personal PM 2.5 concentrations per participant, on which the within-participant Pearson (63%) and Spearman (37%) correlation coefficients were based (Table 1).
Several study, participant, and environmental characteristics were suggestively associated with moderate increases in the within-participant residential outdoor-personal PM 2.5 correlation coefficient in univariate metaregression models (Figure 4), including earlier study midpoints, eastern longitudes, older mean age, lower personal-residential outdoor PM 2.5 differences (and ratios), and higher mean temperatures ( Figure 5). For example, every 5°C increase in mean temperature was associated with a 0.10 95% CI, (-0.02, 0.21) unit difference inr . The direct association between mean temperature andr j also was apparent when evaluating mean temperature dichotomized at the median: In studies with a mean temperature ≥ 13.43°C,r was 0.59 (range, 0.40-0.74), and in those with a mean temperature < 13.43°C,r was 0.50 (range, 0.44-0.56).
When evaluating multivariable metaregression models, only higher mean ages and eastern longitudes were associated with an increased within-participant residential outdoor-personal PM 2.5 correlation coefficient (p < 0.05).

Discussion
Epidemiologic studies of the health effects of PM 2.5 typically estimate PM 2.5 exposures using daily mean concentrations either obtained from a single ambient PM 2.5 monitoring site or averaged across several sites (U.S. EPA 1996). Although rapid dispersion and secondary formation of atmospheric PM 2.5 via chemical reactions of such gases as sulfur dioxide, nitrogen oxides, and ammonia ensure some geographic uniformity of the monitored concentrations, primary sources of anthropogenic PM 2.5 , including traffic, construction, and industry (Samet and Krewski 2007), can increase the spatial variability of PM 2.5 . Additional factors that influence the relationship between ambient PM 2.5 concentrations and PM 2.5 exposures include home ventilation, indoor activities associated with generation or resuspension of PM 2.5 like cooking or cleaning, and time-activity patterns (Liu et al. 2003;Williams et al. 2000b). Thus, estimates of PM 2.5 exposure based on ambient PM 2.5 concentrations are associated with an acknowledged degree of uncertainty (Janssen et al. 1998).
To further characterize this uncertainty, in the present study we extended a prior meta-analysis of the within-participant ambient-personal PM 2.5 correlation (Avery et al. 2010) by examining the withinparticipant residential outdoor-personal PM 2.5 correlation using analogous metaanalytic methods. In both cases, the examination generated little evidence for publication bias of Fisher's z-transformedr j but strong evidence of heterogeneity. Several study, participant, and environment characteristics were associated with an increasedr j , including earlier study midpoints, eastern longitudes, lower personal-residential outdoor PM 2.5 differences (and ratios), higher mean ages, and higher mean temperatures. Moreover, the direct association between eastern longitudes and increasedr j was consistent with the prior meta-analysis of the within-participant ambient-personal PM 2.5 correlation.
The direct association between eastern longitudes and increasedr j may reflect several regional factors, including higher urban PM 2.5 concentrations (Rom and Markowitz 2006) or a greater influence of secondary PM 2.5 sources in eastern locales (Pinto et al. 2004). The inverse associations between the residential outdoor-personal PM 2.5 difference (or ratio) and mean temperature withr j may also suggest lower microenvironmental variation in PM 2.5 or an increased contribution of residential outdoor to personal PM 2.5 exposure, through either time-activity patterns or increased air exchange. We were unable to fully evaluate the influence of these factors given the limited number of published studies and their inconsistent reporting of other geographic, household, and personal factors potentially responsible for the above associations. However, higher mean ages and eastern longitudes were associated with increasedr j in the multivariable prediction model that included study, participant, and environment characteristics routinely available in epidemiologic studies of PM 2.5 health effects. Although the meta-analyses of the ambient-personal and residential outdoorpersonal PM 2.5 correlations summarized a wide range of published correlation coefficients, both of them estimated a median r j of 0.5, which suggests that attempting to account for spatial variability and outdoor microenvironments does not appreciably affect the use of outdoor PM 2.5 concentrations as proxies for personal PM 2.5 exposure in the settings examined by the source studies. Nonetheless, these simple measures of central tendency have potentially important implications for studies using PM 2.5 concentrations measured at distal or proximal monitoring sites. For example, anr of 0.5 implies that, on average, onlyr 2 or one-fourth of the variation in personal PM 2.5 is explained by ambient or residential outdoor PM 2.5 concentrations. Under a simple measurement error model, it also implies that the variances of ambient or residential outdoor PM 2.5 concentrations are 1/r 2 , or four times as large as the variance of the true, but often unmeasured, personal PM 2.5 exposure. Moreover,r values of 0.5 in diseased and nondiseased subpopulations (i.e., nondifferential exposure measurement error) imply that a) sample sizes needed to detect between-group differences in mean ambient or residential outdoor PM 2.5 concentrations are 1/r 2 , or 4-fold as large as those needed to detect the same differences in personal PM 2.5 exposures, and b) effect estimates expressed as microgram per cubic meter increases in ambient or residential outdoor PM 2.5 concentrations are equal to those associated with the same microgram per cubic meter increases in personal PM 2.5 exposure, albeit attenuated toward the null by the power r 2 or 0.25. The latter form of attenuation is capable of obscuring weak to modest health effects of PM 2.5 (White et al. 2003), yet it cannot be adequately controlled by methods commonly used to account for confounding (Greenland and Robins 1985).
Given the above considerations, it is tempting to assume that all health effect estimates based on ambient or residential outdoor PM 2.5 concentrations would be considerably larger if they were instead based on personal PM 2.5 exposures, but to do so would yield more biased estimates if the original PM 2.5 -disease associations were spurious due to chance or confounding (Armstrong 1998). This justifies the application of the present findings to the PM 2.5 -disease associations that are the most precise and least biased according to criteria used to judge epidemiologic evidence (Hill 1965;Poole 2001;U.S. EPA 2009). Furthermore, factors associated withr, such as mean age and eastern longitudes, may differ among participants and the studies in which they are enrolled. It is therefore difficult to predict the degree to which PM 2.5 health effects estimates may be biased by exposure measurement error. Nonetheless, the above examples clearly illustrate that the impact ofr on the interpretation of findings from studies of PM 2.5 health effects may be substantial.
Although in the present study we attempted to quantify the error associated with using residential outdoor and ambient PM 2.5 concentrations as proxies for total personal exposure, the approach adopted here has several limitations. First, residential outdoor and ambient PM 2.5 concentrations are likely to be poor proxies for exposure to nonambient PM because PM originating indoors has different Figure 4. Unadjusted summary correlations (95% CIs) and differences (95% CIs) by study, participant, and environment characteristics for nine studies examining the within-participant residential outdoor-personal PM 2.5 correlation. Summary correlations represent stratum-specific estimates ofr. Increases inr per unit change of study, participant, and environment characteristics are provided byr difference estimates. SLP, sea level pressure.   Figure 5. Plot for 16 estimates of the within-participant residential outdoor-personal PM 2.5 correlation (95% CI) versus mean outdoor temperature, including the univariate random-effects meta-regression line.  (Long et al. 2001). Although the relative toxicity of outdoor and indoor PM remains under investigation, a panel study of 16 chronic obstructive pulmonary disease patients in Vancouver, British Columbia, reported that only the PM originating outdoors was associated with adverse cardiopulmonary effects (Ebelt et al. 2005). Moreover, in the present study we did not evaluate the correlation between concentrations of PM originating almost exclusively outdoors (e.g., sulfate or elemental carbon) and personal PM 2.5 exposure, despite reports that their associations with ambient PM 2.5 are particularly strong (Ebelt et al. 2000;Sarnat et al. 2006). Further work examining the relative contributions of PM 2.5 constituents to PM-mediated health effects is clearly needed. In summary, the results presented here and in the previous meta-analysis of the within-participant ambient-personal PM 2.5 correlation (Avery et al. 2010) suggest that greater scrutiny of the effects of exposure measurement error is warranted. Further inquiry should involve quantifying the impact of using ambient or residential outdoor PM 2.5 concentrations as proxies for personal PM 2.5 exposure, as well as the development of methodologies to apply such findings. A comprehensive understanding of the degree to which these proxies influence PM 2.5 -disease associations is especially important in air pollution epidemiology because the health effects of PM 2.5 exposure may be subtle. Such subclinical effects are particularly difficult to detect in the presence of measurement error because sensitivity of detection varies inversely with the degree of misclassification (Rom and Markowitz 2006).