Daily mortality and air pollution in Santa Clara County, California: 1989-1996.

Since the last revision of the national particulate standards, there has been a profusion of epidemiologic research showing associations between particulates and health effects--mortality in particular. Supported by this research, the U.S. Environmental Protection Agency promulgated a national standard for particulate matter [less than/equal to] 2.5 microm in aerodynamic diameter (PM(2.5)). Nevertheless, the San Francisco Bay Area of California may meet this new standard. This study investigates the relationship between daily mortality and air pollution in Santa Clara County (a Bay Area county) using techniques similar to those utilized in earlier epidemiologic studies. Statistically significant associations persist in the early 1990s, when the Bay Area met national air pollution standards for every criteria pollutant. Of the various pollutants, the strongest associations occur with particulates, especially ammonium nitrate and PM(2.5). The continuing presence of associations between mortality and air pollutants calls into question the adequacy of national standards for protecting public health.

the San Francisco Bay Area of California may meet this new standard. This study investigates the relationship between daily mortality and air pollution in Santa Clara County (a Bay Area county) using techniques similar to those utilized in earlier epidemiologic studies. Stastically sigificant associations persist in the early 1990s, when the Bay Area met national air pollution standards for every criteria polutant Of the various pollutants, the strongest associations occur with particulates, especially ammonium nitrate and PM25. The coninuing presence of associations between morulity and air pollutants calls into question the adequacy of national standards for protectig public helth. Key wordr. air pollution, ammonium nitrate, cabon monoxide, epidemiology, National Ambient Air Quality Standards (NAAQS), ozone, PM25, Poisson regession. Environ Healtb Perspet 107:637-641 (1999). [Online 25 June 1999] tsp:llehpnlt.niehs.nib.gpvdoc/1999/107p637-64lfairley/absaract.btml The past decade has seen a burgeoning of epidemiologic research investigating the relationship between air pollution and health effects. Dozens of these studies have analyzed the relationship of daily mortality to various air pollutants, especially particulates. The U.S. EPA analyzed many of these studies in Chapter 12 of Air Quality Criteria for Particulate Matter (1). This criteria document and the later staff report (2) concluded that the preponderance of evidence supports a causal connection between fine particulate levels and various health effects, including mortality. This led to the establishment of national standards for particulate matter < 2.5 pm in aerodynamic diameter (PM2.5).
A previous study (3) showed that an association existed between particulates [measured as coefficient of haze (COH)] and mortality in Santa Clara County (SCC), California, during the years 1980-1986. Since that time, the Bay Area Air Quality Management District (BAAQMD) has monitored particulate matter < 10 pm in aerodynamic diameter (PM1O), and since 1990, the California Air Resources Board has operated PM2 5 monitors, including one in SCC. An analysis of SCC PM2.5 data shows that SCC would have met the PM2.5 standard between 1991 and 1996. The present study is motivated by the concern that, although SCC may attain the new PM2 5 standard, particulates there may still cause substantial health effects.
Air quality in SCC. Most of the studies of mortality and air quality have been based on eastern or midwestern U.S. cities, whose air quality dynamics differ markedly from those of the San Francisco Bay Area. Among the gaseous pollutants, ozone and carbon monoxide levels are similar, but Bay Area sulfur dioxide levels are an order of magnitude lower than in the eastern United States. In fact, sulfur dioxide is so low that it is no longer measured in SCC, but nearby San Francisco's 24hr design value is < 0.01 ppm, compared with typical design values of approximately 0.05 ppm in many eastern cities (4).
SCC's particulate composition, dynamics, and sources also differ markedly from those of eastern cities. In eastern cities, ammonium sulfate represents approximately 45% of PM25 (1), whereas in SCC it represents 5%.
For many eastern and midwestern cities, particulate levels peak in the summer months (1). For SCC, however, particulates (especially fine particulates) are higher in winter.
Specifically, mean San Jose, California, PM2.5 levels in November, December, and January averaged 25 pg/m3 in 1990-1996, but < 10 pg/mi during the rest of the year.
Wood-burning and ammonium nitrate each contribute approximately 40% of SCC's wintertime PM2.5 (5). These sources, combined with wintertime stagnation periods, are the main causes of SCC's elevated wintertime particulate levels. As a result of this seasonality, the new SCC 15 pg/m3 annual standard appears no more stringent than the 65 pg/m3 24-hr standard (6). This is in spite of the EPA's stated intention to make the annual average the more stringent controlling standard (4. Particle size also varies by season. During the winter, SCC PM2.5 averages approximately 70% of PMIO compared with 50% for the year as a whole. Wintertime PM1O is dominated by combustion sources, with approximately 10% coming from geological dust. During the rest of the year, geological dust makes up a larger fraction, marine sea salt becomes significant, and the amount of ammonium nitrate decreases by half.
For several years during the early 1990s, SCC and, in fact, the entire Bay Area, had air quality that complied with air quality standards for all criteria pollutants. Moreover

Methodology
This study attempts to draw on the extensive experience of previous studies to determine a modeling approach. The sensitivity of conclusions to model choice, meteorological adjustment, and covariates has been extensively investigated [e.g., (1,(8)(9)(10)]. These studies have reached similar conclusions, namely that the choices of (reasonable) model and (reasonable) meteorological adjustment do not appear to greatly affect conclusions on the relationship between mortality and particulates, but the inclusion of other air contaminants often causes a substantial increase in the standard error of the particulate regression coefficient and sometimes a drop in the level of the coefficient. In other words, there can be substantial confounding of these variables.
Based on these considerations, various models were tried, including Poisson regression with either linear predictors or generalized additive models (GAMs) for temporal and weather variables, and models with an overdispersion fit using quasi likelihood.
The disadvantage of the GAM approach is that it does not provide simple coefficients. Because the focus is on pollutant variables, however, this lack is not of great concern. The advantage is that the GAM approach is less likely to induce lack of fit. Thus, we will use the GAM approach. Models with an overdispersion parameter are useful for certain deviations from the Poisson model. However, if the Poisson model appeared adequate, it would be tised.
The modeling strategy follows that of Samet et al. (10), first fitting terms for season anid trend, then adding terms for meteorology, and finally adding pollutant terms, with the number of seasonal, trend, and meteorology terms determined by optimizing Akaike's information criterion (AIC).
Tests ofgoodness offit. A goodness of fit test of the Poisson model was performed based on deviance. Under the null hypothesis that the data derive from this model, the deviance has an approximately X2 distribution with the residual degrees of freedom. Specifically, the x2 test is a likelihood ratio test versus a saturated model, where each day is fitted with a different mean. Seriouis lack of fit would result in unusually large values of the deviance.
Residuals were checked for extreme values. The CAM approach minimizes problems with any nonlinearity between the response and the temporal and weather variables.
To test the sensitivity of the results to the use of CAM, a parallel modeling approach was performed using sine and cosine terms for time and day of year and polynomials in minitnum and maximum temperature. A simulation ofthe model-fitting process.
The statistical significance level for testing a parameter in a model is based on the assumption that the selection of the model was made before the data were gathered. In practice, this is rarely the case, so the defacto assumption is that the process of model-building has a minimal effect on the significance level.
An approach to finding more realistic significance levels is to simulate the modelbuilding process itself. To that end, an S-Plus function was developed to simulate the following approximation of model-building. The idea was to simulate data from a true model that contains no pollutant term, then simulate the building up of the model and the fitting of a pollutant variable. The set of pollutant variable coefficients thus obtained should form a more realistic distribution than the simple one where model-building is ignored.
The steps of the simulation were as follows. Initially, a vector of Poisson means was generated by fitting daily mortality data to the seasonal, trend, and weather variables using the (AM approach. An S-Plus function was then invoked repeatedly with different random seeds that performed the following steps: 1. The function simulates a vector of Poisson variates from the initial mean vector. 2. It fits this simulated variate vector to CAM terms for time and day of year in a Poisson regression, increasing the degrees of freedom until there is no improvement in AIC from the addition of another degree of freedom in either GAM term. 3. It uses the optimal number of degrees of freedom for time and day of year from step 2, and adds GAM terms for minimum and maximum temperature, again adding terms until there is no improvement in AIC. 4. It fits the simulated variates to PM,J) in addition to the optimal number of time, day of year, and minimum and maximum temperature GAM terms found in steps 2 and 3, wvith the fitted PM-, coefficienlt ouLtput.
The coefficient found from the actual data is then compared to the resulting distribution of simulated coefficients, providing what may be a more realistic p-value.
The data. California mortality data were obtained from the California Department of Health Services (Sacramento, CA) for the years 1989-1996. Counts of daily total nonaccidental mortality (henceforth described as mortality), respiratory mortality, and cardiovascular mortality were extracted for SCIC residents who died in-county, using the same International Classification of Diseases, Ninth Revision (11) codes as in the previous study (3).
Pollutant data were obtained from the BAAQMD pollutant database. Long-term PM1O data were available for only one SCC site San Jose 4th Street. These data cover the full period on an every-6-day schedule, with an every-other-day schedule during the first 3 years. This site also provided PM1o constituents nitrate and sulfate on the 6-day schedule and daily COH values. PM215 and PM10-2)5 were also available from a research model dichotomous sampler that operated at this site from 1990 through 1996 on the satne 6-day schedule.
Ozone, carbon monoxide, and nitrogen dioxide data were also obtained for the 4th Street site. Although data for ozone were available from some other SCC sites, these were not included in the interests of simplicity. Because national standards are health-based it seemed reasonable to include variables with averaging times as defined in the standards, namely maximum 8-hr ozone, maximum 8-hr CO, and 24-hr NO,. Nevertheless, 24-hr CO and ozone were also considered.
Comparisons of 4th Street ozone with other SCC' sites conlsistentlytI show correlationls above 0.8 in seasonally, adjtisted ozoie concentrations. Thus, 4th Street ozoone concentrations represent a reasonably good surrogate for outdoor ozone exposure in SCC. Based on data from the late ]970s when the district operated a number of (OH monitors in SCC, correlations with the 4th Street site were quite high. The correlation between season-and trend-adjusted PM-) for Fremonit and 4th Street was 0.86.
Weather data were obtained from the BAAQMD meteorological database for San Jose Airport. Previous studies have found inonlinear relationships between mortality and weather variables. Because mortalitr can be affected by both hot anid cold weather, it seemied reasoInable to consider both minimunm and maximtum temperature as variables. 'Fherefore, both daily maxinmunm and minimum temperature as well as 24-hr average relative humidity data (rh) were obtained. Missing values were filled in by regressing against temperature and rh values at other nearby BAAQMD meteorological sites Alviso and Union City.
Comparison with previous results. To compare the results for 1989-1996 with the previous 1980-1986 resuilts it was necessary to reanalyze the earlier results paralleling the nlew analysis as closely as possible.
For the 1980-1986 reanalysis, I'M-1, and PM10 and its species were not available. COH was used along with NO3 from the TSP filter. The other pollutants-NO2, 03, and COwere measured, although the results were read from strip charts and recorded with one less significant digit. San Jose Airport data were not available for this time period; therefore, San Jose city temperatures were used. Season and trendfits. Rather than predicting mortality from a single temporal GAM term, separate CAM terms in time and day of year were fit because a good-fitting model could be obtained using many fewer degrees of freedom. Terms were added sequentially until there was no further improvement in AIC.

Results
The best model contained a GAM term for time with 7 degrees of freedom (dfi and a day of year term with 12 df. The resulting deviance was 3,038.5, with AIC 3,078.5. Meteorological variables. GAM terms were fit for minimum and maximum temperature in addition to a 7-degree term for time and a 12-degree term for day of year, yielding an optimum AIC with 3 dffor minimum temperature and 2 df for maximum. (Subsequendy, this set of GAM terms will be referred to as the optimal GAM terms.) The inclusion of a minimum-maximum crossproduct term did not improve the AIC, nor did the inclusion of relative humidity.  aValues are 24-hr averages unless otherwise noted. blhe fine and coarse fraction of PM10 do not add to total PM10 because they derive from the dichotomous sampler, whereas total PM10 was collected with a separate high-volume sampler. Pollutant variables. Table 3 presents partial correlations between mortality and the pollutant variables. Specifically, mortality and each pollutant variable were regressed against the optimal GAM terms, and the residuals saved. Several of the Poisson regressions did not converge, so least squares regressions were used. The table presents the correlations among these residuals.
Of the pollutant measures, PM2 5 and NO3 have the highest partial correlations with mortality. There are also reasonably high correlations with PM1O and SO4. Interestingly, in contrast to other studies, there are actually negative correlations between mortality and the lags (previous day) of these variables. Another change is that COH is only weakly correlated with mortality, although there is a statistically significant correlation with lagged COH. The relationship with 24-hr CO is similar to that of COH. NO2 is also highly correlated with COH, but lag NO2 has a lower partial correlation with mortality than unlagged NO2.
The correlation between ozone and mortality is weak, although the correlation with 8hr ozone is borderline significant.
Except for ozone, there are positive correlations between the other pollutant variables, with high correlations between some of the particulate measurements (PM2.5 and PM10, PMIO and COH, and PM2 5 and NO3).
There are also high correlations between NO2 and PM1O, and NO2 and COH.
Various combinations of pollutants were tried in Poisson regressions that also included the optimal GAM terms. The results are shown in Table 4. As with the partial correlations, both NO3 and PMIO were highly significant. PM2 5, SO4 and 8-hr ozone were also marginally statistically significant. Among lagged variables, COH and CO were highly significant and NO2 was marginally so. PM 10-2.5 was not significant, nor was its lag. Because NO3 and PM25 had the highest partial correlations with mortality, these were included in regressions with other pollutants. The raw mortality values have an autocorrelation of 0.18, but the residuals from the multiple regression of mortality On trend, season, and weather terms has an autocorrelation of only 0.04. Thus, autocorrelation of residuals is not a significant issue.
The fact that the deviance is approximately equal to the value expected under the null hypothesis suggests that it would be difficult to improve the fit substantially. The lack of Y-outliers, the lack of influential x-values. and the lack of autocorrelation stiggest that the Poisson model fits reasonably well.
Analysis using a parametric app-oach. 10o check the adequacy, of the GAM approach, a parallel analysis was performed using sine/cosine ftinctions for season and trend, and polynomials for weather variables.
[he results were similar both qualitatively aind quantitatively to those found in Table 4.
A simulation of the model-fitting process. The simutlation described in "A Simulation of the Model-Fitting Process" in "Methodology" was repeated 1,000 times. It yielded four fitted coefficients greater than that observed so that, based on the simulationi, the p-value is approximately 0.004. This p-value is, if anything, smaller than that found using statistical theory, where the p-value was 0.012.
Comparison with 1980-1986 results. ITable 5 presents a reanalysis of the 1980-1986 data using methods paralleling those of Table 4. In particular, the same variables for season, trend, and weather were used (although they were refit with 1980 19986 data). To make coefficients comparable, the same deltas are used; that is, 50 x S[)(p)/SD(PM,()), where the SDs are from the 1989-1996 data.
Generally, the results for the 1980-1986 period are similar to those of 1989-1996. In particular, with the exceptioni of ozone, the coefficient for every pollutant or the lagged pollutant is statistically significanit. In pairw ise models with lagged COH, the other pollutants are no longer statistically significacnt. l agged C(OH remains highly significanit in conbin.ation with NO, and ozone; with CO, it is nlot statistically significant, but its regressioin coefficienlt is little changed. NO2 is Four-pollutant" -1.09** 095 1.06 1.07 Abbreviations: SD, standard deviation; COH, coefficient of haze; PM2 5, particulate matter < 2.5 pm in aerodynamic diameter; PM10, particulate matter < 10 pm in aerodynamic diameter. aRelative risks calculated by exp(b x Ap) -1, where b is the pollutant coefficient from the Poisson regression, and Ap 50 for PM10, corresponding to the increment used in the criteria document 1). For other pollutants, p, the increment was 50 x SD(pI/SD(PM10); e.g., SD(PM2.5) = 13, SD(PM10) = 23, so for PM2 5, Ap = 50 x 13/23 = 28. bAll models include 7 generalized additive model terms for trend, 12 for season, 3 for minimum temperature, and 2 for maximum temperature. cLagged variables were used if they appeared to fit better lagged than unlagged. Thus, lagged CO and lagged COH were used when fitting jointly with other pollutants. dPollutants are lagged CO, lagged NO2, 8-hr ozone, and either PM2 or NO3 *Statistical significance at the 0.05 level. **Statistical significance at the 0.01 level. borderline significanit (p= 0.06) with the ozoIne, but not significant with CO or NO,. Oddly, in combination with NO3, NO, was significant. Note that the sample i7,c for No, is only 354, comnpared with over 2-000 for thc other pollutanits. [he small sample size miakes it more difficult to detect an effect. When both lag COH and NO, are in the model, the COH coefficient is smnaller and nio longetstatistically significant, whereas the NO, coef ficient changes only slightly and is borderlilne significant (p= 0.09).
One difference wx ith the 1989-1996 results is that the 1980-1986 CO(H coefficient is highly signiificanit. with a relatiVe risk of 1 .06, comipared wvith 1.03 for 1 989-1996. A comparison of the tswso coefficients takinig theii differencce and dividing by the square root of the SUIml of tlhcirsample variances yields a value of z= 1.36, not statistically significant. Pooling the two periods, fitting the same coefficient for season, trend, and weather, but with different COH slopes and intercepts did not result in a statistically significant difference in COH coefficients.
One possible reason that the COH coefficient might have changed is that COH has diminished from the early 1980s to the 1990s. Thus, if the effect of COH is not linear, this could result in different coefficients.
However, neither a quadratic nor a hockeystick function of C(OH was sigiificuint in the pooled regressionis for either period. Respiratoiy and cardiovascular regr.essions. Table 6 shows relative risks fiom ioisson regressions iSing eaclh pOlluIltant or their lags (dependinig on wlhich hadl the greotter risk based ont Fable 4). So04 anid CO were significantlv associated rirespiratoryi mortalits  for PM10 and 50 x SD(p)/SD(PM10J for other pollutants, p, e.g., SD(PM251 = 13, SD(PM10) = 23, so for PM25, Ap = 50 x 13/23 = 28. bAll models include 7 generalized additive model terms for trend, 12 for season, 3 for minimum temperature, and 2 for maximum temperature. cLagged variables were used if they appeared to fit better lagged than unlagged. Thus, lagged CO and lagged COH were used when fitting jointly with other pollutants.
*Statistical significance at the 0.05 level. and PM2 5, NO3, and CO were associated with cardiovascular mortality. For PM1O, PM1I2.5, NO3, and CO the point estimates for risk were higher than those in Table 4.
Analyses by season. Analyses were performed by season for pollutants with the highest partial correlations with mortality (Table 7). In most cases, the change in relative risk is not statistically significant. Based on Tukey's studentized range distribution, the risks differ significantly from season to season for NOY For the other pollutants, the differences in risk between seasons are not statistically significant.

Discussion
One striking result of the analysis is that although the Bay Area met every air quality standard in the early 1990s (and would have met the new 8-hr ozone and PM2.5 standards had they been in effect), there is a statistically significant correlation between each pollutant considered (except coarse fraction PM1O) and mortality. Second, the regression coefficients of other pollutants that are correlated with particulates-CO and NO2drop to nonsignificance in a regression that also includes some measure of fine particulates (either PM2.5 or NO3), whereas there is little change in the fine particulate coefficients. This suggests that fine particulates (or what fine particulates may be a surrogate for) may be the real culprits. The result that NO3 had the strongest association with mortality is clearly of practical importance and worth investigating for other areas.
The level of PMIO effect found-a relative risk of 1.08 for an increase of 50 pg/m3 PM10-is larger than that found in many other studies [see the EPA's Table 12-37 and Figure 12-43 (1)]. This may reflect a better correlation between monitored values and exposure in SCC. Part of the explanation may be that buildings in SCC are not as tight because of its mild climate, which could lead to a higher correlation of indoor and outdoor particulate levels. A second point is that the correlation between particulate values measured at the San Jose 4th Street monitor and other SCC monitors is high. Particulate levels at the 4th Street monitor exceed those of other SCC monitors (12); therefore, the relative risk as a function of SCC average levels could be higher than 1.08.
No evidence for a threshold was found. Although the COH coefficient was substantially lower for the 1989-1996 period than for 1980-1986, the result did not appear to be due to the lower particulate levels in the later period. One point that is important to keep in mind is the role of chance in these comparisons; because there is marginal power to detect effects of this magnitude, some data sets may yield nonsignificant results whereas others yield highly significant results.
Although the results for respiratory and cardiovascular mortality showed fewer significant results than for mortality as a whole, the level of effects appeared somewhat higher. The number of cardiovascular and respiratory deaths is considerably smaller than all deaths (Table 2), so the power to detect an effect is less unless the effect is much larger.
The results by season were ambiguous. The lack of statistical significance of most of the coefficients can be attributed to lack of power. The criteria document (1) found that a minimum sample size of 400 was necessary to achieve reasonable power in epidemiologic studies such as this. For PM2 5, NO3, and S04, there were approximately 100 observations per season, far below the 400 observations necessary to achieve reasonable statistical power. Nevertheless, a statistically significant difference in effect was found for NO3, with positive effects for winter, spring, and summer, and a negative effect for fall; although only the winter coefficient was statistically significant, the range of coefficients was larger than expected by chance. This analysis has found associations between air pollution variables and mortality-especially with fine particulate variables-similar to the levels of associations found in the studies that were used to justify the new PM2 5 standards. Yet the Bay Area probably meets these new standards. The