Childhood cancer incidence rates and hazardous air pollutants in California: an exploratory analysis.

Hazardous air pollutants (HAPs) are compounds shown to cause cancer or other adverse health effects. We analyzed population-based childhood cancer incidence rates in California (USA) from 1988 to 1994, by HAP exposure scores, for all California census tracts. For each census tract, we calculated exposure scores by combining cancer potency factors with outdoor HAP concentrations modeled by the U.S. Environmental Protection Agency. We evaluated the relationship between childhood cancer rates and exposure scores for 25 potentially carcinogenic HAPs emitted from mobile, area, and point sources and from all sources combined. Our study period saw 7,143 newly diagnosed cancer cases in California; of these, 6,989 (97.8%) could be assigned to census tracts and included in our analysis. Using Poisson regression, we estimated rate ratios (RRs) adjusted for age, race/ethnicity, and sex. We found little evidence for elevated cancer RRs for all sites or for gliomas among children living in high-ranking combined-source exposure areas. We found elevated RRs and a significant trend with increasing exposure level for childhood leukemia in tracts ranked highest for exposure to the combined group of 25 HAPs (RR = 1.21; 95% confidence interval, 1.03, 1.42) and in tracts ranked highest for point-source HAP exposure (RR = 1.32; 95% confidence interval, 1.11, 1.57). Our findings suggest an association between increased childhood leukemia rates and high HAP exposure, but studies involving more comprehensive exposure assessment and individual-level exposure data will be important for elucidating this relationship.

Research to date has failed to firmly establish risk factors for childhood cancer other than ionizing radiation (Ross et al. 1994;Stewart et al. 1958), chemotherapy agents (Ross et al. 1994), and certain inherited genetic disorders (Cowell 1991;Li et al. 1988). The potential relationship between childhood cancer and air pollution from motor vehicle exhaust has been one of the more studied environmental factors. Many studies have shown an association between elevated childhood cancer risks and surrogate measures of exposure to vehicle exhaust, including traffic density, car density, and estimated nitrogen dioxide concentration in outdoor air (Feychting et al. 1998;Nordlinder and Jarvholm 1997;Pearson et al. 2000;Savitz and Feingold 1989). However, a study examining estimated nitrogen dioxide and benzene concentrations found no increase in childhood cancer risk (Raaschou-Nielsen et al. 2001). Two case-control studies conducted in urban areas of California (Langholz et al. 2002;Reynolds et al. 2001) and a statewide ecologic study in California (Reynolds et al. 2002) failed to show an association between increased rates and traffic density measures.
One British study identified childhood cancer excesses near industrial facilities considered major volatile organic compound emitters and near sources of exhaust from internal combustion engines (Knox and Gilman 1997). However, a similarly designed study, also in Britain, of lymphohematopoietic malignancies around oil refineries did not find an increased cancer risk among children (Wilkinson et al. 1999). Previous epidemiologic studies have also found associations between childhood cancer and parental occupational exposure to petroleum hydrocarbons (Colt and Blair 1998;Savitz and Chen 1990), but occupational exposure levels are generally much higher than those found in outdoor air. The risk of childhood cancer associated with exposure to ambient air pollution levels is unknown.
Limitations in available air monitoring data make it difficult to predict variations in the concentration of potentially carcinogenic pollutants at the neighborhood level (Kelly et al. 1994). Carcinogenic air pollutants have been measured routinely at only 20 sites throughout California, mostly urban areas where the highest concentrations would be expected (California ARB 1999). Amendments made to the U.S. Clean Air Act in 1990 (Clean Air Act 1990) identified 189 hazardous air pollutants (HAPs) (Stern 1992). HAPs are compounds found in ambient air known to cause cancer or other adverse health effects in laboratory animals or in occupational health studies. Because of the limited amount of air monitoring data available, the U.S. Environmental Protection Agency (U.S. EPA) developed a database with modeled outdoor concentrations of 148 HAPs, at the census tract level, for 1990. Comparison of these modeled HAP concentrations with available air monitoring data indicated good agreement (Rosenbaum et al. 1999).
The U.S. EPA's modeled HAP concentrations provide researchers the opportunity to study relationships between adverse health effects and estimated exposure to multiple chemicals emitted from various sources. Others have used the HAPs database to estimate excess lifetime cancer risk by census tract (EDF 1999;Morello-Frosch et al. 2002;Pratt et al. 2000;Woodruff et al. 1998Woodruff et al. , 2000. In this study, we analyzed population-based childhood cancer incidence rates in California census tracts by HAP exposure scores that were based on the estimated excess lifetime cancer risk. Our goal was to evaluate, at the census tract level, whether childhood cancer rates are elevated in areas estimated to have high exposure to potentially carcinogenic HAPs.

Materials and Methods
We used California's population-based cancer registry (the California Cancer Registry, Sacramento, CA) to obtain information on all invasive cancer cases diagnosed between 1988 and 1994 in children younger than 15 years. The statewide registry routinely records information on age, race/ethnicity, sex, and residence at the time of diagnosis. Because this project involved use of human subjects data, it was reviewed and approved by the California Health and Human Services Agency Committee for the Protection of Human Subjects. We used a geographic information system to automatically match case addresses with a road network and determine the corresponding census tract of each child's residence at the time of diagnosis (ESRI 2000). If an address could not be automatically linked, we manually located it whenever possible.
We obtained 1990 U.S. Census population data for each census tract in California (U.S. Census Bureau 1992). During this period, population growth varied by age and race/ethnicity. To create population estimates for our study's 7-year pericensal time period, we obtained annual statewide population growth estimates for 1988 through 1994, by age group, race/ethnicity, and sex (California DOF 1998). We multiplied the 1990 population for each census tract by seven and the applicable growth factor for each age, race/ethnicity, and sex group.
The U.S. EPA has combined 1990 emission inventories with meteorologic data into a dispersion model that estimates the annual average HAP concentrations for each census tract in the contiguous United States (Rosenbaum et al. 1999). The U.S. EPA estimated its HAP concentrations separately using emission inventories for mobile, area, and point sources. Mobile sources include motor vehicles, airplanes, trains, and ships. Area sources include emissions from many smaller stationary sources such as dry cleaners, gas stations, and residential use of solvents in consumer products (automotive, household, and personal care products). Pesticide field applications and forest fires also contribute to the area-source estimates. Point sources are large industrial manufacturing facilities. Unlike those for pointsource emissions, the U.S. EPA allocated area-source emission estimates from the county level to individual census tracts based primarily on population and land use data. The U.S. EPA estimated total HAP concentrations for each census tract by adding concentrations resulting from mobile, area, and point sources.
For our analysis, we calculated the exposure scores for each census tract by multiplying the modeled air concentration and the corresponding inhalation unit risk factor for potentially carcinogenic HAPs. The inhalation unit risk factor combines the cancer potency for each compound with standard assumptions for body weight and breathing rate (Cal/EPA OEHHA 1997;U.S. EPA 1999). Our methodology follows established risk assessment guidelines for estimating theoretical lifetime excess cancer risk (U.S. EPA 1986). We multiplied exposure scores by 1 million to normalize the resulting range of values. Figure 1 shows our formula and a sample calculation for the benzene exposure score in a census tract.
We started with 35 HAPs previously identified as the compounds with the best available information on their potential to cause cancer via inhalation . We excluded 10 of these compounds from further analysis because their maximum estimated cancer risk for any census tract was less than 1 in 10 million. We calculated a combined source exposure score for each census tract in California by summing the exposure score for the remaining 25 potentially carcinogenic HAPs. Table 1 lists the HAPs included in the combined source group, the cancer classification for each compound, and the inhalation unit risk factors used in this study.
We used the Spearman rank correlation coefficient to evaluate the degree of covariability among individual HAPs with the same primary emission source (Snedecor and Cochran 1989). Most of the correlation coefficients between four chemicals (acetaldehyde, benzene, butadiene, and formaldehyde), which are emitted primarily from mobile sources, exceeded 0.9 ( Table 2). Many of the HAPs emitted primarily from area sources were also highly correlated. For this reason, we decided to analyze further the HAP data using mobile, area, and point emission source groups instead of individual compounds. For each emission source group, we calculated exposure scores by summing the concentration multiplied by the inhalation unit risk factor for any of the 25 potentially carcinogenic HAPs listed in Table 1. Exposure scores based on emission source also correlated with each other, although not as highly. The resulting Spearman correlation coefficients were between 0.6 and 0.7.
We performed a multivariate Poisson regression analysis (Frome and Checkoway 1985) using SAS PROC GENMOD (SAS 2000) and the Stata procedure "glm" (StataCorp 2001), adjusting for age, race/ethnicity, and sex. We used exposure scores for each emission group in several ways. Treating the score as a continuous variable and assessing the resulting regression coefficient provided us a test for trend. To obtain rate ratio (RR) estimates for various exposure levels, we divided the continuous exposures into four categories, low or reference exposure (< 25th percentile), medium exposure (25th-74th percentiles), high exposure (75th-89th percentiles), and very high exposure (≥ 90th percentile), and represented them in a regression as indicator terms. Finally, we constructed cubic spline curves (Sasieni 1995) to permit a more flexible representation of exposure (Greenland 1995). We analyzed all types of childhood cancer together and the two most common cancer types, leukemias and gliomas (brain cancer), separately. Table 3 shows the number of cases included in our analysis by age, race/ethnicity, and sex for all sites combined as well as separately for the leukemias and gliomas. Of the 7,143 childhood cancer cases diagnosed in California during this time, we were unable to assign the residence of 155 cases (2.2%) to census tracts because of incomplete or missing address information. Approximately one-third of the study cases were leukemias (n = 2,443), including 1,938 cases of acute lymphocytic leukemia (ALL) and 368 cases of acute nonlymphocytic leukemia (ANLL). Gliomas comprised 19% of the total cancers (n = 1,351). Thirty-six percent of the cases occurred among Hispanic children, 47% among non-Hispanic whites, and 7% among African Americans. The study period encompassed 46 million person-years of children in California.   source groups, we found the exposure scores for census tracts at the highest level (≥ 90th percentile) to be at least three times greater than the exposure scores for tracts at the lowest or reference level (< 25th percentile). Figure 2 shows the spatial distribution of the combined source exposure score by census tract in California. As expected, the most densely populated areas of the state have the highest combined exposure scores including Los Angeles and the San Francisco Bay area. Figure 3 is an enlarged map of the combined exposure score by census tract in Los Angeles and Orange Counties; local variations in exposure scores between census tracts are more apparent at this scale. For the combined source group, the cancer incidence rates in the highest HAP exposure census tracts were a modest 6% higher than those in the lowest HAP exposure areas [RR = 1.06; 95% confidence interval (CI), 0.97-1.16]. For the leukemias, the RR for the highest combined source group exposure category was 1.21 (95% CI, 1.03-1.42). The trend from the lowest to highest exposure levels for the combined source group and leukemias was statistically significant (p < 0.05). We also ran initial models for each of the emission source groups, adjusting only for age, race/ethnicity, and sex. For the point-source group, the RR for leukemia at the highest exposure level (≥ 90th percentile) was 1.32 (95% CI, 1.11-1.57). Again, the trends from the lowest to highest exposure levels were statistically significant (p < 0.05). We saw some suggestion of a stronger effect in younger children. Among children diagnosed with leukemia younger than 5 years, the RR for the highest point-source exposure category was 1.45 (95% CI, 1.17, 1.79), somewhat higher than the RR for children 5-14 years old, for whom the RR was 1.18 (95% CI, 0.88-1.58; data not shown).

Results
Poisson regression using a spline (with six join points) for point-source exposure score conveys the same impression of an increasing leukemia rate with increasing score (Figure 4), although with a leveling off above a score of 100, which includes only seven census tracts (0.1%). A likelihood-ratio test for the spline terms (adjusting for age, race/ethnicity, and sex) has a p-value of 0.02. In contrast, the spline regression for point-source exposure and glioma shows no distinct trend (Figure 5), consistent with the glioma RRs for exposure categories in Table 4.
Because childhood leukemias were the most common cancer type, we further examined these malignancies by specific subtypes. Among the leukemias, 1,938 were ALL and 368 were ANLL. The RR for ALL was 1.19 (95% CI, 1.00-1.43) for the highest exposure level of the combined source group. For ANLL, the RR was 1.46 (95% CI, 0.97-2.19) at the highest exposure level of the combined source group (data not shown).
When we restricted the analysis to urban block groups, the resulting point estimates were similar (data not shown). The confidence intervals were wider, because of the smaller number of census tracts included in the analysis, but the findings remained consistent. Similarly, when we included a measure of socioeconomic status (quartile of census tract median family income) in the original models, we observed no substantial differences in point estimates (data not shown).

Discussion
For all sites combined, we found no significant childhood cancer excess within census tracts in the highest exposure category for all HAPs Environmental Health Perspectives • VOLUME 111 | NUMBER 4 | April 2003 combined. However, RRs for the leukemias appeared to be elevated in these tracts. The combined HAP exposure score provides an overall ambient air quality estimate because it includes emissions from mobile, area, and point sources. When calculating the exposure scores separately by emission source, we observed the most dramatically elevated childhood cancer and leukemia incidence rates within census tracts in the highest exposure category for point sources. Census tracts in the highest exposure categories for area-and mobile-source HAPs also had slightly elevated leukemia rates. Although these elevated rates were not statistically significant, the point estimates were similar in magnitude to those we observed for the combined score and point sources.
Given the highly correlated nature of the exposure data, we cannot ascertain clearly whether any one source or any one chemical is driving the observed associations. However, it is interesting to note that we observed the highest RRs for leukemia in census tracts ranked highest in exposure to HAPs emitted from point sources for which the main compounds contributing to the exposure scores were benzene and perchloroethylene. This more striking increased risk association for leukemia with point-source emissions is consistent with the known benzene exposure risk for leukemia in adults (IARC 1982;Rinsky et al. 1987). However, mobile (benzene) and area (perchloroethylene) sources also emit these same compounds in large quantities. We noted some, but not extensive, overlap between tracts categorized in the highest exposure groups for mobile, area, and point sources. For example, 36% of the highest point-source census tracts were also classified in the highest exposure category for area sources. However, only 19% of the highest point-source tracts fell into the highest exposure category for mobile sources. The census tracts in the highest exposure category for point sources were concentrated in heavily industrialized, urban counties; 70% of the tracts were in Los Angeles County alone.
In this study, the magnitude of the theoretical cancer risk, as estimated by the exposure scores used in our analysis, did not predict the resulting RRs for childhood malignancies. On the basis of the exposure scores alone, we would have expected the highest RRs to be for mobile sources and all sources combined, rather than for point sources as we observed. It is important to keep in mind that these theoretical risks are predicated primarily on cancer potency values derived from animal data and limited human health studies of adult cancers, which include tumor types quite different from those most common in children. Because of 666 VOLUME 111 | NUMBER 4 | April 2003 • Environmental Health Perspectives this and because of the uncertainties involved in extrapolating from high exposure levels in animal and occupational studies to lower exposure levels in ambient air, we are uncertain to what degree these estimates might apply to childhood cancers. Only a handful of previous studies have assessed the potential risk of exposure to HAP sources and cancer in children. Most of these studies focused on indirect indices of potential exposure to mobile sources (traffic), with mixed results. Savitz and Feingold (1989) reported a strong association between case status and high traffic density (> 500 cars a day) on the street of residence among cases and controls in Denver, Colorado (USA). They found that the association was strongest for childhood leukemia, with an odds ratio (OR) of 2.1 (95% CI, 1.1-4.0), especially among children younger than 5 years (OR = 5.6; 95% CI, 1.9-16.7). Pearson et al. (2000) reanalyzed the data using a methodology that counted all surrounding streets and reported an OR for leukemia of 8.3 for the highest exposure category. A small study (39 leukemia cases) in Sweden observed elevated risks of child leukemia associated with estimated exposure to motor vehicle exhaust (Feychting et al. 1998). In contrast, a large case-control study conducted in Denmark, using a validated model to estimate lifetime exposure to benzene and nitrogen dioxide from motor vehicle exhaust, did not find an association with childhood cancer (Raaschou-Nielsen et al. 2001). Similarly, we failed to find an association in a small case-control study of traffic density and early childhood leukemia in San Diego County, California (Reynolds et al. 2001). In that study, we estimated traffic density using varied methods, including information on average daily traffic counts and other road characteristics; none of the traffic intensity measures were associated with case status. In a later statewide ecologic study, we analyzed childhood cancer rates with respect to traffic density at the census block group level and, consistent with other more recent studies focused on traffic measures (Langholz et al. 2002;Raaschou-Nielsen et al. 2001), we failed to find a significant association (Reynolds et al. 2002). However, the RR for leukemia in that study, at the highest exposure level for traffic density (> 90th percentile), was 1.15 (95% CI, 0.97-1.37), which is very similar to the RR (1.18) observed for mobile-source HAPs in this study.
Because childhood cancers are likely to have shorter latency periods than tumors found in adults, they lend themselves more easily to this type of epidemiologic analysis. Our study included a large, diverse population of children in a geographically varied area. It also has the advantage of being population based and therefore is not subject to participation bias. But there are still a number of limitations to consider. Childhood cancer rates depend on population estimates, which may not be completely accurate. A child's address at time of cancer diagnosis may not represent the relevant exposure time or place, a particular concern for highly residentially mobile populations such as those in California (Reynolds et al. 2001). Additionally, some children spend a significant amount of time away from home, at school, in day care, or riding in vehicles, and may be exposed to pollutants in these locations. It is interesting to note that children 0-4 years old had higher point estimates for leukemia than children 5-14 years old. This finding needs to be studied in more detail but could be because the address at diagnosis is a more relevant measure of exposure for younger children because they generally spend more time at home and are less likely to have lived in other locations.
Our study employed an ecologic design, using group-level (census-tract) exposure data and cancer incidence rates. Because significant concentration variations could exist within a census tract, group exposure levels do not necessarily correspond to individual exposure levels. Likewise, a study of this nature cannot account for variations in susceptibility among individuals with comparable exposure.
The value of the U.S. EPA's HAP data lies in its breadth of geographic coverage, its inclusion of a large number of compounds, and its use of an atmospheric dispersion model to estimate concentrations. Previous epidemiologic studies used data on proximity to facilities, roadways, or emissions as a surrogate for exposure. Use of the U.S. EPA's modeled HAP concentrations offers a major improvement in exposure assessment because the concentration can be combined with the carcinogenic potency of each compound to estimate a theoretical cancer risk. In evaluating the combined risk from multiple chemicals, the sum of cancer risk estimates provides a more meaningful exposure measure than does the sum of emissions or concentrations.
The HAP data represent only the average outdoor concentration of pollutants, not the total exposure from all possible sources and pathways. Although indoor HAP sources may contribute more to personal exposure than do ambient outdoor concentrations (Wallace 1991), our study did not include potentially important indoor HAP sources such as environmental tobacco smoke. We also limited our analysis to HAP compounds with available modeled concentrations and with fairly strong evidence of inhalation-route carcinogenicity. Many of the HAPs have not been evaluated for carcinogenicity. For those that have tested positive in laboratory studies for carcinogenicity, many lack reliable inhalation cancer potency values (Kyle et al. 2001). Another limitation of the HAP data is that relatively little meteorologic information was available for input into the U.S. EPA dispersion model; this could decrease the reliability of modeled concentrations at the census tract level. Evaluation of individual HAPs was not possible because several compounds are usually emitted simultaneously from a given source; the resulting concentration estimates were therefore too highly correlated.
Our initial evaluation suggests that background air quality, as estimated by HAPs, may be associated with incidence of childhood leukemia. The modeled HAP concentrations developed by U.S. EPA provide a valuable resource for studies, such as ours, designed to take an initial look at health outcome differences associated with identified pollutants of interest. To our knowledge, this is the first epidemiologic study using modeled concentrations of multiple chemicals, at the census-tract level, to evaluate cancer incidence. Future HAP modeling efforts could be improved by incorporating more accurate and geographically specific emission inventories. In southern California, a large monitoring and modeling effort produced more accurate estimates of ambient HAP concentrations (South Coast AQMD 1999). Further field validation studies must also be conducted to examine ambient HAP concentrations and actual personal exposure of children. Building on a large collaborative case-control study with investigators at the University of California at Berkeley, we are beginning a follow-up analysis using individual lifetime residential history data to assess the relationship between cumulative exposure to HAPs and childhood leukemia. This follow-up study will also combine data on outdoor HAP concentrations with information, obtained from questionnaires, on indoor pollution sources and personal activity patterns to further evaluate the relationship between childhood leukemia and exposure to HAPs.