Lung Cancer Risk after Exposure to Polycyclic Aromatic Hydrocarbons: A Review and Meta-Analysis

Typical polycyclic aromatic hydrocarbon mixtures are established lung carcinogens, but the quantitative exposure–response relationship is less clear. To clarify this relationship we conducted a review and meta-analysis of published reports of occupational epidemiologic studies. Thirty-nine cohorts were included. The average estimated unit relative risk (URR) at 100 μg/m3 years benzo[a]pyrene was 1.20 [95% confidence interval (CI), 1.11–1.29] and was not sensitive to particular studies or analytic methods. However, the URR varied by industry. The estimated means in coke ovens, gasworks, and aluminum production works were similar (1.15–1.17). Average URRs in other industries were higher but imprecisely estimated, with those for asphalt (17.5; CI, 4.21–72.78) and chimney sweeps (16.2; CI, 1.64–160.7) significantly higher than the three above. There was no statistically significant variation of URRs within industry or in relation to study design (including whether adjusted for smoking), or source of exposure information. Limited information on total dust exposure did not suggest that dust exposure was an important confounder or modified the effect. These results provide a more secure basis for risk assessment than was previously available.

VOLUME 112 | NUMBER 9 | June 2004 • Environmental Health Perspectives Research | Review Airborne polycyclic aromatic hydrocarbons (PAHs), which are emitted when organic matter is burned, are ubiquitous in the occupational and general environment. It has long been known that several PAHs can produce cancers in experimental animals, and epidemiologic studies of exposed workers, especially in coke ovens and aluminum smelters, have shown clear excesses of lung cancer and highly suggestive excesses of bladder cancer [Boffetta et al. 1997; International Agency for Research on Cancer (IARC) 1984(IARC) , 1985(IARC) , 1987Mastrangelo et al. 1996;Negri and La Vecchia 2001]. The animal experiments have included some using airborne exposure and have been mixtures and individual compounds, including particularly benzo[a]pyrene (BaP). Although the existence of a cancer risk is beyond reasonable doubt, considerable uncertainty exists as to the exposure-response relationship, and hence as to the risks posed at today's levels in the workplace and general environment. Information on this relationship is clearly important for setting of occupational and environmental standards.
Estimating exposure-response relationships by extrapolation from animal studies is possible [Collins et al. 1991; U.S. Environmental Protection Agency (U.S. EPA) 1984], but the limitation of this approach, particularly species differences, makes sole reliance on it problematic. Data from a large cohort of coke oven workers in the United States, which has been followed since the 1960s (Costantino et al. 1995;Lloyd 1971), have been used to estimate risk per unit residential exposure [Nisbet and LaGoy 1992;World Health Organization (WHO) 1987]. However, many other studies provide information that has not yet been systematically used to quantitatively assess risk.
The fact that PAHs comprise a mixture, several components of which are animal carcinogens, adds to the complexity of the task. One issue is whether a single index of exposure, such as BaP or total benzene soluble matter (BSM) or cyclohexane soluble matter (CSM) is adequate to determine risk. If such an index is used, risk per unit exposure may differ between studies (and unstudied exposures) because of differences in the ratio of this index to the total carcinogenic potential of the mixture. It is possible that such variation, if present, can be adequately described by classifying exposures in broad categories (e.g., by source). However, this approach remains untested.
We conducted a review and meta-analysis that aimed to use all relevant published evidence from epidemiologic studies to obtain an estimate or estimates of the relationship of PAH exposure with lung and bladder cancer and to identify sources of variation in this relationship. Here we report the results for lung cancer.

Methods
The methods summarized here are described at greater length in a technical report on this work (Armstrong et al. 2002).
We excluded the following: • Studies of workplaces where PAHs were considered unlikely to be the predominant lung or bladder carcinogen, to reduce potential for confounded results. Such workplaces included those in the rubber industry; those where primary exposure was from diesel exhaust; foundries; and steel works (because of co-exposure to silica), unless there were separate analyses specifically of coke oven workers. • Studies for which it was not possible to quantify exposure to PAHs. Hospital-and population-based case-control and registry studies were excluded for this reason. • Studies of occupational exposure other than by inhalation. • Superceded publications. Where repeated follow-ups of the same workforce were reported in several articles, only the most recent was included. • Biomarker studies because it was difficult to deduce relationships of exposure concentration to cancer incidence. • Proportional cancer analyses. • Articles not written in English.
After these exclusions, 34 articles remained, of which 5 reported two distinct cohorts for which results were presented separately. Thus, there were 39 cohorts. For each included cohort we systematically extracted general descriptive information, information on potential modifiers of risk associated with PAHs, and information from which we estimated unit relative risk (URR) increments (see next two subsections).

Exposure Estimation
We distinguished studies according to author reports of the following: • exposures to PAHs measured as BaP (10 cohorts) • exposures to PAHs measured by a proxy that we could convert to BaP: benzene soluble matter (BSM), total PAHs, carbon black (6 cohorts) • no measures of exposure (n = 23 cohorts) For those studies with no exposure measures, we estimated exposure to PAHs for each workgroup for which cancer risk estimates were presented (e.g., for top-oven workers, side-oven workers). These exposure estimates were based on published exposure estimates in the same industries (IARC 1984;Lindstedt and Sollenberg 1982) and other published epidemiologic studies. Principle estimates thus derived are presented in Table 1.
Cumulative exposure. We sought to relate cancer risk to mean cumulative exposures to BaP (duration × time-weighted mean concentration). Where risk by cumulative exposure was not published, it was derived as the product of mean estimated concentration of exposure in each group for which risk was reported and the mean duration of exposure in that group. In the absence of information on duration of exposure, 20 years was assumed, representing the average found in studies for which duration was reported.
Dust exposure. At the inception of this study, we had not planned to seek information on potentially confounding exposures beyond those noted by the authors. However, interest has sharply increased recently in the hypothesis that inhaled dust carries a risk of lung cancer regardless of composition. We therefore sought to add information that we could find on dust exposure. Because few publications reported such estimates, we relied entirely on supplementary data and the judgment of the hygienists on the research team. We were aware that this would be a very rough assessment and chose a simple scale [low (< 1 mg/m 3 total dust); moderate (1-5 mg/m 3 ); high (5-10 mg/m 3 ); very high (10-25 mg/m 3 )] and broad job groups or, in some cases, entire industries. Assessments are listed in the final column of Table 1. A list of references on which assessments were based is included in Appendix B4 of the full research report (Armstrong et al. 2002).

Estimation of Unit Risks from Studies
We estimated relative risks (RRs) per 100 µg/m 3 years cumulative BaP for each study, using a log-linear model, RR = exp(bx), where RR is relative risk, x is cumulative exposure in micrograms per cubic meter years and b is the slope of the exposure-response relationship. (For this model, RR = 1 if x = 0.) Thus, relative risk represents the risk of lung cancer at a specified exposure (x) relative to that at zero exposure. For example, RR = 1.30 at x = 100 µg/m 3 years BaP exposure implies that at this exposure, lung cancer risk is 1.3 times that of an unexposed person-a 30% excess.
Slopes b were estimated by Poisson regression, using data from each study from published tables of risk [usually standard mortality ratios (SMRs) or internal RRs] by cumulative exposure, duration of exposure, or job group. For 13 cohorts with only one published SMR, these estimates depended on assuming that at zero exposure those cohorts would have experienced the same rates as those of the general population (allowing for age and calendar time). For the remaining 23 cohorts, rates at zero exposure were inferred from the cohort itself (i.e., exposureresponse curves were not constrained to SMR = 1 at zero exposure). Standard errors were from a scale-overdispersion model, reflecting variation in observed deaths above the Poisson expected, and estimated jointly across studies.
Unit relative risks were defined as those predicted by the models at 100 µg/m 3 years BaP [i.e., exp(100b)]. The exposure of 100 µg/m 3 years BaP is close to the mean of the maximum exposures in included studies and corresponds to a concentration of 2.5 µg/m 3 BaP over 40 years.
Many studies reported more than one contrast of risk in differently exposed subgroups, so several URRs could be estimated. For example, there may be tables of risk by duration of service (sometimes subdivided by job group) and by job group (e.g., coke oven top-worker, side-worker, and distillation products), as well as on overall SMR. In these instances we selected the following, in order of importance: • internal comparisons (risk in groups of different exposure in the same study) over external (a single SMR) • large contrasts of exposure across exposure groups • mortality outcomes over morbidity • confounder-controlled contrasts over uncontrolled (e.g., smoking-adjusted vs. unadjusted) • estimates without latency or lag restrictions over those with such restrictions (to maximize comparability in primary analyses).

Meta-Analysis and Meta-Regression
We sought to describe the distribution of URRs across studies and to identify determinants, allowing for sampling uncertainty of each estimate and additional random variation between studies (i.e., random effects) if present (Sutton 2000). We proceeded on the assumption that sampling variation of the logged URRs and any additional random variation were reasonably approximated by a normal distribution. Cochran's test was used to determine significance of variation in URRs between studies. Meta-regression, using a log-linear random effects model with restricted maximum likelihood, was used to clarify patterns in URRs (e.g., a tendency for different URRs for each industry) and to identify whether such patterns could have occurred by chance (Sutton 2000).

Study Characteristics
Characteristics of the 39 cohorts are shown in Table 2, and the frequencies of selected study characteristics are given in Table 3. All were essentially cohort studies, but three used nested case-control samples, and one (Armstrong et al. 1994) used case-cohort sampling from within the cohort. For 13 of the cohorts, only single SMRs were reported; for the remainder there were risk comparisons (contrasts) across 2 or more exposure groups (maximum 7). Of these, the contrasts selected according to our criteria were by cumulative Review | Lung cancer and PAH meta-analysis Environmental Health Perspectives • VOLUME 112 | NUMBER 9 | June 2004 971  Table 2 lists the cumulative exposure in the highest exposure group in each study, which ranged across three orders of magnitude from 0.75 to 805 µg/m 3 years BaP. This corresponds approximately to concentrations in air of 0.04-40 µg/m 3 . This large range was the predominant reason for the large range in the precision with which the URR was estimated.

Unit Relative Risks
Relative risks predicted at 100 µg/m 3 years BaP from the log-linear model are shown to the right of the cohort characteristics in Table 2. They ranged from 0 to > 1,000. The precision with which these relative risks were estimated also varied substantially, with standard errors (log scale) ranging from 0.02 to > 1,000. Most of the variation in precision was due to variation in the degree of exposure contrast in the studies. Many of the studies at the bottom of the table (power and carbon black industries) have low exposures. This limits the range of exposures being compared in the studies, which causes imprecision in estimated URRs, shown as wide confidence limits. (URRs are essentially regression coefficientsa narrow range in the x variable leaves uncertainty in the slope of the line.) Some variation in precision was also due to variation in size of cohort populations and duration of follow-up, which is reflected in the number of cases.
For cohorts without any exposure groups with mean higher than 100 µg/m 3 BaP, the estimate of relative risk at this value (the URR) is an extrapolation. The extreme values of URRs found in such cohorts are thus theoretical. To give an indication of the actual relative risks found in the cohorts, we also show for each cohort (Table 2, last column) the relative risks for the group with the highest exposure in that cohort, as predicted by the (log-linear) model.
Twenty-eight (72%) of the URRs were > 1, with the lower confidence limit > 1 (p < 0.05) in 14 of these URRs. The mean (estimated by random effects meta-analysis), overall and in subgroups, is shown in Table 3. A graph of all results loses definition catastrophically in the more precise studies. Limiting the graph to studies with standard errors < 10 ( Figure 1A) and 1 ( Figure 1B) allows focus on the more precise and consequently influential cohorts.
The overall mean URR was 1.20 and significantly > 1 (p < 0.001). There was no one cohort dominating this estimate, and it was little changed on removal of the less precise cohorts. However, there was significant heterogeneity of URRs across cohorts (p < 0.001).
Meta-regression revealed that much of the heterogeneity was explained by variation in URRs across industries (p = 0.002), although Review | Armstrong et al. 972 VOLUME 112 | NUMBER 9 | June 2004 • Environmental Health Perspectives coke ovens, gasworks, and aluminum smelters exposed to coal tar volatiles at similar levels had similar mean URRs. There was no significant heterogeneity of URRs within industry groups. We therefore examined variation in URRs according to other factors after allowing for the differences across industries by including industry in the meta-regression. After doing so, there was no difference more than could easily be explained by chance (p > 0.20) when studies were grouped according to source of exposure information, continent, whether the outcome of studies was mortality or morbidity, or exposure contrast (cumulative exposure, duration, etc). Neither did maximum exposure explain variation. The higher mean URR in the three nested case-control studies (p = 0.10) and that in the four smoking-adjusted studies (p = 0.05) are not independent. Both reflect high URRs in two case-control studies also adjusted for smoking.

Publication Bias
There was little evidence that the URR was related to its standard error or to number of cases (p > 0.20), factors that might relate to publication. It is evident in Table 2 that although the very high URRs derive from the smaller studies with lower exposures, some of the extremely low estimated URRs do also. Further, neither Egger's test nor Begg's test (p > 0.20) gave evidence for publication bias (Sutton 2000). Applying a trim-and-fill analysis (designed to correct for publication bias, if any) made negligible difference to the mean.

Dust
Because our information on dust exposure was for each cohort or sometimes for broad job group within studies (Table 1), we could not use conventional methods for controlling for confounding (stratification or inclusion of dust in multiple regression analyses). We adopted an ad hoc approach to use the data we had in order to shed what light we could on this issue: We compared relative risks estimated at 100 µg/m 3 years BaP in cohorts in which we had identified substantial dust exposure with those in which there was less. If generic dust were an important cause of lung cancer in these cohorts, one would expect greater apparent risks per unit PAH (BaP) where it was accompanied by dust. Results are shown at the bottom of Table 3. There was no significant association between estimated relative risk per unit PAH (BaP) exposure and dust exposure in the industry. This gives some reassurance that dust is not the predominant cause of the association seen in this cohort between PAH and lung cancer.

Sensitivity Analysis
By investigating dependence of URRs on study characteristics (Table 3), we have already implicitly examined sensitivity of results to these characteristics (study design, smoking adjustment, exposure information, etc) and found little such sensitivity. Here we report investigations of sensitivity of our results to three statistical modeling assumptions.
First, we repeated analyses using the linear model (RR = 1 + bx). We found very similar rankings of URRs (Spearman's correlation = 0.99). Fitted relative risks at the maximum exposure found in each plant were also similar. However, there was some variation in URRs of individual cohorts; those with lower exposures typically had lower URRs with the linear model, and those with higher exposures higher URRs. For example, the URR for Swaen's 1997 study of asphalt workers was 15.23 with the exponential model but 3.13 with the linear model; the relative risk predicted at the actual mean exposure in this cohort of 10 µg/m 3 years, however, was 1.31 for both models. Because methods are not available to rigorously allow for the highly nonregular sampling error in the linear estimates in meta-analyses, we view means and the assessment of heterogeneity of URRs estimated under this model cautiously. Nevertheless, it is reassuring that the mean estimated relative risk at 100 µg/m 3 years BaP was similar (1.19 compared with 1.20, both highly significant). The patterns of variation of risk across industries were broadly similar, although with some important differences (e.g., means for coke, gas, aluminum, and other were 1.22, 2.25, 1.04, and 4.41, respectively, in linear model vs 1.17, 1.15,1.16, and 10.9, respectively, in log-linear model).
Second, we repeated analyses using alternative criteria for choice of contrast: • Minimum standard error • Minimum standard error but using internal comparisons instead of single SMRs whenever available.

Review | Lung cancer and PAH meta-analysis
Environmental Health Perspectives • VOLUME 112 | NUMBER 9 | June 2004 973 In either case, the mean URR and the basic pattern of URRs between industries changed little, although estimates for individual studies changed, sometimes substantially.
Finally, we investigated dependence of our results on extrapolation of risks from very high exposures, by repeating analyses three times, excluding exposure groups with means more than 80 µg/m 3 years BaP (40 years at 2 µg/m 3 ), 40 µg/m 3 years BaP (1 µg/m 3 ), and 20 µg/m 3 years BaP (0.5 µg/m 3 ). For example, the large U.S. coke ovens study (Costantino et al. 1995) had seven groups with means 0.0, 14.8, 73.7, 162.4, 251.2, 339.9, and 805.4 but contributed only the first three groups to the first reestimated URR (means ≤ 80) and only the first two to the second and third reestimated URRs (means ≤ 40 and ≤ 20). Overall mean URRs and mean URRs for coke ovens, gasworks, and aluminum smelters are given in Table 4. The mean URR increases substantially on removal of higher exposure groups. This is partly explained by the greater weight given by URRs from industries with lower exposures, most of which have higher URRs. However, looking at the results for coke ovens, gasworks, and aluminum smelters only (right side of Table 4), we see that even within these industries restricting analyses to groups with lower cumulative exposures led to higher mean URRs, suggestive of an exposureresponse curve steeper at lower exposures than at higher exposures. However, for all these analyses except those excluding all exposures above 20 µg/m 3 years BaP, which was imprecise, there was significant heterogeneity between studies. These results should therefore be interpreted with caution.

Discussion
That our meta-analysis supports the conclusions of previous reviews that lung cancer is associated with PAH exposure is reassuring but not surprising. Our attention to quantification of this relationship in a comprehensive review is novel. Although other reviews have cited unit risk estimates from single studies, and one (Gibbs 1997) calculated such estimates from eight studies, no meta-analyses of unit risk estimates have been published.
Our results for coke ovens, gasworks, and aluminum production are relatively well supported by evidence from multiple studies, although biases should be considered. Our findings of higher URRs for other industries are more tentative. In the following sections, we discuss biases and possible explanations for patterns of variation in URRs.

Possible Biases
Each study included in this meta-analysis is subject to the usual range of potential biases in epidemiologic studies, in particular, confounding and information bias (exposure error).
Our first concern is potential confounding by smoking, which was uncontrolled in most studies. However, for two reasons, this seems unlikely to have caused major bias: a) Although only four studies controlled for smoking, two were large studies with substantial exposure allowing precise estimates of URRs. The mean URR in smoking-adjusted studies was statistically compatible with but somewhat higher than that for the studies uncontrolled for smoking and was statistically significant. b) Several methodological articles (Axelson and Steenland 1988;Blair et al. 1988;Siemiatycki et al. 1988) have explored mathematically the potential for confounding by smoking. One common conclusion was that because comparisons are generally between groups with only moderately differing smoking habits (particularly different groups of manual workers, as in most studies in this analysis), substantial confounding is unlikely.
Confounding by other occupational exposure is also possible, but we limited that potential by excluding cohorts in which PAHs were Review | Armstrong et al.

974
VOLUME 112 | NUMBER 9 | June 2004 • Environmental Health Perspectives  Chau 1993Costantino 1995Franco 1993Hurley 1983Hurley 1983Reid 1956Sakabe 1975Swaen 1991Xu 1996Berger 1992Doll 1972Doll 1972Armstrong 1994Milham 1979Moulin 2000Mur 1987Rockette 1983Rockette 1983Romundstad 2000Spinelli 1991Donato 2000Liu 1997Moulin 1989Hammond 1976Hansen 1991Swaen 1997Hansen 1989Maclaren 1987Swaen 1997Evanhoff 1993Hansen 1983 Combined Chau 1993Costantino 1995Franco 1993Hurley 1983Hurley 1983Reid 1956Sakabe 1975Swaen 1991Xu 1996Berger 1992Doll 1972Doll 1972 Armstrong 1994   judged unlikely to be the predominant carcinogen. We did not exclude subjects exposed to high levels of total dust, however, because the hypothesis that dust may cause lung cancer regardless of composition has gained credence only recently (Pope et al. 2002) and because dust is a universal co-exposure of PAHs. The analysis that we conducted addressing the possibility of confounding by dust gave no support to the hypothesis that dust plays a major confounding role. Other ad hoc investigations of confounding potential, in particular noting the absence of lung cancer excess in prebake aluminum workers (exposed to dust but little PAH), came to similar conclusions (Armstrong et al. 2002). However, none of our analyses could rule out confounding completely because of the limited information on total dust exposure available to us and the lack of control for this exposure in the published studies. Further evaluation will be possible when assessments of the dust hypothesis are carried out, which was not possible in this study.
Exposure is likely to have been inaccurately estimated in many studies, in particular those for which no exposure data were published in the report of the epidemiologic study itself, so we made estimates. Random exposure error tends to bias exposure-response slopes toward the null value (Armstrong 1998). However, if our estimates were systematically too high or too low, exposure response slopes would be underestimated or overestimated, respectively. We included estimates if we believed them to be within 4 times the true exposure, so considerable margin for uncertainty remains. It is somewhat reassuring that the mean URR in those studies for which we estimated exposure was not much different from the mean URR in those studies with author-provided exposure information (Table 3). However, errors in exposure estimation might explain particularly high or low URRs in specific studies or industries. In those industries (tar distillation, chimney sweeping, power) with no studies reporting investigators' own exposure estimates, interpretation should be particularly cautious.
Finally, could bias be introduced by selection of cohorts or contrasts for inclusion? Our sensitivity analyses suggest no strong sensitivity, and standard tests for publication bias were negative. However, the overall mean URR is strongly influenced by the predominance of coke ovens and aluminum smelters in the sample.

Explanations for Variation in Unit Relative Risks
Unit relative risks may vary between industries and cohorts for three reasons: a) chance, b) biases, or c) because risk per unit BaP really varies. We established that variation in URRs between industries cannot be explained by chance (particularly coke ovens and aluminum production vs. asphalt and chimney sweeping), but variation within industry can be. Therefore, it seems sensible to focus attention on explaining variation between industries. We discussed biases and confounding in the preceding section. Biases, in particular from inevitably inaccurate exposure estimation, could explain some variation. Confounding by other occupational exposures, perhaps dust, could also play a part, although we found no evidence for this.
Two reasons might account for true variation in URRs. a) A factor that modifies the effect is present to varying degrees in different industries. An example is smoking. Even if different PAH exposure groups in each cohort smoke to the same extent (so there is no confounding), a heavily smoking cohort might exhibit greater or lesser effect on relative risk per unit occupational PAH than a lightly smoking cohort. Unfortunately, we did not have the information to address this aspect. Generally, we can assume that our cohorts were mixed smokers and nonsmokers, so the exposure-response relationships are most likely to predict risk well in similarly mixed groups. Other occupational exposures might also modify the effect per unit PAH by promoting or inhibiting the action of PAHs. Such a hypothesis is too general to evaluate without making it more specific. Finally, cumulative exposure may not be the right metric. If another metric (e.g., early adult exposure, lagged exposure, or another timeweighted exposure) were the relevant one, modification would arise if the time pattern of exposure differed across cohorts. Information on timing of exposure was insufficient for us to evaluate such hypotheses, but in any case we expect that timing of exposure would be too similar across cohorts for informative results to emerge. A few studies reported risk by lagged cumulative exposure, but these generally differed little from tables of risk by overall cumulative exposure. b) The carcinogenic potency of the PAH mixture varies across industries. As we noted earlier, many PAHs aside from BaP are carcinogenic in animals (IARC 1985). BaP is used as an indicator of the total risk, not because it is the sole causal agent but because at least in some industries it correlates well with other agents (Expert Panel on Air Quality Standards 1999). To the extent that PAH mixtures in different industries have different relative concentrations of the various carcinogenic PAHs (their profiles), this could thus explain differences in risk per unit BaP. Krewski et al. (1989) have proposed an approach that derives a risk metric by combining information on PAH profiles with information on relative carcinogenic potency from animal studies. To apply that approach to this meta-analysis, however, would require estimates of PAH profiles for each study or at least each industry. We did not have such information, which is not readily available, for this study. However, PAH profiles are slowly being ascertained and some are published (Appendix B3, Armstrong et al. 2002), so this approach could probably be applied in the future.
Some specific studies with URRs quite different from the mean for their industry deserve specific mention: • The two cohorts of gasworks worker studies (Doll et al. 1972) have high URRs. The estimate of exposure for retort workers in these plants was 3 µg/m 3 BaP and was heavily influenced by measurements reported in 1965 from mask samples. These measurements may have been underestimates. • The very high URR estimated from one study of carbon anode plant workers (Liu et al. 1997) was based on exposure estimates reported by Liu for just one of seven plants, which may not have been representative. • The low and precisely estimated URR from the study of several Norwegian aluminum production plants  has no obvious explanation. Exposure estimation was based on substantial hygiene data for most plants. The nonsignificance of the test for heterogeneity in URRs among studies of aluminum production workers indicates that the absence of excess risk in this study could have been due to chance, but the result remains noteworthy.

Comparisons with Other Unit Risk Assessments
The most directly comparable study estimated lifetime risks of lung cancer per 100,000 men from 50 years of continuous exposure to 1 ng/m 3 BaP [unit lifetime risk (ULR)] from nine studies, using a linear no-threshold model (Gibbs 1997). Eight of the nine studies were occupational (four from coke ovens, two gasworks, one aluminum production, and one asphalt), and they or their updates were included in our analysis. continuous exposure to relative risk estimates from occupational exposure (per 100 µg/m 3 BaP), we used the conversion factors used by Gibbs to do the reverse: where ULR is the lifetime risk per nanogram per cubic meter continuous (23 m 3 /day vs. 10 occupational) exposure (365 days vs. 230 occupational-our assumption) over 50 years, assuming a 9% baseline lifetime risk. Gibbs' finding of ULR 0.3,4.2,4.4,5.8,6.6,7.2,7.8,and 9.5 translates to URRs 1. 02, 1.26, 1.27, 1.35, 1.40, 1.44, 1.48, and 1.58, which are somewhat higher on average than the estimates for the same studies in this analysis but not grossly different. It is also possible to compare our metaanalytic estimates with those published from the U.S. coke oven cohort (latest update reported by Costantino 1995): • The U.S. EPA (1984), cited by WHO (1987) estimated a lifetime risk from continuous exposure per nanogram per cubic meter BaP of 8.7/100,000 using the linearized multistage model. Following the translation we used for the Gibbs study, this corresponds to a URR (relative risk from 100 µg/m 3 years BaP) of about 1.53. • Moolgavkar et al. (1998) estimated a unit absolute risk from continuous 1 µg/m 3 BSM of 15/100,000 using the two-stage clonal expansion model. Roughly re-expressing this using the Gibbs study translation gives a URR = 1.13. The second of these alternative estimates of URR is very similar to our estimate from Costantino (1995) using the exponential model (Table 2; URR = 1.15).
A review of 10 studies with risk estimates published for two or more exposure groups (Mastrangelo et al. 1996) emphasized the unit risk estimate published in one aluminum smelter study (Armstrong et al. 1994). The unit risk estimate cited by Mastrangelo for the Armstrong study used the linear model and was somewhat higher (1.39 translated into the units we have used) than the log-linear URR for this study (1.22) that we used in our meta-analysis.
Our findings of a larger URR in the asphalt industry than in coke ovens or aluminum smelters was tentative. Recent publication of a very large European study of mortality in the asphalt industry (Boffetta and Burstyn 2003) will add important information on this question. The study was not published in time for formal inclusion in this meta-analysis. It found an association of lung cancer with exposure to bitumen fumes in some but not other analyses. Estimates of exposure to PAHs as BaP were made, allowing as far as possible for knowledge of the extent to which coal tar was used as an additive, time trends in exposure levels, and type of asphalt paving. In the asphalt industry PAH exposure originates from bitumen, coal tar (now banned in Western Europe), and diesel exhaust. Contribution of diesel exhaust to PAH exposure was not incorporated into quantitative PAH exposure metric because available data did not permit the investigators to identify groups of asphalt pavers within the cohort with different diesel exhaust exposure. The technical report of this study (Boffetta et al. 2001) includes a table (8.9.4) of lung cancer rate ratios in relation to cumulative exposure to PAHs (as BaP). From this table, it was possible to estimate a URR in the method that was standard for our meta-analysis. The estimate [44.9; 95% confidence interval (CI), 25.0-64.8] is similar to the that of other asphalt worker studies included in this review, adding support to the hypothesis that risk per unit BaP is higher in this industry than in coke ovens or aluminum production. However, analysis of risk by quantitative estimate of PAH exposure was possible only for workers employed in paving (including mastic paving). It may be that other groups in the study (e.g., roofers) showed different patterns.

Interpretation for Risk Assessment
We have used a benchmark of 100 µg/m 3 years exposure to provide a scale for presenting the URR, but risk predictions at other exposures (x) can be made using the formula URR cum.exp=x = [URR cum.exp=100 ] (x/100) . [2] For example, relative risk consequent on exposure to 1 µg/m 3 for 40 years (40 µg/m 3 years) according to the mean estimate for coke ovens is 1.17 (40/100) = 1.06. (At these moderate to low relative risks, log-linear interpolation is close to linear interpolation.) Risk estimates calculated this way for a range of URRs and exposure concentrations are given in Table 5.
Overall or industry-specific means? The URRs overall had significant and substantial heterogeneity. There was evidence that risk per unit BaP varied across cohorts. The mean in the presence of this heterogeneity is a rather artificial one, reflecting those industries and cohorts that happen to have been studied. Within industries there was no significant heterogeneity, so that the industry-specific means could be interpreted as representative of each industry. These considerations favor use of industry-specific means. Means for coke ovens, gas works, and aluminum production are consistent and relatively precisely estimated. The combined mean URR for these industries was 1.17 (95% CI, 1.12-1.22) and might reasonably be used for all these industries. However, means for other industries are imprecise. Risk assessment for these industries will inevitably be uncertain, whether the imprecise industry-specific mean or the overall mean was used.
Model choice. Risk assessment depends on the form of the model, in particular for extrapolation of risk to exposure ranges far from those observed. We adopted the loglinear model because the linear model is not amenable to rigorous statistical evaluation; estimates and confidence intervals for means, and p-values for heterogeneity are unreliable. However, evidence suggests (Appendix C, Armstrong et al. 2002) that the linear model fits the data and arguments on mechanism better than the log-linear model. That the overall mean and broad pattern of URRs under the linear and log-linear models were similar is reassuring, but having model choice forced by statistical tractability is not ideal. The development of methods to allow better meta-analysis of linear relative risk models would be useful.
Apart from the log-linear and linear models, models with very different assumptions about increments at low exposures, such as threshold models, could predict very different risks at these levels. However, information was insufficient to fit these or other more elaborate models (e.g., two-stage, multistage) with the information published. In particular, lack of information precluded our investigating dependence of risk on timing or exposure beyond the cumulative exposure model, for example, risk eventually declining after exposure. The sensitivity analysis ( Table 4) investigating dependence of results on high exposures was suggestive of an exposure-response curve steeper at lower exposure than at higher exposure.
Attributable burden of disease. The number of cancers caused by occupational exposure to PAHs depends on three factors beyond the exposure-response relationship: a) the number of persons exposed; b) the levels at which they are exposed, and c) the background rate of lung cancer on which relative 3.14 risks will act. As an example, we have made an estimate of cases that would be caused in U.K. coke oven workers by PAH exposures continuing at current levels, ignoring probably higher past exposures. There are currently about a thousand coke oven workers in the United Kingdom, with mean exposure about 1.5 µg/m 3 BaP (Unwin J, personal communication). General population lifetime risk of lung cancer in U.K. males, using 1997 rates, is 8% (Office for National Statistics 2000).
Using the mean URR of 1.17 for coke ovens, 1 year of exposure will therefore lead eventually to a lifetime excess risk of 0.08 × (1.17 (1.5/100) -1) = 1.9 × 10 -4 , which among 1,000 workers will lead to 0.2 cases. Forty years of such exposure would lead to 40 × 0.2 = 8 cases.
Assessing risk in the general environment. Included cohorts were all occupationally exposed, and our study was aimed primarily at informing risk assessment in an occupational setting. However, given the limited number and informativeness of direct studies of risks from PAHs in the general ambient exposure, these data also provide a possible basis for estimating these risks. A full discussion is beyond the scope of this article, but we note that for this purpose our (occupational) ULRs would have to be converted to apply to continuous (24-hr, 365-day) exposure, such as with the assumptions of Gibbs discussed above.
Uncertainty. We have acknowledged many sources of uncertainty in risk estimates made from a summary URR. Many such sources, notably model choice and exposure uncertainty, are not incorporated in the confidence intervals, which should be regarded as lower bounds of uncertainty.

Methodological Lessons
Compared with their widespread use in clinical trials, meta-analyses are relatively new to occupational epidemiology, and even more rare in investigations of exposure-response relationships. In entering this poorly charted territory, this study presented several methodological challenges for which we found reasonable but ad hoc solutions. It might be useful to future similar meta-analyses for us to draw attention to the principle issues: • We needed to choose one contrast from each study from which to estimate an exposure-response relationship. To be objective, we selected a simple choice algorithm and explored sensitivity of results to it, but it may be that this procedure could be improved. • We needed to estimate mean exposure in upper-exposure groups for which only a lower limit was published. • A priori considerations and data in the meta-analysis studies suggested use of linear rather than log-linear models, but estimates of URRs from linear models proved intractable in meta-analysis, so we worked with log-linear models. It would be preferable not to have to compromise. One possibility would be to apply a random effects linear relative risk model to semiaggregated data (see below), but to our knowledge, such models have not been discussed in the statistical literature, nor can they be fitted with standard software. • We proceeded in this meta-analysis to estimate first a single effect measure (URR) from each study, then analyze these measures using standard meta-analytic methods.
However, it appears to us that once semiaggregated data have been assembled for cases, exposures, and relative risks in each exposure group in each study (Appendix E, Armstrong et al. 2002), it would be possible to use methods developed more generally for hierarchical data (multilevel models).

Conclusion
Considerable independent data are now available that allow us to conclude that occupational exposure to PAHs by inhalation is associated with a risk of lung cancer. For exposures in the coke ovens, gasworks, and aluminum industries, the risk can be estimated and is equivalent to a relative risk of 1.06 for a working lifetime at 1 µg/m 3 exposure to BaP. Exposures in other industries with PAH exposure, in particular carbon anode plants, asphalt use, and tar distilleries, suggest higher risks at equivalent BaP exposure, but the risk estimates are much less precise.