Models of interaction between metabolic genes and environmental exposure in cancer susceptibility.

Polymorphic metabolic genes that confer enhanced genetic susceptibility to the carcinogenic effects of certain environmental carcinogens act according to a type 2 interaction between genetic and environmental risk factors. This type of interaction, for which the gene has no effect on disease outcome by itself but only modifies the risk associated with exposure, must be treated differently from other types of gene-environment interaction. We present a method to analyze different dose effects often seen in studies involving these genes. We define a low exposure-gene effect, when a greater degree of gene environment interaction appears at lower doses of exposure (the interaction follows an inverse dose function), and a converse high exposure-gene effect, when the interaction increases as a function of dose. Using a standard logistic regression model, we define a new term, alpha, that can be determined asa function of exposure dose in order to analyze epidemiological studies for the type of exposure-gene effect. These models are illustrated by the use of hypothetical case-control data as well as examples from the literature.

Polymorphic metabolic genes that confer enhanced genetic susceptibility to the carcinogenic efect ofcertain environmental carcinogens act according to a type 2 interaction between genetic and environmental risk factors. This type of interaction, for which the gene has no effect on disease outcome byitself but only modifies the risk asociated with exposure, must be treated differently from other types of ge interaction. We present a method to analyz different dose efcts often seen in studies involving these genes. We define a low exposure-gene efect, when a greater deee of gene environment interaction appears at lower dose of exposure (the interaction follows an inverse dose function), and a converse high exposure-gene effict, when the interaction increases as a function ofdose. Using a stan logistic regsion model, we define a new tenr, a, that can be detmined as a function of :exposure dose in order to analyze epidemiological studies for the type of exposure-gene ef&ct. These models are illustrated by the use of hypothetical case-control data as wel as examples from the literature. Key wordk; biomarkers, case-control, epidemiology, logistic regression, polymorphisms. Environ Health Perspect 106: 67-70 (1998). [Online 21 January 1998] htatp:/ehpnntl.niehs.nih.gov/docsl19981106p67-70taiolilabstracthtml Both environmental and genetic factors have been identified as playing roles in the development of many chronic diseases including cancer (1)(2)(3)(4)(5)(6). Despite the importance of environmental exposure in human cancer, the evidence for some form of genetic influence on almost all cancer etiology is very strong (7,8). Human subjects show not only a wide variation of genetic polymorphisms but also an extremely broad range of phenotypic responses to environmental stimuli. In this scenario, it is important to understand the type of interactions that occur between genetic and environmental factors. With the development of new biomarkers of genetic cancer susceptibility, new paradigms for the classical terms of interaction and confounding that are appropriate for the several different mechanisms of gene-environment interaction are needed (9).
A number of genes involved in metabolism of carcinogens have been shown to play a role in cancer risk. In most cases, the putative biochemical mechanism by which such genetic factors exert their effects is fairly straightforward and is related to the actual dose of the active carcinogenic metabolite that reaches the genome in the target cell. By definition, these genes play a role in cancer risk only in the context of interaction with the environment because the substrates of their gene products are xenobiotic chemicals or their metabolites. This follows the type 2 form of gene-environment interaction (GEI) as previously described by Khoury (10,11) and Ottman (12). In this model of interaction, the cancer is caused by exposure to an environmental agent. If there is no exposure, the presence or absence of the genetic risk factor is irrelevant for disease causation. This model of GEI is the most suitable to explain human carcinogenesis related to metabolic susceptibility genes such as cytochrome P450 lAI (CYPIAJ), N-acetyltransferase (NA7), glutathione S-transferase (GS7), etc.
When the dose of environmental exposure (such as smoking) is analyzed with respect to genotype of a metabolic susceptibility gene, two apparently divergent patterns are seen. The first instance could be termed the low exposure-gene (LEG) effect. This is seen when a decreasing degree of interaction occurs as a function of exposure. For example, in a case-control study, the proportion of cases with the genetic risk factor (GRF) or polymorphism would have lower exposure than the proportion of cases without the GRF (13,14). If the endpoint is not cancer but, for example, some marker of exposure such as adducts, subjects with polymorphisms should tend to have higher relative levels of adducts at lower doses of exposure; at high exposures, no difference in endpoint would be observed between those with and without the GRF (15). The LEG effect is very often observed with cases of type 2 GEI such as cancer susceptibility metabolic genes and has been discussed in the context of these genes (16)(17)(18).
A high exposure-gene (HEG) effect is observed when there is an increased degree of interaction as a function of exposure dose. In a case-control study, cases with the GRF would have higher environmental exposure doses than cases without the GRF; in other words, the higher the dose, the greater the effect of having the GRF on any other endpoint (such as disease, adducts, etc.). This high exposure-gene effect has been seen with GSTM1 and lung cancer (19,20). However, a study of GSTM1 and asbestosis (21) showed a LEG effect, suggesting that this phenomenon is not simply gene specific, but must be related to the mechanism of action of the gene product leading to the endpoint being measured. It is important to understand that, for both types of doserelated gene-exposure interaction defined here, having the GRF is not protective, and persons with the GRF are at either equal or higher risk than those without the GRF. The difference between the HEG and LEG effects is whether the effect of increasing exposure dose magnifies or diminishes the relative risk associated with the GRF.
We present a method to identify these two patterns of dose effects and illustrate this method with data from a hypothetical case-control study of a metabolic gene polymorphism showing type 2 GEI, as well as with examples from the literature. The method is also able to distinguish between these effects and the presence of a true protective effect of the genetic variant.

Results and Discussion
A common way to describe interactions between the effects of an environmental agent and a genetic risk factor is to use both terms in a multiple regression model and to include a term that multiplies the GRF by the environmental agent. The coefficient of this interactive term then determines whether interaction is present: G(Y) = a+ beE+ bgG+ begEG, (1) where Yis the odds of disease, a is a constant, E is the environmental exposure, G is the GRF, and EG is the interaction term. The coefficients be, b and begare determined by regression analysis using an appropriate computer program. If we assume that in the absence of environmental exposure the presence of the GRF by itself has no effect on disease outcome (which is the definition of type 2 GEI), then b in the regression model of Equation 1 is deined as = 0, which leads to Equation 2: G(Y) = a+ beE+ begEG, (2) which can also be written This expression corresponds to the assumption that the risk of disease is due to the action only of the environmental exposure and the only effect of the GRF is to modify the coefficient of the exposure term. Now we can say that b* = be+ begG= b(l + aG) (4) where a = beglbe.
In a case-control study, a is the ratio of two log odds ratios (ORs).
For example, let us assume that the effect of GRF such as the isoleucine (Ile) to valine (Val) polymorphism in the CYPlAI gene is to increase the enzymatic activity of the gene product, as has been shown (22)(23)(24)(25). The result of having this GRF is an increased level of metabolism, presumably leading to an increased concentration of the ultimate carcinogen, given a particular exposure dose.
While this scenario may represent an oversimplification, it can be seen that the effect of the GRF is to quantitatively modify the effect of the exposure term. This would be reflected in a value greater than 0 for the term a. If a is negative, the genetic factor should be protective. If the GRF has no effect on the exposure (e.g., if the exposure is to an agent that is not a substrate for the gene product), then b and a = 0. If the GRF is absent, G = 0. Iwn either case, the risk is a function of exposure only, with no contribution from the gene.
We can rewrite the regression model of Equation 2 if there are data for the effects of multiple (n) levels of exposure (doses): (5) where be, = be for exposure level E1, b(e,01 = beg for exposure level E1, and G is the GRF. Using this notation, E, is the dose and b is defined as 0 and does not appear. For each dose level (apart from the reference) from Equation 4: a(i= b(e4lbei (6) If values of a are plotted against dose, several outcomes are possible. If the slope of this plot is positive, the gene-environment interaction follows a HEG effect; if the slope is negative, a LEG effect is operative. If a <0 and be >0 at any particular dose level, the genetic factor is protective at that level. Such a scenario, whereby a particular genetic polymorphism may be a risk factor at one level of exposure but protective at a different level, is possible given the highly complex web of interconnecting metabolic pathways that usually operate in carcinogenic mechanisms.
The risk of disease or any other endpoint is always a function of dose. Figure 1 illustrates a hypothetical dose-response relationship for a type 2 GEI. At low-exposure levels (area A in the figure), the curves with and without the GRF diverge as a function of increasing dose, showing an increased interaction (HEG effect). On the other hand, at high doses, near a putative saturation value (Area C), the curves converge toward saturation, showing a decreased interaction (LEG effect). The area in between, Area B, exhibits a mixed pattern. This model may or may not explain the observed LEG and HEG effects in type 2 GEI.
Since a is dependent on two odds ratios (or rate ratios) and each ratio is a function of the risk of disease at a particular dose, we can assume that a is some function of dose. The particular function is likely to vary for different specific chemical exposures. If the function of a with respect to dose is known, we can determine the form of the exposure-gene Exposure level Figure 1. Relationship between levels of exposure and endpoint, showing three regions with different exposure-gene effects. A, high exposure-gene effect; B, mixed effect; C, low exposure-gene effect; dotted line, ge'netic risk factor present; solid line, genetic risk factor absent. effect by the sign of the first derivative of ac with respect to exposure, decIdE. Although several feasible dose-response models could be used to determine the function of a with respect to exposure level, there is no clear reason to choose any particular such model, given the current understanding. Therefore, rather than attempt to more precisely define daxldE, we have used hypothetical case-control studies representative of the various scenarios presented above.
Hypothetkal da" We have created a dataset using the EGRET package (Serc and Cytel, Seattle, WA), induding 5,000 cases and 5,000 controls, in which the frequency of the genetic polymorphism among the controls was arbitrarily defined as 20% and the frequency of the levels of exposure (no exposure, low, medium, high) was arbitrarily set to 90%, 5%, 3%, and 2%, respectively. To show the HEG effect, we set the ORs of the three levels of exposure as 2.0, 3.0, and 4.0, the OR for the presence of the genetic polymorphism as 1.0 (as from the definition of GEI type 2), and the OR for the presence of the gene and the three levels of environmental exposure to 3.0, 6.0, and 12.0. To show the LEG response, we kept the same ORs of the three levels of exposure and for the presence of the genetic polymorphism, while the OR for the gene and the three levels of environmental exposure were set to 4.0, 5.0, and 6.0.
The results are presented in Tables 1 and 2. In Table 1 the value of a rose from 1.1 to 1.7, confirming a HEG effect; in Table 2, a decreased from 2.1 to 1.3, as hypothesized in the LEG effect.
Examples of type 2 GEI analyses. Both types of exposure-gene effects have been observed in several studies of genetic susceptibility genes that include information on exposure dose. Kihara et al. (20) illustrate a high exposure-gene effect. Table 3 shows the ORs for each category of smoking exposure and genotype. The first observation is that when the GRF is present but there is no exposure, the OR is equal to the reference (absence of both gene and exposure); this defines a type 2 interaction. As the exposure level increases, the risk of disease increases; the increase is higher when the GRF is present for each category of exposure level. Table 4 shows the coefficients (bp, bQ obtained from the multiple logistic regression model using the SAS statistical package Genmod (SAS Institute, Cary, NC). Also shown in Table  4 are the values for a, the interaction term, which increase directly as a function of dose. Thus, in this case there is a HEG effect. Similar results can be obtained using the data from other sources for this gene and smoking-related lung cancer (19).
An example of a LEG effect is seen for the CYPIAJ Ile to Val polymorphism in exon 7 of the gene as a GRF for smokinginduced lung cancer. The data from Nakachi et al. (26), shown in Tables 3 and 4, show that although the OR for cancer increases for both genotypes as a function of dose, the ratio of the risks decreases at higher doses.
The decrease of a with increasing exposures illustrates this. We have also observed a low exposure-gene effect for the association of the African-American-specific polymorphism in CYPlAl with lung adenocarcinoma in smokers (14), and other groups have reported similar findings using adducts as an endpoint for NAT2 and for CYPIA2 [Landi et al., personal communication; (15)]. Metabolic genes that modify cancer susceptibility play no role in carcinogenesis in the absence of a relevant carcinogenic exposure, assuming that endogenous substrates are not involved. If they are, then the term exposure must be modified to include such agents. In the analysis of this GEI, defined as type 2, it is important to consider the exposure dose and two different forms of exposure-gene interaction. We have shown that the two forms of dose effect, the low exposure-gene effect and the high exposure-gene effect, may be analyzed and distinguished from each other and from other types of effects (such as protection). In our example, we have not addressed the question of whether multiplicative versus additive models of interaction should be used (27). For areas A and C of Figure 1, a multiplicative model was the hypothesis underlying our analysis, and we have observed that use of an additive model has an effect only on the magnitude of the coefficients and a values but not on the direction of the dose-dependent effect of the gene (not shown). For Area B, this is not the case.
While at this point the biological mechanisms responsible for the two types of exposure effects are not known, it is possible to speculate that these effects may simply be a reflection of the shape of the dose-response curves for individuals with and without the GRF. If we assume that the effect of a genetic susceptibility factor is to increase the carcinogenic response at any particular dose (e.g., by causing increased enzymatic activity or by altering the metabolic profile of an agent), the dose-response curve will be shifted to the left. An important assumption is that the GRF has no effect on the maximal response. At low dose levels, the presence of the GRF will lead to an increase in the slope of the dose-response curve, and subjects with the GRF should respond more to higher doses of environmental agents than subjects without the GRF, leading to the observation of a HEG effect. It may be safely assumed that for any toxicological endpoint (including carcinogenesis) a saturating dose must always exist at which no further effect can be seen with increasing dose. At doses dose to this maximum saturating level, the dose-response curve for subjects with the GRF will exhibit a decreased slope so that, although the overall response in GRF positive individuals is higher then in GRF negatives, the increase of the effect with increasing dose is lower in the positives than in the negatives. Therefore, subjects with the GRF may exhibit disease (or high adduct levels, etc.) at lower doses of environmental agent than those without the GRF (the LEG effect). This assumes that both exposure and the GRF have a positive influence on the endpoint. For situations in which either the exposure (such as a therapeutic or preventive agent) or the gene have a negative effect on the endpoint, the reverse would be observed. According to this model, genes such as CYPIA1 show low exposure-gene effect with respect to lung cancer because the doses of carcinogen present in cigarette smoke are so high. A low exposure-gene effect implies that for GRF positive individuals, even a low carcinogen dose is highly risky; people carrying the polymorphism are at higher risk of cancer in comparison to members of the general population who are exposed to a carcinogen such as tobacco smoke. Only complete smoking cessation, as well as the avoidance of other relevant exposures, can lead to cancer prevention in the susceptible group.
This analytical approach may be used to determine whether a HEG or LEG effect is operative. Type 2 GEI is the correct model for a particular study if the odds ratio for the unexposed group with the GRF is dose to 1. However, when the data are of insufficient statistical power to allow for an accurate determiination of this odds ratio, an alternative is to apply knowledge regarding the mechanism of action of the gene related to the exposure to decide whether a type 2 interaction is logical within a mechanistic context. Comparison of a values between studies and meta-analysis of a for specific gene-exposure combinations may prove valuable in the future. The detection of cancer genetic susceptibility has profound positive public health implications for cancer prevention. Detailed study of the interactions of these genes with environmental carcinogens holds the promise of allowing the definition of subsets of individuals of varying sensitivity and responsiveness. While the entire issue of genetic susceptibility differences among people has important ethical, legal, and political issues, increased knowledge in this area (such as the specific form of exposure-gene effect discussed here) will provide benefits in helping to resolve these issues.