Computer-assisted mechanistic structure-activity studies: application to diverse classes of chemical carcinogens.

In the first part of this paper we have indicated how the techniques and capabilities of theoretical chemistry, together with experimental results, can be used in a mechanistic approach to structure-activity studies of toxicity. In the second part, we have illustrated how this computer-assisted approach has been used to identify and calculate causally related molecular indicators of relative carcinogenic activity in five classes of chemical carcinogens: polycyclic aromatic hydrocarbons and their methyl derivatives, aromatic amines, chloroethanes, chloroalkenes and dialkyl nitrosamines. In each class of chemicals studied, candidate molecular indicators have been identified that could be useful in predictive screening of unknown compounds. In addition, further insights into some mechanistic aspects of chemical carcinogenesis have been obtained. Finally, experiments have been suggested to both verify the usefulness of the indicators and test their mechanistic implications.


Overview
The evaluation of large numbers of diverse chemicals for potential toxic effects in an industrial or environmental setting is an enormous task. In conjunction with in vivo and in vitro testing, computational structureactivity studies can be very useful in such evaluations. Since all computational methods require at least an initial data set to test the reliability of selected molecular descriptors prior to their use in predictive screening, the question then is how to optimize the potential symbiosis between experimental and computational risk assessment procedures.
As shown in Figure 1, a five-step protocol is suggested. The first step is to divide the chemicals into subgroups according to structure and functional groups. The next suggested step is to develop a systematic and self-consistent data base of toxic effects for representative members of each class of compounds. The more specific the biological activity tested, the more useful the tests will be in relating structure to activity.
As a third step, studies of what causes the observed toxic effects, i.e., mechanistic studies for a limited number of representative compounds of different classes of chemicals, would be of value. Mechanistic studies can *Life Sciences Division, SRI International 333 Ravenswood Ave., Menlo Park, CA 94025. be done using both experimental methods and the techniques of theoretical chemistry. Information coming from both types of studies can be very useful in selecting mechanistically relevant molecular descriptors for computation. While in-depth studies might appear to be relatively time-consuming, they should ultimately lead to more effective and rapidly coverging risk assessment schemes.
The fourth step entails selection of molecular descriptors for evaluation. Differing guidelines for this selection distinguishes classical QSAR (quantitative structure-activity relationships) from causally related approaches. In the classical approach, any molecular property is used which might provide correlation to activity. In the "causal" approach, molecular indicators are selected which might have a causal relationship, however tenuous, to the observed toxicity based on prior theoretical or experimental studies.
In the fifth step suggested, each set of chosen candidate molecular descriptors is evaluated by some method that can include both experimental and theoretical determination, and their reliability as determinants of toxicity tested (step 5). If successful, their use for large-scale predictive screening is indicated. If not, the choice of molecular indicators and/or the method of evaluation must be re-examined and refined, leading, it is hoped, ultimately to a useful set of descriptors. The techniques of theoretical chemistry can be par-

SELECTION Oc MOLECULAR DESCRIPTORS
Use the insights obtained from indepth studies of limited numbers of compounds to identify and select a set of appropriate 'molecular descriptors' with mechanistic relevance to each risk.

SCREENING
'Calculate' these descriptors for many compounds with known activity in each class by a variety of methods to test ability of chosen descriptors to account for known behavior.

PRELIMINARY DATA BASE
Select a limited number of compounds from each class for extensive series of tests; e.g., acute toxicities. mutagenicity/carcinogenicity. resistance to chemical and biological degradation.

MECHANISTIC STUDIES
Select a limited number of compounds which have been screened for different types of adverse effects to perform in-depth mechanistic studies relevant to each risk; e.g. by mechanisms of degradation.  I-ticularly useful in three different steps of the process outlined in Figure 1: (1) elucidation of mechanisms (step 3); (2) selection of molecular descriptors mechanistically relevant to a specific risk (step 4); (3) calculation of selected molecular descriptors of potential use in screening large numbers of molecules (step 5). Figure 2 schematically indicates molecular events that could lead to a toxic response. Either parent compounds themselves or their chemically transformed products can be responsible for an adverse effect by interaction with tissue macromolecules. As further shown in Figure  2, such interactions-either "physical," leading to reversible complex formation, or chemical, leading to adduct formation or transformation of tissue macromolecules-could play a key role in eliciting adverse effects.
In a parallel fashion, the capabilities of theoretical chemistry embodied in a hierarchy of computer programs and methods, can be used to characterize these molecular events. In particular, they can be used to describe: parent compounds or transformed products implicated in toxic responses, physical interactions (i.e., complex formation) between putative toxic agent and target molecules, and chemical reactions involved in both formation of toxic species and in their interaction with tissue nucleophiles. Thus, computers are the "laboratory" where the properties of individual molecular systems and their physical and chemical interaction with biological targets can be modeled and calculated.
As shown in Table 1, the techniques used are not limited to one particular method. In fact, an extensive library of diverse methods is currently available with programs of varying degrees of sophistication. The particular theoretical method selected should depend on the property to be calculated and the degree of accuracy needed for its use as a reliable molecular indicator. Table 1. Capabilities of specific methods of theoretical chemistry.

Empirical energy methods
Rapid calculation of individual molecular conformations with or without geometry optimization Rapid characterization of energies and geometries of intermolecular complexes typical of physical interactions between toxic agent-tissue macromolecule physical interactions Semiempirical quantum chemical methods Calculate energy-conformation profiles of individual compounds to investigate conformational flexibility Calculate electronic properties of individual compounds Calculate chemical/biochemical reactivity parameters of individual compounds related to formation of toxic products or formation of adducts between toxic agents and tissue nucleophiles Model chemical/biochemical reactions involved in transformation to toxic species or adduct formation of them with tissue nucleophiles Model intermolecular complex formation, i.e., physical interactions with tissue components that could lead to adverse effects Table 2 describes in more detail the specific capabilities of theoretical chemistry that can be used to calculate properties of putative toxic species related to their ability to physically and chemically interact with target tissue molecules. Table 3 summarizes the techniques of theoretical chemistry that can be used to characterize reversible "physical" interactions explicitly, i.e., complex formation with tissue macromolecules that can play a role in eliciting adverse effects. Table 4 further lists methods of theoretical chemistry that can be used to characterize chemical reactions involved in eliciting an adverse response in either of two ways: transformation of parent compounds to toxic forms and transformation of target tissue molecules by toxic species. From such information, chemical reactivity parameters for isolated reactants can be extracted and calculated for a large number of compounds. In addition, intermediates can be identified and characterized that might be too transient to be examined experimentally but which could be implicated in toxicity, or other adverse environmental effects. Group or atomic electrophil-selectivity in transformation icities and/or nucleophilicities to specific intermediates and products Estimate of the extent and specificity of covalent adduct formation with tissue macromolecules  This section summarizes how the techniques of theoretical chemistry can be used in a mechanistic approach to predictive toxicology. As indicated, this mechanistic approach does not eliminate the need for experiments but, if successful, invests experimental data with increased usefulness and allows the development of a rapid screening procedure for toxic effects in unknown compounds. In subsequent sections, we review work on a variety of classes of chemical carcinogens as an example of this approach.

Applications: Mechanistic
Structure-Activity Studies of Chemical Carcinogens Introduction and Background As shown in Table 5, a wide variety of chemical classes have been shown to exhibit mutagenic and carcinogenic activity by a number of assays, including in vivo animal testing, bacterial mutagenicity, cell culture transformations, and sedimentation analysis of DNA. In our laboratory, we have embarked on a systematic program of computer-assisted mechanistic structure-activity studies of a number of different classes of chemical carcinogens, divided as suggested in step 1 of the protocol into different chemical types.
In particular, we have studied five different classes: polycyclic aromatic hydrocarbons and their methyl derivatives, aromatic amines, aflatoxins, halohydrocarbons, and nitrosamines. We report here a summary of some examples of these studies.
Following step 2 of the protocol, an initial data base was selected for each of these classes of chemical carcinogens. Unfortunately, quantitative self-consistent, standardized results of in vitro animal tests for carcinogenic activity were not available. Moreover, short-term in vivo assays were not suitable in some instances, for example, for halohydrocarbons there is no correlation between bacterial mutagenic and mammalian carcinogenic activity. Thus, we chose, for this initial study, results of qualitative in vivo carcinogenic studies that allowed either a rank order of analogs to be established or, in the worse case, a bimodal active/inactive evaluation. More quantitative and consistent biological data would have rendered our search for reliable molecular indicators more efficacious. However, while the absence of such data did not allow a quantitative relationship to be established, it was possible at least for some classes of chemical carcinogens to identify useful qualitative molecular indicators of relative carcinogenic activity. Moreover, results obtained for each class led to additional mechanistic insights and suggestions for further experiments.

Polycyclic Aromatic Hydrocarbons and Their Methyl Derivatives
Polycyclic aromatic hydrocarbons (PAHs) were selected for initial studies, since they have long been implicated as mutagens and carcinogens (1)  widely distributed throughout the environment, being found in urban air (2), cigarette smoke (3) and foodstuffs (4), and pose an important health threat. As a result, much experimental and theoretical effort has been aimed at understanding the mechanisms by which they initiate carcinogenesis. Early work established that reactive intermediates capable of binding to DNA and other tissue nucleophiles are formed in PAH metabolism (5). Indirect evidence accumulated that PAHs require metabolic activation to become active carcinogens (6), but it was not until very recently that the complete sequence of intermediates between a parent PAH and DNA adduct was demonstrated (7). Based on this work and subsequent studies of benzo(a)pyrene (8), benz(a)anthracene (9), 7-methylbenz(a)anthracene (10), chrysene (11), dibenz(a,h)anthracene (12), and 3-methylcholanthrene (13,14), the concept of bay-region diol epoxides as proximate carcinogens was advanced. As shown in Figure 3, activation of a parent PAH to a bay-region diol epoxide involves three enzymatic reactions. The first is a cytochrome P-450-mediated epoxidation at the bond adjacent to the bay region [e.g., bond 3,4 in benz(a)anthracene], which HYPOTHESIS can be called a distal bay-region bond. The second is an epoxide hydrase-catalyzed hydrolysis of the distal bayregion epoxide to a trans-dihydrodiol. The third is another P-450-mediated epoxidation yielding the bay-region tetrahydrodiol epoxide.
For the parent hydrocarbon, as well as the intermediate species, there are detoxification pathways competing with this activation pathway. The sum of the detoxification and activation processes determines the net amount of available diol epoxide. These species are thought to attack critical nucleophilic sites in DNA, either directly in an SN2 reaction (15) or, after forming a carbocation, in an SN1 reaction (16).
While every step involved in PAH activation and detoxification has not yet been totally elucidated, the significant progress outlined above makes it possible to use a mechanistic approach to structure-activity correlations for this class of carcinogens.
In addition to the widely accepted bay-region diol epoxide (BRDE) hypothesis (7)(8)(9)(10)(11)(12)(13)(14)(17)(18)(19), as shown in Figure 4, two other hypotheses were considered. One is the hypothesis that differential rates of transformation of the PAH to K-region oxides might relate to car-   l cinogenic potencies (8). The other hypothesis investigated is based on an alternative activation scheme suggested by Cavalieri and co-workers (20,21). They envision PAH as undergoing one-electron oxidation, yielding a reactive radical-cation intermediate which acts as an ultimate carcinogen. Using each of these three hypotheses, we have identified and calculated relevant molecular properties for a series of 17 PAHs (Fig. 5). In addition, we have studied the effect of methyl groups on carcinogenic activity (Tables 6 and 7) of PAHs considering the three hy-potheses summarized in Figure 4, in studies of 14 methyl derivatives of benz(a)anthracene and 13 methyl and fluoro derivatives of chrysene. At least qualitative carcinogenic potency data exist for these PAHs (22), methylbenz(a)anthracenes (23,24), and chrysene (25)(26)(27)(28), making them suitable for an initial data set. Ch rys ene (4)    bMethylchrysenes. 'Dimethylchrysenes. dFluoro-5-methylchrysenes.
In the studies reported here, three types of properties were calculated related to the ability of parent compounds to form each type of toxic species proposed: radical cations, K-region epoxides, and bay-region diol epoxides. In addition, properties related to the electrophilicity of each toxic species such as charge distributions and stabilities of carbocations were also calculated. The major results of these studies (29)(30)(31)(32)(33) are summarized below. Additional support was provided for the bay-region diol epoxide as the active carcinogen in contrast to the K-region epoxide or radical cation of the parent PAH. A dominant mechanism was determined by which methyl substituents appear to alter the carcinogenic activity of parent PAH compounds such as benz[a]anthracene and chrysene. Activation by CH, groups appears to occur mainly by the stabilization of the bay-region diol epoxide carbocations. A set of reactivity parameters of PAHs and their methyl derivatives was identified and calculated which can serve as predictors of their major metabolites and the extent of metabolism by the enzymes (cytochrome P-450) implicated in their transformations to carcinogenically active forms. A single molecular descriptor which, as shown in Table 8, is a reliable indicator of the related carcinogenic activity of a series of 44 PAHs and their methyl derivatives, was identified and calculated. This property is the calculated ease of formation of the carbocation of the bay-region diol epoxide. Having proven reliable for these known compounds, it can now be used in screening procedures to predict at least the presence or absence of carcinogenic activity in untested com- The major exceptions to the reliability of this indicator correlations obtained can be explained in terms of the steric effect of a methyl group adjacent to the bay region, the 5-position in BA and the 12-position in chrysene. Called the "peri" effect (19), it has been proposed to inactivate PAH by blocking distal bay-region diol formation, the second activating step which we have not modeled. However, the results of our studies offer an alternative explanation of the peri effect.
Energy/conformation calculations performed on the trans-3 , 4-diol of 5,12-dimethylbenz[a]anthracene (DMBA) to determine its minimum energy conformation indicate that the diol substituents have a different conformation from that in any other methylbenzanthracene. Although it appears minor, this change in conformation could prevent the diol from being further epoxidized at the 1-2 position: or if the diol epoxide is formed, the altered conformation could sterically inhibit interaction of its carbocation with tissue nucleophiles.
In terms of predictive indicators, the steric effect of a peri-methyl group should be added to the calculated values of carbocation stabilities in screening unknown methyl PAH compounds for carcinogenic activity. mon with PAHs, ring epoxidation could contribute other active forms, while ring C-hydroxylation leading to phenols could be a detoxification pathway.
While neither ester precursors nor arylnitrenium ions of aromatic amines have been detected in vivo, these latter species have been proposed as the electrophilic ultimate carcinogen in amine carcinogenesis primary because of the detection of two types of covalently bound adducts of esters of polycyclic aromatic amines to nucleophilic sites in DNA and mononucleotides (34,35,(38)(39)(40)56,57) and to tissue macromolecules (36,42,58). In both types of adducts (I,II) for guanine, nitrenium ions could be the precursor electrophile.
Thus the hypothesized mode of action of PAA carcinogens involves their initial transformation to hydroxylamines by cytochrome P-450s and their ultimate conversion to electrophilic arylnitrenium ions which interact with key tissue nucleophiles. In addition, in com-With these hypotheses as a guide, we have used the techniques of theoretical chemistry to identify and calculate mechanistically relevant molecular properties which could be reliable indicators of the extent of carcinogenic activity in two different series of related PAA, substituted anilines (59) and the aromatic amines of different polycycic hydrocarbons shown in Figure 7 (60,61).
The eight PAAs were a small, but particularly challenging, set of compounds to study. As shown by (+) and (-) in Figure 7, one of each pair of isomeric amines is an active carcinogen, while the other is inactive or of doubtful activity. Mutagenic potency data are also available for these eight compounds, but not all data were obtained with the same bacterial strain (62). For isomeric pairs, however, their qualitative carcinogenic potency obtained from animal testing (43) is consistent with mutagenic potency: the weaker mutagen is the inactive or more marginally active carcinogen. These subgroups make ideal tests of the ability of calculated electronic parameters alone to predict relative mutagenic activity, since effects such as transport and elimination should be more nearly the same for both isomers of a given pair than for the group as a whole.
Electronic reactivity parameters relevant to the relative ease of metabolic transformation of each parent compound to hydroxylamines by cytochrome P-450 and also to other competing metabolic products involving ring epoxidation and hydroxylations were calculated. The reactivity parameters were selected from the known aspects of the mechanisms of cytochrome P-450 oxida-  tions as summarized in Figure 8. All oxidations by this enzyme system involve transfer of an electrophilic oxygen atom to the substrate, and three major metabolic transformations are possible for the PAAs: N-hydroxylation to form hydroxylamines, ring hydroxylation to form phenols and ring oxidation to form epoxides. Plausible mechanisms for these oxidations are also indicated in Figure 8. N-Hydroxylation could proceed by addition of the electrophilic oxygen to the lone pair of the nitrogen with rearrangement of the hydrogen to form the hydroxyamine. Direct phenol formation could occur by addition of the electrophilic oxygen to the IT-orbital of the ring carbon to form a tetrahedral intermediate or transition state followed by rearrangement of the hydrogen atom to form the phenol. Ring epoxidation could occur by direct addition of the electrophilic oxygen across the ring C = C bond and perpendicular to it. Consistent with these mechanisms, the nucleophilicity of the nitrogen atom is an appropriate biochemical reactivity parameter to monitor the extent of formation of hydroxylamine, and the nucleophilicity of the ring carbon atoms and ring wr bonds should indicate the extent of formation and preferred sites of formation of phenol and epoxides. An appropriate measure of atom and T-bond nucleophilicity used was their superdelocalizability which is an energy-weighted average of the electron density centered on a given atom (SA) or bond RABi(T).
Parameters were also calculated relevant to the stability and electrophilicity of each arylnitrenium ion, the postulated ultimate carcinogen of PAA. Stabilities of arylnitrenium ions relative to sulfate esters (AENH+) were calculated. The relative electrophilicities of arylnitrenium ions were estimated using two electrophilic indices. One is a measure of incipient covalent adduct formation. It is the calculated electron density distribution on the nitrogen atom and the carbon (C,B) adjacent to it in the lowest empty molecular orbital (LEMO) act as electron acceptors in an incipient reaction with target nucleophiles to form adducts of type I and II, respectively. The other measure of electrophilicity used was the net calculated charge on each atom, a measure of the extent of electrostatic interactions.
Finally, we have explicitly characterized the energetics and geometries of intermolecular complexes that could be precursors to adducts of type I and II for adenine and guanine in order to help understand why type I adducts are favored and why guanine is a preferred site of attack over adenine as a target nucleophile (38,39,56,(63)(64)(65)(66)(67)(68).
In characterizing carcinogen-base complexes, the ni-trenium ion of 2-aminonaphthalene was used as a prototype ultimate carcinogen. Coplanar complexes, as well as stacked complexes corresponding to intercalation which could lead to adducts of type I and II, respectively, were considered in detail. Intermolecular energy calculations by both quantum mechanical (69) and empirical energy methods (70) were made of arylnitrenium complexes with each base and the steric feasibilities of forming the most energetically favored complexes examined graphically. Molecular Indicators ofRelative Mutagenicity/Carcinogenicity. Table 9 summarizes the calculated parameters which should be most relevant as molecular dUnits of all nucleophilic superdelocalizabilities, SA,B,C,N(1f) and RAB(rr) are in millielectron charge/V eThe maximum fT superdelocalizability on any carbon atom in the molecule, neglecting the carbons ortho to the amine group.
'Reactivity of the most reactive bond in the corresponding aromatic hydrocarbon. gStability of arylitrenium ion relative to the sulfate ester of the N-hydroxylamine. The larger the number the less stable the ion.
h Electron density on nitrogen atom and C. in lowest energy empty molecular orbital of the arylitrenium ion. This quantity is a measure of the electrophilicity of the nitrogen in incipient covalent interactions.
indicators of observed carcinogenic and mutagenic potencies of the eight PAAs studied. Comparing the results for pairs ofisomers, in each case the value ofSN(rr), chosen as an indicator of the extent of formation of hydroxylamine from parent compounds, is larger for the more potent mutagen/carcinogen. Also, the less potent isomer in each case has the ring carbon which is most reactive to direct phenol formation, i.e., larger value of Scm`0(iij. Thus, direct phenol formation appears to be an effective detoxification pathway. Ring epoxidation RAB(rT), on the other hand, appears to be more activating than detoxifying, since some direct relationship is observed between mutagenicity and rr-bond reactivity. Taken together, the results imply that competition between ring hydroxylation (detoxification) and N-hydroxylation (activation) might be a crucial factor in determining mutagenic/carcinogenicity. The importance of such metabolic factors is suggested by the fact that 1-hydroxylnaphthylamine is a more potent mutagen than the 2-hydroxylamine, while the parent compound activities are reversed. Thus, the inactivity of the parent compound could be due to competing ring phenol formation, particularly in the 4-position thereby preventing formation of significant amounts of the 1hydroxylamine.
Considering the properties of the postulated intermediates, neither hydroxylamines nor sulfate esters appear to be electrophilic by either a charge or overlap criteria. This is in contrast to results for the postulated ultimate carcinogen, the arylnitrenium ion, where the calculated quantities AENH', PN and Pc revealed electrophilicity activity.
A For all the amines studied, the nitrogen atom and the ring carbon atom adjacent to it (C) were the most electrophilic sites by the "overlap criterion" of having the largest electron density (PN and Pc ) in the lowest unoccupied molecular orbital (LUMO). This result is consistent with the identification of adducts of guanine involving the nitrogen and the ,B carbon atom of aromatic amines. However, the more stable the arylnitrenium ion (AENH+), the less electrophilic these two atoms are is in terms of either their net charge or their extent of participation in the lowest empty molecular orbital. Such an inverse correlation is reasonable, since the delocalization which leads to stabilization of the cation also leads to less charge or electron density localization on the nitrogen atom and ring carbon atoms adjacent to it. This balance of effects between AENH+ and PN, Pc, can explain why the three-ring aromatic system (2-aminoanthracene) with intermediate values of both arylnitrenium ion stability and electrophilicity is the most potent mutagen. Between isomers, the more electrophilic the arylnitrenium ion, the more potent the parent amine activity. However, all arylnitrenium ions might be electrophilic enough to interact with DNA if formed to appreciable extent. In summary then, five calculated properties can be used as reliable indicators of relative carcinogenic/mutagenic potencies of polycyclic aromatic amines to be further tested in predictive screening. Three of these are measures of metabolic transformations: two to active arylnitrenium metabolites [SN(A) and AENH'] and one to inactive hydroxyl metabolites Scr(,rr), and two are measures of covalent adduct formation (electrophilicity) of the arylnitrenim ions (PN and Pc in LEMO).
In a similar study of a series of aniline derivatives (60), identification and calculation of two mechanistically relevant properties of substituted anilines were made which correlated with their activity and hence could be used as predictors in screening untested aniline derivatives for presence or absence of activity. One property is a measure of the extent of metabolism to hydroxylamines and the other of the ease of formation of the arylnitrenium ion.
Mechanistic Inferences ofResults ofStudies ofAromatic Amines. Further verification is obtained that arylnitrenium ions formed by N-hydroxylation and active ester formation appear to be the likely ultimate carcinogens of this class of compounds. It is determined that the carcinogenic activity of parent PAA could be due to a balance of the stability and electrophilicity of the arylnitrenium ions formed from them, and indicated for the first time that detoxification pathways such as ring hydroxylations might play a major role in determining relative carcinogenic potencies of aromatic amines and could account for reversal of relative activities between parent amines and their synthetically produced hydroxylamines. Features in interactions with DNA which determine observed specificity of adduct formation of aromatic amines with DNA, i.e., guanine attacked preferentially to adenine, and in guanine the C8, position is preferred over the exocyclic amine group at C2, are calculated and identified.
Guanine prefers to form coplanar complexes which can be direct precursors to covalent bond formation between C8 and the nitrogen atoms of the nitrenium ion, while adenine prefers stacked complexes which could lead less readily to this type of adduct. If a coplanar approach in the major groove of the DNA helix is involved in the formation of the major adduct, the intermediate formed by guanine is more stable than that formed by adenine, implying a lower energy transition state. Our results strongly imply that it is this difference in preferred behavior, coplanar complexes with guanine versus stacked complexes with adenine, that accounts for the observed preference for C8-N adduct formation with guanine. This prediction could be tested by experimentally measuring the extent of intercalation of arylnitrenium ions into poly-A and poly-G by measuring unwinding angle and association constants relative, for example, to the known behavior of ethidium bromide (71). Such measurements would further test the methodology employed here and could corroborate the implications of these calculations for the mechanism of arylamine carcinogenesis/mutagenesis.

Ethylene Chlorides and Chloroethanes
As suggested in steps 2 and 3 in the proposed protocol, the related classes of halohydrocarbons ethylene chlorides and chloroethanes shown in Figure 9 were selected for study because ethylene chlorides and chloroethanes are widely used in industry, and because a number of analogs such as vinyl chloride (72)(73)(74) and 1,2-dichloroethane (75,76) have been shown to cause cancer in laboratory animals and are implicated in increased tumor incidence and mortality in humans as well (77,78). In addition, mechanistic studies were available. Specifically, six chloroethanes have been shown to produce tumors in experimental animals following both long-term feeding and inhalation routes of exposure (75,76). From the results obtained from feeding studies for hepato cellular carcinomas in mice, the rank order of carcinogenic activity (as a percentage of female mice with tumors from low dose) is: 1,1,1,2,2 > 1,1,2,2 > 1,1,2 > 1,1 > 1,2 > 1,1,1.
While much less extensive and consistent data exist for the ethylene chlorides, available data (79-82) allow a tentative rank order of carcinogenic potency for four of these: vinyl chloride (chloroethylene) > vinylidene chloride (1,1-dichloroethylene) > tetrachloroethylene > trichloroethylene, with no studies reported for cis-and trans-1,2-dichloroethylene.
Also of significance is that, while some aliphatic halohydrocarbons have been shown to be weakly active as bacterial mutagens, in general this short-term assay fails to correlate with carcinogenic activity of this class of compounds in susceptible mammalian species (83,84). It is thus particularly important to explore other means of evaluating halohydrocarbons for possible carcinogenic activity as an alternative to costly and time-consuming animal tests.
The studies we report here use semiempirical molecular orbital methods embodied in large-scale computer programs, together with postulated mechanisms of action for these two classes of compounds, to identify and calculate molecular properties for each class that could be reliable indicators of their relative carcinogenic activity in susceptible species and hence be useful in computer-assisted predictive screening of the behavior of unknown haloalkanes and haloalkenes.
Choice ofMolecular Indicators ofRelative Carcinogenic Activity of Chloroethanes and Chloroethylenes. Following step 4 of the suggested protocol ( Fig.  1), fundamental molecular properties which might be reliable indicators of the relative carcinogenic activity of these two classes of compounds were selected based on theoretical (85,86) and experimental (87-89) studies of model cytochrome P-450 oxidations and known experimental studies of the metabolism (90-108) and adduct formation (109-118) of ethylene chlorides and chloroethanes . Based on this knowledge, the following assumptions were made and used to choose causally related properties: (1) P-450 oxidation is the initial step in enzymatic transformations leading to the active carcinogenic forms. (2) P-450 oxidation occurs in two steps by a radical mechanism or, less likely, in one step by a nonradical mechanism. (3) Subsequent nonenzymatic transformations occur as outlined in Figures 10  and 11 for ethylene chlorides and in Figures 12 and 13 for chloroethanes. (4) Acyl chlorides, chloroaldehydes and, less likely, chloroepoxides are the active carcinogenic forms of both classes of halohydrocarbons which act as electrophiles in forming the adducts shown in Figure 14 with nucleophilic sites of DNA bases. (5) Formation of such adducts are important initiating steps in tumor formation. (6) Properties related to extent of metabolism of the parent compound to the active form and the electrophilicity of the putative ultimate carcinogen could be reliable indicators of their relative carcinogenic behavior.
Properties Relevant to Extent ofTransformation of Chloroethylenes and Chloroethanes to Putative Ac-Live Forms. In order to select molecular properties of parent compounds and intermediates that could serve as indicators of the extent of transformation of chloroethylenes and chlororethanes to active forms, i.e., chloroepoxides, chloraldehydes, and chloroepoxides, we have used the main features of oxidative metabolism postulated for these two classes of compounds as summarized in Figures 10-13. The properties chosen to monitor extent of metabolism for the chloroethanes are shown in Figure 13 are thermodynamic criteria for model reactions of their transformation to active forms. Differences in transport or steric factors at the enzyme active site that influence this transformation are not included.
Specifically, both theoretical and experimental evidence indicates that aliphatic hydroxylation occurs by a radical two-step process (steps Al and A2 in Fig. 13) involving H atom abstraction (85)(86)(87)(88)(89). Thus, we have calculated and compared the enthalpies of reactions (1) and (2): AHR(l) and AHR(2), for the eight parent chloroethanes as a possible indicator of their relative extent of hydroxylation by cytochrome P-450s.
There is, however, still the possibility that P-450mediated hydroxylation proceeds by a nonradical, closedshell mechanism involving a singlet oxygen. We have therefore calculated the enthalpy of this reaction AHR(3) (Fig. 13) as another possible measure of the relative extent of hydroxylation of the parent compounds by cytochrome P-450 to form chloroethanols.
Nonenzymatic loss of HCI by the chloroethanols to form aldehydes and acyl chlorides is the postulated transformation to these putative ultimate carcinogens. Thus, differences in the calculated heats of formation AHR(4) of the alcohols and aldehyde products were used DNA DNA ADDUCTS GLUTATH I ONE CONJUGATION as a measure of the relative ease of formation of the aldehydes. These quantities were determined for each possible chloroalcohol and aldehyde formed from each parent compound.
The fully chlorinated parent compounds were not included in these studies, since evidence indicates they are transformed by reductive rather than oxidative P-450 metabolism (147,148).
Using vinyl chloride as an example, Figure llA shows the proposed pathway (85-89) to direct formation of acyl chlorides and chloroaldehydes from parent chloroethylene compounds via P-450 oxidation. For all chloroethylenes, differences in heats of formation were calculated between all proposed species in each step in this pathway (Fig. 11A).
In addition to aldehyde formation, initial oxidation of the parent chloroethylenes can lead to epoxide formation (Fig. 10). Figure liB indicates the assumed pathway for this transformation involving triplet/singlet crossing and ring closing of the initial triplet radical. Again, differences in heats of formation were calculated for each step in this pathway for each parent compound.
Chloroepoxides are readily transformed to acyl chlorides and chloroaldehydes in aqueous media (109    ure 11C shows a plausible mechanism for such a transformation involving proton-assisted ring opening and isomerization of the carbocation by 1,2 Hor Clion shifts. Again, relative heats of reaction for all of these steps in transformation of epoxides to aldehydes were calculated as an alternative route to direct formation of such putative ultimate carcinogens. Calculated values of all thermodynamic quantities were compared with the relative extent of metabolism and rank order of carcinogenic activity for all parent compounds. Electrophilic Properties ofPutative Active Carcinogens. Both the chloroepoxides and aldehydes are electrophilic enough to form adducts with DNA, and evidence for such adducts comes from extensive in vitro and in vivo studies with vinyl chloride (114)(115)(116)(117). Two types of DNA adducts (Fig. 14)  with vinyl chloride and DNA from a variety of species, (113)(114)(115) and an oxoethyldeoxyguanosine most recently found in animals exposed to vinyl chloride (116). As shown in Figure 14, the nature of these adducts provides direct evidence that chloroaldehyde is the adduct forming metabolite. However, as also shown in Figure  14 the N7-adduct deoxyguanosine could also be formed by a carbocation of the epoxide which attacks N7 with loss of HCl from the adjacent carbon. If formation of the adducts shown in Figure 14  Cl Cl I I important in tumor formation, then acyl chlorides and chloroaldehydes likely active forms of ethylene chlorides, with or without the intermediacy of epoxides. They are also very plausible candidate ultimate carcinogens for saturated halohydrocarbons as well. Thus, both classes of halohydrocarbons appear to be transformed to the same possible active carcinogenic formsacyl chlorides and chloroaldehydes-via initial oxidation by cytochrome P-450, but by different subsequent steps.
Two types of properties were chosen as indicators of electrophilic activity of acyl chlorides and chloroaldehydes, one relevant to electrophilicity in incipient covalent bond formation and the other to more long-range electrostatic interactions (150)(151)(152)(153).
In incipient covalent bond formation, the lowest unoccupied molecular orbital (LUMO) of the electrophile functions as the acceptor of electron density from the attacking nucleophile centers of the DNA bases. The energy of this orbital was thus chosen as a good indicator of electron accepting ability. The lower the energy, the more facile the electron transfer. In addition, the more electron density the two carbon atoms have in this orbital, the more they function as localized electrophilic centers in covalent bond formation.
The net charges on the two carbon atoms were chosen as monitors of their electrostatic electrophilicity. The larger the positive charge on these atoms, the greater their long-range attraction to the electron-rich nitrogen atoms of DNA bases.
An additional requirement could be that the C, carbon must have at least one halogen atom, since in an SN2 displacement a good leaving group like Clfacilitates attachment of the incoming nucleophile.
To examine the possibility that carbocations of epoxides could also be active carcinogenic forms of the ethylene chlorides, the energy of LUMO and the net charge on the cationic carbon atom of these species were also tabulated.
These calculated electrophilic properties of the putative ultimate carcinogens, together with those that indicate the extent of metabolism of the parent compounds, provided a set of molecular properties which could successfully be used as reliable indicators of rank order of carcinogenic activity of the chloroethanes and chloroethylenes.
All calculations made in this study were performed using an all-valence semiempirical molecular obrital method called MNDO (modified neglect of diatomic overlap) introduced in 1977 by .
Summary of Main Conclusions of Chloroethane Study. Molecular determinants of the relative metabolism of chloroethanes (Table 10)  Within the uncertainties of both experimental and theoretical quantities, then, these quantities are promising predictors of the extent of overall metabolism of aliphatic chlorinated hydrocarbons. For example, from calculated values for two compounds which have not been studied, as shown in Table 10, we would predict the metabolism of 1,1-dichloroethane to be at least as extensive as that of pentachloroethane and the metabolism of monochloroethane to be comparable or somewhat less than that of 1,1,1-chloroethane.
The combined results shown in Table 10 (3)]-support the hypothesis that aliphatic hydroxylation by cytochrome P-450s occurs by a radical (rather than singlet) oxene mechanism through the intermediacy of aliphatic hydrocarbon radical formation. Ease of aldehyde formation by loss of HCI from the alcohol is an additional factor in determining the overall extent of metabolism of the parent compound. Table 11, lists all the parent chloroethanes in order of the energy of the lowest empty orbital of the putative ultimate carcinogen (aldehyde) they form. As seen in Table 11 within the uncertainties of the experimental data, this order corresponds closely to the rank order of carcinogenic activity of the six known parent compounds. Another good indicator of carcinogenic activity is the net charge on the CQ carbon of the aldehyde.
Finally, while electrophilic properties of the putative ultimate carcinogen alone appear to be able to predict relative order of carcinogenicity, the relative stability aFor definitions see Fig. 13. bElr LUMO = energy of the lowest unoccupied molecular orbital in eV LUMO is a I* orbital. qC = Mulliken net charge on the oa-carbon.  eA(AHiep) = isomerization energy of carbocations of epoxides. All quantities relative to most favorable enthalpy differences.
of the aldehyde with respect to the alcohols 4[AHR(4)] appears to be a contributing factor in determining rank order of carcinogenic activity.
The calculated values of ELUMO could at least be a useful binary screening parameter for the carcinogenic activity of saturated halohydrocarbons. Of the tested compounds, those with E*T,*LUMO less than -0.8 eV are potent carcinogens, while those with values above -0.15 eV have little or no activity. Thus there may be a threshold value of ET*LUMO below which a compound can be considered a carcinogen. On this basis, the 1,1,1,2-tetrachloroethane would be predicted to be an active carcinogen similar in potency to 1,1,2,2-tetrachloroethane, while monochloroethane is predicted to be inactive.
While there is no direct evidence for the nature of the active carcinogenic form of chloroethanes, our results strongly suggest that the acyl chlorides and chloroaldehydes assumed by us to be important are indeed the activated forms of these parent saturated halohydrocarbons.
Summary of Main Conclusions of Chloroethylene Study. Of all the thermodynamic quantities calculated for these complex transformations of chloroethylenes (Fig. 11), only the three given in Table 12 correlate with their rank order of metabolism. Given in Table 12 are: the relative enthalpies of isomerization of the triplet biradicals l(AH,.), leading directly to carbonyl products from the initial products of oxidation of parent com-pounds by cytochrome P-450s; the stabilities of epoxide intermediates relative to each parent compound [A(AHe/))]; and the relative enthalpies of isomerization of these epoxides [A(AHjieI,)], an alternative pathway leading to formation of carbonyl products. Consistently, the energetics of these steps also appear to be most involved in determining product specificities.
The results obtained are consistent with evidence from observed secondary isotope effects (89) that P-450 epoxidations occur by a radical two-step mechanism and that the second step, ring closing, is rate-determining. Our results suggest further that isomerization to aldehydetype products is a determining step in extent of metabolism of ethylene chlorides, both with and without the intermediacy of epoxide formation, and that different products can be selectively formed by these two alternative routes.
As shown in Table 13, if carbonyl compounds are ultimate carcinogens, the enthalpy of isomerization of both the initial triplet radical of P-450 oxidation [A(zHi,.)] and of epoxide intermediates [A(lH,)] to form the carbonyl products were found to be promising indicators of carcinogenic activity. As also shown in Table 11, no relationship was obtained between any measure of the electrophilicity of the carbonyl compounds that could be formed from the parent compound and their carcinogenic activity.
If epoxides can be involved as intermediates in the formation of carbonyl compounds, it is also possible that they can be active as ultimate carcinogens. As shown in Table 14, there is some relationship between the calculated net charge on the cationic carbon of the ion of each epoxide, as well as with the relative stability of the epoxide (AHep) from which these carbocations are formed, and the relative carcinogenic activity, of the parent compunds.
In screening of unknown compounds with haloethylene functional groups, all four promising indicators of carcinogenic activity identified here (i.e., isomerization energies with and without epoxide intermediates, stability of epoxides relative to parent compound, and the net charge on the epoxide carbocation) should be calculated to further sort out their relative importance and predictive capabilities.
For example, using the criteria corresponding to all three modes of activation, we predict that 1,2-dichloroethylene, which has not yet been studied, will be a carcinogen with an activity intermediate between vinylidene chloride and tetrachloroethylene. This prediction remains to be verified.
For the chloroethylenes, three possible modes of transformations to active carcinogens could be important: formation of carbonyl products directly or via epoxide intermediates and formation of carbocations from the epoxides. If carbonyl products are the ultimate carcinogens, then our results imply that their extent of formation by isomerization form primary products of P-450 oxidation rather than their electrophilicities are discriminating factors in determining the relative carcinogenic potency of the parent compounds.
On the other hand, if epoxides themselves can act as ultimate carcinogens without isomerization to carbonyl compounds, then the electrophilicity of their carbocation is a good indicator of relative parent compound activity.

Alkylnitrosamines
Alkylnitrosamines are a ubiquitous class of chemical carcinogens that can be formed in situ by the in vivo reaction of nitrites with amines (157). In common with R R .N -N = O chemical carcinogens described above, polycyclic aromatic hydrocarbons, aromatic amines, chloroethanes, and chloroethylenes, the alkylnitrosamines require multiple transformations to their active form. Evidence supporting this conclusion, particularly for dimethylnitrosamine (DMN), the most studied analog, is substantial (158)(159)(160)(161)(162)(163). While studies of other longer chain dialkylnitrosamines are less extensive, additional evidence exists that they also require transformation to active carcinogenic species (162,164). Three types of active species have been implicated in DNA adduct formation: alkyl diazohydroxides, diazocarbonium ions and alkyl carbocations. While evidence of covalent adduct formation is extensive (165)(166)(167)(168), the electrophilic species forming such adducts have not been definitely identified. Our studies of nitrosamines are of two types: further elucidation of mechanisms of formation of putative ultimate carcinogens (Step 3 of protocol) and identification and calculation of reliable indicators of relative carcinogenicity of related disalkylamines considering all three types of active species suggested.
Elucidation ofMechanism ofFormation ofPutative Ultimate Carcinogens. Although many steps leading to nitrosamine activation are unknown, the most accepted pathway (169) for parent compounds whose principal target organ is the liver to active carcinogens is shown schematically in Figure 15. The first step is thought to be hydroxylation of the (cx-w) alkyl group by the active oxygen atom species of the cytochrome P-450, with the a-hydroxylation leading directly to active carcinogens. Both our mechanistic studies of model systems (85,170) and experimental studies of alkylhydroxylation indicate that a radical mechanism for enzymatic hydroxylation is a likely one (87).
As shown schematically in Figure 15 and in more detail in Figure 16, once formed, the CQ,-hydroxynitrosamine (3a), a particularly unstable intermediate, is thought to undergo a sequence of nonenzymatic transformations leading to several species suggested as ultimate carcinogen: alkyldiazohydroxides, RNNOH (5-7); diacarbocatiums ions, RNN + (6); or alkycarbocations R + (7). However, neither theoretical (171,172)  carcinogen. The most widely assumed mechanism for this transformation is a two-step pathway (pathway I) via a monoalkylnitrosamine (4) which tautomerizes to form RNNOH (173). However, an alternative mechanism, pathway II, a concerted pathway which leads directly from the C,-hydroxydialkylnitrosamine (3) to the RNNOH species (5) by a six-membered transition state, TS4 (174), is also possible. In systematic mechanistic studies, (175) we have used the semiempirical molecular orbital method called MNDO (154)(155)(156) and the ab-initio method Gaussian 80 (176), together with identification of stationary points and transition states to compare the energetics of pathways I and II to the formation of RNNOH. The results indicate that the concerted mechanism for the formation of RNNOH is kinetically favored, i.e., the free energy of activation for transition state TS4 is much less than that for transition states TS2 and TS3. Thus the most probable mechanism for transformation of a parent dialkylnitrosamine to an alkyldiazohydroxide species appears to be: initial abstraction of a H radical from the a-carbon atom of the parent compound by an electrophilic oxygen radical of cytochrome P-450 leading to formation of an OH radical and a nitrosamine If I DNERACTION WITH DNA radical 2, R'(RCH)NNO; rapid recombination to form species 3, a hydroxynitrosamine; and concerted rearrangement (pathway II) of the hydroxynitrosamine to an alkyldiazohydroxide. Molecular Indicators ofCarcinogenic Activity. In the second part of the study described here, we have used the postulated transformation pathway described above as a guide to the identification and calculation of molecular properties which could be reliable indicators of the relative metabolism and carcinogenic potency of a series of five symmetric dialkylnitrosamines (R = methyl, ethyl, n-propyl n-butyl, and n-pentyl), for which rat liver carcinogenicity has been measured (164). In particular, we have considered the possibilities that RNNOH species (5), RNN+ species (6), and R+ species (7) can all be the electrophilic species which interact with tissue nucleophiles and that they are formed sequentially from the a-hydroxyl intermediate (3a). Thus, we have selected and, using MNDO (154)(155)(156), have calculated the following candidate molecular indicators of relative carcinogenic activity ofthese five compounds: stability of radical species leading to CQ-hydroxy intermediates (3a), and those that do not 3b, 3c; the relative enthalpies of activation (rate) of formation of RNNOH species (5) from the species 3a; the relative stabilities of RNNOH (5), of RNN+ (6), and R+ (7), species; electrophilic properties of the putative carcinogens RNNOH (5) RNN+ (6), and R+ (7) which could be indicators of either their electrostatic or covalent electrophilicities. Among the electrostatic indicators of electrophilicity calculated were: net atomic and group charges and the orbital electron density on the a-carbon which is the alkylation site for DNA-bases. Among the covalent indicators of electrophilicity calculated were: the extent of participation of the a-carbon in the lowest empty molecular orbitals and the energy of these orbitals.

Promising Molecular Indicators
Of all the calculated properties, only the relative stabilities of the radical intermediates (3a,b,c) formed in the postulated first step of P-450 hydroxylation of alkyl amines could be used as reliable molecular indicators of relative carcinogenicity of this class of very similar symmetric dialkyl nitrosamines.
These promising molecular indicators, radical stabilities together with carcinogenic activities of the parent compounds are shown in Table 15. We see from Table  15 that formation of radicals at the secondary a-carbon positions is favored over the ,-positions, in keeping with the known product distribution of dipropylnitrosamine (177). In general, using radical stability as a criterion, hydroxylation at all secondary carbon positions could occur and is greatly favored over hydroxylation at primary carbon terminal positions (X). Of all the primary terminal (w) carbon positions, that of dimethylnitrosamine, which is also the a-carbon position, forms the most stable primary carbon radical.
As shown in Table 15, the enhanced carcinogenic activity of diethylnitrosamine relative to dimethylinitrosamine could be associated with its increased a-carbon radical stability and hence enhanced hydroxylation at this site, together with no significant competing hydroxylations at the terminal carbon that could led to detoxification. For all the longer alkyl-chain compounds, a-carbon hydroxylation is equally favored compared to the diethyl compound. However, as the alkyl chain lengthens, there are more favorable secondary carbon sites for hydroxylation. Hydroxylation of the alkyl chain at these sites form more stable products than a-carbon hydroxylation with increased water solubility, thus allowing them to be detoxifying pathway since these products will have enhanced likelihood of being excreted. In addition, such sites are favorable for conjugation with detoxifying agents such as glucuronic acid. These additional hydroxylation sites then could account for the diminished carcinogenicity with increasing alkyl chain length for R > C2H5.
Experiments with dipropylnitrosamine (DPN), however, indicate that hydroxylation of ,B to w carbon atom sites might not lead exclusively to detoxification but to new parent compounds, as illustrated in Figure 15, which could recycle through cytochrome P-450 and ultimately form carcinogenic (178,179) products. Thus, possible recycling of products through P-450 oxidations appear to also be a factor in the ultimate carcinogenicity gf the parent compound.

Other Molecular Properties Examined
The extent of further proposed nonenzymatic transformations of C.-hydroxynitrosamines to alkyldiazo- hydroxides (RNNOH), diazonium ions (RNN+), and alkylcations (R+) were examined as possible indicators of relative carcinogenic activity. As illustrated in Table  16, no direct correlation was found between carcinogenic potencies and either enthalpies of formation or of activation for transformation of hydroxynitrosamines 3 to alkyldiazohydroxides 5. Nor was there any correlation between relative potencies and the stabilities of putative ultimate carcinogenic species 5, 6, or 7 relative to their parent compounds (Table 17). Rather, as shown, more of an inverse relationship was obtained. Selected electrophilic properties of the putative carcinogenic species 5, 6, and 7 were also calculated and   Tables 18-20. From Table 18, we see that RNNOH does not appear to be a particularly electrophilic species by either simple electrostatic (q,) or covalent criteria.
The most widely assumed active form of nitrosamides and nitrosamines is their alkyldiazonium ions, RNN+. Several recent calculations modeling the interaction of this species with model nucleophiles (180) and nucleic acid bases (181) have been reported. We have calculated electrophilic properties of the RNN + species that could be indicators of their relative ease and extent of adduct bLDw in female rat liver.
CAHr (5)   cpC,, = total electron density in the fI orbital of the a-C atom. dELEMo+l = energy of the lowest unoccupied molecular orbital with significant a-carbon atom character. LEMO is a T*(NNO) orbital.
fpCa,-N, = Mulliken bond overlap density between the a-carbon and the nitrogen atom. cp,,C,, = total electron density in the wr orbital of the at-C atom.
dELEMo+2= energy of the lowest unoccupied molecular orbital with significant a-carbon atom character. LEMO is a r*(NNO) orbital.
ep C (LEMO + 2) = electron density in the CG rr atomic orbital of the (LEMO + 2) molecular orbital. fpe -N, = Mulliken bond overlap density between the a-carbon and the nitrogen atom. 'LDr, in female rat liver. bqC_ = Mulliken net atomic charge of a-carbon atom. cp,C<, = total electron density in the IT orbital of the a-C atom. dELEMO = energy of the lowest empty molecular orbital. LEMO is a IT*(NNO) orbital.

91
formation. These results are summarized in relative carcinogenicity was obtained. The species R + is formally an alkylating agent only in an SN' reaction with DNA bases. As shown in Table  20, alkyl cations R+ are indeed electrophilic with substantial positive charge on the Ca atoms. However, delocalization of this charge as the chain length of R increases does not account for the observed variation in carcinogenicity. Moreover, as shown in Table 17, the stability of this R+ carbocation does not correlate with parent compound carcinogenicity. This result is in contrast to that found for carbocations of polycyclic aromatic hydrocarbon bay region diolepoxides, where the stability of the carbocation was a good indicator of parent compound carcinogenicity.
In summary, of all the properties examined, the initial hydroxylations by cytochrome P-450 at activating (cxcarbon) and inactivating (13-w-carbon) positions appear to be the major modulators of parent compound activity as rat liver carcinogens. Neither the extent of further proposed nonenzymatic transformation of Ca-hydroxynitrosamines to alkyldiazohydroxides (RNNOH), diazonium ions (RNN +) + or alkylcations (R ), nor the electrophilicity of these species appeared to be significant factors in differentiating parent compound carcinogenic activity for this class of very similar symmetric dialkylnitrosamines. It is possible that these additional factors will be important for comparisons among more diverse types of nitrosamines. Calculations for such a data set would be the most reasonable next step in studies of nitrosamines. Support for this work from NCI Contract CP 15730 is gratefully acknowledged.