A mechanism-mediated model for carcinogenicity: model content and prediction of the outcome of rodent carcinogenicity bioassays currently being conducted on 25 organic chemicals.

A hierarchical model consisting of quantitative structure-activity relationships based mainly on chemical reactivity was developed to predict the carcinogenicity of organic chemicals to rodents. The model is comprised of quantitative structure-activity relationships, QSARs based on hypothesized mechanisms of action, metabolism, and partitioning. Predictors included octanol/water partition coefficient, molecular size, atomic partial charge, bond angle strain, atomic acceptor delocalizibility, atomic radical superdelocalizibility, the lowest unoccupied molecular orbital (LUMO) energy of hypothesized intermediate nitrenium ion of primary aromatic amines, difference in charge of ionized and unionized carbon-chlorine bonds, substituent size and pattern on polynuclear aromatic hydrocarbons, the distance between lone electron pairs over a rigid structure, and the presence of functionalities such as nitroso and hydrazine. The model correctly classified 96% of the carcinogens in the training set of 306 chemicals, and 90% of the carcinogens in the test set of 301 chemicals. The test set by chance contained 84% of the positive thio-containing chemicals. A QSAR for these chemicals was developed. This posttest set modified model correctly predicted 94% of the carcinogens in the test set. This model was used to predict the carcinogenicity of the 25 organic chemicals the U.S. National Toxicology Program was testing at the writing of this article.


Introduction
As part of a program to obtain and develop models for predicting properties needed to perform ecological and human health risk assessment prior to chemical synthesis, we attempted to model rodent carcinogenicity.
Toxicity due to electrophilic reactivity had been found to correlate with the computed property delocalizibility (3; also, unpublished models). We decided to see if This paper is part of the NIEHS Predictive-Toxicology Evaluation Project. Manuscript received 10 January 1996; manuscript accepted 1 May 1996.
Much thanks to J. Stevens for his help and suggestions. His practical and theoretical chemistry counsel were invaluable in identifying predictors and understanding the nature of individual chemicals and classes.
Address correspondence to R. Purdy 778-5379. this parameter would be a good predictor of carcinogenicity. Preliminary evaluations using the data of Benigni et al. (1) indicated that both the experimental parameter they used (Ke) and carcinogenicity could be predicted with delocalizibility. It was noted that the positives not predicted by delocalizibility mostly belonged to generally accepted classes of carcinogens such as aromatic amines and estrogens. They probably reacted with or catalyzed other chemicals to react with DNA by another mechanism, and some could require metabolic activation.
Based on this background, we then decided to attempt to develop a model composed of quantitative structure-activity relationships (QSARs) for each class. Classes would be based on a chemical reactivity mechanism. The predictive parameters and classes would be hypothesized and evaluated until appropriate ones were found. The emphasis was put on chemical reactivity. Hypotheses based on enzymatic activity were tried only after organic chemical mechanisms were found to be inappropriate. In addition expert judgment would be used in developing the QSARs, the criteria, and the hierarchy of the model. This model is based on generally accepted science, so it is not extensively referenced.
Because judgment would be used to decide on predictive parameters, numerical criteria, and the identification of classes, the model could easily be biased to fit the data and not be broadly applicable. Therefore the data set was split into training and test sets of about equal size. The test set was held back to test the predictability and broad applicability of the model. The 25 organic chemicals in testing by the U.S. National Toxicology Program (NTP) provides an additional and blind test set for the model. The carcinogenicity of these chemicals was predicted after evaluating the predictability with the original test set.

Data Selection
An overall set of positive organic carcinogens was assembled by combining the "confirmed carcinogens" listed by Lewis (4), the positives in tables 1-3 of Zhang et al. (5), and the organic chemicals for which the NTP found "clear evidence" of carcinogenicity (6)(7)(8). This set was randomly halved to give a training set and a test set. Some chemicals were selected from the test set to be transferred to the training set because their chemical class was poorly represented in the training set.
The set of negative organic carcinogens was assembled from the negative studies found in the U.S. Food and Drug Administration data base, the negatives in Zhang et al.'s tables (5), and the "clear negatives" reported by the NTP through 1994 (6)(7)(8) and not listed by Lewis (4) as "confirmed," "suspected," or "questionable" carcinogens. The set was randomly split in half and combined with the positives into a training set and a test set.

Computations
Computation of most predictors was done with Project Leader and associated software from CAChe Scientific, Beaverton, OR. Structures were drawn with this software, the standard molecular mechanics were run before geometry optimization with MOPAC using PM3 parameters. Angles, atomic distances, lowest unoccupied molecular orbitals (LUMOs), and partial atomic charges were extracted or tabulated from the MOPAC results using the CAChe Project Leader software. The delocalizibility/ superdelocalizibility calculation algorithm in the CAChe software was replaced with a program written by John Blair of 3M. It was based on a more classical interpretation of the original reference (9). The term delocalizibity is used when the property is determined for nonaromatic atoms and superdelocalizibity is used for aromatic atoms. In calculating radical superdelocalizibilites the average of the highest occupied molecular orbital (HOMO) and LUMO energies were added to all molecular orbital energies, and 10ev were added to the virtual molecular orbitals for calculating acceptor delocalizibilities.
Calculation of pKa values was done with SPARC provided by the U.S. Environmental Protection Agency and the University of Georgia, Athens. This model does not provide pKa for halogen acids. These were estimated to be 3.5, 1.7, 1.3, and 1.0 for fluorine, chlorine, bromine, and iodine, respectively. The halogen values were calculated by using their reactivity reported by Noyce and Virgilio (10) in a QSAR of the organic reactivity data from table 2 of that paper versus pKa values calculated with SPARC (11). Calculation of ester hydrolysis rate was done with the EPIwin software from Syracuse Reasearch Corporation (Syracuse, NY). The log of octanol/water partition coefficients (log P) were calculated using MacLogP 1.0 from BioByte Corp (Claremont, CA).

Model Development
The mechanisms of reactivity of carcinogens were hypothesized. Then parameters that should be good predictors were hypothesized. These predictors were calculated. Predictor values were inspected for ability to discriminate between positive and negative carcinogens. When a good discriminating predictor was found, a criterion was set so that the optimum number of positive carcinogens were correctly predicted. In developing the individual QSARs, often chemicals that appeared to belong to one class actually belonged to another. Until the full model was constructed these chemicals could not be identified, so they gave rise to noise in the criteria identification process when they were put in the wrong mechanism class. For this reason and a desire to predict as many positive carcinogens as possible, more incorrect classifications of negative chemicals were allowed than with positives.
After the various QSARs and rules were identified, they were assembled into a hierarchical scheme that made physiological sense and created a procedure that was time and computationally efficient.

Model Based on Training Set
A chemical is evaluated with the model by proceeding through the following QSARs that make up the model. The QSARs are used to classify a chemical as a positive or negative carcinogen. If none of the criteria classify the chemical, it is classified as negative.
* Negative if log P <-1.6 or > 8. 8  Model Rationale The QSARs that make up the model were organized in a way that made some physiological sense and made for the most efficient use of computation time. It did not seem worthwhile to do time-consuming calculations on a molecule if it belonged to a class that required less or no computation time to identify as positive or negative.
The first QSAR determines whether a chemical is likely to penetrate a cell. It was assumed that a chemical had to penetrate a cell before it could be a carcinogen. Chemicals with low octanol/water partition coefficients are not likely to passively penetrate a cell, and chemicals with high octanol/water partition coefficients probably are not soluble enough to be in solution. In addition, very large chemicals are not likely to penetrate. The criteria for each of these were estimated by listing the chemicals that should be carcinogenic but were not and then deciding on a criterion that correctly classified most chemicals. The criteria listed in the model above were found. The data around these criteria is limited, as for most criteria in the model, and will likely be modified as more data are generated.
It was noted in developing the model that many ester-and azo-containing chemicals were predicted wrong. Azo compounds are thought to be hydrolyzed in the intestine, and esters are likely to be hydrolyzed by one of many cellular and extracellular esterases. The data did not support a criterion for any azo compounds not being hydrolyzed, but there were slowly hydrolyzed esters that were incorrectly predicted. Since esterases speed up chemical hydrolyses of esters, it was thought that the chemical rate of hydrolysis could be used as an index for the rate of esterase hydrolysis. A model is available to estimate the hydrolysis of esters at pH 7. It was used to calculate hydrolysis rates. These rates were examined for a criterion to determine when Environmental Health Perspectives * Vol 104, Supplement 5 * October 1996 to consider the hydrolysis products and when to consider the parent as possible carcinogens. A half-life of 100 years or less was chosen as the criterion.
These first two QSARs for cellular penetration and hydrolysis were placed at the begining of the model architecture because these phenomena happen prior to interactions with DNA and the calculations are fast and simple.
It was noted that all chemicals containing a hydrazine, nitroso or isocyanate group were positive, and no other QSAR appeared to apply to any of these classes. Therefore it was decided to consider all chemicals that contain one of these functionalities to be positive. No computations would have to be done on these classes of chemicals, so this QSAR was placed early in the model procedure.
There were chemicals such as 2-ethylhexanol in the data sets that were reported to be positive but were not predicted so by any QSARs in the model. In addition, no mode of reactivity could be imagined. But these types of chemicals are peroxisome proliferators, which appear to increase the incidence of cancer in rodents. A simple QSAR that fit the data but not necessarily all peroxisome proliferators was hypothesized and evaluated with the training set. The QSAR is based on the hypothesis that nonalkylating peroxisome proliferators competitively bind to a n-oxidation enzyme and are not oxidized. This causes an increase in synthesis of the peroxisome enzymes. This QSAR requires only inspection of the structure for the connection of atoms, so it was placed early in the model procedure for computing efficiency.
Carcinogenic estrogenlike chemicals also were not predicted to be positive by any of the QSARs based on reactivity. A review of the estrogens listed in the Merck Index, 11th edition, revealed that they are all rigid between two hydroxyl groups and the hydroxyls are about 11 angstroms apart. This appeared to be a potentially good QSAR, and so was applied to the training set. The range of distance between hydroxyl groups was widened to 9.8 to 12.5 angstroms after considering the training set. Also, the part of the criteria that said the groups should be hydroxyls was made more encompassing by requiring the atoms to contain lone pairs of electrons. This was done because there were some diamino compounds to which the QSAR applied and various binding studies indicate that halogens might also be appropriate. The computations to optimize structure for this QSAR are slightly less demanding than those that follow and more so than those that precede it in the model, so it was placed fifth in the sequence.
The QSAR for polynuclear aromatics was developed after review of the chemicals that would generally be included in this class. It was noted that chemicals of this class with log octanol/water partition coefficient less than 4.0 were negative, so that was made part of the criteria. The bay region for all positives was seen to be unsubstituted and steric bulk at either the adjacent L or Kregion resulted in the chemicals' being negative. Attempts to develop reactivity criteria failed. This QSAR was placed sixth in the model procedure because it requires no time-consuming computations.
It is generally assumed that carcinogenic primary aromatic amines are metabolically activated to an ultimate carcinogen. The first step in this transformation is the hydroxylation of the amine. A nitrenium ion resulting from the dehydration of the hydroxylamine is generally thought to be the DNA reactive agent (12). The electron-accepting ability of the nitrenium ion was hypothesized to control the reactivity or indicate the likelihood of the original oxidation. The acceptor superdelocalizibility on the nitrenium ion nitrogen and the LUMO energy of the whole molecule were evaluated as predictors. Both were good. The LUMO energy was slightly better. The numeric part of the criteria was chosen by inspection to maximize the number of correct predictions. There were too few secondary aromatic amines in the training set to see if this QSAR or another applied to them. This QSAR was placed 7th because it is easy to identify chemicals that belong to the class, and slightly less computational time is required than for the QSARs that follow.
The SN2 set of QSARs are the backbone of the model. The rudiment of these QSARs was found first. The inability of the first crude SN2 QSAR to predict the carcinogenicity of many classes indicated the need for the development of the other QSARs. The term SN2 is used broadly here and includes two-step aromatic nucleophilic substitution. The QSARs are based on the principles of nucleophilic substitution taught in undergraduate first year organic chemistry. That is: the rate is controlled by the electrophilicity (electron accepting property) and/or the increase in stability of the resulting products. In the first of the three QSARs for this mechanistic class the release of steric strain and the stability of the leaving group are the predictors. This QSAR applies to chemicals like epoxides and lactones. The second QSAR depends on the electrophilicity of the carbon and the stability of the leaving group, which are estimated by the calculated pKa of the leaving group. The third QSAR contains two properties that reflect the electrophilicity of the reactive carbon. It seems reasonable that these three QSARs could be combined into a single one, but attempts to do so failed. This QSAR was placed 8th because it required more computation time than those before it and provided part of the computations required by the QSARs that follow it in the model.
It was found that all positive chlorinated carcinogens could not be correctly predicted with the SN2 QSARs or any others. It was found in previous unpublished studies that the toxicity and reactivity data of Hermans et al. (2) could be predicted with a QSAR based on the hypothesis that the toxicants reacted with a model nucleophile or biochemicals by the SN1 mechanism. It was hypothesized and confirmed that the difference in charge on the chlorinated carbon before and after the hypothesized ionization of the chlorine was a good predictor. It was hypothesized that this mechanism was appropriate to carcinogenicity also, and this was found to be true. The numerical part of the criteria (< 0.595) differed by only about 1% between the criteria for reactivity to a model nucleophile, fish acute reactive toxicity, and carcinogenicity. The QSAR was placed after the SN2 QSAR because it depends partly on the same calculations.
It is generally accepted that many aromatic chemicals such as benzene are enzymatically epoxidized and the epoxide reacts with DNA. It was hypothesized that two adjacent aromatic carbons had to be unsubstituted before epoxidation would take place, and that the radical superdelocalizibility of the carbons might be a good predictor. It was found to be so. It was also observed that the criteria were applicable to conjugated carbons in addition to aromatic ones. The numerical criterion was found to be a range rather than one value. Apparently chemicals with too high a radical superdelocalizibility react too fast or the epoxides are too unstable to reach DNA. This QSAR was placed after the primary aromatic, SN2, and SN1 QSARs because many chemicals have structures for which two or more of these QSARs apply, and the structure optimization for the SN2 QSARs is the basis for the calculations done for this QSAR.
Environmental Health Perspectives * Vol 104, Supplement 5 * October 1996 The last QSAR in the model applies to nitroaromatic chemicals. The QSAR was found by trial and error and rationalization of why it works is incomplete. This lowers confidence in the predictions of this QSAR. For this reason and because many chemicals to which this QSAR is applicable are predicted to be positive by another QSAR, this QSAR was placed last.

Model Performance
The model correctly predicted the carcinogenicity of 96% of the carcinogens in the training set. This high level of predictability could be due to selecting parameters and criteria that are only appropriate or ideal for this data set and not broadly applicable. The results of applying the model to a test set of chemicals usually indicates broader or universal applicability of a model.
The model correctly predicted carcinogenicity for 90% of the carcinogens and 88% of both the positives and negatives in the test set. This is substantially less than for the training set.
A review of the incorrectly predicted chemicals in the test set suggests two possible reasons for this lower predictability: First, three chemicals (phenol, methyl salicylate, and benzoic acid) were incorrectly predicted to be positive because they would form reactive epoxides. This appears to be a large percentage when compared to the total two correct positive predictions by this QSAR. But the QSAR was applied to 25 negative chemicals. So for the total number of test chemicals to which this rule was applied, 89% were correctly predicted. Nonetheless this QSAR might be deficient for some reason and will be reexamined.
The second reason for the lower predictability is that five thio-containing chemicals (sulfallate, 2-imidazolidinethione, thiourea, 6-propyl-2-thiouracil, and thioacetamide) were falsely predicted to be negative. No QSAR was developed for thio-containing chemicals because only one was incorrecdy predicted in the training set (1,3-diethylthiourea). Thus there were insufficient data to develop a QSAR. A QSAR for this class was developed after we saw that the model did not have an appropriate QSAR. It was hypothesized that the acceptor delocalizibility on the sulfur of a thio group would be a good predictor. It was found that such values greater than 0.165 indicated a chemical was a positive carcinogen. This criterion discriminated between the five positive and three negative thio-containing chemicals in the test set. The inclusion of this QSAR into the model resulted in correctly predicting 94% of the carcinogens and 90% of the chemicals overall ( Table 1). The training and test set chemicals, and their carcinogenicity and predicted carcinogenicity, used to calculate the percentages in Table 1 are presented in   Tables 2 and 3 respectively.

Predictions
Furfiryl Alcohol 98-00-0. Predicted to be positive because it would likely be epoxidized to a reactive epoxide. The radical superdelocalizibility of an unsubstituted carbon adjacent to another in a conjugated system was 0.163, which is within the criteria range of 0.151 and 0.180.
Codeine, 76-57-3. Predicted to be positive because it is likely to be an alkylator. The sum of the acceptor delocalizibility plus the partial charge on a carbon (the sp3 cyclic ether carbon) with a leaving group is 0.437, which is greater than the criterion of 0.395.
Diethanolamine, 11-42-2. Predicted to be negative because no properties were consistent with its being positive.
Pyridine, 110-86-1. Predicted to be positive because it would likely be epoxidized to a reactive epoxide. The radical superdelocalizibility of an unsubstituted carbon adjacent to another in a conjugated system was 0.155, which is within the criteria range of 0.151 and 0. 180, but close to the lower limit, which weakens the likelihood of being correct.
Tetrahydrofuran, 109-99-9. Predicted to be negative because no properties were consistent with its being positive.
Predicted to be negative because no properties were consistent with its being positive. Methyleugenol, 93-15-2. Predicted to be negative because no properties were consistent with its being positive, but chemicals of this class do react as indicated by their property of skin sensitization. There were not enough chemicals of this class in the training set to even consider if a QSAR was needed. In developing a skin sensitization model a QSAR for this type of chemical was developed. Such a QSAR might be useful for this carcinogenicity model. This negative prediction is suspect, then, because of the paucity of chemicals of this type in the model's training set.
Cinnamaldehyde, 104-55-2. Predicted to be positive because it would likely be epoxidized to a reactive epoxide. The radical superdelocalizibility of an unsubstituted carbon adjacent to another in a conjugated system was 0.154, which is within the criteria range of 0.151 and 0. 180, but close to the lower limit, which weakens the likelihood of being correct.
Predicted to be negative because no properties were consistent with its being positive. Ethylbenzene, 100-41-4. Predicted to be negative because no properties were consistent with its being positive. In the case of the QSAR for predicting epoxidation to a reactive epoxide the radical superdelocalizibility of the orthocarbon to the ethyl group was 0.150, which is just under the lower criterion of the range of 0.151 and 0.180. The close proximity of the radical superdelocalizibility to the criterion lowers confidence in the prediction.
Chloroprene, 126-99-8. Predicted to be positive because it would likely be epoxidized to a reactive epoxide. The radical superdelocalizibility of an unsubstituted carbon adjacent to another in a conjugated system was 0.161, which is within the criteria range of 0.151 and 0.180. This was true for both the cis and trans isomers.
-PNA n = 305. +, positve; -, negative. "The reason is a listing of one QSAR of the model that was used to make the prediction. For some positives more than one QSAR applied. bPresumed unreacted monomer leaches from polymer.
Ethylene Glycol Monobutyl Ether, 11-76-2. Predicted to be negative because no properties were consistent with its being positive.
Citral, 5392-40-5. Predicted to be positive because it would likely be epoxidized to a reactive epoxide. The radical superdelocalizibility of an unsubstituted carbon adjacent to another in a conjugated system was 0.166, which is within the criteria range of 0.151 and 0.180.
Primaclone, 125-33-7. Predicted to be negative because no properties were consistent with its being positive.
Predicted to be negative because no properties were consistent with its being positive. Oxymetholone, 434-07-1. Predicted to be positive because the structure is ridged between two oxygens that are 10.4 angstroms apart which is within the criteria range of 9.8-12.5 angstroms for estrogen-like QSAR.
Anthraquinone, 84-64-1. Predicted to be negative because no properties were consistent with its being positive.
Nitromethane, 75-52-5. Predicted to be negative because no properties were consistent with its being positive. This is another chemical for which there are not enough of the class to develop a QSAR.
Nitropropane is a positive carcinogen, but there were no other nitroalkanes in the training set for developing a QSAR. So even though this chemical is predicted to be negative by the model, it would not be surprising if the test results come out positive.
Phenolphthalein, 77-09-8. Predicted to be negative. The ester hydrolysis model is not for cyclic esters, but since the hydrolysis would relieve steric strain, it was assumed to be likely. The hydrolysis product resembles estradiol in that two pairs of unpaired electrons are about the right distance apart over a rigid structure, but the distance between the oxygens is 9.6 angstroms, which is slightly less than 9.8 angstroms, the lower limit of the range in which positive chemicals fall.
Scopolamine Hydrobromide Trihydrate, 6533-68-2. Predicted to be positive. The likely mechanism is SN2 alkylation because the angle of deviation from ideal at a carbon with a leaving group divided by the PKa of the leaving group is 2.6, which is greater than the criterion of 1.8.

Discussion
Most of the chemicals predicted to be positive in the newest NTP test set were predicted to be epoxidized to a reactive epoxide. This is different from the training and first test set of the model, in which the SN2 and PNA QSARs were dominant. The difference reflects the testing of classes for which there are fewer data. Overall this set of organic chemicals does not represent the broad spectrum of mechanisms that cause or do not cause carcinogenesis. Therefore it is not a good set for evaluating the broad applicability of models. Still it provides a test for subparts of models and eventually data for improving those parts. These data will be particularly useful to the model used here because of its possible weakness in predicting carcinogenicity of chemicals with conjugated carbons.
The presence of nitromethane in this set brings up the question of reactivity and carcinogenicity of nitroalkanes in general. The training set for the model (2) contains nitropropane, which was found to be positive in an NTP bioassay. One chemical is not enough to permit development of a QSAR, so the literature was reviewed for the reactivity of nitroalkanes. It appears that not all nitroalkanes are as reactive and by the same mechanism as nitropropane, so a rule or QSAR is not appropriate based on chemical reactivity and one bioassay result. A positive bioassay for nitromethane will indicate the need for a QSAR for this class or a QSAR that includes the nitroalkanes into a broader reactivity class.   positve; -, negative. "The reason is a listing of one QSAR of the model that was used to make the predition. For some positives more than one QSAR applied.