Empirical modeling of an in vitro activity of polychlorinated biphenyl congeners and mixtures.

The goal of this research is to predict an in vitro activity of polychlorinated biphenyl (PCB) congeners and their mixtures and to describe the relationship between this activity and chemical structure. The test system used multiple PCB concentrations on each cell culture plate in a repeated measures design, which improved precision for comparing between concentration levels. A weighted regression that accounted for this experimental design feature was used in fitting a nonlinear dose-response exponential model to the PCB concentration-activity data from an in vitro test system in which 3H-phorbol ester binding was measured in cerebellar granule cells exposed to different PCB congeners to test for their effects on protein kinase C translocation. The model allowed for the minimum level to be less than control, a common slope, and the estimation of the log of the concentration that produces an activity 50% above the control activity (E50) for 36 congeners and 3 commercial mixtures. Next, a weighted logistic regression using a second order response model in the variables Clortho, Clpara, and Clmeta was used to relate the estimated log E50s to indicators of chemical structure. This model was preferred over models that might seem more mechanistically based because in internal validation, it attained a smaller PRESS statistic (the sum of squares between all observed and predicted observations) than other models. Evidently, this second order model makes more efficient use of parameters than other models considered. Plots of the predictions of the logistic second order response model versus log Kow confirm the usual pattern that congeners with intermediate levels of log Kow are the more active. The data of three commercial mixtures were included in this regression by assuming a common combination index (ratio of observed E50 to predicted E50, assuming dose addition). The logistic model suggests that congeners with one, two, or three chlorine substitutions at the ortho position are more active than other congeners. Also, congeners with log Kow between 5.2 and 6.6 are generally more active. The estimated combination index indicated that the joint action of PCB congeners in the three commercial mixtures was less than dose additive. The error sum of squares was significantly large, which may indicate a lack of fit of the logistic model. Empirical Bayes estimates (EBE) are weighted averages of model predictions and observations of E50s and can be better estimates than the fitted model when there is a lack of fit. The PRESS statistic for the EBE indicated larger prediction error than for the logistic model, but the EBE provided better estimates of commercial mixture E50s based on dose addition. This may indicate that the logistic model is not incorporating all the information in the single congener data needed to predict mixtures. ImagesFigure 1.Figure 2.Figure 2.Figure 2.Figure 2.Figure 2.Figure 2.Figure 3.Figure 4.Figure 4.Figure 4.Figure 4.Figure 4.Figure 4.Figure 5.Figure 5.Figure 5.Figure 5.Figure 5.Figure 5.

cal structure. This model was prefeed over models that might seem more mechanistically based because in inrnal alidaton, it attained a smaler PRESS statistc (the sum of squares between all observed and predicted observations) than other models. Evidenly, this second order model makes more efficient use ofparameters than other models considered. Plots of the predictions of the logistic second order response modd versus log K. confirm the usual pattern that congeners with intermediate levels oflog Kow are the more ative. Ihe data of hee commercial m were induded in tiis regression by a a common combination index (ratio of observed E50 to predicted E50, assuming dose addition). The logistic model suggests that congeners with one, tw, or he chlorine substitutions at the ort/o position are more active than other congeners. Also, congeners with log KzwN between 5.2 and 6.6 are generly more ative. The estimted combination index indicated that the joint action of PCB ongeners in the three ommercial m was less than dose additive. The error sum of squares was sigificanty large, which may indicate a lac of fit of the logistic model. Empircal Bayes esumates (EBE) are weighted averages of model predictions and observations of E50s and can be bettr estat than the fitted model when there is la.c of fit The PRESS statistic for the EBE indicated r pedicion error than for the logisic model, but the EBE provided better etimate of commecial mixre E50s based on dose addition. This may indicate that the logistic model is not incorporating all the information in the single congener data needed to predict mixtur. Key wont empirical model, in viro aii ty, polychorinated biphenyls, repeated measures design.
Environ HealhPerspet105: 1106-1115 (1997). hap://eh.niebs.ni.gov Polychlorinated biphenyls (PCBs) are industrial compounds detected in air, water, sediments, fish, wildlife, and humans (1,2. PCBs are prepared by the chlorination of biphenyls, and the result is a mixture of possibly as many as 209 congeners. Although the manufacture of PCBs has been banned in the United States, these compounds remain a serious environmental pollutant due to ongoing release from hazardous waste sites, the accidental breakdown of electric transformers, and the high resistance to degradation. Risk assessment of PCBs currently involves usage of toxic equivalency factors (TEFs), which are predicted on the assumption that this class of chemicals elicits their toxic responses through a common receptormediated mechanism. Structure-activity relationship studies, for example, have found that PCB-induced body weight loss, thymic atrophy, immunotoxicity, endocrine anc reproductive toxicity, and carcinogenicit are associated with high affinity for the ary hydrocarbon (Ah) receptor (3). Recent stud ies, however, have shown environmentall' relevant ortho-substituted PCB congener with weak or no Ah-receptor activity hav effects on brain dopamine concentrations i; vivo (4) and in vitro in PC12 cells (5) Kodavanti et al. (6) have also found tha ortho-substituted PCB congeners have sig nificant effects on calcium homeostasi mechanisms in vitro, while non-ortho PCBs having a more coplanar structural configu ration, are relatively inactive in vitro. This paper builds on the previous worl of Kodavanti et al. (6) by developing ar empirical model for predicting the in vitr activity of PCB congeners. This informa tion may be useful in developing a risl assessment strategy for nondioxinlike PCBs and their mixtures. The activity was measured in an in vitro test system in which different PCB congeners and PCB mixtures were tested for their effects on protein kinase C (PKC) translocation by measuring 3Hphorbol ester binding in cerebellar granule cells (6,7). The prediction of the activity of PCB mixtures is based upon knowledge of the chemical composition and the in vitro activity of the components of the mixture.
We considered using the logistic dose-response model to determine the effective concentration that produces an activity 50% above the control activity (E50s) for 36 tested congeners and the 3 commercial mixtures. The logistic model involves the estimation of a minimum and a maximum activity.
However, estimates of the maximum were quite variable because not all congeners attained maximum activity. The exponential model is similar to the logistic model but has no maximum value so this model was used instead. Logs of the mean activity relative to control activity were fit to an exponential model using the data from 36 congeners and the 3 commercial mixtures; the model had a common slope parameter, a common minimum value, and a separate log E50 for each of the congeners and the commercial mixtures. There is a correlation between the observed activity values from the same congener due to variability between culture plates. The use of weighted regression based on variance components estimated from the repeated measures d design permitted data from all congeners to be y estimated simultaneously while addressing this 4 1rs e n t ) l S I s , 1k n 10 k correlation. The weights formed a block diagonal matrix. The minimum value of the dose-response curve was allowed to be negative (less than control) in view of the recent interest in U-shaped dose-response curves (8).
Next, a structure-activity relationship needed to be developed so that the E50s of the untested congeners could be estimated. After considering various models involving log RoW and indicators of chemical structure, we decided to approximate this relationship by a logistic second order response surface model in the variables Clortho' Clpara, and Clmea This model was fit to the logs of the E50s using weighted nonlinear regression. The logs of the E50s of the three commercial mixtures were simultaneously fitted in this regression through the use of the dose-addition equation. For these mixtures, a single combination index was also estimated. The combination index is the ratio of the E50 of the mixture to the E50 predicted by the doseaddition assumption. The reason for simultaneously estimating the structural activity relationship and the mixture E50s was to constrain the estimated structural relationship to one capable of predicting the activity of mixtures based upon dose additivity.
It is ideal to develop a model with one set of data and validate the model with a second set of data; this validation process is called external validation. In this case, all the data were needed to develop the model due to the small sample size; therefore, the model was internally validated. Internal validation involves excluding one data point at a time from the model fitting process and comparing the resultant prediction of the excluded point with the observation of that point. After each observation has been excluded and predicted, the sum of squares of the prediction differences is called the PRESS statistic. The lower the PRESS statistic the better the model.
The PRESS statistic was calculated for five models, and most of them were more biologically based than the second order model. Hansch, the father of quantitative structure-activity relationships (QSAR), derived a quadratic polynomial to relate biological activity to log P (9). Therefore, one model we considered was a quadratic model in log K . The quadratic coefficient was fixed and the linear coefficients were allowed to vary with Clortho' The intercepts varied with the main effects of Clorth& and Clpara.
There were biological reasons for believing that Clorho is an important variable, but the reason for considering Clpara was more empirically based. A main efdects model was also fit using dummy variables that considered Clortho, Clpara, and Clmeta to be factors.
The second order logistic model attained a smaller PRESS value than other models. In choosing the model we were guided by the PRESS statistic and how well the ESOs of the mixtures were predicted. The data contain more information about the activity and chemical structure relationship than we have modeled. This state of affairs is due to our search for the most detailed model that could be validated with this sample size. Also, the PRESS statistic seems to favor models that use continuous independent variables rather than dummy variables such as would be used to describe a main effect model in the factors CLortho, Clpara, and Clmew' These efforts are preliminary in some ways. The use of physicochemical variables as independent variables may further explain these activity data. However, considering the prevalence of mixtures in the environment, our focus is on the prediction of the activity of mixtures. This work is a step in that direction, and it gives a framework for more detailed structural-activity research to build on.
Our avoidance of physicochemical variables at this stage of the research is based upon three reasons. First, the selection of congeners to test was based mainly on their presence in environmental mixtures; therefore, the sample of congeners is unlikely to be ideal for the determination of a mechanistically based model. Second, there is measurement error in these variables. Sabljic (10) has described the measurement error in log Kow and has concluded that the use of the noctanol/water partition should be avoided in environmental research. Approaches using physicochemical variables as independent variables in regression routines need to address this error. The practice of combining several physicochemical variables into a summary variable using principle component analysis does address the measurement error problem. Third, most biologists do not think in terms of physicochemical variables and describe their results in terms of indices of structure, such as we use here. For example, Kodavanti et al. (11) has found that noncoplanar congeners are usually more active. Noncoplanarity is related to the configuration of the ortho substitutions. Our descriptive approach sheds light on the importance of such theories in explaining the variation of the E50 estimates.
The success of our approach depends on an assumption of smoothness of the E50 response surface. For example, the logistic second order model can estimate at most one peak in a response surface that might not be very smooth. Our approach should be judged on its clarification of important aspects of the E50 response surface, on its ability to predict mixtures, and on its ability to indicate how future experiments can be designed to improve mixture prediction.
We also present some results based upon empirical Bayesian analysis. This approach can fit response surfaces that are less smooth than the logistic second order model can fit.

Materials and Methods
The in vitro test system. The in vitro test system of Kodavanti et al. (6,7) measures increases in [3H]-phorbol ester (PDBu) binding, which suggests increased activation/translocation of PKC from cytosol to the membrane. Translocation of PKC is dependent on the concentrations of intracellular free Ca2+ and/or diacylglycerol (12,13). PKC has been reported to play a key role in a number of physiological and toxicological phenomena (14,15). Cerebellar granule cells grown on 12-well culture plates were tested after 7 days in culture for [3H]-PDBu binding. Each replicate consisted of a control and usually six different concentrations placed in the wells of the cell culture plates. Generally, four replicates were used and, other than control, the concentrations of PCBs were 1, 3, 10, 30, 50, and 100 pM. Table 1 shows the details of the experimental designs.
Chemicals and terminology. Figure 1 shows the chemical structure of a PCB congener and explains how the congeners are denoted. The log of the octanol-water partition coefficient is denoted by log Kv and was obtained from Hawker and Connell (16). The tested congeners are listed in Table 1. The percentages in the commercial mixtures were the averages across lots of values from Frame et al. (17).
The concentration-activity model. The logl0 of activity ([3H]-PDBu binding) relative to control and averaged across culture plates was fit to an exponential concentration-activity model. Specifically, the model for the jth congener was y =MO + (kmo )es(og 10 C)-log E50 (1) where y is the average across plates of the logl0 of [3H]-PDBu binding relative to the control activity, C is the concentration (pM), and k = logl0 1.5. The parameter mo is the log of the minimum of the concentration-activity model, and s is a slope parameter. This parameterization allows the standard error of the log E50 to be obtained directly from the nonlinear routine, which is based on the delta method. This model was fit simultaneously to all the nonzero dose data from the 36 congeners and 3 commercial mixtures using weighted regression. The weights were determined as follows. A repeated measures analysis was done on the log of the activity relative to control for each congener. This resulted in estimates for the variability due to plates, variability due to doses, and residual variability.
For the data from each congener (or mixture), a covariance matrix with plate variability at the off-diagonal elements and plate plus residual variability on the diagonal was formed. The weights for all the 36 congeners and the 3 commercial mixtures formed a block diagonal matrix. See Draper and Smith (18) for an explanation of why it is valid to use linear regression with such a weight matrix.
The dose-addition method ofpredicting the mixture E50. The E50 of a mixture satisfies the following equation when the joint action is dose addition: E50k j=l E50i (2) where 1ik is the fraction of the jth congener in the kth mixture. The combination index (1.9) is the ratio of the actual E50 of the mixture to the E50 predicted in Equation 2. Combination indices that are >1 indicate antagonism and those <1 indicate synergism.
The structure-activity modet Weighted nonlinear regression analysis was used to determine the relationship between the log of the E50 of the concentration-activity model and the chemical structure of the PCB congener. Also, the logs of the E50 for the three commercial mixtures were induded in this regression by assuming that these mixtures all had the same combination index. The weights used were the reciprocal of the square of the estimated standard errors of the log E50s. The model used to relate the log E50 of each congener to the structure was the second order logistic model: The logistic model becomes flat as the value of the expression within the square brackets of the equation becomes either positively or negatively large. This feature restricts the predicted values to lie in the range from m to M (minimum to maximum), thus improving the stability of the predictions.
The prediction of mixture E50s. Using the combination index and Equation 2, the log of the E50 of the mth mixture can be expressed as 209 -loE5 log E50 = log CI -log T nl10 where all logs are to the base 10 and CI is the combination index for the mth mixture. In fact, when one views each congener as a Volume 105, Number 10, October 1997 * Environmental Health Perspectives Articles * Svendsgaard et al.  The color of the activity-concentration line represents the level of Clmeta: blue is 0, green is 1, brown is 2, orange is 3, and purple is 4. The wider the activity-concentration line, the less chlorine at the para posi-  (20) discuss empirical Bayesian analysis. Briefly, when data are analyzed by using a simple weighted linear regression with one continuous independent variable, the error sum ofsquares has a chi-square distribution, with degrees of freedom equal to the residual degrees of freedom if several conditions are met. These are 1) the elements of the dependent variable are normally distributed with a known variance, 2) the reciprocal of this variance is used as the weight, and 3) the dependent variable is linearly related to the independent variable. When this is the case, the empirical Bayes estimates (EBEs) of the true model can be used; they are points lying between the elements of the dependent variable and the straight line fitted by the regression. The smaller the error sum of squares, the closer the EBE is to the straight line.
We used the EBE in the case of weighted nonlinear regression with multiple independent variables. The weights must be estimated because the standard errors are unknown. Therefore, we expect the error sum of squares to be somewhat more variable than a chi-square distribution.
We tested 36 congeners so we could estimate their E50s directly. The E50 of the 173 untested congeners could be estimated from the fitted second order logistic model. By assuming there were indications of lack of fit of the model, we could use the EBEs in this case. However, EBEs only are known for the 36 tested congeners. To estimate the EBEs for the untested congeners, linear splines were connected to the EBE points to estimate the EBEs for the untested congeners. Those untested congeners with log KoW values less (or greater) than the smallest (largest) tested congener were assigned the same EBE as the smallest (or largest) tested congener. These six groups were the coplanar congeners with Cl ara= 0; the coplanar congeners with Clpara = 4 or 2; the congeners with Clortho = 1; the congeners with Clortho= 2; the congeners with Clor,ho = 3; and the congeners with Clortho = 4.
The EBE for the ith observation (observedi) were obtained as follows: EBEi = kiobservedi + (1 -X1 )fittedi, (5) Environmental Health Perspectives * Volume 105, Number 10, October 1997 where Xi = I/(Qr + v), X is the unexplained variance between log E50s, ¾i is the estimated variance of the log E50 of the ith congener, and fittedi is the predicted value for the ith observation. The variance t was estimated by adding a constant to the reciprocal of the weights. This constant was increased until the error sum of squares was less than the 90th percentile of a chi-square with the residual degrees of freedom. This constant was used as the estimate of .

Results
Results offitting an exponential concentration-activity model to 36 congeners and 3 commercial mixtures. Figure 2 shows the mean activity versus concentration over the region from 1 to 100 pM for the 36 congeners and 3 commercial mixtures. Note that 22 of the congeners have mean activities exceeding 50% of control. About 6 of the congeners barely exceed 10% of control. A difficulty was encountered when fitting the exponential concentration-activity model (Equation 1) to the data. A very large log E50 estimate for congener 54 resulted, and its estimated standard error (SE) was also very large. Examining the plot of the activity versus concentration for congener 54 indicated that the activity decreased with concentration at concentrations of 30 pM and greater. Once the data from these dose levels were excluded, the nonlinear regression converged. Some large log E50 standard errors were also noted for several other congeners. Their concentration-activity relationships were also plotted, and decreasing activity at increasing concentration was also noted for those congeners. These data points were also excluded. Altogether, 9 of the 212 data points were excluded. These were congeners 54 and 77 and concentrations 30, 50, and 100 pM; congener 126 and concentrations 50 and 100 pM, and congener 169 and concentration 100 pM. With these points deleted, the nonlinear regression was stable and converged to the estimates of log E50 in Table 2. It should be noted that congeners 77, 126, and 169 are coplanar congeners with Clpara = 2. There were 4 such congeners tested, and each was inactive.
The resultant slope estimate was 0.494 + 0.033 (mean ± SE), and the log of the minimum mO= -0.0168 ± 0.0079. The model for the ith congener is of the form log activity = -0.017 + sid°494, where the parameter Si can vary among the congeners and mixtures. On a log dose scale, the slope (0.494) is constant for all congeners and mixtures, and Si is an intercept term.
Results ofstructural activity modeling. Table 3 shows the results of modeling the structural-activity relationship. The Q2 of 0.641 indicates that the model described by Equation 3 and Equation 4 is valid. The I2 of 0.866 is also very good. Figure 3 shows the residuals of the model versus log K,. Figure 4 shows plots of the log E50 versus log Ko. Generally, we expect to see activity decrease and then increase as log No. increases. The plots show that the model generally reflects this shape. Also, congeners with Clpara = 2 are generally less active than other comparable congeners. The error sum of squares was 68.32. When weighted regression is used and the weights are the reciprocal of known variances, we expect under normal theory that this error sum of squares would have a chi-square distribution with 26 degrees of freedom. Since our weights are estimated rather than known, this sum of squares could be inflated slightly, but not to this amount. A simulation study indicated that using estimated weights based on t-distributions with 21 degrees of freedom would increase the 90th percentile about 10%. The 90th percentile of a chi-square with 26 degrees of freedom is 35.6. Factors that tend to increase this sum of squares are nonnormality, the presence of outliers, model lack of fit, and between-experiment variability.
The EBEs are calculated so there will be no significant lack of fit. Figure 5 shows the EBE plotted versus log K, for comparison to Figure 4.
The ratio of the estimate to its standard error should follow a t-distribution with 26 degrees of freedom for the estimates in Table 3. Because the ratio corresponding to the log of the combination index exceeds 2,   Table 4 shows the model predictions of the log base 10 E5Os for all congeners 4 together with the 95% confidence limits. 8 Lu 3 Also shown are the log K., Only three con-.2 3 geners were tested whose log Kw exceeded 7.     Abbreviations: IUPAC, International Union of Pure and Applied Chemistry; E50, the concentration producing an activity 50% above the control activity; CL, 95% confidence limit. The Bayes estimates of log E50 are also shown.

Discussion
Adequacy of the concentration-activity model. A logistic model was fit to the activity-concentration data; the estimation of the maximum was unstable so an exponential model was used instead. This model is similar to the logistic, but it has no maximum. In fitting the exponential, we deleted data at high concentrations when the activity decreased with increasing concentration. This reversal in slope might indicate that the concentration-activity curve has reached a maximum and, for some congeners, the activity might never attain 50% above control. However, the congener could metabolize to a more soluble metabolite, justifying this deletion of data.
On the other hand, congeners whose activity does not attain 50% above control might be considered inert with respect to the E50, but, for example, active with respect to the E10. Whether the congener is inert or weakly active is difficult to discern; however, this distinction is of less concern in mixture prediction. Adequacy and validation of the model relating activity to structure. One objective of this study was to determine how reliable the predictions were for the untested congeners. The internal validation was very good with a Q2 = 0.641. The use of a logistic model to describe the relationship between activity and chemical structure was very successfil. The graph of the logistic model is a sigmoidal-shaped curve, so the range of predictions from this model is between two fixed parameters. Therefore, the prediction of extreme E50 values is avoided. The use of the logistic model causes the smallest E50 prediction to be near the average of other small E50s, rather than possibly being a value lower than any calculated E50. Because our model is empirical, this reluctance to extrapolate is reasonable. Logistic regression in this instance is very similar to dicscriminant analysis with two classes-active and inactive congeners.
There were indications of lack of fit of the model. In this case, one might predict the mixtures with the estimated E50s rather than the predictions from the logistic model. This strategy only works for tested congeners, and we still need a way to There are various ways this process could be improved. We want to evaluate both prediction methods using various types of PCB environmental mixtures before we decide which prediction method is best.
Comparison to other approaches for determining structure-activity relationships.
It is enlightening to compare this approach with structure-activity modeling using Hansch analysis as described by Martin (21). Martin believes "the Hansch method is most suitable for a data set that has the following characteristics: 1) the compounds should be structural analogs that are identical in the structure of the pharmacopone, 2) all analogs should produce their biological effect by interacting with the same biological receptor(s), 3) it should be possible to derive quantitative measures of the physical properties of the analogs, 4) there should be enough compounds in the set that one can statistically examine a number ofproperties, . . . 5) the variation in potency between different analogs should be substantially larger than the error in measuring potency, and 6) the relevant physicochemical properties should be varied properly within the series." It is not clear that the 209 PCB congeners satisfy the Hansch assumptions. They do not all interact with the same receptor. This generally requires the dose-response slopes to be equal. We have fit the activity curves with the same slope, and this is a good approximation at this stage. More congeners should be tested to determine whether the slopes vary with subsets of the congeners. Thus, there is a lack of evidence to conclude that the slopes differ or that the shapes of the concentration-activity curves are not similar. On the other hand, there is increasing evidence that not all congeners interact with the same receptors. Therefore, PCB congeners may not satisfy this assumption, and Hansch analysis may not be appropriate. However, Hansch analysis may be appropriately applied to subsets of the 209 congeners once those subsets that satisfy the assumptions have been identified.
We have previously stated why we did not use mechanistically based variables in our model. However, Verhaar et al. (22) have shown an advantage of using mechanistically based variables in risk assessments of those complex mixtures in the environment whose chemical components are difficult to measure. Those interested in such approaches may find our predicted E50s useful to relate to mechanistically based variables.
Because this is an empirical approach, there are limitations. One cannot be as comfortable in the predictability from an empirical approach as from an approach that uses mechanistically based variables. However, this approach has achieved a prediction error that seems acceptable; a mechanistically based approach that includes this target population of congeners and a similar activity endpoint has not yet been completed. Mixture risk assessment. A very practical approach to predict PCB mixture activity is to use the dose-addition assumption. Future studies of the joint action of PCB mixtures should provide a better evaluation of this issue and may suggest some modification to dose additivity is required. Perhaps it will be discovered that the joint action of environmental mixtures tends to be more antagonistic on average. A simple correction could be introduced to account for this deviation from additivity. The use of this test system to study mixtures is particularly sensitive to nonadditivity because all mixture combinations (up to 12) can be completed in one replicate using the 12-well culture plate.
A drawback to the dose-addition method is that the components of the mixture need to be known, and this information can be difficult to obtain for some complex mixtures in the environment. However, if one knows the percent of chlorine in a PCB mixture, this and other knowledge could be used to calculate a probability interval for the E50 of an environmental PCB mixture that would be useful in determining risk. This calculation would be based on predicted E50s as we have provided and might require the assumption of a probability distribution for PCB congeners.
Congener selection. Tysklind et al. (23) discussed a procedure for selecting PCB congeners for use in quantitative structure-activity modeling. They restricted the congeners to the 154 tetra-through hepta-chlorinated congeners. Some restriction is necessary in view of the solubility problems encountered with most of the test systems. The congeners were described using 47 physicochemical variables, and principal component analysis reduced the number of variables to four orthogonal summary variables. The full 24 factorial design plus four center points required 20 congeners to be tested. These 20 congeners were divided into two groups. One group of 10 congeners was labeled the training set to be used for model development. The second group was labeled the validation set. The division occurred so that both sets of congeners typified the whole chemical domain of the four summary variables.
Given our goal of mixture prediction and the problems encountered with solubility, the approach considered by Tysklind et al. (23) does not seem appropriate. It is efficient to restrict testing to those congeners that are soluble, as their approach does. However, some sampling needs to be done to determine the limits of that region. These limits could well vary with the test system and other experimental factors, such as the use of metabolites if the parent congener is not soluble.
In addition, it seems that the choice of congeners depends on the balancing ofseveral objectives. For example, we can test those congeners found in high percentages in environmental mixture samples and reserve for prediction those congeners that are less prevalent in the environment. A design with characteristics similar to the Tysklind design could be used so that good estimates of the untested congeners can be obtained.
Do more congeners need to be tested? Martin (21) has suggested that a 3 to 1 ratio of chemicals to parameters be used for Hansch analysis. Our estimation of the logistic second order model required the estimation of 13 parameters using 39 observations (congeners and mixtures). Therefore, our ratio of chemicals to parameters meets the minimum requirements suggested by Martin (21). In the area of general medical research, which is probably subject to more sources of prediction error than QSAR, Neter et al. (24) have suggested the use of a 6 to 1 ratio as a rule of thumb. Our more modest goal of prediction and description rather than variable selection may not require as large a ratio. Also, our use of the logistic model reduces the range of prediction. This reduction in range should limit the prediction error, which should translate into a less stringent requirement for precise estimates of the regression coefficients. The precision of these estimates is the reason for requiring a high ratio of chemicals to variables.
We think we have tested enough congeners. We have tried to externally validate various models based upon data from 17, 24, and 28 congeners. As the number of congeners fit to the model increased, the relative number of prediction errors decreased. The width of the confidence intervals as shown in Table 4 depend upon the fit of the model and how well the tested congeners span the space of the untested target congeners. Testing more congeners to better span the space of the target congeners is not likely to yield smaller confidence intervals due to insolubility limitations. Testing more congeners to justify the use of a more biologically based model such as a main effect model does not seem productive. Therefore, no further single congener testing using this test system is planned.
Further testing of mixtures is planned, however. We have tested a few other mixtures and expect to learn some new information about mixtures. When the model is fit to E50s arising mainly from mixture testing rather than from single congener testing, we initially expect a worsening of the fit of the model. This is due to information that the present model does not consider. For example, solubility might not have a dose-addition joint action and would interfere less in data consisting mainly of mixtures. After this phase has passed and enough mixtures have been tested, we expect the model to provide satisfactory predictions of mixtures.
Risk assessment implications. Although further research is needed to better understand whether this test system (PKC activation/translocation) has any role in the neurotoxicity of PCBs, data from this test system eventually may be used in terms of assessing the biological activity of PCB mixtures in a risk assessment. It is believed that this analysis of the data could provide benchmark estimates for mixtures following the method used by Crump (25) with some modifications. These modifications may include the method used to calculate confidence limits, the need for a lower bound on the slope estimate, and the use of a concentration causing a percentage of subjects to be affected rather than the E50.

Conclusion
This approach toward prediction and relating structure to activity is novel in several ways. The laboratory effort is reduced because our exponential model does not have the data requirement of estimating a maximum. The model has been validated, but there were indications of lack of fit. Also, the predictions of E50s for the three commercial mixtures based on dose addition were lower than the estimated E50s. Future mixture studies should indicate how the dose-additivity equation needs to be modified to predict various environmental mixtures. Our empirical approach is not limited to the assumptions of Hansch analysis. This approach, suitably modified, should be useful when applied to other test systems and should improve the ability to evaluate PCB congeners for hazard identification.
Being Informed.I