Population-Based in Vitro Hazard and Concentration–Response Assessment of Chemicals: The 1000 Genomes High-Throughput Screening Study

Background: Understanding of human variation in toxicity to environmental chemicals remains limited, so human health risk assessments still largely rely on a generic 10-fold factor (10½ each for toxicokinetics and toxicodynamics) to account for sensitive individuals or subpopulations. Objectives: We tested a hypothesis that population-wide in vitro cytotoxicity screening can rapidly inform both the magnitude of and molecular causes for interindividual toxicodynamic variability. Methods: We used 1,086 lymphoblastoid cell lines from the 1000 Genomes Project, representing nine populations from five continents, to assess variation in cytotoxic response to 179 chemicals. Analysis included assessments of population variation and heritability, and genome-wide association mapping, with attention to phenotypic relevance to human exposures. Results: For about half the tested compounds, cytotoxic response in the 1% most “sensitive” individual occurred at concentrations within a factor of 10½ (i.e., approximately 3) of that in the median individual; however, for some compounds, this factor was > 10. Genetic mapping suggested important roles for variation in membrane and transmembrane genes, with a number of chemicals showing association with SNP rs13120371 in the solute carrier SLC7A11, previously implicated in chemoresistance. Conclusions: This experimental approach fills critical gaps unaddressed by recent large-scale toxicity testing programs, providing quantitative, experimentally based estimates of human toxicodynamic variability, and also testable hypotheses about mechanisms contributing to interindividual variation. Citation: Abdo N, Xia M, Brown CC, Kosyk O, Huang R, Sakamuru S, Zhou YH, Jack JR, Gallins P, Xia K, Li Y, Chiu WA, Motsinger-Reif AA, Austin CP, Tice RR, Rusyn I, Wright FA. 2015. Population-based in vitro hazard and concentration–response assessment of chemicals: the 1000 Genomes high-throughput screening study. Environ Health Perspect 123:458–466; http://dx.doi.org/10.1289/ehp.1408775


Introduction
During the past decade, considerable progress has been made in high-throughput approaches for toxicity testing to address challenges posed by a) expense and ethical constraints in animal testing, b) uncertainties in applicability of animal models to human susceptibility, and c) a large and increasing number of chemicals, many of which have never been subjected to adequate toxicity testing. A vision for screening by highthroughput biochemical and cell-based assays to improve understanding of toxicity response and modes of action was articulated by Collins et al. (2008). In vitro testing of human cell lines meets human relevance standards (Collins et al. 2008) and serves as a bridge to in vivo assessment. Beyond characterizing an "average" response to chemi cals, next-generation toxicity testing may improve understanding of population variability, identify vulnerable subpopulations, and refine uncertainty factors used in risk assessment (Zeise et al. 2013).
The Tox21 initiative (Tice et al. 2013) is systematically screening thousands of chemicals against hundreds of molecular and cellular toxicity phenotypes. Cell-based viability assays are an established approach to prioritize chemicals or classify them into hypothe sized modes of action ). However, for environmental chemicals, the number of cell lines has typically been limited to dozens (Lock et al. 2012;O'Shea et al. 2011), sometimes representing multiple species . Thus, an understanding of human population variability and the role of constitutional genetic variation remains elusive. Epidemiological approaches have been limited to a few chemicals with high occupational or other exposure (Zeise et al. 2013), or have quantified poly morphic toxico kinetic variation mainly in drugmetabolizing enzymes (Ginsberg et al. 2009). Epidemiological studies provide little basis to compare chemicals, including new chemicals with little or no data, and risk assessments still typically assume that more sensitive individuals or subpopulations are adequately protected by applying an "uncertainty" factor of 10, the product of factors of 10 ½ each for toxico kinetics and toxico dynamics (Zeise et al. 2013).
Screening of lymphoblastoid cell lines (LCLs) is an established approach to identify genetic variants that influence cyto toxic response to pharmaceuticals, especially chemotherapeutic agents (Wheeler and Dolan 2012). Choy et al. (2008) challenged the value of these approaches, primarily because of the effects of growth rates and technical factors. However, enrichment of human blood expression quantitative trait loci has been established among weakly significant chemo therapeutic drug-susceptibility loci . With the advent of statistical methods that are purpose-built for cyto toxicity profiling, several robust associations have been identified (Brown et al. 2014).
For environmental chemicals, the extent of population variation in in vitro cytotoxicity may serve as a surrogate for cellular variation in the toxico dynamic relationship between systemically available concentrations and toxic responses (Zeise et al. 2013). Such data could inform a chemical-specific adjustment factor for human toxico dynamic variability, replacing the usual factor of 10 ½ [International Programme on Chemical Safety (IPCS) 2005]. Direct connections to human risk assessment must consider genetic variation at low concentrations relevant to human exposure. This goal may conflict somewhat with maximization of power to identify specific genotype-susceptibility associations because the effects of genetic variation may be apparent only at higher concentrations. Furthermore, for both these goals, the sample sizes in studies of environ mental chemical cyto toxicity has often been inadequate to establish population variation or to assess genetic association for these complex traits with small effect.
Here, we describe profiling 1,086 LCLs for cyto toxic response to 179 chemicals, each assayed over a range of eight concentrations spanning six orders of magnitude. The compounds were primarily chemicals of environ mental concern, cover a wide range of in vivo toxicity hazards, and were drawn from a larger set of 1,408 compounds used for high-throughput screening (Lock et al. 2012;O'Shea et al. 2011;Xia et al. 2008). We selected the LCLs from the 1000 Genomes Project (1000Genomes Project Consortium et al. 2012, spanning a variety of ancestral populations. We assessed cyto toxic response using an EC 10 (effective concentration, 10th percentile) and performed genome-wide association mapping using both EC 10 and the entire eight-concentration profile as a multivariate vector.

Materials and Methods
Chemicals and cytotoxicity profiling. The chemicals evaluated were a subset of the National Toxicology Program's 1,408 chemical library as described by Xia et al. (2008). We dissolved chemicals with dimethyl sulfoxide (DMSO) into eight stock concentrations transferred into 1,536-well plate format via a pin tool station (Kalypsys Inc.). The final concentrations ranged from 0.33 nM to 92 μM. The negative control was DMSO at 0.46% vol/vol, and the positive control was tetra-octyl-ammonium bromide (46 μM). We used the CellTiter-Glo Luminescent Cell Viability (Promega) assay to assess intra cellular ATP concentration, a marker for viability/ cytotoxicity, 40 hr after treatment. We used a ViewLux plate reader (PerkinElmer) to detect luminescent intensity.
Cell lines. We acquired 1,104 immortalized lymphoblastoid cell lines from the Coriell Institute. We randomly divided cell lines into screening batches, equally distributed by population and sex in each batch without regard to family structure. We cultured cells at 37°C with 5% CO 2 in RPMI 1640 media (Invitrogen) supplemented with 15% fetal bovine serum (HyClone) and 100 U/mL penicillin/100 mg/mL streptomycin (Invitrogen), replacing media every 3 days. We plated cells with viability of > 85% into tissue culture-treated 1,536-well white/ solid bottom plates (Greiner Bio-One) at 2,000 cells/5 μL/well using a flying reagent dispenser (BioRAPTR, Beckman Coulter). We seeded each cell line on multiple plates (1-2 plates within or between batches). We fit all chemicals to a single plate.
Genotypes. The primary genotypes were the Illumina HumanOmni2.5 platform (ftp:// ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/ working/20120131_omni_genotypes_and_ intensities) and available for 1,086 lines, excluding SNPs with call rate < 95%, minor allele frequency (MAF) < 0.01, or HWE p-value < 1 × 10 -6 . We chose a maximal subset of 884 samples to remove first-degree relatives ("unrelated" set) using geno types and sample annotation. Of the 884 samples, genotyped SNPs from the platform were available for 761. The remaining 123 samples were genotyped by HapMap (http://hapmap. ncbi.nlm.nih.gov/downloads/genotypes/ hapmap3_r3/plink_format), and we imputed for the filtered Illumina SNPs using MaCH (Li et al. 2010). We used a set of 875 samples from the 1000 Genomes set (not restricted to these cell lines) as an imputation reference, producing 1.3 million SNPs for primary analysis. A further subset of 690 unrelated individuals from 1000 Genomes Phase I had more complete sequencing data, with a total of 12 million filtered SNPs.
Cytotoxicity EC 10 estimation, outlier detection, and variability characterization. We normalized cyto toxicity data (see Supplemental Material, Figure S1) relative to positive/negative controls. Although the primary association mapping method was a multi variate treatment of cyto toxocity response across all concentrations for each chemical, we also used a single cyto toxicity dose summary per chemical and cell line. We devised an EC 10 using the logistic model where ε ~ N(0, σ 2 ), y is the observed normalized signal representing the proportion of surviving cells (the "cyto toxicity value"), d is log(concentration), and θ max is the response value at zero concentration. We set θ min = 0 to avoid estimation difficulties for chemicals with low cyto toxicity. We made an exception for a very few chemicals for which the cyto toxicity value at the highest concentration was > 0.4, fixing θ max using the observed cytotoxicity at the maximum concentration, and inspection revealed good fits in such instances. Although, in principle, θ max should have been 1.0, several plates exhibited a drift from this value, and the parameter was estimated from the data. We used maximum likelihood by numeri cal optimization in R v2.15 (R Core Team 2012) to fit [β 0 , β 1 , σ 2 , θ max ]. We devised automatic outlier detection, dropping each concentration value in succession and flagging values for which the maximum likelihood improved by a factor of ≥ 10 (see Supplemental Material, Figure S2), refitting the model using non-outlying observations.
We characterized inter individual variability using the distribution of estimated EC 10 across cell lines. Summary statistics, including the mean, SD, and selected quantiles (q 01, q 05, q 95, and q 99 ), were calculated for log(EC 10 ) (see Supplemental Material, Table S1). For risk assessment, the relevant variability measure is the ratio of EC 10 for the median compared with a "sensitive" individual, because the uncertainty factor is intended to cover the more sensitive population "tail" (i.e., those for whom a lower concentration elicits effect). There is no standard definition for a sensitive population threshold, so we selected 1% as a nominal value that could be estimated reliably from a sample size of 1,000 individuals, and we defined a toxico dynamic variability factor as 10 q50-q01 analogous to a chemical-specific adjustment factor for human toxico dynamic variability.
Attenuated variability estimates to account for sampling variation. To account for the inflationary effect of sampling variance, we considered the model logEC 10 = μ + ε, where μ is the underlying true (unknown) logEC 10 and ε represents sampling variation. We assumed each chemical has an underlying true sampling variability of σ s 2 per observation; observed log EC 10 values were, in many instances, averaged across multiple observations. For an individual measured n´ times, var(ε) = σ s 2 /n´. We conservatively estimated σ s 2 by computing the sample variance for paired replicate instances for the chemical across different batches and averaging across pairs. Then we computed a variance inflation factor (VIF): where mean(n´) is the average number of replicates per individual. Finally, we considered individual measurements to have been inflated by VIF 1/2 so that, for example, the "shrunken" toxico dynamic variability factor is 10 (q50-q01)/VIF 1/2 . Comparison with estimated in vivo toxico dynamic variability. The World Health Organization recently reviewed available data on human in vivo toxico dynamic variability as part of a new harmonized framework (IPCS 2014). For each of the available volume 123 | number 5 | May 2015 • Environmental Health Perspectives data sets, variation in systemic concentration eliciting a toxic response was represented by a geometric SD (GSD) for population toxicodynamic variability based on fitting to a lognormal distribution [Tables A4.5 and A4.6 in IPCS (2014)]. We calculated an analogous toxico dynamic variability factor using our in vitro data as the ratio of the median to the 1% quantile, equal to GSD 2.326 , where the exponent is the 99% standard normal quantile, forming a basis for comparison with in vivo summaries.
Multivariate association analysis. We used MAGWAS multivariate analysis of covariance model (Brown et al. 2012b) for primary association mapping. The approach uses the full concentration-response profile instead of a univariate summary (such as EC 10 ), with advantages in robustness and power under a variety of association patterns. The model for the jth individual and genotype i for the chemical/SNP is where Y ij is the response vector (across eight concentrations) for the jth individual having genotype i; X ij is the design matrix of covariates, including sex, indicator variables for laboratory batch, and the first 10 genotype principal components, and μ i is the eightvector of parameters modeling the effects of genotype i. The multivariate normal error model allows dependencies in the variancecovariance matrix Σ. We obtained p-values using Pillai's trace (Pillai 1955). Because this method makes use of asymptotic theory, we removed markers with < 20 individuals representing any genotype, leaving 692,013 SNPs for analysis. Heritability. We calculated the proportion of chemical response variation due to genetic variation (heritability) for each compound using the mean batch-adjusted EC 10 value across the 401 related individuals belonging to nuclear family trios. We used the Multipoint Engine for Rapid Likelihood Inference (MERLIN) (Abecasis et al. 2002) package to estimate heritability. Consideration of covariates, including sub population by ethnicity [Utah residents with European ancestry (CEU), Mexican ancestry in Los Angeles (MXL), and Nigeria (YRI)] and population stratification (first three principal components) did not have a substantial effect (not shown). In addition, we performed variance component analysis and hypothesis testing with Sequential Oligogenic Linkage Analysis Routines (SOLAR) (Almasy and Blangero 1998) to evaluate the significance and standard error for each heritability.
Using the 884 unrelated individuals, we also ran genome-wide complex trait analysis (GCTA) (Yang et al. 2011) to estimate heritability, using default settings and the 1.3 million SNPs. To assess whether the concordance between MERLIN and GCTA was as expected, we used the 179-vector of MERLIN heritability estimates as a hypothetical true set of heritabilities. We used these "true" values and associated standard errors from both MERLIN and GCTA to simulate independent normal errors to create 10,000 paired vectors of MERLIN and GCTA estimates, which we then compared.

Results
Cell lines and genotypes. An initial set of 1,104 LCLs was representative of nine geographically and ancestry-diverse populations: Utah residents with European ancestry (CEU); Han Chinese in Beijing, China (CHB); Japanese in Tokyo, Japan (JPT); Luhya in Webuye, Kenya (LWK); Mexican ancestry in Los Angeles, California (MXL); Tuscans in Italy (TSI); Yoruban in Ibadan, Nigeria (YRI); British from England and Scotland (GBR); and Colombian in Medellin, Colombia (CLM). A few cell lines (18; 1.6%) were not viable or grew very slowly, or they had insufficiently available genotypes; therefore, the final set consisted of 1,086 cell lines.
To reduce multiple comparisons, we initially focused on approximately 1.3 million markers typed on the Omni 2.5 platform and further filtered by MAF. Because 172 individuals had not been genotyped on the platform, dosage imputation was performed using the appropriate 1000 Genomes reference population. We performed separate analyses on 400 individuals belonging to parent-child trios (not all complete) in the CEU (164), MXL (83), and YRI (153) populations, and on a maximal set of 884 individuals in the remaining populations with no first-degree relationships (unrelateds). We also performed association analyses using a larger set (~ 12 million) of typed SNPs available from the sequencing data. Figure 1A shows the distribution of populations and continental ancestry. We randomly divided LCLs into screening batches with equal distribution of populations and sex in each batch, without regard to family structure. The major HapMap/1000 Genomes continental ancestry populations were represented, as well as admixed populations from the Americas ( Figure 1B).
Cytotoxicity profiling. Supplemental Material, Figure S1, shows a flow chart of the data analysis from cyto toxicity profiling across eight concentrations ranging from 0.33 nM to 92 μM. We used logistic curve fitting with outlier detection (see Supplemental Material, Figure S2) to obtain EC 10 values, which were batch-corrected and averaged across replicates for each cell line.
To place our study in context, we reviewed comparable studies, identifying 19 reports (see Supplemental Material, Table S2). These studies had more than one chemical, except for Brown et al. (2012a), and at least 50 cell lines. Figure 2A depicts a heat map of the cytotoxicity measurements across cell lines and chemicals, and shows, to scale, the size of the other studies in terms of cell lines and number of chemicals/drugs. In these terms, our study is an order of magnitude greater than any single previous study, and several times larger than the other reports combined.
For approximately 700 cell lines for which there was at least one replicate plate, Figure 2B depicts the EC 10 values for replicates (r = 0.90). We assayed 9 of the chemicals in duplicate on each plate, and duplicate chemicals showed similar median EC 10 values and ranges of variability ( Figure 2C). The entire range of EC 10 values across all chemicals exhibited remarkable cyto toxicity variation ( Figure 2D). Only one other report has been of similar scale in number of chemicals [240 chemicals investigated by Lock et al. (2012)]. However, our comparisons are much more definitive in the ability to rank and  prioritize compounds by cyto toxic activity because of the large number of cell lines [n = 1,086 used here vs. n = 81 described by Lock et al. (2012)]. Figure 3A shows EC 10 estimation for all cell lines for an illustrative chemical β-nitrostyrene, as well as results from the logistic fit applied to the pooled data. The histogram depicts individual EC 10 estimates, showing overall variation of more than an order of magnitude. To quantify sensitivity variation, we recorded the 1st and 50th percentiles of log EC 10 values for each chemical, and refer to the natural-scale quantile difference 10 (q50-q01) as a "toxicodynamic variability factor." Figure 3B shows the range in these factors across chemicals and as a function of median EC 10 values. The figure also shows a shrunken estimate of the true under lying distribution after removing inflation due to pure sampling variation. For 30 chemicals a shrunken estimate could not be determined, so only 149 chemicals are shown. About half of these chemicals show a shrunken range < 10 ½ ; however, for some chemicals the estimated cyto toxicity range is > 10 (see Supplemental Material, Table S1). Figure 3C shows the cumulative distribution of in vitro toxicodynamic variability factors across 149 chemicals in comparison to in vivo toxico dynamic variability factors across 34 chemicals (IPCS 2014). The distributions are strikingly similar, with medians equal to 3.04 (90% confidence intervals of 1.48-10.3) in vitro and 3.10 (1.70-38.5) in vivo, and not significantly different (Kolmogorov-Smirnov p = 0.548).
Next, we profiled the EC 10 for each chemical by averaging within each population. Hierarchical clustering of these averaged profiles ( Figure 3D) shows general assortment by ancestry, although variation was generally greater within than across populations. Although a large number of chemicals showed significant EC 10 variation across populations or by sex (false discovery q < 0.05; see Supplemental Material, Table S3), this variation was modest; two examples are shown in Figure 3E.
Heritability and mapping. Trio-based analysis provided evidence of additive heritability for 17 chemicals (q < 0.2), with significant trio-based heritability estimates (h 2 ) ranging from approximately 0.25 to approximately 0.5 ( Figure 3F; results for all chemicals shown in Supplemental Material, Table S4). We augmented this analysis by essentially independent heritability estimation using GCTA (Yang et al. 2011) performed using the maximal set of 884 unrelated individuals. GCTA-based h 2 ranged from approximately 0.4 to 0.8 for 34 significant chemicals (see Supplemental Material, Figure S3A,B). Correlation of these two heritability estimates was modest (Spearman r = 0.22, p = 0.0026) but highly consistent with simulations (average r = 0.24) as described in "Materials and Methods." Our use of EC 10 values was motivated by relevance to human health assessment practices; however, elucidation of the underlying genetic mechanisms may be more powerful without assumptions about the point of departure. Moreover, EC 10 is not sensitive to genetic influences apparent only at high concentrations. We thus adopted a three-stage approach to mapping, using 10 genotype principal components and sex as covariates. For the primary analysis, using the unrelated individuals, we applied the multi variate MAGWAS approach (Brown et al. 2012b), sensitive to any pattern of variation of cyto toxicity measurements due to genotype. Second, for the same individuals, we used EC 10 values as a quantitative pheno type in regression analysis for an additive SNP model, using the larger set of 1.3 million SNPs (chr1-X). For individual SNPs, this analysis identified associations that might have been missed by MAGWAS and allowed us to investigate pathway-based associations (Schaid et al. 2012). Finally, to capture a larger number of SNPs and variants with lower MAF ), we applied the EC 10 regression approach to 690 of the unrelated individuals who were among 1000 Genomes Phase I, and used approximately 12.4 million variants with MAF ≥ 0.01. Preliminary analysis indicated pheno type outlier effects causing spurious significant findings due to the lower MAF threshold; after applying an initial filter of association p < 5 × 10 -8 , we recomputed the chemical × SNP analyses after applying an inverse quantile normalization to EC 10 . We deemed each chemical worthy of separate investigation and applied perchemical false discovery control, following proposals that SNPs with false discovery rates q < 0.10 be declared significant (van den Oord and Sullivan 2003). Table 1 shows these 48 chemical-SNP associations, after removing redundant regional findings within ± 1 Mb. The nearest gene is reported, along with partial R 2 , the portion of variance explained by MAGWAS across the concentrations after considering covariates. The most significant MAGWAS findings tend to have larger partial R 2 (see Supplemental Material, Figure S4). Table 1 shows data for each chemical, but a re-ranking by p-values revealed that the top 10 significant associations includes three solute carriers (SLC7A11 for 2-amino-4-methylphenol, SLC39A14 for 1,3-dicyclohexyl carbodiimide, and SLCO3A1 for titanocene dichloride), the transmembrane protein TMEM196 for N-isopropyl-N´phenyl-p-phenylenediamine, and NFAT5, which activates several solute carriers in response to osmotic stress, for o-aminophenol. Our findings suggest a major role for membrane proteins and solute carrier transporters in mediating cyto toxicity, as has been reported for the chemotherapeutic agent paclitaxel (Njiaju et al. 2012).
The most significant MAGWAS association (p = 8.4 × 10 -10 ) was 2-amino-4-methylphenol at rs13120371 in the 3´ UTR of The result was highly significant on a perchemical basis (q = 0.0006), and at the significance threshold for the entire combined set of SNPs × chemicals (q = 0.10). Figure 4A,B shows the corresponding Manhattan and regional plots. Same exact SNP also appeared with q < 0.10 for methyl mercuric chloride and N-methyl-p-aminophenol sulfate (Table 1). Comparative curves show that the differences in cyto toxic response appear mainly at the highest concentration ( Figure 4C). The plot illustrates the contrast between EC 10 , which did not differ significantly by genotype, and the multi variate MAGWAS finding, which is sensitive to concentration-response variation.
Supplemental Material, Table S5 shows results from the EC 10 regression analyses, with all significant findings (per-chemical q < 0.10) shown after removing redundant regional findings (63 unique chemicals, 260 unique nearest gene assignments). For many chemicals, we observed the effects of genotype both for EC 10 and across the multivariate response, and the two approaches provided similar evidence (see Supplemental Material, Figure S5). At the false discovery rate of < 0.1, only approximately 18 unique chemicals would be expected to appear in Supplemental Material, Table S5. SNPs in four genes appear for three or more chemicals: GRIP1 (glutamate receptor interacting protein 1), which directs localization of transmembrane proteins; FMN2, a component of p21-based cell cycle arrest; DNER, a trans membrane protein associated with glioblastoma propagation; and the cell membrane cadherin CDH13, an epithelial tumor suppressor. As we observed with MAGWAS analysis, membrane-localized proteins appear to play an important role. Because EC 10 values were available for 179 chemicals, we found that GCTA-based heritability estimates are largely reflected in a tendency toward small p-values, a phenomenon that is difficult to discern for single-trait GWAS studies (see Supplemental Material, Figure S3C). Supplemental Material, Table S6 shows  Figure 4D). Using a standard false-discovery computation, we estimated the proportion of true discoveries for the SNP across the chemicals as 0.25, a significant trend that remained even after removal of the three top MAGWAS-identified chemicals. The estimated number of true discoveries, corresponding to an estimated 44 chemicals showing true cyto toxicity association with rs13120371, is subject to considerable sampling variation. Nonetheless, the data indicate that SLC7A11 may be a cyto toxicity mediator, and a role for SLC7A11 has been proposed in glutathione-mediated chemotherapeutic resistance (Huang et al. 2005).
We performed "pathway" association analysis of gene sets/ontologies for EC 10 phenotypes and the 1.3-million Omni 2.5 SNPs using gene set scan (Schaid et al. 2012) which computes significance of SNPs, genes, and ontologies. Eleven chemicals had significant pathways, and several chemicals showed significant associations with immune-response pathways and ontologies (see Supplemental Material, Table S7) at a family-wise error rate of < 0.05.

Discussion
Despite early concerns over the ability to map meaningful response traits in LCLs and questions about this model's relevance to toxicity studies of chemicals that require metabolism, our results suggest that large sample sizes-on the order necessary for mapping human complex traits (Goldstein 2009)can overcome challenges. Importantly, we have demonstrated the feasibility of using an in vitro population-based model system for assessment of individual variability in nextgeneration risk assessment (Zeise et al. 2013). Although here we present our results as a survey, results for each chemical screened will be useful for future targeted investigations. Moreover, use of a common protocol enables valid comparisons across chemicals that are difficult to perform across individual studies.
Quantitative high-throughput screening of a large number of compounds affords detailed investigation of concentration response, which is critical for safety margins and informed decisions on relative hazard ranking/prioritization. Most similar in vitro studies have charac terized the concentration effect through EC 50 (Neubig et al. 2003); however, there are many limitations of this approach for screening data (Sand et al. 2012). Here, we derived EC 10 or no-effect values to describe variability across cell lines and among chemicals, and for GWAS analyses. In addition, we used the full complement of the concentration-response values for multi variate analysis.
To date, high-throughput screening for chemical prioritization has been largely limited to small numbers of genetic variants, and to models that are limited in diversity. Although cyto toxicity in LCLs is just one among multiple measures of toxicity, the availability of > 1,000 samples from global populations allows for precise estimation of population response range, filling a critical need (Collins et al. 2008). Thus, prioritization may be based on central tendency (e.g., median) or sensitive sub population (population quantile) estimates of activity, depending on contextual suitability.
The data generated using this approach also may help refine risk assessment (Zeise et al. 2013), potentially providing the basis for chemical-specific factors for toxico dynamic variability, replacing the canonical 10 ½ uncertainty/assessment factor. Cytotoxicity is often considered a crude measure, but for most chemicals evaluated in the ToxCast program, it constitutes a large proportion of "signal" detected in various high-throughput assays. Therefore, cyto toxicity may often be an appropriate surrogate for systemic toxicity.
We also compared our results on interindividual variability to those collected from human studies (IPCS 2014). Although data on in vivo human toxico dynamic variability are limited, we found that they appear largely consistent with our in vitro estimates. Interestingly, both in vivo and in vitro data suggest that the usual 10 ½ factor is appropriate "on average," but for roughly half of the chemicals the estimated factor would be greater. An estimate of the extent of overall human variability would also necessitate incorporating toxico kinetic variability (Judson et al. 2011).
Beyond immediate utility of our data in health assessments, we observed in GWAS analyses that genes with protein localization to cell membranes, including solute carriers, are enriched. Solute carrier transporters have been investigated as potential mediators of cyto toxicity for chemo therapeutics (DeGorter et al. 2012;Njiaju et al. 2012), controlling cellular influx and efflux of drugs/ toxicants. Moreover, several families of solute carriers are important toxicity mediators in liver and kidney (DeGorter et al. 2012). To our knowledge, we are the first to highlight the role of membrane transporters in interindividual susceptibility to a wide range of environmental chemicals, beyond chemotherapeutic agents.
The results for rs13120371 in SLC7A11 were striking, and are supported by growing literature on its importance in chemoresistance (Lo et al. 2008). Small interfering SLC7A11 RNA increased sensitivity to various agents in cancer cell lines (Pham et al. 2010). Expression was altered in drugresistant ovarian cancer cell lines (Januchowski et al. 2013), was downregulated in response to thymo quinone in breast cancer cells (Motaghed et al. 2014), and predicted poor survival in vivo (Kinoshita et al. 2013). In addition, SLC7A11 was inversely correlated with clinical outcome in bladder cancer and negatively regulated by a microRNA for cisplatin-resistant cells (Drayton et al. 2014).

Conclusions
Although the risk assessment process is shifting toward greater reliance on in vitro data, none of the in vitro assays in Tox21, ToxCast, or other large-scale screening programs is designed to address individual variability (Rusyn and Daston 2010). The present study demonstrates how a large-scale systems biology experiment (toxicity phenotyping and genetic mapping) can aid translation to public health protection, and provides novel information about global inter individual variability. The availability of genetically diverse, genetically defined renewable human cell lines opens an opportunity for in vitro toxicity testing at the population scale. Our heritability estimates show that genetic variation may have a profound effect on differences between cell lines and can be quantified and used to generate testable hypotheses about mechanisms of toxicity.