Environmental Impact on Vascular Development Predicted by High-Throughput Screening

Background: Understanding health risks to embryonic development from exposure to environmental chemicals is a significant challenge given the diverse chemical landscape and paucity of data for most of these compounds. High-throughput screening (HTS) in the U.S. Environmental Protection Agency (EPA) ToxCast™ project provides vast data on an expanding chemical library currently consisting of > 1,000 unique compounds across > 500 in vitro assays in phase I (complete) and Phase II (under way). This public data set can be used to evaluate concentration-dependent effects on many diverse biological targets and build predictive models of prototypical toxicity pathways that can aid decision making for assessments of human developmental health and disease. Objective: We mined the ToxCast phase I data set to identify signatures for potential chemical disruption of blood vessel formation and remodeling. Methods: ToxCast phase I screened 309 chemicals using 467 HTS assays across nine assay technology platforms. The assays measured direct interactions between chemicals and molecular targets (receptors, enzymes), as well as downstream effects on reporter gene activity or cellular consequences. We ranked the chemicals according to individual vascular bioactivity score and visualized the ranking using ToxPi (Toxicological Priority Index) profiles. Results: Targets in inflammatory chemokine signaling, the vascular endothelial growth factor pathway, and the plasminogen-activating system were strongly perturbed by some chemicals, and we found positive correlations with developmental effects from the U.S. EPA ToxRefDB (Toxicological Reference Database) in vivo database containing prenatal rat and rabbit guideline studies. We observed distinctly different correlative patterns for chemicals with effects in rabbits versus rats, despite derivation of in vitro signatures based on human cells and cell-free biochemical targets, implying conservation but potentially differential contributions of developmental pathways among species. Follow-up analysis with antiangiogenic thalidomide analogs and additional in vitro vascular targets showed in vitro activity consistent with the most active environmental chemicals tested here. Conclusions: We predicted that blood vessel development is a target for environmental chemicals acting as putative vascular disruptor compounds (pVDCs) and identified potential species differences in sensitive vascular developmental pathways.


Disclaimer: The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency. Mention of trade
names or commercial products does not constitute endorsement or recommendation for use. SM. Table 1 10-12 SM. Table 2 13-14

Virtual Tissues Knowledgebase (VT-KB)
VT-KB represents a flexible platform to extract and organize relevant facts from the existing body of scientific literature. Briefly, a vocabulary of terms was built to describe concepts relevant to health and disease using publicly available ontologies including genes, pathways, anatomy, clinical outcomes, and chemicals. We compiled a list of keywords relevant to embryonic vascular formation (Supplemental Figure 1), and varying combinations of these and like terms were cross-referenced in the VT-KB with the list of ToxCast in vitro assay targets.
Supplemental Figure 1 shows the example of cross-referencing the vascular developmental keywords with the VEGFR2 receptor and the many synonyms (KDR,etc) found in the literature. Initial results showed a high incidence of publications related to tumor neovascularization; therefore, the query was filtered using NOT logic with the keyword "cancer".
The keyword search results (over 20 million PubMed abstracts searched) were stored in a MySQL database for statistical analyses to summarize relationships and map them to biological concepts and pathways.

Vascular Bioactivity Score (VBS)
A weighted score (VBS) was created for each chemical across the six in vitro targets with the highest ranking, based on the log transform of the AC50/LEC values (Equations 1-3).
Here, i represents the assay target (feature) and j represents the number of different assay systems each target was measured in. When applicable, the directional regulation of each of these targets with respect to blood vessel development was considered. For example, the upregulation of CXCL10, an anti-angiogenic chemokine, was considered to be relevant while downregulation was not, as suppression of that chemokine would provide an environment favorable for angiogenesis, whereas our hypothesis is focused on disruption of vascular processes. The assay scores were normalized and summed (Equation 2), and chemicals were ranked based on the Vascular Bioactivity Score (VBS) shown in Equation 3, where k i is the weighted coefficient value determined by the number of term associations in the VT-KB search and by the specificity to vascular developmental processes: k 1,2 =3 (VEGFR2, TIE2), k 3,4 =2 (CCL2, PAI-1), and k 5,6 =1 (CXCL10, uPAR). Those that had a VBS above the mean (123 chemicals out of 309) were classified as putative VDCs. The pVDC ToxPi profiles are shown in Supplemental Figure 2, where each sector is normalized to show the relative effect of each chemical on each read-out and the slice widths represent the relative weights (k i ) across VBS targets.

Multivariate Modeling
We used machine learning tools to build a step-wise linear discriminant analysis (LDA) algorithm that yields a multivariate toxicity signature based on significant associations between the features set (in vitro ToxCast assay data and pathway perturbation scores) and a predefined endpoint. Typically, the endpoints are in vivo phenotypes culled from guideline animal studies and reported in ToxRefDB. Vascular disruption may result in a variety of developmental endpoints including fetal resorption, limb defects and microphthalmia, and similarly these endpoints could be a result of disruption of other non-vascular developmental processes or maternal toxicity. In an attempt to separate out those compounds which were potentially acting via a vascular disruptive mechanism, we defined pVDCs as those chemicals which had a VBS greater than the mean over the entire Phase 1 chemical space based solely on 6 assay targets critical to vascular development. We then identified those compounds with species-specific developmental toxicity and searched for correlations among the remainder of the ToxCast in vitro data and the aggregated pathway-perturbation scores.
First, individual (univariate) statistical associations were calculated for the chemicalassay and chemical-endpoint space. Two statistical tests were used. In the first, the data matrix was dichotomized so that if activity was seen at any concentration (assays) or dose (endpoints), a value of 1 was assigned to the chemical assay (endpoint) pair. Otherwise a value of 0 was assigned. Next, one assay was selected as the input (predictor variable) and another assay (or endpoint) as the output (predicted variable). A 2x2 contingency table was created with values for TP (true positive, number of chemicals for which the input and output were both positive), FP (false positive, number of chemicals for which the input was positive and output negative), FN (false negative, number of chemicals for which the input was negative and the output positive) and TN (true negative, both input and output negative). The significance of association was tested using a Fisher's exact test. In the second statistical method, the input assay AC50 values were log transformed and scaled as in Equation 1. This scaling yields a value of zero for inactive chemical-assay combinations. The output variable was dichotomized as before. We then performed a t-test comparing the score distribution for the output-positive vs. outputnegative chemicals. Each pair of associations was ranked by the minimum p-value from either test, with a cut-off of p<0.05 designated as statistically significant.
Multivariate models were built where the species-specific pVDCs were designated as outputs and the ToxCast assays and pathway perturbation scores as inputs. The predictive model was constructed from the most significant univariate features, and cross-validated over 20 iterations. For this analysis, the original data matrix was log-transformed. The model is of the form, where if is 1, then assay i is included, otherwise it is not. If the model score for chemical x is >0, the chemical is predicted to be active in the output class or endpoint, otherwise it is predicted to be inactive. Assays are added to the sum in a stepwise fashion, where the one with the most significant univariate association is added first, the second most significant is added next and so on. Model performance was evaluated using a 2x2 contingency table as described above where the true activity vector for the output assay is compared with the predicted activity vector. The model was implemented using a k-fold cross validation algorithm in which the data is randomly divided into training (80%) and test (20%) portions and optimal linear combinations of features are found which maximize the area under the curve (AUC) of the Receiver Operator Characteristic (ROC) curve. In addition to the AUC and Fisher's exact p-value, we also calculated the sensitivity, specificity, balanced accuracy (average of sensitivity and specificity) and other metrics. The algorithm was run multiple times with varying feature sets, allowing for linear combinations of ToxCast in vitro assays (excluding the VBS ranking assays), ToxRefDB in vivo data, and pathway perturbation scores. The model with best cross-validation test balanced accuracy (BA, an average of sensitivity and specificity) was selected for further consideration. The algorithm is implemented in R and is available upon request ("linmod.R" (http://www.epa.gov/ncct/)). Keyword Query: "vasculogenesis" OR "angiogenesis" OR "endothelial cell" OR "vascular development" OR "embryogenesis" OR "vascularization" OR "embryo" OR "tubulogenesis" OR "vessel formation" OR "vessel growth" NOT "cancer" AND Gene Query: "VEGFR2" OR "VEGFR-2" OR "VEGFRII" OR "KDR" OR "FLK1" OR "FLK-1" OR "CD309"

Gene
PubMed Abstracts

VEGFR-2 1775
Tie Arrows represent direction of regulation where applicable. .