Skip to main content
Advertisement
  • Loading metrics

Network-assisted target identification for haploinsufficiency and homozygous profiling screens

Abstract

Chemical genomic screens have recently emerged as a systematic approach to drug discovery on a genome-wide scale. Drug target identification and elucidation of the mechanism of action (MoA) of hits from these noisy high-throughput screens remain difficult. Here, we present GIT (Genetic Interaction Network-Assisted Target Identification), a network analysis method for drug target identification in haploinsufficiency profiling (HIP) and homozygous profiling (HOP) screens. With the drug-induced phenotypic fitness defect of the deletion of a gene, GIT also incorporates the fitness defects of the gene’s neighbors in the genetic interaction network. On three genome-scale yeast chemical genomic screens, GIT substantially outperforms previous scoring methods on target identification on HIP and HOP assays, respectively. Finally, we showed that by combining HIP and HOP assays, GIT further boosts target identification and reveals potential drug’s mechanism of action.

Author summary

Chemical genomic screens have been developed to systematically explore compound-gene interactions with the goal of identifying new drugs and drug targets. Haploinsufficiency profiling and homozygous profiling screens measure the drug-induced growth sensitivities of deletion strains that are grown in the presence of a compound. Traditionally, putative target genes are ranked according to their importance for growth in HIP-HOP screens, since most sensitive strains often carry deletions of genes that are related to the drug target(s). Here, we present GIT (Genetic Interaction Network-Assisted Target Identification), a network analysis method for drug target identification. GIT scores a gene by combining its fitness defect with the screen outcomes of the gene’s neighbors in the genetic interaction network. We demonstrated that GIT significantly improves target identification and elucidates molecular and functional mechanisms of drug action.

Introduction

Chemical genomic screens have been extensively used to discover functional interactions between genes and small molecular compounds in vivo [18]. Due to its short generation time, inexpensive cultivation, and facile genetics, the budding yeast S. cerevisiae has been widely used as a platform for chemical genomic screens to decipher proteins and pathways targeted by small molecular compounds [911]. In comparison to other approaches [1215], yeast chemical genomic screens provide comprehensive and systematic genome-wide measurements of a complete set of deletion strains. When the target protein(s)’s functions are conserved throughout evolution, results of chemical genomic screens in yeast can be readily transferred to other species, including human [1618]. There are two types of yeast chemical genomics assays: haploinsufficiency profiling (HIP) and homozygous profiling (HOP). A HIP assay consists of a set of heterozygous deletion diploid strains that are grown in the presence of a compound. Decreasing gene dosage of a drug target from two copies to one copy will result in increased drug sensitivity, or drug-induced haploinsufficiency [19]. Under normal condition, one copy of gene is adequate for the normal growth for diploid yeast. Haploinsufficiency can happen when a drug is added into the strain. Consequently, HIP experiments are designed to identify the relationship between gene haploinsufficiency and compounds. In contrast, a HOP assay measures drug sensitivities of strains with complete deletion of non-essential genes in either haploid or diploid strains. Because of the complete deletion, HOP assays identify genes that act to buffer the drug target pathway. The fitness defect score (FD-score) is widely used to predict drug targets by comparing the perturbed growth rates to those of a set of control strains [9, 10].

Recently, a large-scale Synthetic Genetic Analysis (SGA) study [20] showed that target’s genetic interaction profiles are highly correlated with the outcomes of chemical genomic screens, suggesting the possibility of combining genetic interaction profiles with chemical genomic screens for drug target identification [2123]. A genetic interaction is measured as the difference between the experimentally measured double-mutant phenotype and the expected double-mutant phenotype [24]. A negative genetic interaction occurs when two genes have similar functions that compensate each other’s absence to support cell viability [2527]. In contrast, a positive genetic interaction occurs when a mutation in one gene rescues the fitness defect associated with a mutation in another gene [2830].

Intuitively, a gene’s genetic interaction neighbors are also modulated if the gene is targeted and perturbed by a compound. Hence, we can use genetic interaction neighbors’ FD-scores to assist the inference of drug targets on chemical genomic screens. To the best of our knowledge, the only previous attempt to combine chemical genomic screens with genetic interaction profiles is by computing the Pearson correlation coefficient between their outcomes [23]. A higher and positive Pearson correlation coefficient indicates a potential compound-target interaction because genetic perturbation is inherently similar to chemical perturbation. However, this approach often works poorly because the Pearson correlation coefficient is sensitive to the noise in high-throughput chemical genomic screens and SGA profiles. Moreover, these existing methods ignore the inherent differences between HIP and HOP assays. Since HIP and HOP assays are complementary, combining HIP and HOP should further improve target identification and enhance our understanding of compounds’ mechanisms of action (MoA) in a comprehensive way.

In this work, we introduce GIT, a novel Genetic Interaction Network-Assisted Target Identification scoring method for HIP-HOP screens. Due to the inherent similarity between genetic perturbation and chemical perturbation, it is possible to use genetic interaction neighbors’ chemical genomic profiles to assist drug target inference. Therefore, we adopt a network biology perspective to detect drug targets by its neighbors. We first constructed a weighted, signed genetic interaction network from SGA profiles. For HIP assays, GIT supplements a gene’s FD-score by the FD-scores of its neighboring genes in the genetic interaction network. If the FD-scores of its positive genetic interaction neighbors are high while the FD-scores of its negative genetic interaction neighbors are low, the gene is more likely to be a target. For HOP assays, GIT incorporates the FD-scores of long-range two-hop neighbors to identify drug targets, since HOP is more likely to prioritize genes that buffer the drug target pathway rather than the direct targets. By combining HIP and HOP assays using GIT, we observed further improvement in target identification. Extensive experiments on three genome-wide chemical genomic screens demonstrated that GIT substantially improves target identification in comparison with existing scoring methods. We also identified many novel compound-target interactions that are currently not in any curated database but are supported by literature. In addition to target identification, we further demonstrated that GIT can be used to reveal the mechanisms of action of compounds and uncover co-functional gene complexes.

Methods

FD-score

The fitness defect score (FD-score) is the log-ratio of the growth defect of a deletion strain in response to a compound treatment, relative to its growth under control conditions. For gene deletion strain i and compound c, the corresponding FD-score is defined as (1) where ric is the growth defect of deletion strain i in the presence of compound c, and is the average growth defect of deletion strain i measured under multiple control conditions without any compound treatment. FD-score reflects the sensitivity of a gene deletion strain to a compound treatment. Specifically, a negative FDic score means the growth fitness of the strain i in the presence of the chemical c should be weaker than that of the control without treatment. Therefore, a low, negative FD-score indicates a putative interaction between the deleted gene and the compound.

The FD-score does not consider epistasis or interactions among genes. However, recent studies indicate that the phenotype of a particular strain can be caused by the deletion of a genetic modifier of a neighboring gene that is responsible for the phenotype [9, 3134]. Therefore, it is necessary to consider a gene’s neighboring genes for target identification.

Genetic interaction network

We first obtained genetic interaction profiles of 5.4 million gene-gene pairs in yeast from a recent genome-scale Synthetic Genetic Array (SGA) study [20]. We then constructed a signed, weighted genetic interaction network based on these profiles. The edge weight gij between gene i and gene j in the genetic interaction network is defined as (2) where fij is the double-mutant growth fitness, and fi is the single-mutant growth fitness of gene i. A negative genetic interaction refers to a more severe growth fitness observed than expected, with an extreme case being synthetic lethality, whereas a positive genetic interaction refers to double mutants with a less severe growth fitness than expected [35, 36].

Network-assisted target identification in HIP assays

To identify drug targets in HIP assays, we introduce the GITHIP-score, which combines a gene’s FD-score and the FD-scores of its neighboring genes in the genetic interaction network. For gene i and compound c, we define the GITHIP-score as (3) The GITHIP-score considers two types of information of gene i: the FD-score of gene i and the FD-scores of gene i’s genetic interaction neighbors (Fig 1).

thumbnail
Fig 1. Illustration of how GIT identifies drug targets in HIP assays.

Red (Blue) nodes indicate genes with high (low) FD-scores. Red (Blue) lines indicates positive (negative) genetic interactions. Yellow node indicates drug target. GIT supplements a gene’s FD-score by the FD-scores of its neighboring genes in the genetic interaction network. GIT identifies a gene as the target if the gene’s positive genetic interaction neighbors have high FD-scores and the gene’s negative genetic interaction neighbors have low FD-scores.

https://doi.org/10.1371/journal.pcbi.1005553.g001

To account for different signs and strengths of genetic interactions, we compute a linear combination of the FD-scores of neighboring genes according to their genetic interaction edge weights to gene i. When gene i and gene j are a negative genetic interaction pair (gij < 0) and gene i is the target of compound c, it is very likely that we also observe a negative FDjc value. This is because the deletion of one copy of gene j will make gene i more essential to the cell growth, according to the SGA assays. Likewise, if gene i and gene j are a positive genetic interaction pair (gij > 0) and gene i is the target of compound c, it is very likely that we observe a positive FDjc value. This is because the deletion of one copy of gene j will make gene i less essential to the cell growth. Therefore, we designed the GIT score to integrate the information from the genetic interaction neighbors to increase the signal-to-noise ratio, thus improving the sensitivity of the target identification. A low GITHIP-score indicates a potential compound-target interaction.

For target that cannot be accurately identified by the FD-score because of the noise in chemical genomic screens or neighboring gene effect [31], the GITHIP-score corrects the target’s FD-score according to the FD-scores of its genetic interaction neighbors. Previous study proposed to identify essential genes in human cancer cell lines according to the expression profile of genetic interaction neighbors [37]. Specifically, one gene becomes more essential if its negative genetic interaction neighbors are inactive and its positive genetic interaction neighbors are active [38]. In this paper, we studied more direct phenotypic growth fitness from chemical genomics assays, instead of gene expression data. Accordingly, the GITHIP-score can also be viewed as a conditional essentiality score since it captures the growth fitness of genetic interaction neighbors.

Network-assisted target identification in HOP assays

Compared to HIP, HOP assays delete both copies of non-essential genes in either haploid or diploid strains. Therefore, the FD-score in HOP assays prioritizes genes that buffer the drug target pathway. In contrast to existing studies [911, 23], which apply the same scoring methods for HIP and HOP assays, we introduce a different network-assisted approach for HOP assays to tackle the inherent difference between HIP and HOP assays. Since genes with low FD-scores in the HOP assay are often close to or located in the drug target pathway, we extend our framework to consider the FD-scores of long-range two-hop genetic interaction neighbors.

To utilize long-range two-hop genetic interaction neighbors’ FD-scores, we first calculate the first-order GIT-score of each gene. We define the first-order GIT-score of gene i in the presence of compound c as (4)

We then use the first-order GIT-score to calculate the GITHOP-score for target identification in HOP assays. (5) The GITHOP-score considers two types of information for each gene: the FD-score of the gene itself and the first-order GIT-scores of the gene’s genetic interaction neighbors (Fig 2). A low GITHOP-score indicates a potential compound-target interaction.

thumbnail
Fig 2. Illustration of how GIT identifies drug targets in HOP assays.

Red (Blue) nodes indicate genes with high (low) FD-scores. Red (Blue) lines indicate positive (negative) genetic interactions. Yellow node indicates drug target. GIT supplements a gene’s FD-score by the FD-scores of its long-range two-hop neighbors in the genetic interaction network. These long-range two-hop neighbors capture the drug target pathway buffer effect in HOP assays.

https://doi.org/10.1371/journal.pcbi.1005553.g002

While genes that act to buffer the drug target pathway may not be the one-hop neighbors of the target, the GITHOP-score takes into account indirect neighboring genes in the genetic interaction network. Specifically, if the FD-scores of positively (negatively) weighted two-hop neighbors are high (low), the corresponding one-hop neighbors will have a significantly low GIT1st-scores. On the other hand, if the FD-scores of positively (negatively) weighted two-hop neighbors are low (high), the corresponding one-hop neighbors will have a significantly high GIT1st-scores. By correcting a gene’s FD-score according to the GIT1st-scores of the gene’s genetic interaction neighbors, the GITHOP-score explicitly considers the FD-scores of two-hop neighbors, which capture the drug target pathway buffer effect in the HOP assay.

GITHOP-scores can be viewed as the second-order GIT-score. It can be calculated by iteratively multiplying the FD-score vector with the genetic interaction network matrix. Although this scoring framework can be naturally extended to kth-order, we did not observe substantial improvement for k > 2 in our experiments. We showed the performance of using different k in S8 Fig.

ρ-score

In addition to the FD-score, we compared our GIT-score with the ρ-score, which combines genetic interaction profiles with chemical genomic screens by computing the Pearson correlation coefficient between their outcomes [23]. ρ-score is proposed to leverage the inherent similarity between genetic perturbation and chemical perturbation. Both of them profiles the relative growth fitness when a gene i is knockdown by mutation or chemical compound. Thus if the genetic interaction profile of a gene i is positively correlated to the FD-score profile of a compound c, it means that the effect of mutating this gene i is similar to adding a compound c.

For gene i and compound c, the ρ-score is defined as (6) where ngh(i) denotes the genetic interaction neighbors of gene i. When calculating ρic, we excluded pairs of k and c if FDkc is missing a value. A high, positive ρ-score indicates a potential compound-target interaction.

Datasets

STITCH compound-target interactions and genetic interaction profiles.

We obtained known compound-target interactions from the STITCH 4 database [39] as a benchmark to evaluate the performance of different target identification scoring methods. These compound-target interactions are built from heterogeneous data sources, including experiments, expert-curated databases, and literature mining. Each interaction has a combined score (between 0 and 1) representing the confidence. We excluded low-confidence interactions (combined score < 0.4), as suggested by the STITCH 4 database. We further excluded interactions predicted solely from putative homologs from other species. After filtering compounds that are not in any of the collected chemical genomics screens, we obtained 1472 compound-target interactions. We obtained genetic interaction profiles of 5.4 million gene-gene pairs in yeast from a recent genome-scale Synthetic Genetic Array (SGA) study [20].

Genome-wide HIP-HOP screens.

We obtained three yeast HIP-HOP chemical genomic screens from [911]. For evaluation purposes, we only included compounds that had at least one STITCH compound-target interaction. The first screen from Hoepfner et al. 2014 has 4,146 and 4,921 deletion strains grown under 71 and 73 compound treatment conditions for heterozygous and homozygous deletion collections, respectively. The second screen from Lee et al. 2014 has 1,095 and 4,810 deletion strains grown under 382 compound treatment conditions for heterozygous and homozygous deletion collections, respectively. The third screen from Hillenmeyer et al. 2008 has 5,307 and 4,810 deletion strains grown under 333 and 162 compound treatment conditions for heterozygous and homozygous deletion collections, respectively.

Experimental settings

We evaluated the performance of each scoring method based on two criteria: 1) the number of compound-target interactions that can be identified in the top k genes and 2) the number of compounds, at least one target of which can be identified in the top k genes.

For each criteria, we plotted a curve which describes the number of identified compound-target interactions (drugs for second criteria) against the rank of gene for each scoring method. The y-axis shows the number of compound-target interactions (drugs for second criteria) identified in the top k genes, where k is shown on the x-axis. We first calculated the area under the curve (AUC) of each scoring method.

Then we calculated a normalized AUC (nAUC) for scoring method X as (7) where the AUCFD−score is the AUC of the FD-score and the AUCX is the AUC of the scoring method X. If the nAUC is larger than 1, then the corresponding scoring method is better than the FD-score method.

We denoted the nAUC obtained from the first criteria as nAUCt and the nAUC obtained from the second criteria as nAUCd. Since one drug may have multiple targets, nAUCt is more suitable for evaluation. Therefore, nAUCt is the primary metric used for evaluation in this paper.

When calculating the GITHIP-score and the GITHOP-score, we only considered the top q positive genetic interaction neighbors with highest weights and the top q negative genetic interaction neighbors with highest absolute weights for each gene in the network. We set q to 100 in all three screens. We showed in S7 Fig that the performance of our method is not sensitive to different values of q.

Results

GIT substantially improves target identification in HIP assays

To evaluate GIT in HIP assays, we performed large-scale target identification on three chemical genomic screens. The results are summarized in Fig 3 and S1 Fig. It is clear that our GITHIP-score substantially outperforms other scoring methods on all three screens. For example, on Hoepfner et al. 2014 screen, GIT achieves 1.270 nAUCt, which is much higher than 1.000 nAUCt of the FD-score. On Hillenmeyer et al. 2008 screen, GIT identifies 89 compound-target interactions in the top 150 genes, which is again substantially higher than the 75 compound-target interactions identified by the FD-score. Similar improvement was observed in terms of nAUCd on all three screens.

thumbnail
Fig 3. Comparison of GIT with other scoring methods in terms of nAUCt in HIP assays on three chemical genomic screens.

The y-axis shows the number of compound-target interactions identified in the top k genes, where k is shown on the x-axis.

https://doi.org/10.1371/journal.pcbi.1005553.g003

To further understand how the GITHIP-score achieves improved performance, we listed compound-target interactions that are ranked higher by the GITHIP-score than by the FD-score in Table 1. For instance, the FD-score failed to identify the interaction between HSP90 and geldanamycin because of the large FD-score (2.01) of HSP90 in the presence of geldanamycin. In contrast, our GITHIP-score successfully identified HSP90 as the target of geldanamycin by considering relevant HSP90’s negative genetic interaction neighbors (e.g., SGT1 with FD-score -8.77) and its positive genetic interaction neighbors (e.g., NSE3 with FD-score 0.68).

thumbnail
Table 1. Top compound-target interactions identified by GIT in HIP assays.

We listed examples of compound-target interactions that are ranked higher by the GITHIP-score than by the FD-score. We showed compound-target interactions that are identified in the top 100 genes by the GITHIP-score.

https://doi.org/10.1371/journal.pcbi.1005553.t001

We further compared the GITHIP-score with the ρ-score, which combines SGA profiles with chemical genomic screens via the Pearson correlation coefficient. We found that GIT substantially outperforms the ρ-score on all three chemical genomic screens (Fig 3 and S1 Fig). For instance, GIT achieves 1.270 nAUCt on Hoepfner et al. 2014 screen, which is much higher than 0.919 nAUCt of the ρ-score. Same as the observation in the previous work [23], the ρ-score performs consistently worse than the FD-score, possibly due to the fact that the Pearson correlation coefficient is sensitive to the noise in high-throughput chemical genomic screens and SGA profiles.

Finally, we examined whether other molecular networks also enable good target identification performance. Since previous work [40] has used protein-protein interaction network to identify essential genes in CRISPR screens, we hope to test if protein interaction network can also achieve good performance when is used to identify drug targets. Therefore, we compared the performance between protein interaction network and genetic interaction network on chemical genomics screens. We first obtained a physical interaction network (PI) of yeast proteins from BioGRID V3.4 [41]. There are no sign but only weighted edges in the obtained physical interaction network. We then used Eq 3 to calculate the GITHIP-score(PI) based on this physical interaction network, where gij is the weight between gene i and gene j in the physical interaction network. We found that the GITHIP-score has better performance than the GITHIP-score(PI) (S2 Fig). Since the physical interaction network is unsigned, the GITHIP-score(PI) inevitably prioritizes hubs which have a large number of neighbors. In contrast, by considering both negatively weighted edges and positively weighted edges in the genetic interaction network, our GITHIP-score is not biased towards hubs and thus substantially enhances target identification performance.

GIT substantially improves target identification in HOP assays

We next performed large-scale target identification on three chemical genomic screens to evaluate GIT in HOP assays. The results are summarized in Fig 4 and S3 Fig. It is clear that our GITHOP-score achieves the best target identification performance on all three chemical genomic screens. For instance, the GITHOP-score achieves 1.355 nAUCt on Lee et al. 2014 which is substantially higher than 1.000 nAUCt of the FD-score. We listed compound-target interactions that are ranked higher by the GITHOP-score than by the FD-score in Table 2. For instance, the FD-score fails to identify YSR2 as the target of sphingosine because of the close to zero FD-score (-0.017) of YSR2 in the presence of sphingosine. In contrast, the GITHOP-score successfully identifies the interaction between YSR2 and sphingosine, mainly due to the high GIT1st-scores of YSR2’s positive genetic interaction neighbors (e.g., CSF1 with the GIT1st-score 1.60) and the low GIT1st-scores of YSR2’s negative genetic interaction neighbors (e.g., VBM2 with the GIT1st-score -3.00). The substantial improvement of GIT over the FD-score demonstrates the promising of utilizing two-hop genetic interaction neighbors’ FD-scores to identify targets in HOP assays.

thumbnail
Fig 4. Comparison of GIT with other scoring methods in terms of nAUCt in HOP assays on three chemical genomic screens.

The y-axis shows the number of compound-target interactions identified in the top k genes, where k is shown on the x-axis.

https://doi.org/10.1371/journal.pcbi.1005553.g004

thumbnail
Table 2. Top compound-target interactions identified by GIT in HOP assays.

We listed examples of compound-target interactions that are ranked higher by the GITHOP-score than by the FD-score. We showed compound-target interactions that are identified in the top 100 genes by the GITHOP-score.

https://doi.org/10.1371/journal.pcbi.1005553.t002

It is crucial to understand whether it is necessary to apply different scoring methods to the HIP assay and the HOP assay. To this end, we calculated the GITHIP-score in the HOP assay based on Eq 3, where FDic (FDjc) is the FD-score of the HOP assay rather than the HIP assay. Consequently, the GITHIP-score in the HOP assay corrects a gene’s FD-score only according to the FD-scores of its one-hop neighbors. Although the GITHIP-score in HOP assays achieves better performance in comparison to the FD-score and the ρ-score, we noticed that it is consistently worse than the GITHOP-score on all three screens (S4 Fig), suggesting the necessity of utilizing the FD-scores of two-hop neighbors to capture the drug target pathway buffer effect in the HOP assay.

We further investigated whether two-hop neighbors also enable better performance in HIP assays. We calculated the GITHOP-score in HIP assays by using the FD-score of HIP assays in Eqs 4 and 5. Different from HOP assays, the GITHOP-score does not obtain substantial improvement in comparison to the GITHIP-score in HIP assays, reflecting the inherent differences between the HIP assay and the HOP assay.

Finally, we examined whether using the physical interaction network enables good target identification performance in HOP assays. We used Eqs 4 and 5 to calculate the GITHOP-score(PI) based on the obtained physical interaction network, where gij is the edge weight between gene i and gene j in the physical interaction network. We compared the GITHOP-score(PI) with the GITHOP-score in HOP assays (S4 Fig). Similar to our observation in HIP assays, we found that using the genetic interaction network has an overall better performance than using the physical interaction network in HOP assays.

Combining the HIP assay with the HOP assay further improves target identification

Since the HIP assay and the HOP assay are inherently different, we then studied whether these two assays can be combined for improving target identification. We noticed that either using the GITHIP-score from the HIP assay or using the GITHOP-score from the HOP assay identifies compound-target interactions that are not identified by the other (S5 Fig). For example, in the Hoepfner et al. 2014 screen, the GITHOP-score identifies 11 compound-target interactions that are not discovered by the GITHIP-score. Since the HIP assay and the HOP assay are complementary, we sought to combine them in order to further enhance the target identification performance.

We proposed a combined scoring method which takes the average of the z-score by the GITHIP-score from the HIP assay and the z-score by the GITHOP-score from the HOP assay. Averaging these two scores can be viewed as boosting two weaker scoring methods to create a more robust and better scoring method. We denote this score as the GIT-score. Since Lee et al. 2014 only measures the FD-scores of the heterozygous strains of essential genes and the homozygous strains of nonessential genes, there is no overlapping genes between the HIP assay and the HOP assay in this screen. Therefore, we evaluated the GIT-score on the other two chemical genomic screens. We observed that the combined GIT-score is substantially better than both the GITHIP-score and the GITHOP-score (Fig 5). For example, the GIT-score achieves 1.802 nAUCt, which is much higher than 1.284 nAUCt of the GITHOP-score and 1.000 nAUCt of the GITHIP-score on the Hoepfner et al. 2014 screen.

thumbnail
Fig 5. Comparison of GIT with other scoring methods in terms of nAUCt on two chemical genomic screens.

The y-axis shows the number of compound-target interactions identified in the top k genes, where k is shown on the x-axis. The GITHIP-score is calculated by using the FD-score from the HIP assay. The GITHOP-score is calculated by using the FD-score from the HOP assay. To calculate nAUCt, we divided the AUC of each method by the AUC of the GITHIP-score.

https://doi.org/10.1371/journal.pcbi.1005553.g005

Statistical assessment of using genetic interaction network in GIT

Since drug targets are likely to be enriched with high-degree nodes in the network [42], the improvement of GIT may be caused by its ability to prioritize these high-degree nodes instead of utilizing neighboring gene’s FD-scores. We then examined whether the improvement of GIT comes from prioritizing high degree nodes. We first constructed a large set of random networks according to the following procedure. For each node, we replaced each of its neighbors to another random node in the network while keeping the same edge weight and sign. Hence, each node in the new random networks has the same number of positive weighted neighbors and negative weighted neighbors as in the original genetic interaction network. We then used these networks to calculate the GIT-score, where gij is the edge weight between gene i and gene j. We calculated an empirical p-value according to the number of random networks that have better performance than the genetic interaction network when used to identify targets. Since each node in the new random networks have the same degree as it does in the original genetic interaction network, this empirical p-value tests whether the improvement of GIT is achieved by identifying high-degree nodes. We obtained significant empirical p-values on both screens (empirical p-value < 0.009 on Hoepfner et al. 2014; empirical p-value < 0.018 on Hillenmeyer et al. 2008). Therefore, we found that GIT on these random networks is significantly worse compared to GIT on the original GI network. This demonstrates that the improvement comes from correcting each gene’s FD-score with its neighbors’ FD-scores rather than the network topology only.

GIT elucidates established mechanism of action of compound

To understand how GIT achieves the substantial improvement, we studied how GIT-score elucidates the compound’s MoA. We examined caffeine, which is a distinct, small molecular inhibitor of TOR complex [43, 44]. We analyzed GIT’s performance based on the Hoepfner et al. 2014 chemical genomic screen, which is the most recent screen among all three screens. We first noticed that the FD-scores of TOR1 in the HIP assay and the HOP assay are 0.079 and -1.827 in the presence of caffeine, respectively. Consequently, the FD-score fails to identify TOR1 as the very top target candidate. In contrast, our GIT-score successfully identifies TOR1 as the target of caffeine, mainly due to the high FD-scores of its positive genetic interaction neighbors SAC7 and UGP. The high FD-scores of SAC7 (1.75) and UGP (2.38) indicate that their positive genetic interaction neighbor TOR1 is inhibited by caffeine. Moreover, most of TOR1’s negative genetic interaction neighbors have substantially low FD-scores (e.g., GTR1 has -4.52 FD-score). We show the genetic interaction neighbors of TOR1 in Fig 6. We can see that TOR1’s negative genetic interaction neighbors have low FD-scores, whereas its positive genetic interaction neighbors have high FD-scores. Even though TOR1 does not have a substantially low FD-score, GIT still accurately identifies TOR1 as the target of caffeine by correcting TOR1’s FD-score according to its genetic interaction neighbors’ FD-scores.

thumbnail
Fig 6. GIT elucidates MoA of caffeine.

Green lines indicate positive genetic interactions. Red lines indicate negative genetic interactions. Red nodes (e.g., GTR1) are genes with low FD-scores in the HIP assay and the HOP assay. Green nodes (e.g., SAC7) are genes with high FD-scores in the HIP assay and the HOP assay. TOR1 and TOR2 are established targets of caffeine. The GIT-score successfully identifies their interactions with caffeine by using neighboring genes’ FD-scores. TOR1 complex, TOR2 complex, and GSE complex are identified as crucial cellular components that interact with caffeine.

https://doi.org/10.1371/journal.pcbi.1005553.g006

Notably, the GIT-score also identifies TOR2 as a target of caffeine. Although the GIT-score ranks TOR2 lower than TOR1, it ranks TOR2 higher than the FD-score does. We noticed that it is difficult to identify TOR2 only according to one-hop genetic interaction neighbors’ FD-scores. Both TOR1 and AVO1 have close to zero FD-scores. GTR1 has a substantially low FD-score, but the genetic interaction edge weight between GTR1 and TOR2 is much lower than the one between GTR1 and TOR1. Nevertheless, the GIT-score still identifies TOR2 as a target of caffeine through the GIT1st-scores of TOR2’s genetic interaction neighbors (e.g, GTR1, TOR1 and AVO1).

Fig 6 not only elucidates how the GIT-score identifies the targets of caffeine, but also reveals the underlying MoA of caffeine. We can see that there are three major functional complexes that interact with caffeine. Both the TOR1 complex and the TOR2 complex are established cellular components that are affected by caffeine [43, 44]. GSE complex, along with PIB2 and YCL062W, are all associated with vacuolar membranes which play important roles in the TOR2 complex [45].

GIT discovers novel compound-target interactions

We then investigated whether those novel compound-target interactions that are discovered by the GIT-score can be supported by existing literature. The output of GIT is a score for each compound-gene pairs. The top ranking genes are the potential targets of each drug. According to the average number of targets of each compound, we proposed to use an empirical p-value 0.001 as the cut-off values of significant compound-target interactions. Here, for each compound, we examined the top five genes that were predicted to be potential targets by the GIT-score. We listed the novel compound-target interactions that are supported by literature in Table 3. To show the advantage of using the genetic interaction network, we only listed targets that cannot be identified by the FD-score. For example, the GIT-score successfully identifies the interaction between 5-fluorouracil and SSF1. This interaction is verified by a haploid yeast knockout strains screen [46]. In contrast to the FD-score which fails to identify SSF1, the GIT-score identifies this interaction according to the high FD-scores of SSF1’s positive genetic interaction neighbors (YLR407W, TSL26) and the low FD-scores of SSF1’s negative genetic interaction neighbors (RRP6, YMR268W-A). The GIT-score also identifies POL32 as the target of camptothecin, which is verified by a recent cross-species chemical genomics profiling [47]. Again, the FD-score of POL32 in the presence of camptothecin is not significantly low (-0.38 in the HIP assay and -1.38 in the HOP assay). The GIT-score identifies POL32’s interaction with camptothecin through the high FD-scores of its positive genetic interaction neighbors (VPS39, BTS1 and RPL13A) and the low FD-scores of its negative genetic interaction neighbors (YCL060C, XRS1 and RTT110).

thumbnail
Table 3. Novel compound-target interactions identified by GIT.

We listed compound-target interactions that are identified by the GIT-score but are currently not in any curated database. All these interactions are supported by literature. For each identified targets, we also listed its positive genetic interaction neighbors that have high FD-scores and its negative genetic interaction neighbors that have low FD-scores.

https://doi.org/10.1371/journal.pcbi.1005553.t003

GIT groups genes into co-functional gene complexes

In addition to understanding of compound’s MoA, we studied whether the GIT-score can be used to identify co-functional gene complexes. We used k-means to cluster yeast genes into 100 different clusters based on their GIT-scores in the presence of different compounds. For each cluster, we used Fisher’s exact test to test whether it was enriched with the annotated genes of at least one Gene Ontology term. We obtained Gene Ontology annotations from BioGRID V3.4 [41]. We compared the clustering performance of using the GIT-score with using the FD-score from the HIP assay and using the FD-score from the HOP assay on three Gene Ontology categories in Fig 7 and S6 Fig. We can see that the GIT-score discovers more established complexes than the FD-score. For example, 93 of the 100 clusters identified by the GIT-score are significantly enriched with at least one cellular component function by using a false discovery rate of 0.005. In contrast, the FD-score from the HOP(HIP) assay only identifies 75(65) clusters that are significantly enriched with at least one cellular component function. In addition, the GIT-score also exclusively identifies many important cellular component complexes such as pore complex, chromosome, centromeric region, and microtubule.

thumbnail
Fig 7. GIT identifies co-functional gene complexes.

The y-axis shows the number of co-functional gene complexes that can be identified by the GIT-score, the FD-score from the HIP assay, and the FD-score from the HOP assay by using false discovery rate r, where r is shown in the x-axis. (A) shows the biological process category. (B) shows the cellular component category.

https://doi.org/10.1371/journal.pcbi.1005553.g007

Discussion

Here we have reported the discovery that, through the use of prior knowledge captured in the genetic interaction network, compound-target interactions can be identified more accurately on chemical genomic screens. Our method identifies many compound-target interactions that comprise existing curated database as well as novel compound-target interactions that are supported by literature evidence. Due to its ability in modeling the genetic interaction among genes, we can better understand the mechanism of action of compounds, which may provide new insight into drug discovery and drug repositioning. Historically, genetic interaction profiles have been integrated with chemical genomic screens to identify compound-target interactions via the Pearson correlation coefficient [23] due to the inherent similarity between genetic perturbation and chemical perturbation. Our study is different from these previous works in that different local network topology features are taken into consideration in HIP and HOP assays. To the best of our knowledge, this is the first time that HIP and HOP assays are used differently to decipher compound-target interactions.

One might consider at least three potential reasons for the good performance of GIT. First, existing high-throughput chemical genomic screens might be noisy. GIT is more robust to the noise by using genetic interaction neighbors’ FD-scores to assist the inference of drug targets. Second, HOP assay and HIP assay are fundamentally different biological assays, thus prioritizing different sets of genes. We use direct neighbors’ FD-scores to identify compound-target interactions in the HIP assay, whereas we consider two-hop neighbors’ FD-scores to capture the drug target pathway buffer effect in the HOP assay. Moreover, combining predictions from the HIP assay and the HOP assay further makes GIT more robust. Finally, the genetic interaction network reflects the consequence of perturbing gene function and uncovers broader relationships between diverse functional modules, thus provides functional information that is largely invisible to physical interactions.

One interesting observation is that GIT achieves a substantial improvement in comparison to the ρ-score. Both the ρ-score and the GIT-score use genetic interactions to assist target identification. However, the ρ-score prioritizes genes according to the Pearson correlation between one gene’s chemical genomic profile and its genetic interaction profile. Consequently, it is sensitive to the noise in SGA and chemical genomic screens. In contrast, GIT scores a gene according to the dot product between its neighbors’ FD scores and their genetic interaction edge weights, making it more robust to the noise. More importantly, in HOP assays, unlike the ρ-score which only considers one gene’s one-hop genetic interaction neighbors, we also consider its two-hop genetic interaction neighbors. Our observation that the GITHOP-score (two-hop) has much better performance than the GITHIP-score (one-hop) and the ρ-score (one-hop) in HOP assays demonstrates the promising of considering two-hop genetic interaction neighbors in the HOP assay.

Finally, we see many opportunities to improve upon the basic concept of GIT in future work. First, although the current GIT framework is developed in an unsupervised fashion, the GIT-score can be used as the feature and plugged into off-the-shelf machine learning classifier for compound target identification on chemical genomic screens. Second, although this study focused on yeast chemical genomic assays, the GIT method is broadly applicable to any drug perturbation screens on other species [48]. Finally, current genetic interaction network is still noisy, whereas GIT can be potentially used to predict the genetic interaction given the compound-target interaction. For example, one gene with a low FD-score might have a negative genetic interaction with the drug target. In comparison to model organisms such as yeast and worm, high-throughput measuring genetic interactions in human is inherently difficult due to the lower efficiency of genetic engineering and the absence of resources like the yeast knockout collection. With available large-scale drug perturbation screens [48] in human, GIT offers the intriguing opportunity to explore genetic interactions in human.

Supporting information

S1 Fig. Comparison of GIT with other scoring methods in terms of nAUCd in HIP assays on three chemical genomic screens.

The y-axis shows the number of compounds, at least one target of which can be identified in the top k genes, where k is shown on the x-axis.

https://doi.org/10.1371/journal.pcbi.1005553.s001

(EPS)

S2 Fig.

A,C,E are the comparison of GIT with other scoring methods in terms of nAUCt in HIP assays on three chemical genomic screens. B,D,F are the comparison of GIT with other scoring methods in terms of nAUCd in HIP assays on three chemical genomic screens. The GITHOP-score is calculated by applying Eq 5 to HIP assay. The GITHIP-score(PI) is calculated based on the physical interaction network instead of the genetic interaction network.

https://doi.org/10.1371/journal.pcbi.1005553.s002

(EPS)

S3 Fig. Comparison of GIT with other scoring methods in terms of nAUCd in HOP assays on three chemical genomic screens.

The y-axis shows the number of compounds, at least one target of which can be identified in the top k genes, where k is shown on the x-axis.

https://doi.org/10.1371/journal.pcbi.1005553.s003

(EPS)

S4 Fig.

A,C,E are the comparison of GIT with other scoring methods in terms of nAUCt in HOP assays on three chemical genomic screens. B,D,F are the comparison of GIT with other scoring methods in terms of nAUCd in HOP assays on three chemical genomic screens. The GITHIP-score is calculated by applying Eq 3 to HOP assay. The GITHOP-score(PI) is calculated based on the physical interaction network instead of the genetic interaction network.

https://doi.org/10.1371/journal.pcbi.1005553.s004

(EPS)

S5 Fig. Venn diagrams show the overlap between the identified interactions by using the GITHIP-score in the HIP assay and the identified interactions by using the GITHOP-score in the HOP assay on (A) Hoepfner et al. 2014 and (B) Hillenmeyer et al. 2008, respectively.

https://doi.org/10.1371/journal.pcbi.1005553.s005

(EPS)

S6 Fig. GIT identifies co-functional gene complexes in molecular function category.

The y-axis shows the number of co-functional gene complexes that can be identified by the GIT-score, the FD-score from the HIP assay, and the FD-score from the HOP assay by using false discovery rate r, where r is shown in the x-axis.

https://doi.org/10.1371/journal.pcbi.1005553.s006

(EPS)

S7 Fig. The robustness of the result on different q values on three chemical genomics screens.

The y-axis is the nAUCt. The x-axis is different q values. We show the nAUCt of GITHIP-score by using different q values.

https://doi.org/10.1371/journal.pcbi.1005553.s007

(EPS)

S8 Fig. The results of using k-hop neighbors where k is ranged from 1 to 10.

The y-axis is the nAUCt. The x-axis is different k values. We show the nAUCt of GITHOP-score by using different k values. We found that k = 2 has the best overall performance on HOP assays. When k is larger, the performance becomes worse due to the increasing noise from long distance neighbors in the genetic interaction network.

https://doi.org/10.1371/journal.pcbi.1005553.s008

(EPS)

Author Contributions

  1. Conceptualization: SW JP.
  2. Data curation: SW.
  3. Formal analysis: SW JP.
  4. Funding acquisition: JP.
  5. Investigation: SW JP.
  6. Methodology: SW JP.
  7. Project administration: JP.
  8. Resources: JP.
  9. Software: SW.
  10. Supervision: JP.
  11. Validation: SW JP.
  12. Writing – original draft: SW JP.
  13. Writing – review & editing: SW JP.

References

  1. 1. Smith AM, Ammar R, Nislow C, Giaever G. A survey of yeast genomic assays for drug and target discovery. Pharmacol Ther. 2010;127(2):156–164. pmid:20546776
  2. 2. Nijman SMB. Functional genomics to uncover drug mechanism of action. Nat Chem Biol. 2015;11(12):942–948. pmid:26575241
  3. 3. Giaever G, Flaherty P, Kumm J, Proctor M, Nislow C, Jaramillo DF, et al. Chemogenomic profiling: identifying the functional interactions of small molecules in yeast. Proc Natl Acad Sci U S A. 2004;101(3):793–798. pmid:14718668
  4. 4. Roemer T, Davies J, Giaever G, Nislow C. Bugs, drugs and chemical genomics. Nat Chem Biol. 2012;8(1):46–56. pmid:22173359
  5. 5. Giaever G, Flaherty P, Kumm J, Proctor M, Nislow C, Jaramillo DF, et al. Chemogenomic profiling: identifying the functional interactions of small molecules in yeast. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(3):793–798. pmid:14718668
  6. 6. Chua HN, Roth FP. Discovering the targets of drugs via computational systems biology. Journal of Biological Chemistry. 2011;286(27):23653–23658. pmid:21566122
  7. 7. Lum PY, Armour CD, Stepaniants SB, Cavet G, Wolf MK, Butler JS, et al. Discovering modes of action for therapeutic compounds using a genome-wide screen of yeast heterozygotes. Cell. 2004;116(1):121–137. pmid:14718172
  8. 8. Hoon S, Smith AM, Wallace IM, Suresh S, Miranda M, Fung E, et al. An integrated platform of genomic assays reveals small-molecule bioactivities. Nature chemical biology. 2008;4(8):498–506. pmid:18622389
  9. 9. Hillenmeyer ME, Fung E, Wildenhain J, Pierce SE, Hoon S, Lee W, et al. The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science. 2008;320(5874):362–365. pmid:18420932
  10. 10. Lee AY, St Onge RP, Proctor MJ, Wallace IM, Nile AH, Spagnuolo PA, et al. Mapping the cellular response to small molecules using chemogenomic fitness signatures. Science. 2014;344(6180):208–211. pmid:24723613
  11. 11. Hoepfner D, Helliwell SB, Sadlish H, Schuierer S, Filipuzzi I, Brachat S, et al. High-resolution chemical dissection of a model eukaryote reveals targets, pathways and gene functions. Microbiol Res. 2014;169(2–3):107–120. pmid:24360837
  12. 12. Rix U, Uwe R, Giulio SF. Target profiling of small molecules by chemical proteomics. Nat Chem Biol. 2009;5(9):616–624. pmid:19690537
  13. 13. Teotico DG, Babaoglu K, Rocklin GJ, Ferreira RS, Giannetti AM, Shoichet BK. Docking for fragment inhibitors of AmpC beta-lactamase. Proc Natl Acad Sci U S A. 2009;106(18):7455–7460. pmid:19416920
  14. 14. Schreiber SL. Target-oriented and diversity-oriented organic synthesis in drug discovery. Science. 2000;287(5460):1964–1969. pmid:10720315
  15. 15. Keiser MJ, Vincent S, Irwin JJ, Christian L, Abbas AI, Hufeisen SJ, et al. Predicting new molecular targets for known drugs. Nature. 2009;462(7270):175–181. pmid:19881490
  16. 16. Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS, et al. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science. 1998;282(5396):2022–2028. pmid:9851918
  17. 17. McGary KL, Park TJ, Woods JO, Cha HJ, Wallingford JB, Marcotte EM. Systematic discovery of nonobvious human disease models through orthologous phenotypes. Proceedings of the National Academy of Sciences. 2010;107(14):6544–6549. pmid:20308572
  18. 18. Kohlwein SD. Obese and anorexic yeasts: experimental models to understand the metabolic syndrome and lipotoxicity. Biochimica et Biophysica Acta (BBA)-Molecular and Cell Biology of Lipids. 2010;1801(3):222–229. pmid:20056167
  19. 19. Giaever G, Shoemaker DD, Jones TW, Liang H, Winzeler EA, Astromoff A, et al. Genomic profiling of drug sensitivities via induced haploinsufficiency. Nature genetics. 1999;21(3):278–283. pmid:10080179
  20. 20. Costanzo M, VanderSluis B, Koch EN, Baryshnikova A, Pons C, Tan G, et al. A global genetic interaction network maps a wiring diagram of cellular function. Science. 2016;353(6306):aaf1420. pmid:27708008
  21. 21. Yan Tong AH, Boone C. Synthetic genetic array analysis in Saccharomyces cerevisiae. Yeast Protocol. 2006; p. 171–191.
  22. 22. Parsons AB, Brost RL, Huiming D, Zhijian L, Chaoying Z, Bilal S, et al. Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways. Nat Biotechnol. 2003;22(1):62–69. pmid:14661025
  23. 23. Heiskanen MA, Aittokallio T. Predicting drug –target interactions through integrative analysis of chemogenetic assays in yeast. Mol Biosyst. 2013;9(4):768–779. pmid:23420501
  24. 24. Baryshnikova A, Costanzo M, Myers CL, Andrews B, Boone C. Genetic interaction networks: toward an understanding of heritability. Annu Rev Genomics Hum Genet. 2013;14:111–133. pmid:23808365
  25. 25. Bender A, Pringle JR. Use of a screen for synthetic lethal and multicopy suppressee mutants to identify two new genes involved in morphogenesis in Saccharomyces cerevisiae. Mol Cell Biol. 1991;11(3):1295–1305. pmid:1996092
  26. 26. Guarente L. Synthetic enhancement in gene interaction: a genetic tool come of age. Trends Genet. 1993;9(10):362–366. pmid:8273152
  27. 27. Tong AH, Evangelista M, Parsons AB, Xu H, Bader GD, Pagé N, et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science. 2001;294(5550):2364–2368. pmid:11743205
  28. 28. St Onge RP, Mani R, Oh J, Proctor M, Fung E, Davis RW, et al. Systematic pathway analysis using high-resolution fitness profiling of combinatorial gene deletions. Nat Genet. 2007;39(2):199–206. pmid:17206143
  29. 29. Drees BL, Thorsson V, Carter GW, Rives AW, Raymond MZ, Avila-Campillo I, et al. Derivation of genetic interaction networks from quantitative phenotype data. Genome Biol. 2005;6(4):R38. pmid:15833125
  30. 30. Mani R, St Onge RP, Hartman JL, Giaever G, Roth FP. Defining genetic interaction. Proceedings of the National Academy of Sciences. 2008;105(9):3461–3466.
  31. 31. Ben-Shitrit T, Yosef N, Shemesh K, Sharan R, Ruppin E, Kupiec M. Systematic identification of gene annotation errors in the widely used yeast mutation collections. Nature methods. 2012;9(4):373–378. pmid:22306811
  32. 32. Addinall SG, Downey M, Yu M, Zubko MK, Dewar J, Leake A, et al. A genomewide suppressor and enhancer analysis of cdc13-1 reveals varied cellular processes influencing telomere capping in Saccharomyces cerevisiae. Genetics. 2008;180(4):2251–2266. pmid:18845848
  33. 33. Pan X, Ye P, Yuan DS, Wang X, Bader JS, Boeke JD. A DNA integrity network in the yeast Saccharomyces cerevisiae. Cell. 2006;124(5):1069–1081. pmid:16487579
  34. 34. Tong AHY, Evangelista M, Parsons AB, Xu H, Bader GD, Page N, et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science. 2001;294(5550):2364–2368. pmid:11743205
  35. 35. Baryshnikova A, Costanzo M, Kim Y, Ding H, Koh J, Toufighi K, et al. Quantitative analysis of fitness and genetic interactions in yeast on a genome scale. Nature methods. 2010;7(12):1017–1024. pmid:21076421
  36. 36. Srivas R, Shen JP, Yang CC, Sun SM, Li J, Gross AM, et al. A Network of Conserved Synthetic Lethal Interactions for Exploration of Precision Cancer Therapy. Molecular Cell. 2016;63(3):514–525. pmid:27453043
  37. 37. Jerby-Arnon L, Pfetzer N, Waldman YY, McGarry L, James D, Shanks E, et al. Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality. Cell. 2014;158(5):1199–1209. pmid:25171417
  38. 38. Paul JM, Templeton SD, Akanksha B, Andrew F, Vizeacoumar FJ. Building high-resolution synthetic lethal networks: a ‘Google map’ of the cancer cell. Trends Mol Med. 2014;20(12):704–715. pmid:25446836
  39. 39. Kuhn M, Michael K, Damian S, Sune PF, Blicher TH, von Mering C, et al. STITCH 4: integration of protein–chemical interactions with user data. Nucleic Acids Res. 2013;42(D1):D401–D407. pmid:24293645
  40. 40. Jiang P, Wang H, Li W, Zang C, Li B, Wong YJ, et al. Network analysis of gene essentiality in functional genomics experiments. Genome biology. 2015;16(1):239. pmid:26518695
  41. 41. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID: a general repository for interaction datasets. Nucleic acids research. 2006;34(suppl 1):D535–D539. pmid:16381927
  42. 42. Russell RB, Aloy P. Targeting and tinkering with interaction networks. Nature chemical biology. 2008;4(11):666–673. pmid:18936751
  43. 43. Reinke A, Chen JCY, Aronova S, Powers T. Caffeine targets TOR complex I and provides evidence for a regulatory link between the FRB and kinase domains of Tor1p. J Biol Chem. 2006;281(42):31616–31626. pmid:16923813
  44. 44. Wanke V, Cameroni E, Uotila A, Piccolis M, Urban J, Loewith R, et al. Caffeine extends yeast lifespan by targeting TORC1. Mol Microbiol. 2008;69(1):277–285. pmid:18513215
  45. 45. Cardenas ME, Heitman J. FKBP12-rapamycin target TOR2 is a vacuolar protein with an associated phosphatidylinositol-4 kinase activity. EMBO J. 1995;14(23):5892–5907. pmid:8846782
  46. 46. Gustavsson M, Ronne H. Evidence that tRNA modifying enzymes are important in vivo targets for 5-fluorouracil in yeast. RNA. 2008;14(4):666–674. pmid:18314501
  47. 47. Kapitzky L, Beltrao P, Berens TJ, Gassner N, Zhou C, Wüster A, et al. Cross-species chemogenomic profiling reveals evolutionarily conserved drug mode of action. Mol Syst Biol. 2010;6:451. pmid:21179023
  48. 48. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. science. 2006;313(5795):1929–1935. pmid:17008526