A negative genetic interaction map in isogenic cancer cell lines reveals cancer cell vulnerabilities

Improved efforts are necessary to define the functional product of cancer mutations currently being revealed through large-scale sequencing efforts. Using genome-scale pooled shRNA screening technology, we mapped negative genetic interactions across a set of isogenic cancer cell lines and confirmed hundreds of these interactions in orthogonal co-culture competition assays to generate a high-confidence genetic interaction network of differentially essential or differential essentiality (DiE) genes. The network uncovered examples of conserved genetic interactions, densely connected functional modules derived from comparative genomics with model systems data, functions for uncharacterized genes in the human genome and targetable vulnerabilities. Finally, we demonstrate a general applicability of DiE gene signatures in determining genetic dependencies of other non-isogenic cancer cell lines. For example, the PTEN−/− DiE genes reveal a signature that can preferentially classify PTEN-dependent genotypes across a series of non-isogenic cell lines derived from the breast, pancreas and ovarian cancers. Our reference network suggests that many cancer vulnerabilities remain to be discovered through systematic derivation of a network of differentially essential genes in an isogenic cancer cell model.

pooled lentiviral plasmid DNA. 4 × 10 7 HCT116 cells per replicate were infected with 80 k lentiviral shRNA pools at an MOI of 0.3-0.4. After two days of selection in media containing 2 µg/mL puromycin (Sigma-Aldrich) to eliminate uninfected cells, genomic DNA was prepared from shRNA-infected cell populations (Blood Maxi prep kit, Qiagen). Therefore, each hairpin was represented >200 times in the screening populations . Half-hairpin barcodes were prepared from genomic DNA samples using using 30μg of DNA obtained from at least 5× 10 6 infected cells, so that 60-70 fold representation was obtained from the starting amount of gDNA. A master mixture for each sample containing 30μg of template DNA, 2x PCR buffer, 2x enhancer solution, 300nM each dNTP, 900μM each oligonucleotide primer (PCR_BF 5'-Biotin-AATGGACTATCATATGCTTACCGTAACTTGAA-3' and PCR_R 5'-TGTGGATGAATACTGCCATTTGTCTCGAGGTC-3'), 50mM MgSO 4 , 45 units of Platinum Pfx polymerase (Invitrogen), and water to 1200 μl was made and divided into 100 μl aliquots. The amplification reaction was performed by denaturing once at 94°C for 5 minutes, followed by (94°C for 15 seconds, 55°C for 15 seconds, 68°C for 20 seconds) x30, 68°C for 5 minutes, then cooling to 4°C. The PCR product (178bps) was run on a 2% agarose gel to make sure that the amplified shRNA sequence does not form cruciform structure (225bps). PCR products are immediately purified using the QIAquick PCR purification kit (Qiagen) to avoid the conversion of linear product to cruciform DNA and immediately digested with XhoI (New England Biolabs) for 2 hours at 37°C to generate a thermo-stable half-hairpin probe (~106

Supplementary Information
Vizeacoumar et al.
4 bps). This is then gel purified and remaining salts were cleared using a PCR purification kit (Qiagen) with two elutions of 30 μl of EB buffer (Qiagen). An average yield of 3 to 3.5 μg of each sample was obtained from this procedure. Probe hybridization onto UT-GMAP 1.0 microarrays (Affymetrix Inc) was done as described previously .
The pooled shRNA screen to identify genetic interactions with Cetuximab/Erbitux (Bristol-Myers Squibb and Eli Lilly & Company) was performed similarly to the HCT116 screens described above. Briefly, Lim1215 colon cancer cells were grown to a density of Twenty-four hours later, the medium in each of three of the six replicates was replaced with fresh medium containing 0.01 µg/mL Cetuximab; the media in three untreated controls was replaced with normal McCoy's 5A full medium without Cetuximab. Six days post treatment, three aliquots of 1.6x10 7 cells from each replicate were removed, pelleted, and frozen (ie. T6 time point) while one aliquot of 1.6x10 7 cells was re-plated for further growth. T12 and T18 time points were collected the same way as the T6 time point for the cells growing in the presence or the absence of Cetuximab. Genomic DNA was prepared from cell pellets using the QIAmp Blood Maxi kit (Qiagen), precipitated using ethanol and NaCl, and resuspended at 400 ng/mL in 10 mM Tris-HCl, pH 7.5. shRNA populations from cell lines were amplified via PCR and applied to UT-GMAP microarrays (Affymetrix) as described Marcotte et al, 2012).

Supplementary Information
Vizeacoumar et al.

5
Computational scoring of pooled screens.
Hairpin scoring was as described previously (Marcotte et al, 2012). Briefly, expression intensities from triplicate screens were averaged, and individual hairpin features were filtered from further consideration if the initial mean log 2 -expression intensity was below 7.5.
Measurements collected over multiple time points were then integrated using the shRNA Activity Ranking Profile score (shARP), as shown in Equation 1: where n is the number of time points, Δy is the change in expression intensity at t i relative to t 0 , and Δx is the number of doublings for the cell line at t i relative to t 0 . shARP scores were determined for each of the 78,432 library hairpins, and were then used to calculate the Gene Activity Ranking Profile score (GARP) by averaging the two lowest shARP scores. A significance value was assigned to each GARP score through bootstrapping, where the shARP scores were randomLy permuted 1000 times, GARP scores recomputed, and a pvalue determined by the frequency with which the actual GARP score was lower than the permuted GARP scores. To facilitate comparisons between screens, GARP scores were Zscore normalized. Finally, to obtain the top hits from the GARP score, we subtracted the GARP of the parental cell line from the GARP of the mutant (dGARP) on a gene-by-gene basis and this was sorted to obtain the rank order list of hits.

mRNA expression profiling
Transcriptome analysis for all the HCT116 cell lines screened were generated using Affymetrix Genechip® Human Gene 1.0 ST arrays. Total RNA was extracted from cells grown to 70% confluency using RNAeasy assay kit (Qiagen). One microgram of RNA for each sample was processed using the Affymetrix GeneChip® Whole Transcript Sense Target Labeling Assay. The array represents at least 26 probes spread across the entire gene sequence for every single 28,869 genes in the human genome. Hybridization targets were Clara, USA). The arrays were scanned using the Affymetrix GeneChip® Scanner 3000 7G plus (Affymetrix, Santa Clara, USA). An example of the cumulative percentage of expressed genes, Affymetrix controls including expressed and non-expressed genes, as well as negative controls (e.g. intron sequences, miRNAs) for a given sample is shown in Supplementary Fig.   14A. Genes were filtered as not expressed if the mean log 2 -expression intensity was below 6.75. Using this cutoff, a Venn diagram was constructed to summarize the number of genes that were expressed in each of the HCT116 cell lines and also targeted by the TRC lentiviral pooled library that was used to screen these cell lines ( Supplementary Fig. 14B).

Quantitative PCR methods.
Cells were plated into 24-well plates (5 x 10 4 cells/well) or 6-well plates (200,000 cells/well) and infected the following day with the appropriate shRNA.

Dual color HCS-based competition assay.
To validate the candidate negative genetic interactions or synthetic lethal interactions from the primary screens, we first selected a list of hits based on the GARP scores (p<0.05), SHARP scores (p<0.01), genes that had significant GARP scores and were also differentially expressed in the mutant lines versus the parental cells (p<0.05), and finally genes that had yeast orthologs with significant GARP scores (p<0.05)(Supplementary Figure 2B). In the selection of yeast orthologs, we included paralogs of human genes based on Inparanoid, P-POD, OrthoMCL and Genecards in order to maximize the number of candidate conserved genetic interactions in our validation list (Chen et al, 2006;Heinicke et al, 2007;Ostlund et al, 2010;Safran et al, 2010).
Dual color competition assays were developed (Torrance et al, 2001) for all five queries, where we monitored live cells by time-lapse imaging over a period of seven days following siRNA transfection against target genes listed in Supplemental Table 4. Briefly, red (mRFP)_ or green (EGFP) fluorescent proteins were integrated into each of the HCT116 cell lines that were screened using pLJM5 (RFP-hygro) or pLJM7 (GFP-hygro) lentiviral constructs, respectively (J. Moffat, unpublished). Stable cell lines expressing red or green fluorescent proteins were generated following a 2-week selection in 100µg/mL hygromycin and tested for growth rate either alone or in a mixing experiment. For the competition assays, mutant query cells that were green and parental cells that were red were seeded in 384-well plates for reverse transfection with siRNAs from an orthogonal RNA interference library

Time-lapse imaging.
HeLa cells stably expressing murine NEDD1::GFP from a bacterial artificial chromosome (Lawo et al, 2009) were infected with indicated shRNAs and following two days of selection with Puromycin were seeded in Lab-TekII chambers (Nalge Nunc 10 thymidine and fresh media was added and incubated for 7.5 hours to release the cells. Following release, fresh media containing 2mM thymidine (to block cells in G1/S and S phase) or 200mM nocodazole (to block cells in G2/M and M) was added and cells were incubated for 16 hours (second block). After the second block, thymidine or nocodazole was removed by washing with 1X PBS and the cells were released into fresh media. Lysates were prepared at different times following release and were probed for different Cyclin proteins by western blotting using Cyclin Antibody Sampler Kit #9869 (Cell signaling).

Epitope tagging and western blotting.
TTC31 gene containing plasmid (HsCD00336882) was obtained from PlasmID in pBluescript vector and converted to Gateway-compatible construct by adding appropriate "att B1 and B2" sites. Forward primer GGGG ACA AGT TTG TAC AAA AAA GCA GGC TNN-ATG GCG CCG ATT CCA AAG and Reverse primer GGGGAC CAC TTT GTA CAA GAA AGC TGG GTN TCT GGC CTG AGA CAG ATG were used to amplify TTC31 and clone into the pDONR221. The Gateway-compatible destination plasmids pLD-puro-CcVA or pLX304 were used to generate TTC31 expression constructs with either a VA tag or a V5 tag, respectively (Mak et al, 2010;Yang et al, 2011). To construct TUB3B-GFP and Histone-BFP, we used the pLD-puro-Cc-tGFP. This destination plasmid was constructed using a turbo-GFP amplicon, cloned into pLD-puro-CcVA using the XbaI and BstBI sites using the primers XbaI-turboGFP

Cetuximab validation assay.
Lim1215 cells were infected in suspension with individual shRNA and plated in triplicates on 24-well plates (5x10 3 cells/well, 25ml virus/well, 8 mg/mL of polybrene). Twentyfour hours post-infection, the medium was replaced with fresh medium containing 3µg/mL puromycin. After 48 hours incubation, the medium was replaced with fresh full growth medium or the medium containing Cetuximab (0.01 µg/mL) and incubated for an additional four days.
Culture dishes were washed with warm PBS to remove dead cells, and surviving cells were collected by trypsinization at 37°C and counted using a Hemocytometer. LacZ shRNA was used as a control.

CellTiter-Glo® based luminescent cell viability assay.
HCT116 cells were infected in suspension with individual shRNA and plated in triplicates on 96-well plates (0.8 x10 3 cells/well, 5µl virus/well, 8µg/mL of polybrene). Twentyfour hours post-infection, the medium was replaced with fresh medium containing 3µg/mL puromycin. After 48 hours incubation, the medium was replaced with fresh full growth medium or the medium containing Cetuximab (0.01 µg/mL) and cells were incubated for an additional four days. CellTiter-Glo® Luminescent Cell Viability Assay was performed according to manufacture protocol. Briefly, CellTiter-Glo® Reagent was added directly to the wells containing cells cultured in full medium and luminescence was recorded 10 minutes after reagent addition using Synergy 2 plate reader (BioTek).

Rescue/complementation of TTC31 and CD83 knockdown phenotypes.
TTC31 gene containing plasmid (HsCD00336882) was obtained from PlasmID in pBluescript vector and converted to Gateway-compatible construct by adding appropriate "att

Supplementary Information
Vizeacoumar et al.

12
B1 and B2" sites and cloned into pLX304 plasmid to use as a wild type expression construct.
Gene synthesis was used to introduce silent mutations in the ORF region of TTC31 corresponding to the binding sites of the two hairpins as shown below to make shRNA resistant (shR) constructs.
For CD83, human CD83 cDNA (IMAGE ID 4818856) was cloned into the destination plasmid pLD-puro-Cc-tGFP. CD83 shRNA resistant construct was created using GENEART Site-Directed Mutagenesis System (Invitrogen) with the following mutations: Stables cell lines expressing both the wild type and the shRNA resistant constructs were generated as described above and infected with hairpins to examine whether or not the shR constructs could rescue the defects in proliferation. For example, HCT116 cells were infected with lentiviruses expressing WT-CD83-GFP or shR-CD83-GFP and stable lines were generated. GFP expressing cells were infected in suspension with individual lentiviruses: sh1CD83, shLacZ or shPSMD1 and plated into 12 well plates at 10 4 cells/well, 50µl virus/well, 8µg/mL of polybrene. Twenty-four hours post-infection virus was removed and the medium was replaced with fresh medium. Six days later, cells were harvested by trypsinization, stained with 7-AAD (BioLegend) and counted using a BD FacsCalibur analyzer with CellQuest Pro software and analyzed using FlowJo (v.7.6.5). Each sample was run for 204. A bootstrap approach was used to assess the significance of the difference in tumor growth as follows. Each tumor was associated with a treatment T (either Cetuximab or Mock) and a hairpin H (either Untransduced, shLacZ or sh1_CD83). The mean of tumor weights associated with each combination of treatment and hairpin was denoted W H ,T . With this, the following quantities were defined The analysis was designed to assess whether, as indicated by the synthetic lethal relationship between CD83 and Cetuximab observed in vitro, CD83-silenced tumors treated with Cetuximab exhibited greater relative decrease in weight when compared to CD83-normal tumors subjected to the same treatment. Expressed in terms of the quantities defined above, these questions were expressed as the alternative hypotheses RE sh1CD83,Lac Z < 1 and RE sh1CD83,Untransduced < 1. To ensure bootstrap distribution symmetry, log(RE sh1CD83,Lac Z ) < 0 and log(RE sh1CD83,Untransduced ) < 0 were used in the calculations instead. To test these hypotheses, 10 6 bootstrap samples were generated from the tumor weights data. The sets of weights associated with each treatment and hairpin pair were separately sampled with replacement and aggregated to form the bootstrap sample, in order to adequately represent the experimental structure as outlined in Efron and Tibshirani (Efron & Tibshirani, 1993).

Supplementary Information
Vizeacoumar et al.
14 inspection and Quantile-Quantile plots against the Normal distribution. Estimates of bootstrap bias for log(RE) estimates were less than 2.5% of estimates of log(RE) standard error in all cases. Using these bootstrap samples, single-tailed p-values were calculated for each hypothesis test following Efron and Tibshirani (Efron & Tibshirani, 1993). Displayed error bars were also estimated by bootstrap for each of the plotted bars representing the Weights Ratios

Soft agar assay.
Soft agar assays were performed using Human fibroblast (  15 expression of HKDC1 was confirmed by Western blotting using anti-FLAG antibody (Abcam-ab49763).

Expression correlation analysis for (1) HKDC1 expression versus RAS signaling and (2) PTTG1 expression versus DHFR expression in methotrexate sensitive and resistant cell lines.
To examine the association between HKDC1 expression and RAS signaling dependence, two multi-tissue expression studies were used: the Cancer Cell Line  (Loboda et al, 2010) was used to score each sample in the CCLE and EXPO datasets, following the procedure detailed elsewhere (Loboda et al, 2009). Briefly, 147 gene symbols from the RAS signature were matched against genes profiled on the U133plus2 platform, yielding expression profiles for 143 signature genes. Each study was scored separately. The signature's two branches reflect genes up-and down-regulated in a RAS signaling-dependent state. The RAS dependence score for each sample S was calculated as follows: Higher scores suggest greater reliance of the tumor or cell line on RAS pathway signaling, and potential decrease in proliferation rate and viability should this signaling be inhibited.
Examination of the RAS pathway dependence scores obtained for each study revealed bimodal score distributions, with one major peak on either side of zero. These score peaks

Collection of negative genetic interaction data from model systems and other human screens.
To identify interactions which directly (i.e. human interactions) or indirectly (i.e. interactions of orthologs in other species) support the relationships uncovered in our screen,

Supplementary Information
Vizeacoumar et al.

17
we collected an exhaustive dataset of known genetic and physical interactions from human, yeast, worm, fly, and mouse by combining data from the iRefWeb (Turner et al, 2010) and BioGrid (Stark et al, 2011). Both resources curate interactions from various primary sources and thus comprise the largest collection of interaction data available. We also integrated data from a recent KRAS synthetic lethal screen (Luo et al, 2009), and an updated set of genetic interaction data from yeast , neither of which was available in the curated databases. For the yeast genetic interaction data, we chose a cut-off for the absolute genetic interaction score at 0.08, which refers to a p-value of 0.05 Koh et al, 2010). For KRAS, we used yeast genetic interactions of IRA1 and IRA2 as deletion of these mutants mimic constitutively active form of KRAS (Tanaka et al, 1989;Tanaka et al, 1990a;Tanaka et al, 1990b). We then mapped the set of the most significant genetic interactions to these databases using the MP-eggNOG procedure described below.
However, these orthologous groupings contain many-to-many mappings between genes that do not necessarily reflect meaningful relationships (e.g. divergent paralogs with distinct functions). Orthologous group based methods are more sensitive for large evolutionary distances (e.g. between yeast and human) compared to methods based on bi-directional best BLAST hits such as Inparanoid (Ostlund et al, 2010). We implemented a stringent approach based on eggNOG 2.0 resulting in a mapping of the most probable ortholog for each human gene, which we describe here. We adapted an initial ortholog mapping using a combination of eukaryotic and general orthologous groups (euNOGs, COGs, and KOGs in the eggNOG terminology) from the eggNOG database (Muller et al, 2010). These orthologous groups contain orthologous pairs of the same genes in different species, but might also comprise paralogs and their respective orthologs. Consequently, we determined for each human gene the most probable orthologous gene in the species of the evolutionary older, less complex clades. To do this, we detected for each human gene the most similar sequence for each species of interest within the largest group it has been assigned to (by measuring similarity using the Smith-Waterman algorithm with BLOSUM62 substitution matrix, gap opening penalty 2, extension penalty 1), since the largest group should be most sensitive in picking up

Supplementary Information
Vizeacoumar et al.
18 distant relationships. As a consequence, most human genes have been assigned to exactly one gene in each of these species where several human genes may target the same ortholog (e.g. certain kinases that have undergone gene duplication events leading to multiple paralogs/co-orthologs will have the same yeast gene as an ortholog). Our goal was to report only highly likely orthologous relationships, so we further pruned potential false positive relationships based on shared domains. Thus, we removed all pairs of the detected bona fide orthologs with a sequence identity <25% over the complete sequence and these forming alignments with less than 75% coverage for any of the two sequences. This procedure is represented in Supplementary Figure 3. For the orthologs used as supporting evidence of a genetic interaction in our dataset (see below), we manually inspected the orthology assignment and removed further false positive ortholog relationships if contradictory annotations were present (e.g. molecular function) of the sequences. The sequence alignment for each ortholog pair can be found in Supplementary Table 11.

Assessment of conservation of the negative genetic interactions.
While genes themselves may be conserved, there is conflicting evidence whether the corresponding genetic interactions are conserved over large evolutionary distances (Byrne et al, 2007;Tarailo et al, 2007;Tischler et al, 2008). Therefore, we tested the significance of the compiled supporting evidence as follows. First, we calculated empirical p-values by permuting the experimentally determined human genetic interaction network 500 times, and determined the probability of obtaining a greater or equal amount of evidence for each of the query genes.
Permutations were carried out by randomly assigning one of the ~16,000 genes represented in the shRNA library to each node in the network, while retaining the evidence networks, which simulates random shRNA experiments equivalent to the one we performed. Due to the bi-partite nature of the network, this is equivalent to an edge re-wiring strategy. Overall, the overlap between the human and model organism networks is not significant using this approach. The results of this analysis and the individual evidences uncovered are summarized in the Supplementary Table 7.

Construction of a high confidence network.
The high-confidence differential essentiality or DiE network (Figure 2) comprises all interactions which tested positively in the secondary screen at the 80% confidence level and all hits in the top 5% (p-value <0.05) that have any evidence in form of a human genetic or

Supplementary Information
Vizeacoumar et al.

19
physical interaction, or in the respective orthologous networks in mouse, fly, worm, or yeast.
This high confidence network comprises 264 genes connected by 291 interactions and is represented in Figure 2.

Construction and analysis of genetic sub-networks.
In order to investigate the relationships between the genes that have been found genetically interacting to the same query gene, we created sub-networks by gathering interactions between them in human as well as in model organisms. For the latter, we used MP-eggNOG. Negative genetic interactions as observed in other model systems tend to be more related to each other compared to a random selection of genes, indicating that the experimentally-determined interactions tend to share complexes or functional modules . We tested this by assessing the significance of the observed structure within each sub-network as measured by their average clustering coefficient (Barabasi & Oltvai, 2004). The sub-networks were compared against a random model generated by permutations of the evidence networks (i.e. the physical and genetic interaction networks from fly, worm, mouse, yeast, human). In order to avoid reporting structural features of these input networks alone which will be reflected in the projected sub-networks, we applied a conservative random model: for each input network, we generated 500 random networks with the same structure as the original network (ie. retaining same degree and clustering coefficient distribution) by node shuffling. We then selected all interactions that are shared by nodes in the top 5% of each screen for the original (constituting the sub-network) and the randomized networks (constituting the random background networks). We integrated each type of interaction (genetic and physical interactions from each species) into a unified subnetwork for each query gene (BLM, MUS81, PTEN, PTTG1, KRAS). We then computed the average clustering coefficient of the original sub-network and of the sub-networks of the random trials. We determined empirical p-values reflecting the increase of cross-talking genes picked up by each screen against the background as the empirical probability of observing the same or higher average clustering coefficients. We found a statistically significant increase in the average clustering coefficient for several of the sub-networks (Supplementary Table 8).

Supplementary Information
Vizeacoumar et al.

20
We assessed the annotation status of our top dGARP hits using GO annotations obtained from the UniProt-GOA consortium (Dimmer et al, 2012) downloaded at the 10th of January 2013. Each gene that has at least one annotation that had not been electronically transferred (ignoring all entries with GO IEA evidence code) was counted as 'annotated'. In total, we found that 23% of the genes picked up in our screen lack any experimentally derived evidence.

Assessment of overlap of top delta-GARP scores in PTEN query with cancer essentials.
In order to assess the general applicability of the digenic relationships derived from the isogenic colon cancer cells for other non-isogenic cancer types, we systematically compared data generated from cancer-specific essentiality screens from breast, ovarian, and pancreatic cancers to the data from our isogenic screens (Marcotte et al, 2012). As a proof of principle, Each mutation was classified as relevant if it was found in the literature or in UniProt (UniProt, 2012) described as an either PTEN inactivating or PIK3CA activating mutation. Coding mutations in close proximity to these described mutations have been categorized as bona fide equivalent mutations. We also searched the literature and annotated additional cell lines not known to carry these mutations (e.g. SW1990 and ASPC1), but that display constitutively active AKT due to some other mechanism (Cheng et al, 1996;Halilovic et al, 2010 hereafter referred to as "PTEN*" and "WT", respectively. We determined the overlap of the genes with the "m" most significant dGARP scores from the PTEN screen (the DiE profile) with the "n" top cancer essentials from non-isogenic lines by computing the Jaccard index (Jaccard, 1901). The Jaccard index is defined as the intersection between "m" and "n" divided by their union. We then sorted each set due to the resulting value from high to low and plotted them as shown in Figure 7B for m=n=750 which represents genes with dGARP/GARP score

Supplementary Information
Vizeacoumar et al.

21
with a p-value <0.05. We tested several cut-offs for n and m, resulting in a consistently similar scenarios as represented by an example in Figure 7B: the top PTEN* cell-lines exhibit higher Jaccard indices on average. We tested the significance of the difference between the two set as their shift in the mean of the Jaccard distributions (with n=750, m=750) using a one-sided t-Test (p-value=0.0002). A box-plot summarizing this data has been created in R using the ggplot2 package. From this we determined the signature of PTEN dependency using genes that are more frequent within the top 5% of the PTEN* lines compared to WT lines (Fig. 7C).

Determining PTEN signature from non-isogenic lines.
An alternative approach to determine PTEN-dependency signature would be to asses the amount of genes that behave differentially between PTEN* lines and WT non-isogenic cell lines using data from Marcotte et al (Marcotte et al, 2012). We compared the Z-normalized GARP distributions between the PTEN* lines and WT lines and collected genes that exhibit a significant shift to more severe scores in the PTEN* lines (p-value ≤ 0.05, using a one sided student t-test). This procedure resulted in 1147 genes. To assess the significance of this amount of genes, we repeated this procedure 500 times on randomized data (while shuffling the cell-line labels). On average, the random data yield ~750 genes with a significant behavior resulting in an empirical p-value of 0.05. An overview of this procedure can be found in Supplementary Figure 13A. While this procedure may capture genes independent of the GARP score (since their shift but not the actual essentiality is evaluated), the resulting genes have a small but significant tendency to be more essential in the PTEN* lines than in the WT lines. This is pictured in the box plot in Supplementary Figure 13B. We then assessed the overlap of the genes derived from this procedure to our top 5% hits of the DiE profile of PTEN using a Fisher exact test and found the two sets significantly overlap (102 instances, p-value 5e-10). We further investigated this enrichment using different fractions of our PTEN DiE profile. The most significant enrichment was discovered using a cut-off of ~750 top-genes, consistent with the applied cut-off of top 5% used throughout this study (Supplementary Figure 13C).

Precision-Recall Plots.
Precision-recall plots were computed to show that high scoring digenic pairs are more likely functionally related to each other than less significant ones. We defined a true positive prediction if the interacting gene shares one or more functional categories with the query

Supplementary Information
Vizeacoumar et al.

22
gene. These functional categories have been manually assigned based on the publicly available GO annotation and expert knowledge. We restricted this analysis to BLM and

MUS81 as query genes as they have very clear and distinct roles in DNA damage and nucleic acid metabolism. Recall was computed as TP/(FN+TP) and Precision as TP/(TP+FP) where
TP is the amount of true positive (interacting and sharing an annotation term), FP the amount of false positives (interacting but not sharing annotation), and FN the amount of false negatives (not interacting but sharing annotation). Interacting genes have been defined as genes with a GARP score smaller (more aggravating) as a certain cut-off. By varying the cutoff from low (strongly aggravating) to high (more neutral) GARP scores, the plots were then generated. The ROCR package was used to generate the plots (Sing et al, 2005). We computed p-values for the observation of the area under the precision-recall curve (pAUC), which is higher when more signals are recovered on the left part. We empirically computed the p-value by randomizing the annotation 500 times while retaining the frequency of each annotation term as well as the frequency of term co-occurrence. None of the random run revealed a AUC as high as we found for the original data, resulting in p-values <1/500 in both cases.

Meta-analysis calculating the overlap of our study with other genomic datasets.
In order to assess the significance of the DiE genes with respect to other screens, we computed the frequency of the gene-sets in the DiE profile with gene from other indicated datasets. In case of enrichment in the DiE genes, we computed a p-value using a one-sided Fisher exact test (Supplementary Figure 12).

Supplementary Information
Vizeacoumar et al. lists gene level dGARP score calculated as described in the methods section. The Gene ID, and the score in independent screens are listed. To get the top negative genetic interactions from each screen, sort the column ascending. 24 most probable ortholog for each human gene. The data is mapped to Uniprot-identifier as used in the iRef-Web resource. For each gene and species (human yeast, fly, mouse, worm), the Uniprot gene name and gene names are listed. N/A indicates the absence of a common gene name for a gene (only Uniprot name available) and '-' denotes the absence of an ortholog. Table 7. List of genes represented in the high confidence differential essentiality network. Sheet1 of this table lists all the genes that are represented in Figure 2.

Supplementary
Interaction pairs that were conserved in other species are also included with their published references. Sheet 2 of this table shows the statistical significance of the conserved interactions.

Supplementary Table 8. Supporting evidence for sub-networks using a cross-species
analytical approach. This table lists the supporting evidences that connect the genes within each screen (p<0.05). These supporting evidences comprises of known genetic interactions and physical interactions that resemble the genetic interaction in human, or between any pair of orthologs in yeast, fly, worm, or mouse. The first sheet lists all of them together and the subsequent sheets lists separately for each screen for convenience. The last sheet provides information on the non-random structure of the resulting sub-networks as measured by the average clustering coefficient. P-values to find higher clustering-coefficients have been computed from 500 rounds of randomized data. Table 9. Genes that constitute PTEN-signature to determine PTEN genetic dependency. List of genes represented in Figure 7D. The alternative signatures derived from non-isogenic cell lines are listed in Sheet2. Sheet3 lists the intersection of signature genes from the PTEN signature derived from HCT116-PTEN -/versus HCT116-PTEN +/+ screening results and the PTEN/PI3K signature derived from non-isogenic cell lines .

25
"PIK3CA" denotes activating mutation in PI3K pathway, "PIK3CA dependency" denotes cell lines that have activated PI3K pathway but no detected mutations in the PI3K3CA gene. Mutation information has been derived from the CCLE database. Reference/Mutation lists the actual mutations or provides references to relevant publications describing the cell line.

Supplementary Information
Vizeacoumar et al. Left: Percentage of HCT116-PTTG1 +/+ and HCT116-PTTG1 -/cells with a congression defect when transduced with constitutively expressed shRNAs against LacZ or two independent shRNAs targeting ESPL1, measured in cells with spindle poles more than 6μm in distance as determined by NEDD1 staining. Middle: Percentage of HCT116-PTTG1 +/+ and HCT116-PTTG1 -/cells with lagging chromosomes when transduced with shRNAs against LacZ or ESPL1, measured as cells were progressing through anaphase. Right: percentage of HCT116-PTTG1+/+ and HCT116-PTTG1-/-cells expressing a negative control shRNA targeting LacZ (shControl) or the same two independent shRNAs targeting ESPL1 as described above, in different phases of mitosis including prometaphase, metaphase and anaphase (n=3). ** represents p<0.01; *** represents p<0.001, calculated using chi-square test.       A H P D E C F P A C -1 C a p a n -2 C a p a n -1 B x P C -3 A s P C -1 IM IM -P C -2 IM IM -P C -1 H s 7 6 6 T H P A F II H P A C P a n c 0 8 .1 3 P a n c 0 5 .0 4 P a n c 0 4 .0 3 P a n c 0 3 .2 7 P a n c 0 2 .0 3 M ia P a c a 2 K P -2 K P -4 K P -3 P a n c 1 0 .0 5 S K -P C -1 R W P -1 P L -4 5 P A T U 8 9 8 8 T P A T U 8 9 8 8 S P a n c -2 8 P a n c -1 S W 1 9 9 0 S U 8 6 8 6 S K -P C -3