Genetic interactions of G-quadruplexes in humans

G-quadruplexes (G4) are alternative nucleic acid structures involved in transcription, translation and replication. Aberrant G4 formation and stabilisation is linked to genome instability and cancer. G4 ligand treatment disrupts key biological processes leading to cell death. To discover genes and pathways involved with G4s and gain mechanistic insights into G4 biology, we present the first unbiased genome-wide study to systematically identify human genes that promote cell death when silenced by shRNA in the presence of G4-stabilising small molecules. Many novel genetic vulnerabilities were revealed opening up new therapeutic possibilities in cancer, which we exemplified by an orthogonal pharmacological inhibition approach that phenocopies gene silencing. We find that targeting the WEE1 cell cycle kinase or USP1 deubiquitinase in combination with G4 ligand treatment enhances cell killing. We also identify new genes and pathways regulating or interacting with G4s and demonstrate that the DDX42 DEAD-box helicase is a newly discovered G4-binding protein.


Introduction
G-quadruplex secondary structures (G4s) form in nucleic acids through the self-association of guanines (G) in G-rich sequences to form stacked tetrad structures (reviewed in Bochman et al., 2012;Rhodes and Lipps, 2015). In the human genome, over 700,000 G4s have been detected in vitro (Chambers et al., 2015). Sequences encoding G4s are enriched in regulatory regions consistent with roles in transcription and RNA regulation Huppert, 2008), and their over-representation in oncogene promoters, such as MYC, KRAS and KIT, suggests that they are important in cancer and are potential therapeutic targets (reviewed in Balasubramanian et al., 2011). Computationally predicted G4s have also been linked to replication origins (Besnard et al., 2012) and telomere homeostasis (reviewed in Neidle, 2010). In the transcriptome, more than 3000 mRNAs have been shown to contain G4 structures in vitro, particularly at 5' and 3' UTRs, suggestive of roles in posttranscriptional regulation (Bugaut and Balasubramanian, 2012;Kwok et al., 2016).
G4-specific antibodies have been used to visualise G4s in protozoa (Schaffitzel et al., 2001) and mammalian cells (Biffi et al., 2013;Henderson et al., 2014;Liu et al., 2016). More G4s are detected in transformed versus primary cells, and in human stomach and liver cancers compared to non-neoplastic tissues, supporting an association between G4 structures and cancer (Biffi et al., 2014;Hänsel-Hertsch et al., 2016). More recently, ChIP-seq was used to map endogenous G4 structure formation in chromatin revealing a link between G4s, promoters and transcription (Hänsel-Hertsch et al., 2016). G4s are found predominately in nucleosome-depleted chromatin within promoters and 5' UTRs of highly transcribed genes, including cancer-related genes and regions of somatic copy number alteration. G4s may therefore be part of a regulatory mechanism to switch between different transcriptional states. At telomeres, tandem G4-repeat structures also may help protect chromosome ends by providing binding sites for shelterin complex components (reviewed in Brázda et al., 2014). As G4 structures can pause or stall polymerases, they must be resolved by helicases to allow replication and transcription to proceed. Several helicases, including WRN, BLM, PIF1, DHX36 and RTEL1, have been shown to unwind G4-structures in vitro (Brosh, 2013;Mendoza et al., 2016), and it is notable that fibroblasts from Werner (WRN) and Bloom (BLM) syndrome patients, who are predisposed to cancer, show altered gene expression that correlates with sites with potential to form G4s (Damerla et al., 2012).
Small molecules that selectively bind and stabilise G4 formation in vitro have been used to probe G4 biological function. G4 ligands, such as pyridostatin (PDS), PhenDC3 and TMPyP4, can reduce transcription of many genes harbouring a promoter G4, including oncogenes such as MYC, in multiple cancer cell lines (Halder et al., 2012;McLuckie et al., 2013;Neidle, 2017). G4-stabilising ligands also interfere with telomere homeostasis by inducing telomere uncapping/DNA damage through the inhibition of telomere extension by telomerase leading to senescence or apoptosis (reviewed in Neidle, 2010). 5' UTR RNA G4 structures may also be involved in eIF4A-dependent oncogene translation (Wolfe et al., 2014) and their stabilisation by G4-ligands can inhibit translation in vitro (Bugaut and Balasubramanian, 2012). Identification of several RNA G4-interacting proteins (reviewed in Cammas and Millevoi, 2016), including DEAD/DEAH helicases such as DDX3X, and DHX36 (Chen et al., 2018;Herdy et al., 2018) additionally suggests specific roles for G4 structures in RNA.
Some G4-stabilising ligands cause a DNA damage-response (DDR); for example, DNA damage sites induced by PDS in human lung fibroblasts mapped to genomic regions at G4s within several oncogenes including SRC (Rodriguez et al., 2012). Subsequent studies demonstrated that homologous recombination (HR) repair deficiencies can be exploited to selectively kill BRCA1/2-deficient cancer cells with G4 ligands (McLuckie et al., 2013;Zimmer et al., 2016). Recently, this concept has been applied to BRCA1/2-deficient breast cancers using CX-5461, a G4 ligand currently in clinical trials (Xu et al., 2017) (NCT02719977 ClinicalTrials.gov). Overall, these initial studies demonstrate that specific genotypes can be selectively vulnerable to G4-stabilisation and raises the question as to what other genotypes might provide further such opportunities.
We set out to address two main questions ( Figure 1): 1) which human genes and cellular pathways interact with G4s and 2) what genetic backgrounds selectively lead to enhanced cell killing in the presence of G4 stabilising ligands? We employed PDS and PhenDC3 as representative G4 ligands as these are chemically and structurally dissimilar, but each shows a broad specificity for different G4 structural variants. Both ligands have been widely used as G4-targeting probes in biophysical (De Cian et al., 2007b;Rodriguez et al., 2008) and biological studies in which they have been shown to impart transcriptional inhibition, telomere dysfunction and replication stalling (De Cian et al., 2007a;Halder et al., 2012;Mendoza et al., 2016).

Identification of genetic vulnerabilities to G4-ligands via genome-wide screening
An unbiased genome-wide shRNA screen was performed in A375 human melanoma cells to globally evaluate genetic vulnerabilities to G4-ligands and to identify genes and pathways involved with G4structures ( Figure 2A). For this, the pyridine-2,6-bis-quinolino-dicarboxamide derivative, PDS (Rodriguez et al., 2012), and bisquinolinium compound, PhenDC3 (De Cian et al., 2007b) were chosen ( Figure 2B). We used the latest generation shERWOOD-Ultramir shRNA pLMN retroviral library, comprising 132,000 shRNAs across 12 randomised pools targeting the protein coding genome, with an average of five optimised hairpins per gene ( Figure 2C) (Knott et al., 2014). A375 melanoma cells were used due to their rapid doubling, stable ploidy and success in other shRNAdropout screens (Sims et al., 2011); they are TP53 wild-type and driven by oncogenic BRAF (V600E) and CDKN2A loss (Forbes et al., 2015). Figure 2D outlines our shRNA screening strategy. To identify shRNAs that are lost between the initial (t0) and final (fF) timepoints, unique 3'-antisense sequences were recovered by PCR and quantified by sequencing. If a gene knockdown compromises cell viability then the associated shRNA will be depleted compared to those targeting non-essential genes: the tF sequence count will be less than t0 thus log 2 fold change (FC, tF/t0) is negative. A pilot using one shRNA pool established that a tF of 15 population doublings can be used to reveal significant G4-ligand-mediated changes [false discovery rate (FDR) 0.05] in shRNA levels using a ligand concentration resulting in 20% cell death (GI 20 , see Materials and methods and To understand the complete spectrum of G4 vulnerabilities, we first considered the combined set of sensitivities to PDS and PhenDC3 together. For the whole library, when individual shRNAs are considered 9509 (~7%) G4-ligand-specific hairpins (i.e. those not in DMSO) were found to be depleted (FDR 0.05; log 2 FC <0, Figure 3A, Supplementary file 1). We then reasoned, for a gene knockdown to have compromised cell growth, that a minimum of either 50% or three shRNA hairpins should be significantly depleted for that gene (median log 2 FC <0). This resulted in the identification of 843 G4 ligand-specific gene knockdowns not present in DMSO ( Figure 3B). We then denoted a more stringent preliminary list of 758 G4 sensitisers as those having a median log 2 FC  Figure 1. Strategy identifying genetic vulnerabilities involved with G4 biology. Genome-wide shRNA silencing combined with G4 structure stabilisation by small molecules identifies genes that when depleted compromise cell viability. Cells are infected with a genome-wide pool of shRNA lentiviruses targeting the protein coding genome followed by G4 ligand treatment to stabilise genomic and/or RNA G4 structures. Two general outcomes are possible: a gene is not required in a G4-dependent process so there is no effect on cell viability (left); or gene silencing results in cell death either due to loss of a direct G4 interaction (e.g. binding/unwinding) or indirectly through gene loss in a G4-dependent pathway (right). In absence of ligand, cells are viable in presence of the shRNA. Dotted boxes highlight genotypes of disease significance for possible G4-based therapies (blue) and genes and biological pathways that involve and/or interact with G4 structures (orange  Figure 3. Genome-wide screening in A375 cells reveals deficiencies in known G4-associated genes as sensitive to G4-stabilising small molecules. (A-C) Venn diagrams for: (A) significantly differentially expressed individual shRNAs (FDR 0.05); (B) significantly depleted genes (50% or three hairpins, FDR 0.05, median log 2 FC < 0) following DMSO, PDS and PhenDC3 treatment and (C) Significant PDS and PhenDC3 sensitiser genes not in DMSO and after applying a median log 2 FC À1 cut off. (D-F) Tables showing the number of depleted hairpins and median log 2 FC values for: (D) known G4 Figure 3 continued on next page À1 ( Figure 3C). It is reassuring that in this list we independently validated the known G4 sensitisers BRCA1/2, ATRX and HERC2 (McLuckie et al., 2013;Wang et al., 2019;Watson et al., 2013;Wu et al., 2018;Xu et al., 2017;Zimmer et al., 2016; Figure 3D). We next explored further genes already implicated in G4 biology, but whose deficiency has not yet been linked with any enhanced sensitivity to G4 ligands. For genes annotated with G4-related terms in the UniprotKB, Gene Ontology (GO) and G4IPDB databases (Mishra et al., 2016), an additional eight sensitisers (ADAR, DHX36, DNA2, FUS, MCRS1, RECQL4, SF3B3 and XRN1) were uncovered ( Figure 3E). Text-mining with G4 search terms using PolySearch2 on PubMed abstracts and open access full texts (see Materials and methods;Liu et al., 2015b) revealed a further 12 sensitisers arising from our screen including helicases (RTEL1), DDR components (CHEK1, RAD17), transcriptional proteins (POLR1A, CNBP) and replication factors (ORC1, RPA3, TOP1) ( Figure 3F).
Within the total 758 G4-sensitiser gene list, we uncovered five significant enriched KEGG pathway clusters (p<0.05): 'cell cycle', 'ribosome', 'spliceosome', 'ubiquitin-mediated proteolysis' and 'DNA replication' ( Figure 4A, Supplementary file 1). Within each cluster are gene targets common to both G4 ligands, as well as genes unique to each ligand. To gain functional insights, enriched GO 'Biological Process' and 'Molecular Function' terms were determined ( Figure 4B; Supplementary file 1) which showed 20 out of 45 of the former and all the latter terms into DNA or RNA classifications, consistent with PDS/PhenDC3 directly binding nucleic acid G4 targets. Furthermore, when protein domains were considered using GENE3D and PFAM databases ( Figure 4C), we discovered enrichments in helicase C-terminal domains, RNA recognition motifs including RRM, RBD and RNP domains, and DNA-binding domains including zinc fingers, bZIP motifs and HMG boxes. Consistent with the ubiquitin-mediated proteolysis KEGG cluster, enrichments in multifunctional ATPase domains and in ubiquitin hydrolase domains, were also found. These latter findings suggest important areas of biology not previously known to be affected by G4 intervention in mammalian cells.
Cancer-associated gene depletion enhances sensitivity to G4-ligands We next used the complete list of 758 genes, identified as stringent G4 ligand sensitisers above, to discover new cancer-associated gene vulnerabilities to G4-stabilising ligands. For this, we searched this list for any significant enrichment in the COSMIC database (v83) of genes causally implicated in cancer (Forbes et al., 2015). Of the 758 sensitisers, there was a two-fold enrichment (p=9.1Â10 À6 ) for 50 cancer-associated genes, which increases to three-fold (p=2.5Â10 À3 ) when considering only sensitisers common to both G4 ligands ( Figure 5A,B, Supplementary file 1). Notably, when STRING network analysis (Szklarczyk et al., 2017) was used to investigate functional interactions, this revealed a DDR cluster that included BRCA1 and BRCA2, as well as their interacting tumour suppressor partners PALB2 and BAP1, two cancer-associated DDR genes not previously indicated as G4 ligand sensitisers. ( Figure 5C). This analysis also identified as sensitisers a cluster consisting of several chromatin modifiers including SMARCA4, SMARCB1 and SMARCE1.
Focused G4-sensitiser shRNA screening reveals robust G4-ligand genetic vulnerabilities and potential therapeutic targets To enable more rigorous and further comparative analyses that focus solely on G4 sensitisers, we developed a custom shRNA screening panel encompassing the gene sensitisers identified above plus additional G4-associated genes noted from the literature ( Figure 6A, Figure 6-figure supplement 1, see Materials and methods). This panel consisted of a single retroviral shRNA pool to allow all shRNAs to be screened simultaneously under standardised conditions and to minimise technical fluctuations. We first used this panel to recapitulate the findings of the genome-wide screen above ligand sensitisers, ATRX, HERC2, BRCA1 and BRCA2, that are independently validated in our screen; (E) sensitisers annotated with a G4-associated term in GO, UniprotKB or G4IPBD databases and (F) sensitisers identified as G4-related by text-mining showing the associated PolySearch2 algorithm score and summary of the G4 association. Sensitisers are defined as a gene where 50% or three hairpins were significantly differentially expressed (FDR 0.05) with median log 2 FC À1. See also Supplementary file 1. DOI: https://doi.org/10. 7554/eLife.46793.005 and compare responses with different G4 ligands. Using A375 melanoma cells with PDS and PhenDC3, the custom panel recovered a total of 342 G4 sensitisers corresponding to 40.6% overlap (308 genes) with the complete genome-wide screen ( Figure 6B,C). From this, we identified 290 G4    sensitisers with 89 and 161 unique for PDS and PhenDC3, respectively, and 40 genes common for both ligands ( Figure 6-figure supplement 1E). Comparing PDS and PhenDC3 sensitisers by KEGG analysis shows that each ligand mostly interacts with different but related pathways ( Figure 6D,E). Consistent with direct G4-targeting, nucleic-acid-related GO terms were enriched ( Figure 6-figure supplement 1F & G, Supplementary file 2). We next considered that the 40 sensitiser genes common between PDS and PhenDC3 reflected the most robust sensitisers for G4 ligands in general and it is notable that 27 out of 40 associated with DNA or RNA binding processes, such as chromatin modification, replication transcription, and translation ( Figure 6F). Again, the ubiquitin processes, which previously were not linked with G4 biology, were also uncovered as a significant sensitiser pathway. Overall, these results clearly show the spectrum of biological vulnerabilities that underpin the observed enhanced sensitivities for each G4-targeting ligand. We next reasoned that the robust set of 290 G4 ligand sensitiser genes above provides a suitable test bed for exploring the arising therapeutic potential for combinatorial pharmacological inhibition and G4-ligands. We therefore looked for the presence of these sensitisers genes within the druggable genome interaction database (DGIdb) (Griffith et al., 2013). A total of 74 G4-sensitisers were found in the classifications 'Druggable Genome' (genes with known or predicted drug interactions) and 'Clinically Actionable' (genes used in targeted clinical cancer sequencing for precision medicine) with 13 being common to both classifications ( Figure 6G, Supplementary file 1). Notably, this included KEAP1, an E3 ubiquitin ligase adapter protein and highlights a new therapeutic domain for the application of G4-based drugs. Performing a similar analysis on the 40 most robust sensitisers common to both G4 ligands gave 12 genes within DGIdb ( Figure 6H, Supplementary file 1), including 5 (BRCA1, CHEK1, CDK12, TOP1, PDKP1) common to both druggable and clinically actionable classifications. These results therefore open up new possibilities for cancer therapies based on vulnerabilities to G4 ligands.

G4 sensitisers common to two independent cell lines
We next sought to extend the use of the custom shRNA lentiviral library to gain initial insights into possible commonalities and differences in the response to G4 ligands in cells from different lineages. We therefore applied the custom library to mesenchymal-derived HT1080 fibrosarcoma cells (wildtype TP53, driven by activated NRAS (Q61K) and IDH1 mutation (R132C)) and compared the results to those from ectodermal A375 melanoma cells above ( . The custom HT1080 screen recovered a total of 121 G4 ligand sensitisers, with the majority (73 genes, 58%) shared with those seen for each ligand in the A375 genome-wide screen. Cytoscape network analysis ( Figure 7A) revealed a core set of G4-associated genes/pathways for these genes in spliceosome, HR and ubiquitin-mediated proteolysis processes (p<0.0005). Overall, 29 PDS and 22 PhenDC3 gene sensitivities were found to be shared across all three screens ( Figure 7B,C), and it is noteworthy that both G4 ligands targeted similar processes including transcription, splicing and ubiquitin-mediated proteolysis ( Figure 7D,E). BRCA1, TOP1, DDX42 and GAR1 are key G4 ligand sensitiser genes When we evaluated the data collectively from all screens, it was apparent that four genes were repeatedly found as G4 ligand sensitisers-BRCA1, TOP1, DDX42 and GAR1, as they consistently appeared in both cell types and with both G4-ligands in all screens ( Figure 7F, Figure 7-figure supplement 1F). To corroborate these genes as genuine G4 sensitisers, we developed an independent siRNA knockdown approach using a shorter timeframe (~6 days) to recapitulate ligand-induced growth inhibition ( Figure 8). Both A375 and HT1080 cells were transfected with siRNAs targeting BRCA1, TOP1, DDX42 or GAR1 alongside non-targeting siRNA and non-transfected controls. Following 24 hr, cells were treated with two concentrations of PDS and PhenDC3 or vehicle control DMSO for 144 hr. Growth curves for non-transfected and non-targeting siRNA controls were similar across ligand treatments in both cell lines ( , protein depletion following siRNA transfection was confirmed after 48 hr by immunoblotting cell lysates with the appropriate antibodies (average 76-92% knockdown for HT1080; 41-69% knockdown for A375 after 48 hr). The percentage difference in confluency compared to non-targeting siRNA control cells was  Figure 5. Identification of cancer-associated genes whose loss promotes sensitivity to G4 ligands. (A, B) Median log 2 FC and number of significantly depleted hairpins for G4 sensitisers overlapping the COSMIC database for PDS (A) and PhenDC3 (B). Genes common to both are indicated in blue. See also Supplementary file 1. (C) Functional interaction network analysis using STRING for the 50 COSMIC proteins indicated in A and B. Clusters are shown using confidence interactions > 0.4 from co-expression and experimental data. Box indicates the DDR cluster. Mirroring the shRNA screen findings, siRNA knockdown of all four genes in HT1080 cells imparted significant increases in sensitivity with PDS or PhenDC3 compared to DMSO. Some differences between the ligands and individual gene knockdowns were noted. For BRCA1 and TOP1 the lowest concentration of PDS resulted in the most sensitisation and this was evident early at 72 hr, whereas both PhenDC3 concentrations resulted in similar growth inhibition and was apparent later ( Results with the A375 cells also lend support to our observations, although there were some differences compared to HT1080 cells ( Figure 8-figure supplements 2 and 3). While GAR1 knockdown showed a similar sensitivity profile, BRCA1 and TOP1 deficiencies were sensitive to PDS but not PhenDC3. DDX42 knockdown in A375 cells did not reflect the screens ligand sensitivities and this may in part be due to lower knockdown efficiency compared (~40%). Nonetheless, these independent siRNA short-term assays substantiate that BRCA1, TOP1, DDX42 and GAR1 are genetic vulnerabilities to G4 ligands and these may open up future possibilities for therapeutic development.

G4-targeting ligands plus pharmacological inhibitors of G4 sensitiser genes demonstrate synergistic cell killing
One of our aims was to identify potential cancer genotypes where G4-ligands could be therapeutically exploited. Cancers deficient in our newly discovered G4 sensitisers may be preferentially sensitive to G4-ligands as single agents. Alternatively, rather than exploiting a genetic deficiency per se, it may be possible to use pharmacological inhibition of a critical cancer gene product that phenocopies the deficiency in combination with G4 ligands as an orthogonal approach ( Figure 9A). As proof-of-principle, we systematically evaluated cell death potentiation with the G4 ligand PDS in combination with pharmacological inhibitors for two new G4 sensitisers gene products, the WEE1 kinase or the deubiquitinase USP1 ( Figure 9B). WEE1 is a crucial G2/M regulator overexpressed in several cancers (Matheson et al., 2016), and USP1 is involved in DDR regulation and is overexpressed in non-small cell lung and other cancers (reviewed in García-Santisteban et al., 2013). For our studies, we used MK1775 (AZD1775), a WEE1 kinase inhibitor that is being clinically evaluated in several cancers (Richer et al., 2017), and pimozide a potent USP1-targeting drug (Chen et al., 2011a). HT1080 and A375 cells were cultured in matrix combinations of PDS with MK1775 or pimozide at concentrations surrounding the GI 50 values and cell viability measured after 96 hr using an end-point ATP luminescence-based assay (CellTiter-Glo, Promega). Combenefit software (Di Veroli et al., 2016) was then used to calculate synergy for different treatment combinations in which the percentage growth inhibition compared to single agent controls is used to plot a 3D-dose-response surface of synergy distribution in concentration space ( Figure 9C-F). In HT1080 cells, synergy was found for both PDS and MK1775 or pimozide combinations ( Figure 9C,D, Figure 9-figure supplement 1) with peak synergies of 21% and 24% at 156 nM PDS with 21 nM MK1775 or 6.25 mM pimozide, respectively (GI 50 for PDS, MK1775 and pimozide alone = 322 nM, 59 nM and 8.4 mM, respectively). A375 cells showed lower synergy with PDS and MK1775 combination ( Figure 9E, Figure 9-figure supplement 1), with peak synergy of 15% at 8 mM PDS, 444 nM MK1775 (GI 50 for PDS, MK1775 and pimozide alone = 8.5 mM, 625 nM and 12.2 mM, respectively). The greatest synergy was seen in combinations of PDS and pimozide in A375 cells ( Figure 9F, Figure 9-figure supplement 1) with a peak synergy of 61% at 5.33 mM PDS, 6.25 mM pimozide. Furthermore, long-term clonogenic survival assays revealed a similar potentiation of growth inhibition, albeit at lower compound concentrations, for PDS/MK1775 and PDS/pimozide drug combinations for both cell lines tested (Figure 9-figure supplement 2). Altogether, these results validate that appropriate drug combinations can synergistical act as a surrogate for gene deficiencies in the presence of G4 ligands and thus complements the findings uncovered by our genetic screening approach.   Identification of DDX42 as a new G4-binding protein

U Ubiqui biqui biqui biqui biqui biqui biqui biqui biqui biqui biquitin n me medi di di di di di di dia a a a a a at te ed prote teol ol olys ysi is s
Another of our aims was to use the findings of our shRNA screen to identify proteins that may bind and/or regulate G4 structures in cells, such as G4 helicases. Indeed, DHX36 and DHX9, known G4 helicases (Giri et al., 2011;Chen et al., 2018;Chakraborty and Grosse, 2011;Creacy et al., 2008 ;Vaughn et al., 2005) and the DEAD box protein DDX3X, that was recently shown to bind RNA G4s (Herdy et al., 2018), were identified as G4 sensitisers in our screen. Further members of the DDX/DHX helicase family also appeared as G4 sensitisers ( Figure 10A), raising the question of whether these represent previously uncharacterized G4-binding proteins. To address this directly, we chose to investigate DDX42 as this was one of the four key G4 sensitisers identified above. DDX42 is a non-processive RNA helicase (Uhlmann-Schiffler et al., 2006) and has been associated with splicing ; however, this protein remains largely uncharacterised. By immunoblotting of nuclear and cytoplasmic sub-cellular fractions ( Figure 10B-E), we first established that DDX42 predominantly localises to the nucleus (~4 to 9-fold greater than cytoplasmic levels) in three independent cell lines, (HT1080, HEK293 and HeLa). As controls for fractionation, LaminB1 and GAPDH were found to partition as expected into nuclear and cytoplasmic fractions, respectively ( Figure 10C,D). As DDX42 is known to bind RNA, we next set out to demonstrate DDX42 affinity for a RNA-G4 structure as this has not previously been documented. For this, a G4 RNA oligonucleotide from the NRAS 5'UTR sequence, which forms a stable parallel G4 (Kumari et al., 2007), was used together with a mutated oligonucleotide unable to form a G4 structure and also a RNA hairpin as negative controls (Herdy et al., 2018). Oligonucleotides were folded in 100 mM KCl to promote G4 structure formation and the resultant structures confirmed by circular dichroism (CD) spectroscopy (Figure 10-figure supplement 1). The affinity of recombinant DDX42 was then investigated by Enzyme Linked Immunosorbent Assay (ELISA, Figure 10F) and binding parameters calculated using a nonlinear regression model, assuming one-site-specific binding and saturation kinetics using Prism software. DDX42 bound the NRAS G4 folded in KCl with an apparent K d of 71.1 ± 3.5 nM and did not bind detectably to the mutant oligonucleotide or RNA hairpin controls.
Given the nuclear localisation of DDX42 and as some DDX proteins also have DNA helicase activity (Kikuma et al., 2004), the DDX42 affinity for a DNA G4 structure was investigated. For this, an oligonucleotide corresponding to the stable parallel G4 structure in the promoter of MYC (González and Hurley, 2010; Yang and Hurley, 2006), and a non-G4 forming control, were used. The oligonucleotides were folded in 100 mM KCl and structures verified by CD spectroscopic analysis ( Figure 10-figure supplement 1B). DDX42 affinity by ELISA ( Figure 10G) showed that DDX42 binds to the MYC DNA G4 with an apparent K d of 232.9 ± 23.5 nM with little binding to the mutant control. Thus, the G4 sensitiser screen has enabled us to identify and classify DDX42 as a G4-interacting protein as a new finding.

Discussion
G4 structures are emerging as promising clinical targets in cancer (Xu et al., 2017) but the range of disease-associated genetic backgrounds that potentiate G4 ligand effects has yet to be defined. Here, we have discovered many genes that when depleted enhance cell killing with the G4 ligands Figure 6 continued genes common to the genome-wide and A375 focused screens. A right-sided enrichment test with Bonferroni correction used (see Materials and methods). (F) DAVID, STRING (experimental data, co-expression, medium confidence !0.4) interaction and UniprotKB data were used to categorise biochemical roles for the 40 high-confidence G4 sensitisers common to both ligands. Genes in red indicate those found in the (DGIdb 2.0). *=genes in multiple categories. (G, H) Overlap of the all 290 robust G4 sensitisers (G) and the 40 G4 sensitisers common to both ligands (H) with the Drug Genome Interaction database. The druggable genome denotes genes with known or predicted drug interactions. Clinically actionable denotes genes used in targeted cancer clinical sequencing panels. See also    Wu et al., 2018;Xu et al., 2017;Zimmer et al., 2016). We now report for the first time genetic vulnerabilities in 20 other known G4-associated genes that promote sensitivity to G4-stabilising ligands. These include direct nucleic acid binders and/or unwinders, such as ADAR, DHX36, DNA2, FUS, MCRS1, RECQL4, SF3B3 and XRN1.
The clinical PARP inhibitor, olaparib has exemplified the concept of synthetic lethality in BRCAdeficient cells (Bryant et al., 2005;Farmer et al., 2005), and it is notable that BRCA deficiencies were isolated as one of the top genetic vulnerabilities for both G4 ligands in both A375 and HT1080 cells. While PDS and PhenDC3 have not been optimised by medicinal chemistry, the findings of Zimmer et al showing similar efficacy of PDS and olaparib in several BRCA-deficient models (Zimmer et al., 2016) lends further support that our screen detects robust, biologically relevant effects.
In dropout screens, dissociating minor from robust growth effects is important and is highly dependent on parameters such as compound dose, genotype and cell line selected. Our screen was designed with stringent parameters to detect genes deficiencies worthy of further exploration. Indeed, we demonstrate potent growth inhibition of up to 80% of the four top G4 sensitisers genes in a parallel siRNA approach.
The gene sensitivities uncovered here have potential to be exploited chemotherapeutically in cancer by deploying a G4-stabilising drug as a single-agent therapy. Alternatively, in the absence of a particular gene deficiency, pharmacological inhibition of a critical oncogene could phenocopy the genetic sensitivities described here and be used in combinatorial treatments with G4-stabilising drugs. This may be attractive as cells are less likely to simultaneously develop resistance against two drugs (reviewed in Chan and Giaccia, 2011). Furthermore, as lower drug doses are used, this increases the therapeutic window and has less adverse side effects. As proof-as-principle for this, we selected the WEE1 cell cycle kinase and the deubiquitinase USP1, and demonstrated that their pharmacological inhibition, with MK1775 and pimozide, respectively, leads to the potentiation of cell death in conjunction with the G4 ligand PDS. For example, 5.3 mM PDS or 6.25 mM pimozide alone impart little growth inhibition (14% and 6% respectively), but together they lead to strong growth inhibition (79%). Table 1 highlights further potential combinatorial opportunities for cancer-associated genes with clinical and/or experimental drugs. Additional therapeutic possibilities for other gene sensitivities that are largely still to be explored from a pharmacological perspective are illustrated in Table 2.
While the custom HT1080 screen recovered 58% of sensitisers seen for each ligand in the A375 genome-wide screen, it is striking that this increases to 93% (i.e. 112 out of 121) when considering all screens irrespective of G4 ligand, suggesting remarkable consistency when comparing G4 ligand effects globally. Differences in individual ligand sensitives may arise from variances in cellular uptake and dose, for example, the GI20 dose of PhenDC3 is ten-fold higher for A375 compared to HT1080; G4 ligand-dependent molecular preference for G-tetrad end binding (Le et al., 2015) and/or the accessibility of G4s in the chromatin of individual cell lines (Hänsel-Hertsch et al., 2016). These points plus differences in protein knockdown efficiency, especially in A375 cells, may contribute to the differences in G4 ligand growth inhibition in our siRNA experiments. In the siRNA experiments, the G4 ligand-induced growth inhibition of both A375 and HT1080 appear not to follow a 'typical'  Figure 8 continued on next page dose response where higher concentrations lead to greater effects. This may in part be due to there being an optimum G4 ligand dose for a particular gene loss leading to enhanced cell death. Indeed, it is thought that lower drug concentrations better fall within a 'synthetic lethality window' (Nijman, 2011). Higher doses may mask these effects, by targeting more G4s that are not dependent on the particular gene lost and/or be due to other off-target effects. This is also supported by the experiments in Figure 9 that show synergy is only apparent at defined concentrations.
Our data additionally provides insights into the possible functions of the identified G4 sensitisers and indicates roles in DNA damage response (DDR), transcription/chromatin remodelling, nucleic acid unwinding, splicing and ubiquitin-mediated proteolysis. Our findings substantially advance our knowledge of G4 interactions with DDR beyond BRCA1/2 as several key HR genes were identified as novel G4-sensitisers including PALB2, BAP1 and the deubiquitinase USP1. Importantly, this highlights that such HR repair mechanisms are an integral and important cellular response in preventing cell death induced through the increased persistence of G4s. Persistent G4 structures are also inhibitory to DNA replication/cell cycle progression (reviewed in Valton and Prioleau, 2016 ), and it is of note that we also uncovered many cell cycle/DNA replication sensitivities such as PCNA, CHEK1, CCND1, CDC7, RFC2 and RFC4. Taken together these suggest that G4 stabilisation with small molecules could be an attractive therapeutic strategy to inhibit cell growth.
Deficits in G4-unwinding helicases are predicted to increase the persistence of G4 structures resulting in heightened sensitivity to G4 ligands. Several known G4-associated helicase deficiencies were recovered, including RECQL4, RTEL1 and DHX36, alongside many others with no known G4 link (see Figure 10A). Here, we demonstrate for the first time that the DDX42 DEAD/DEAH helicase is in fact a previously unidentified structure-specific G4-binding protein. On a wider level, this acts as proof-of-principle that other specific G4 interacting proteins exist within the sensitiser list of over 700 proteins. Other known G4-helicases such as BLM, WRN, PIF1 and FANCJ (reviewed in Wu and Brosh, 2010) were not identified as sensitisers, which may reflect functional redundancy (Spillare et al., 2006), or a low ligand concentrations and/or cell type effects.
Our findings highlight the ubiquitin-protesome pathway and modifications such, as neddylation as unexplored areas with respect to G4s. The only documented ubiquitin-G4 relationship in human cells is with HERC2, an E3 ubiquitin ligase that is implicated in G4 resolution whose loss sensitises cells to G4 ligands (Wu et al., 2018). We also independently validate HERC2 as a G4 sensitiser in our screen and extend our observations to cover the full breadth of the proteosomal degradation pathway, including members of E1 ligase (UBA3, UBA2, SAE1), E2 ligase (UBE2H), E3 ligase (NEDD4L, RBX1, CUL1, RNF20), deubiquitinating enzyme (USP1 and USP37) and proteosome (PSMC2) families (see Table 2) (Senft et al., 2018;Wei and Lin, 2012) Given the involvement of ubiquitin-proteasomal regulation in pathways, such as DDR and cell cycle, that are generally deregulated in cancer (Harrigan et al., 2018), this opens up an interesting intersection between ubiquitin regulation and G4s. As ubiquitin components are being targeted for anticancer therapies (Huang and Dixit, 2016), their efficacy might be enhanced through simultaneous G4 targeting and here we have provided strong proof-of-principle of this using synergistic combinations of pimozide (targeting UPS1) and the G4 ligand PDS.
In contrast to other genetic screens identifying sensitiser genes that enhance the efficacy of anticancer agents (Azorsa et al., 2009;Martens-de Kemp et al., 2017), our work suggests that persistent G4s are problematic for splicing. We identified several cancer-associated splicing factors as G4 sensitisers, including SRSF10, HNRNPM and the known G4-interactor FUS, which is overexpressed in several cancers (Crozat et al., 1993;Dvinge et al., 2016;Takahama et al., 2013). For the latter, a drug inhibiting general spliceosome assembly ( Table 1) has been pharmacologically explored    (Kotake et al., 2007) raising the possibility of potentiation by G4-stabilising ligand combinatorial treatment.
We designated four of the genetic vulnerabilities as 'key' genes (BRCA1, TOP1, DDX42, and GAR1) whose deficiencies stood out with respect to consistent sensitivity to PDS and PhenDC3 in both cell lines tested. Given this, we postulate that deficiencies in any of these four genes will impart significant G4 ligand sensitivity for a range of cell types and/or with other G4 ligands. As GAR1-deficiencies are implicated in chronic lymphocytic leukaemia and contribute to telomere dysfunction (Dos Santos et al., 2017), we suggest that this cancer may be acutely sensitive to G4-stabilisation by small molecules.
In conclusion, we have revealed genes and pathways that interact with stabilised G4 structures. This information provides new insights into G4-related biology, especially into the functional pathways and roles as G4-interacting proteins. Furthermore, this work reveals novel disease-related genetic vulnerabilities for G4-ligands. Overall, these data provide a unique and comprehensive resource that can be further explored to understand biology that may involve G4s and also inspire new therapeutic possibilities. Continued on next page  Cell lines HT1080 (RRID: CRL-1619) and A375 (RRID: CRL-121) were obtained from the American Type Culture Collection repository (ATCC) (LGC Standards, United Kingdom) and Plat-A (RRID: RV-102) was obtained from Cell Biolabs Incorporation. All cell lines were cultured in DMEM medium (Thermo-Fisher Scientific, cat #41966029) supplemented with 10% (v/v) heat inactivated FBS (ThermoFisher Scientific, cat #10500064) and grown at 37˚C in a 5% CO 2 humidified atmosphere. Cell lines were authenticated using small tandem repeat (STR) profiling and regularly checked to be mycoplasmafree by RNA-capture ELISA. All cell lines tested negative for Mycoplasma contamination. None of the cell lines used in our studies was mentioned in the list of commonly misidentified cell lines maintained by the International Cell Line Authentication Committee.

Quantification of live cell numbers
Live cell numbers (e.g. for plating cells for CellTitre-Glo assays, the screens and Incucyte experiments) were determined using the Muse Cell Analyzer (Merck), 'Count and Viability' assay according to manufacturer's instructions. Cells were diluted either 1:10 or 1:20 in 'Muse Count and Viability kit' solution (Merck, cat # MCH60013), to give a viable cell concentration of 1-2 Â 10 6 cells/mL, with 'Events to Acquire' parameter set at 1000 events. Three cell counts were recorded.
Determination of G4 ligand concentration for shRNA screens PDS and PhenDC3 (both synthesised in-house) (De Cian et al., 2007b;Rodriguez et al., 2008) were used as 100 mM stocks, dissolved in DMSO (Thermofisher Scientific, cat # 20688). GI 20 values were calculated by treating A375 and HT1080 cells with serial dilutions of PDS and PhenDC3 for 96 hr and determining cell death via a CellTitre-Glo One Solution assay (Promega, cat # G8461) according to manufacturer's protocol. Each serial dilution was replicated four times for two-cell-seeding densities (1000/1500 cells per well). For both densities, curves were plotted averaging the four replicates in Prism (GraphPad v6) using a Non-Linear regression model, 'dose-response -inhibition' equation [log (inhibitor) vs. normalised response -variable slope] and GI 20 values calculated. The GI 20 concentrations used represent an average of three separate assays per cell line and yielded the following    concentrations used for the screens -A375: 10 mM PhenDC3 and 1.5 mM PDS; HT1080: 1 mM PhenDC3 and 0.5 mM PDS.

Composition and recombinant DNA production of shRNA libraries
The genome-wide screen uses the transOMIC LMN shRNA library against the human protein coding genome, consisting of 113,002 total shRNAs, split between 12 pools for ease-of-handling (approximately 10,000 shRNAs per pool) with an average number of five optimised hairpins per gene. The G4 focused screen consists of a custom shRNA pool (transOMIC technologies) with the same LMN vector (8018 shRNAs); this includes 1247 genes (7436 shRNAs) uncovered in the genome-wide screen (751 sensitisers and 496 upregulated genes), 116 additional genes identified from the literature as potentially G4-associated (439 shRNAs) and shRNAs targeting 37 olfactory receptors as nontargeting controls (143 shRNAs). 496 upregulated genes (FDR 0.05, 50% or three hairpins; log 2 FC ! 1) were included to mimic the genome-wide screen on a smaller scale by maintaining the population ratio of sensitisation and resistance. In this custom pool, unlike the commercially available genome-wide library, we capped the number of shRNAs at seven per gene. The backbone of both libraries contains Neo R and ZsGreen markers to allow monitoring of infected cell lines by Geneticin (Gibco, cat # 10131035) selection and fluorescence (MacsQUANT), respectively. Both libraries were provided as glycerol stocks. Bacterial density was determined by calculating the colony-forming units (CFU) from dilutions of the original glycerol stock after plating on agar plates (overnight, 37˚C, 100 mg/mL ampicillin). Glycerol stocks were thawed completely with sufficient volume taken (based on CFU) to ensure a minimum of 1000-fold hairpin representation and inoculated into liquid culture (LB media + 100 mg/mL ampicillin). Plasmid DNA was isolated using ZR Gigaprep kit D4057 (Zymo research) according to manufacturer's protocol and DNA quantified by Nanodrop One c (Thermo Fisher Scientific).

shRNA stable cell line creation
For the genome-wide screen, each pool was treated independently, necessitating the creation of 12 different polymorphic cell lines each containing an average of 10,000 shRNAs, for both HT1080 and A375, per replica (three replicas, 36 polymorphic cell lines). Virus was produced using the Platinum-A packaging cell line (4-6 Â 15 cm plates per pool) and calcium phosphate transfection. 24 hr after plating Platimum-A cells (70-80% confluency), media was replaced with DMEM medium supplemented with 1% (v/v) PenStrep (Thermo Fisher Scientific, cat # 150763) and 10% (v/v) heat inactivated FBS, shRNA library plasmid (75 mg) was then mixed with pCMV-VSV-G plasmid (7.5 mg, Addgene cat # 8454), Pasha/DGCR8 siRNA (2.7 mM, Qiagen cat # 1027423) to increase viral titre and 0.25 M CaCl 2 in a total volume of 1.5 mL per 15 cm dish and bubbled with 1.5 mL 2 x HBS (50 mM HEPES, 10 mM KCl, 12 mM Dextrose, 280 mM NaCl, 1.5 mM Na 2 PO 4 at pH 7.00) and added to the Platinum-A cells (containing 17mL media) in a dropwise fashion. Immediately before adding the DNA-Pasha-transfection mixture to the Platinum-A cells, chloroquine diphosphate (lysosomal inhibitor, Acros Organics cat # 455200250) was added to the plates at a final concentration 2.5 mM. 14-16 hr after transfection, fresh media was added with 1:1000 1 M sodium butyrate (Merck, cat # 303410) for enhanced mammalian expression of the shRNA LMN plasmid. Virus was then harvested 48 hr after transfection and filter sterilised (0.45 mM) and stored at 4˚C for a maximum of 7 days. Viral titre was determined by performing mock infections and quantifying fluorescent cells, via flow cytometry (MacsQUANT, Miltenyi Biotec Ltd.) 48 hr after infection. For both the genome-wide and focused-   Table 2 continued on next page screen, 3.6 Â 10 6 target cells were infected with a viral volume predicted to cause 30% infection (MOI 0.3) to minimise multiple shRNA integrations per cell. This provides approximately 10 Â 10 6 shRNA expressing cells (1000-fold shRNA representation). Virus was diluted in serum free media plus polybrene (8 mg/mL) with infections carried out in triplicate and treated as independent replicates hereafter. 48 hr after infection cells antibiotic selection was performed with 800 mg/mL (HT1080) and 1000 mg/mL (A375) geneticin for 7-9 days (antibiotic concentrations were determined from 7-day toxicity curves prior to transfection setup).
Cell culture for pilot, genome-wide and focused shRNAs pools Following complete antibiotic selection, a reference time point was harvested (t0) and cells were split into 3 Â 15 cm plates per replica: PDS, PhenDC3 and DMSO vehicle control, each containing 8-10 Â 10 6 cells to maintain 1000-fold hairpin representation. Every 72 hr, cells were trypsinised, counted via Muse Cell Analyzer to determine the number of population doublings, and 10 Â 10 6 (A375 genome-wide and focused) or 8 Â 10 6 (HT1080) cells per replica re-plated in fresh drug/ DMSO and media (17 mL media per plate). At all times, sufficient cell numbers were used so that a minimum of 1000 or 800 cells per shRNA was maintained (A375 and HT1080, respectively), to ensure maximal potential for uncovering phenotypic effects from each shRNA hairpin tested (Knott et al., 2014). The volume of DMSO used in the 'vehicle' condition is equal to the volume for 10 mM PhenDC3. The remaining drug treatments were supplemented with DMSO to match this volume to keep the same DMSO concentration between treatment cell lines and screens. For the pilot screen, two final timepoints were harvested after 7 and 15 population doublings (t7 and t15), pellets extracted and analysed as described below. Based on the pilot screen, discussed below, a final time point (tF) was harvested after 15 population doublings for subsequent genome-wide and focused screen. For each pool of the genome-wide screen, 12 samples were generated (t0, DMSO tF, PDS tF, PhenDC3 tF; three replicas each). Therefore, 144 samples of 10 Â 10 6 cells were generated to cover the entire screen. For each cell line of the focused screen, 12 samples were generated (t0, DMSO tF, PDS tF, PhenDC3 tF, three replicas each).

Pilot screen technique to determine genome-wide parameters
To determine the most appropriate tF for the genome-wide screen, cells (from shRNA pool 8) were harvested after 7 and 15 population doublings (t7 and t15 respectively) and the average log 2 FC (tF/ t0) counts for each hairpin were determined as described below. For t7, 13 and 115 shRNAs were significantly altered following PDS and PhenDC3 treatment respectively (FDR 0.05). At t15, more hairpins were significantly depleted following PDS and PhenDC3 treatment (746 and 93 shRNAs respectively, excluding those significantly changed in DMSO).

Barcode recovery, adapter ligation and sequencing
All PCR and sequencing oligonucleotides (Merck) are summarised in the table below. Cell pellets (10 Â 10 6 cells) were resuspended in PBS and genomic DNA extracted using QIAmp DNA Blood Maxi Kit (Qiagen, cat # 51194) according to the manufacturer's spin protocol, eluted in a final volume of 1200 mL and quantified by Qubit DNA HS Assay Kit (Thermo Fisher Scientific, cat # Q32851). The shRNA inserts were PCR-amplified from all DNA in each sample, in multiple 50 mL reactions each using 1.5 mg gDNA, with KOD Hot Start DNA Polymerase (Merck, cat # 710864) and the following reagents (included within the kit): 5 mL 10 x buffer, 5 mL 2 mM each dNTPs, 4 mL MgSO 4 (25 mM), 1.5 mL polymerase, 4 mL DMSO. Forward (Mir-F) and reverse (PGKpro-R) primers flanking the loop and antisense sequence of the hairpin region were used at a final concentration of 300 nM. PCR was performed under the following conditions: 98˚C for 5 min, then for 25 cycles of 98˚C for 35 s, 58˚C 35 s and 72˚C for 35 s, followed by a final extension at 72˚C for 5 min. 1.2 mL of pooled PCR reaction were cleaned-up using QIAquick PCR purification kit (Qiagen, cat # 28104) according to manufacturer's protocol. 2 mg purified PCR product were PCR amplified in a second step, using forward (P5-Seq-P-Mir-Loop) and reverse (P7-Index-n-TruSeq-PGKpro-R) primers containing the P5 and P7 flowcell adapters, respectively. PCR was performed in 8 Â 50 mL reactions each with 500 ng template DNA. The reverse primer contains TruSeq adapter small RNA Indexes for multiplexing and a 6-nucleotide barcode, denoted 'nnnnnn' below. PCR reagents were as for the first PCR, with the exception of the primers, which were used at a final concentration 1.5 mM. The second PCR was performed under the following conditions: 98˚C for 5 min, then for 25 cycles of 94˚C for 35 s, 52˚C 35 s and 72C for 35 s, followed by a final extension at 72˚C for 5 min. 200 mL of pooled secondary PCR product was cleaned up as previously and the desired product (~340 bp) was extracted using BluePippin (Sage Science) 2% Internal Standard Marker Kit (DF marker 100-600 bp; Sage Science, BDF2010), according to manufacturer's protocol using a broad range elution (300-400 bp). Individual samples were quantified with a KAPA library quantification kit (KAPA Biosystems, cat # 0796-6014-0001) using a BioRad CFX96 Real Time PCR instrument with no Rox according to manufacturer's protocols. Libraries were diluted to 4 nM in RNAse free water. For the genome-wide screen samples, 24 libraries (12 pools) and for the focused screen samples, 24 libraries (both cell lines) were combined to create a pooled 4 nM stock, with each sample having a unique TruSeq adapter. The genome-wide screen samples were sequenced in six batches; all focused screen samples were sequenced simultaneously. DNA-Seq libraries were prepared from these samples using the NextSeq Illumina Platform v2 High Output Kit 75 cycles, followed by 36 base pair single-read sequencing performed on an Illumina NextSeq instrument, using a custom sequencing primer.  (Andrews, 2010) and bases were filtered from the 3' end with a Phred quality threshold of 33 using the FASTX-Toolkit v0.0.14 (Gordon and Hannon, 2010). Trimmed reads were aligned to the 113,002 reference shRNA sequences provided by transOMIC technologies (Knott et al., 2014) using Bowtie 2 v2.2.6 with default parameters (Langmead and Salzberg, 2012), which resulted in overall alignment rates of 90-95% with an average of 98% of reference sequences detected. The generated SAM files were processed to obtain shRNA counts using Unix tools (https://opengroup.org/unix) and Python scripts (v2.7.10, https://www.python.org), and library purity and potential contaminations were investigated with stacked bar plots and multidimensional scaling (MDS) using the R programming language v3.2.1 (https://cran.r-project.org). The code and scripts developed during the development of the project are available in our group's GitHub website (Martínez Cuesta, 2019; copy archived at https://github.com/elifesciences-publications/GWscreen_ G4sensitivity).
Filtering, normalisation, differential representation analysis and defining sensitisation To discard shRNAs bearing low counts, each library was filtered based on a counts-per-million threshold of 0.5 for all initial time points (t0), for example in a library of 10M reads, shRNAs with at least five counts for all initial time points will pass this filter. Normalisation factors were calculated to scale the raw library sizes using the weighted trimmed mean of M-values (TMM) approach (Robinson et al., 2010). To compare groups of replicates (time points and chemical treatments) for each pool, differential representation analysis of shRNA counts was performed using edgeR (Robinson et al., 2010). Common and shRNA-specific dispersions were estimated to allow the fitting of a negative binomial generalised linear model to the treatment counts. Contrasts between the initial time point and the treatments were defined (PDS-t0, PhenDC3-t0, and DMSO-t0) and likelihood ratio tests were carried out accordingly (Dai et al., 2014). Fold changes (FC) were then computed for every shRNA, and false discovery rates (FDR) were estimated using the Benjamini-Hochberg method. A gene was defined as significantly differentially represented for a given treatment if at least 50% or a minimum of 3 shRNAs were significant (FDR 0.05); sensitisation was additionally determined by applying a log 2 FC À1.
Exploring genes associated to G4s in databases and biomedical literature Three different approaches were developed to uncover genes linked to G4s in the literature and molecular biology databases. 18 high confidence G4-related genes were obtained by scanning for genes in which the corresponding UniprotKB (The UniProt Consortium, 2017) entry is annotated with the term 'quadruplex' or genes annotated with at least one of the following 11 GO terms with any evidence assertion method (Ashburner et al., 2000): treatment. Specifically, a right-sided (Enrichment) test based on the hyper-geometric distribution was performed on the corresponding Entrez gene IDs for each gene list and the Bonferroni adjustment (p<0.05) was applied to correct for multiple hypothesis testing. Only experimental evidence codes (EXP, IDA, IPI, IMP, IGI, IEP) were used. The Kappa-statistics score threshold was set to 0.4 and GO term fusion was used to diminish redundancy of terms shared by similar proteins. Other parameters include: GO level intervals (3-8 genes) and Group Merge (50%). Protein domains were investigated using DAVID (v6.8) to integrate GENE3D crystallographic data and PFAM sequence information and enrichment was considered significant if the EASE score p<0.05 (Finn et al., 2016;Yeats et al., 2006).

COSMIC analysis
Cancer mutation data (CosmicMutantExport.tsv) from the COSMIC database v82 (Forbes et al., 2015) was used to investigate the association between G4 sensitisers and cancer genes.~150,000 were mutations available in COSMIC for 702 (93%) sensitiser genes, with some predicted to be pathogenic by the FATHMM algorithm embedded within the COSMIC database. The Cancer Gene Census (http://cancer.sanger.ac.uk/census) was used to investigate whether G4 sensitisers are enriched in genes containing mutations causally implicated in cancer. Fisher's exact tests as implemented in R were used to calculate fold enrichment significance of sensitisers that are cancer genes in COSMIC (compared to the percentage of protein-coding genes in COSMIC -3.3%).
siRNA validation experiments -transfection, experimental outline, immunoblotting ON-TARGETplus siRNAs (Dharmacon/GE healthcare) were used as summarised in the table below. Cells were transfected with either targeting or non-targeting control siRNAs using lipofectamine RNAiMAX (Thermo Fisher Scientific, cat # 13778150) and OptiMEM reduced serum medium (Thermo Fisher Scientific) according to the manufacturer's protocol (reagent protocol 2013) alongside a nontransfected control. 24 hr after transfection, cells were trypsinised, counted and re-plated in media supplemented with PDS, PhenDC3 or DMSO vehicle control (minimum two biological replicates per condition) in a 48-well plate format (seeding density -8,000 cells per well A375; 4,000 cells per well HT1080). Cell growth was monitored for 144 hr using IncuCyteZOOM live cell analysis (Sartorius) and cell confluency calculated as a percentage of the well area covered. Scans were performed every 3 hr; nine scans per well. To monitor protein levels, cells transfected simultaneously with the same siRNA-reagent mixture were harvested 48 hr and 144 hr after transfection, by cell scraping and lysed on ice (30 min) with RIPA lysis buffer with protease inhibitor +EDTA (Thermo Fisher Scientific, cat # 8990). Lysates were analysed by capillary electrophoresis via the Protein Simple Wes platform according to manufacturer's protocol with antibodies summarised in the key resource table above. Lysates from non-transfected and siRNA-treated (targeting and non-targeting) samples were probed with antibodies against BRCA1 (Cell Signalling Technology, cat # 4970-CST), TOP1 (Abcam, cat # AB109374), GAR1 (NovusBio cat #NBP2-31742) or DDX42 (Abcam cat #AB80975), plus anti-beta actin antibody (mouse Merck cat # A5441; rabbit cat # 4970-CST) by multiplexing. For non-targeting and targeting lysates, the area of the desired band was normalised to beta-actin and then normalised to the protein level in the non-targeting sample, for three (48 hr after transfection lysates) or two independent Wes runs (144 hr after transfection). Protein depletion is expressed as an average of these normalised values. All lysates were used at a concentration of 0.8 mg/mL and antibody dilutions as follows: BRCA1 1:50; TOP1 1:250; GAR1 1:100; DDX42 1:250; rabbit-actin 1:500; mouseactin 1:250.

G4 ligand and drug treatments
10 mM stocks in DMSO of PDS (in house synthesis), MK1775 (Cambridge Bioscience, cat# CAY21266) and pimozide (Merck, cat# P1793-500MG) were used as for synergy experiment. Cells were seeded in Corning, Tissue Culture-treated 96-well clear bottom plates (Thermofisher, for HT1080 (1000 cells per well) and A375 (1500 cells per well) cell lines. 24 hr after plating, media was removed and cells were treated with different concentrations of PDS and MK1775 or pimozide in media in a final volume of 150 mL, alongside non-treated and solvent-treated controls.
After 96 hr, cell death was determined via a CellTitre-Glo One Solution assay (Promega, cat # G8461) according to manufacturer's protocol using the PHERAstar FS (BMG labtech) to detect luminescence, using the recommended settings. Values were normalised to and expressed as a percentage of the untreated controls. This was performed for three biological replicas. Data were analysed via Combenefit software using the BLISS independence model since the molecule have independent targets (Di Veroli et al., 2016) to determine synergy. For the clonogenic cell survival assay, A375 (300 cells per well) and HT1080 (400 cells per well) were plated as single cells in 12-well plates. The next day, cells were treated with DMSO or the indicated doses of PDS, pimozide and/or MK1775 in media. After 8 days, colonies were fixed with 3% trichloroacetic acid (TCA) for 90 min at 4˚C, washed with MiliQ, air dried and then stained with 0.057% (v/v) Sulforhodamine B solution (Merck, cat # 230162-5G) for 30 min at room temperature. Plates were then washed four times with 1% acetic acid, air dried and colonies visualised using GelCount (Oxford Optronix). Colony growth was determined using the 'colony intensity percentage' parameter in the ColonyArea Image J plugin (Guzmán et al., 2014), which considers both the intensity and percentage of area covered by the colonies. Values were normalised to and expressed as a percentage of the untreated controls and then further processed by Combenefit software, as described above, to determine synergy. A total of three independent biological replicates were performed.
The supernatant was then collected as nuclear extract. Cytoplasmic and nuclear lysates were quantified on a Direct Detect platform (Merck) and DDX42 expression analysed by immunoblotting using the Protein Simple Wes instrument as described above with a lysate concentration of 0.5 mg/mL. Samples were also immunoblotted with antibodies against nuclear laminB1 (CST 12586; 1:250) and cytoplasmic GAPDH to confirm subcellular fractionation efficiency (CST 5174, 1:50).

Oligonucleotide annealing
Biotinylated oligonucleotides for G4 and non-G4 forming sequences (IDT technologies; see Table  below) were annealed in 10 mM TrisHCl pH 7.4, 100 mM KCl by heating at 95˚C, 10 min followed by slow cooling to room temperature overnight at a controlled rate of 0.2˚C/min. Annealed oligonucleotides were stored at 4˚C for maximum 1 month.

Enzyme-Linked immunosorbent assay
Recombinant human DDX42 with an N-terminal GST tag was purchased from NovusBio (cat# H0001325-P01). Streptavidin-Coated High-Binding Capacity 96-well plates (ThermoScientific prod #15501) were hydrated with PBS (30 min) and coated with 50 nM biotinylated oligonucleotides (1 hr, shaking 450 rpm). Wells were washed three times with ELISA buffer (50 mM K 2 HPO 4 pH 7.4 and 100 mM KCl/100 mM LiCl); 1 min shaking, 450 rpm. Wells were then blocked with 3% (w/v) BSA (Merck, cat# A7030) in ELISA buffer for 1 hr, at room temperature and then incubated with serial dilutions of DDX42 up to 200 nM for 1 hr. Wells were washed three times with 0.1% TWEEN-20 in ELISA buffer and then incubated for 1 hr with anti-GST HRP-conjugated antibody (Abcam AB3416) diluted 1:10,000 in blocking buffer. Wells were again washed three times with ELISA-Tween, and the bound anti-GST HRP detected with TMB substrate (Merck,cat#T4444) for 2 min. Reactions were stopped with 2 M HCl. Absorbance at 450 nm was measured with PheraSTAR plate reader (BMG labtech).
Binding curves with standard error of the mean (SEM) were fitted using GraphPad Prism software, using a non-linear regression fit, one site, specific binding model with saturation kinetics. The following equation was used: y=(Bmax*x)/(K d +x), where x = concentration of DDX42 (nM) and Bmax is the maximum specific binding (i.e. saturation).

Circular dichroism spectroscopy
200 mL of 10 mM oligonucleotide were prepared in assay buffer and annealed as described above. CD spectra were recorded on an Applied Photo-physics Chirascan CD spectropolarimeter using a 1 mm path length quartz cuvette. CD measurements were performed at 298 K over a range of 200-320 nm using a response time of 0.5 s, 1 nm pitch and 0.5 nm bandwidth. The recorded spectra represent a smoothed average of three scans, zero-corrected at 320 nm (Molar ellipticity is quoted in 10 5 deg cm 2 dmol À1 ). The absorbance of the buffer was subtracted from the recorded spectra.