Functional interrogation of HOXA9 regulome in MLLr leukemia via reporter-based CRISPR/Cas9 screen

Aberrant HOXA9 expression is a hallmark of most aggressive acute leukemias, notably those with KMT2A (MLL) gene rearrangements. HOXA9 overexpression not only predicts poor diagnosis and outcome but also plays a critical role in leukemia transformation and maintenance. However, our current understanding of HOXA9 regulation in leukemia is limited, hindering development of therapeutic strategies. Here, we generated the HOXA9-mCherry knock-in reporter cell lines to dissect HOXA9 regulation. By utilizing the reporter and CRISPR/Cas9 screens, we identified transcription factors controlling HOXA9 expression, including a novel regulator, USF2, whose depletion significantly down-regulated HOXA9 expression and impaired MLLr leukemia cell proliferation. Ectopic expression of Hoxa9 rescued impaired leukemia cell proliferation upon USF2 loss. Cut and Run analysis revealed the direct occupancy of USF2 at HOXA9 promoter in MLLr leukemia cells. Collectively, the HOXA9 reporter facilitated the functional interrogation of the HOXA9 regulome and has advanced our understanding of the molecular regulation network in HOXA9-driven leukemia.


Introduction
Dysregulation of the homeobox (HOX)-containing transcription factor HOXA9 is a prominent feature in most aggressive acute leukemias (Collins and Hess, 2016a;Alharbi et al., 2013). During normal hematopoiesis, HOXA9 plays a critical role in hematopoietic stem cell expansion and is epigenetically silenced during lineage differentiation (Alharbi et al., 2013). In certain leukemia subtypes, this regulatory switch fails and HOXA9 is maintained at high levels to promote leukemogenesis. However, the mechanisms governing HOXA9 expression remain to be fully understood. HOXA9 overexpression is commonly observed in over 70% of human acute myeloid leukemia (AML) cases and~10% of acute lymphoblastic leukemia (ALL) cases (Jambon et al., 2019). Notably, the high expression of HOXA9 is sharply correlated with poor prognosis and outcome in human leukemia (Golub et al., 1999;Baccelli et al., 2019). An accumulating body of evidence indicates that HOXA9 dysregulation is both sufficient and necessary for leukemic transformation (Collins and Hess, 2016a;Alharbi et al., 2013). Forced expression of HOXA9 enforces self-renewal, impairs myeloid differentiation of murine marrow progenitors, and ultimately leads to late onset of leukemia transformation (Bach et al., 2010), which is accelerated by co-expression with interacting partner protein MEIS1 (Kroon, 1998). Conversely, knocking down HOXA9 expression results in leukemic cell differentiation and apoptosis (Ayton and Cleary, 2003;Zeisig et al., 2004). Thus, excessive HOXA9 expression has emerged as a critical mechanism of leukemia transformation in many hematopoietic malignancies.
Consistent with the broad overexpression pattern of HOXA9 in many leukemia cases, a wide variety of genetic alterations in leukemia contribute to HOXA9 dysregulation including MLL gene rearrangements (MLLr), NPM1 mutations, NUP98-fusions, EZH2 loss-of-function mutations, ASXL1 mutations, MOZ fusions and other chromosome alterations (Collins and Hess, 2016a;Jambon et al., 2019;De Braekeleer et al., 2014;Collins and Hess, 2016b). Additionally, our recent work shows that DNMT3A hotspot mutations may also contribute to HOXA9 overexpression by preventing DNA methylation at its regulatory regions (Lu et al., 2016). Given that genomic variation of HOXA9 including NUP98-HOXA9 fusion and gene amplification accounted for less than 2% of HOXA9 overexpression in AML cases (Xu et al., 2016;Gough et al., 2011;Nakamura et al., 1996), uncovering the upstream epigenetic and transcriptional regulators of HOXA9 in leukemia could advance the design of novel therapeutic interventions. For example, because MLLr proteins recruit the histone methyltransferase DOT1L to the HOXA locus promoting hyper-methylation at histone H3 lysine 79 and subsequent high HOXA9 transcription (Krivtsov et al., 2008), selective DOT1L inhibitors have been exploited to inhibit leukemia development and HOXA9 expression in MLLr leukemias and are now in clinical trials (Chen et al., 2015;Stein and Tallman, 2015). However, DOT1L inhibitors usually act slowly and their effects remain sub-optimal. To date, most known HOXA9 regulator proteins are epigenetic modifiers, and little is known about which DNA-binding transcription factors are involved in directly regulating HOXA9 expression in acute leukemia (Godfrey et al., 2017;Daigle et al., 2011;Yu et al., 2012;Shi et al., 2012).
Previous studies have also advocated that the organization of chromatin domains at the HOXA gene cluster contributes to high HOXA9 expression in cancer cells (Luo et al., 2018;Xu et al., 2014). Specifically, CCCTC-binding factor CTCF may potentiate HOXA9 expression through direct binding at the conserved motif between HOXA7 and HOXA9 (CBS7/9) to establish necessary chromatin looping interaction networks in MLLr AML MOLM13 cells (Luo et al., 2018;Luo et al., 2019). In contrast, Ghasemi et al. reported that HOXA gene expression was maintained in the CTCF-binding site deletion mutants derived from AML OCI-AML3 cells, suggesting that transcriptional activity at the HOXA locus in NPM1-mutant AML cells does not require long-range CTCF-mediated chromatin interactions (Ghasemi et al., 2020). These data also suggest that CTCF may play a cell-typedependent role on HOXA9 regulation. However, whether loss of CTCF has a direct effect on HOXA9 expression remains to be studied. Lastly, although the clinical significance of HOXA9 has been recognized for more than two decades, it is technically difficult to systematically discover regulators of HOXA9 in acute leukemia owing to the lack of an endogenous reporter to dictate HOXA9 expression.
In this work, we sought to establish an endogenous reporter system enabling real-time monitoring of HOXA9 expression in conjunction with high-throughput CRISPR/Cas9 screening in a human B-ALL MLLr t(4;11) cell line SEM and a AML MLLr t(6;11) cell line OCI-AML2 equipped with an endogenous HOXA9 P2A-mCherry reporter allele. The HOXA9 P2A-mCherry reporter allele authentically recapitulated endogenous transcription of the HOXA9 gene and did not affect endogenous transcription of other adjacent HOXA genes. To gain a global understanding of the transcription factors regulating HOXA9 expression, we performed a CRISPR/Cas9 loss-of-function screen specifically targeting 1639 human transcription factors. Our screening robustly re-identified expected targets such as KMT2A, DOT1L and HOXA9 itself. Surprisingly, the CRISPR screen and global depletion of CTCF via siRNA and degron-associated protein degradation all demonstrated that HOXA9 does not downregulate upon CTCF loss. More importantly, we identified novel functional regulators of HOXA9 including Upstream Transcription Factor 2 (USF2). USF2 depletion selectively downregulated HOXA9 expression in MLLr leukemia cells and impaired cell growth, which could be rescued by ectopic expression of HOXA9 and its partner MEIS1. Thus, our HOXA9 P2A-mCherry reporter lines are robust tools for discovery of novel HOXA9 regulators.

Results
Establishment and characterization of the HOXA9 P2A-mCherry reporter human MLLr leukemia cell line As shown by many previous studies, HOXA9 overexpression was observed in refractory MLL-rearranged ALL and AML patients (Gu et al., 2019;Haferlach et al., 2010;Kohlmann et al., 2008;Figure 1-figure supplement 1A-C). Therefore, we utilized our previously reported high-efficiency knock-in strategy, 'CHASE knock-in' , to deliver the P2A-mCherry cassette upstream of the HOXA9 stop codon in a patient-derived human B-ALL cell line, SEM, which has a typical B-ALL signature along with a t(4;11) translocation and maintains one single allele expression of the HOXA gene cluster (Figure 1-figure supplement 1D). Because the P2A-mediated ribosome skipping disrupts the synthesis of the glycyl-prolyl peptide bond at the C-terminus of the P2A peptide, translation leads to dissociation of the P2A peptide and its immediate downstream mCherry protein (Kim et al., 2011). Therefore, the knock-in allele would produce a functional HOXA9 protein under control of the endogenous promoter and intrinsic cis-regulatory elements while delivering a separate mCherry protein. In brief, we constructed the knock-in vector containing a P2A-mCherry cassette flanked with 5' and 3' HOXA9 homology arms (HAs) of approximately 800-bps, which were cloned from SEM cells. A single guide RNA (sgRNA) and a protospacer adjacent motif (PAM) sequence targeting the genomic sequence 5' of the HOXA9 stop codon was inserted into the 5' end of the 5' HA and 3' end of the 3' HA ( Figure 1A). When the HA/knock-in cassette was co-electroporated with an all-in-one vector expressing wild-type Cas9 and the same HOXA9 sgRNA, the HA/ knock-in cassette was released from the donor vector with two nuclease cleavages and delivered to the target genomic region where double-strand breaks occurred. Successful knock-in cells were enriched by flow cytometry sorting for mCherry ( Figure 1B) and characterized via genotyping PCR and Sanger sequencing ( Figure 1C). To examine the possibility of random integration of the P2A-mCherry cassette, fluorescence in situ hybridization (FISH) was performed with a P2A-mCherry DNA probe (red) and a FITC-labeled fosmid DNA probe targeting the HOXA9 locus (green). On-target knock-in cells displayed co-localization of red and green fluorescence without random integration signals in the rest of genome ( Figure 1D and Figure 1-figure supplement 2A-D). The bulk knockin population from SEM cells, hereafter called HOXA9 P2A-mCherry , was used as a reporter cell line for the entire study. Similarly, a HOXA9 P2A-mCherry allele was delivered to a human MLLr AML cell line OCI-AML2 (Figure 1-figure supplement 2E). Many knock-in studies reported the exogenous DNA fragment may affect normal endogenous gene expression in a complex chromatin niche Zu et al., 2013). Therefore, to test whether the inserted P2A-mCherry segment would affect the gene expression pattern of HOXA9 and its neighboring HOXA cluster genes, Q-PCR analysis was conducted on both wild-type (WT) and HOXA9 P2A-mCherry knock-in (KI) cells. RNA-seq data collected from SEM cells in our previous studies suggested that HOXA7, HOXA9 and HOXA10 were the only highly expressed HOXA genes in MLLr leukemia SEM cells Figure 1E), and that these patterns were indistinguishable between WT and KI populations, indicating the P2A-mCherry knock-in did not alter the gene expression landscape at the HOXA cluster ( Figure 1F).
The HOXA9 P2A-mCherry reporter allele recapitulates endogenous transcription of HOXA9 in MLLr cells To evaluate whether the HOXA9 P2A-mCherry reporter allele would faithfully respond to the transcriptional regulation of the cellular HOXA9 promoter, we genetically perturbed or pharmaceutically inhibited HOXA9's upstream regulators. Previous studies have shown that DOT1L and ENL positively regulate HOXA9 expression in MLLr leukemia via direct occupancy on HOXA9's promoter (Zeisig et al., 2004;Chen et al., 2015). Therefore, two sgRNAs targeting the coding region of DOT1L (sgDOT1L) and ENL (sgENL) were infected into the HOXA9 P2A-mCherry cells expressing Cas9. Flow cytometry and Q-PCR analysis each revealed that mCherry and HOXA9 expression were both downregulated by sgRNAs targeting DOT1L or ENL (Figure 2A-D), and that the mCherry expression correlated well with the expression of HOXA9 ( Figure 2E). Additionally, a DOT1L-selective inhibitor, SGC0946 (Yu et al., 2012), was supplemented at different dosages for 6 days to the HOX-A9 P2A-mCherry cells in culture resulting in a dosage-dependent reduction of mCherry fluorescence intensity measured by fluorescence imaging (Figure 2F-G) and flow cytometry ( Figure 2H). Again, Q-PCR analysis of the DMSO-and SGC0946-treated cells showed that mRNA expression of mCherry      . Taken together, these data confirm that the newly established HOXA9 P2A-mCherry alleles were authentically controlled by the endogenous HOXA9 promoter and its local chromatin niche.
Pooled CRISPR/Cas9 screening identified a novel transcription factor, USF2, that regulates HOXA9 expression Although a few regulators of HOXA9 in MLLr leukemia have been previously identified (Zeisig et al., 2004;Collins and Hess, 2016b;Collins et al., 2014;Li et al., 2013a;Sun et al., 2013;Li et al., 2013b;Ogawara et al., 2015;de Bock et al., 2018;Lynch et al., 2019), to date a comprehensive CRISPR/Cas9 screen to unbiasedly identify novel upstream regulatory factors of HOXA9 has not been feasible owing to the lack of a reliable reporter cell line. Therefore, we combined the HOX-A9 P2A-mCherry reporter line and an in-house CRISPR-Cas9 sgRNA library targeting 1639 human transcription factors to identify novel regulatory effectors (Lambert et al., 2018). In this library, seven sgRNAs spanning multiple coding exons were designed per transcription factor, seven sgRNAs targeting DOT1L were included as a positive control, and an additional 100 non-targeting sgRNAs were included as negative controls. Two paralleled screens were performed on the same HOXA9 P2A-mCherry reporter line stably expressing Cas9 and the lentiviral sgRNA library at a low M.O.I. (less than 0.3). Cells were selected with antibiotics, enriched, and fractionated by flow cytometric sorting for the top 10% (mCherry High ) and bottom 10% (mCherry Low ) mCherry populations, followed by genomic DNA extraction, PCR, and deep sequencing to identify differentially represented sgRNAs ( Figure 3A). The differentially represented sgRNAs were calculated by DEseq2 analysis and combined for MAGeCK testing at the gene level . The positive control genes HOXA9 and DOT1L were identified among the top hits between mCherry High and mCherry Low populations, suggesting that the screening was successful ( Figure 3B). To mitigate the possibility that key upstream regulators of HOXA9 could be missed due to a survival disadvantage, we conducted an independent CRISPR/Cas9 TF screen in HOXA9 P2A-mCherry reporter SEM cells with ectopically expressed HOXA9 together with its functional partner MEIS1. In this regard, exogenously expressed HOXA9 could rescue the potential cell loss due to decreased HOXA9 expression in SEM cells, while the level of endogenous HOXA9 is still monitored by the mCherry reporter. As a result, our CRISPR screening using the HOXA9/MEIS1 pre-rescued reporter line has identified more well-known regulators of HOXA9, which are also considered survival essential genes. Among the top 10 hits from this screen, DOT1L and HOXA9 were enriched. KMT2A, the translocation partner of MLL-AF4 in SEM cells, was identified in the HOXA9-MEIS1 rescue TF screen but not the original screen without ectopic expression of HOXA9. Notably, the MYST acetyltransferase HBO1 (also known as KAT7 or MYST2) and several members of the HBO1 protein complex, which were recently shown as critical regulators of leukemia stem cell maintenance, were also identified among the top hits (MacPherson et al., 2020;Au et al., 2020). Most importantly, USF2 was enriched among the top hits in both screens ( Figure 3B), suggesting USF2 is likely a positive regulator with less survival essentiality compared with KMT2A. Consistent with the significant enrichment of these three candidates at the gene level, DEseq2 analysis  and sgRNA enrichment plotting both   suggested that most of the sgRNAs against these genes were differentially represented ( Figure 3C-F and Figure 3-figure supplement 1A-F). Importantly, all the non-targeting control sgRNAs were similarly distributed across mCherry High and mCherry Low populations, indicating that the sortingbased screen did not bias the enrichment.
Interestingly, the most-characterized looping factors, CTCF and YY1, were not enriched in the HOXA9 P2A-mCherry reporter screen ( Figure 3B). CTCF was reported to be essential for HOXA9 expression by occupying the boundary sequence between HOXA7 and HOXA9 (CBS7/9) in MLLr AML cell line MOLM13 (Luo et al., 2018). CRISPR-mediated deletion of the core sequence CTCFbinding motif in CBS7/9 significantly decreased HOXA9 expression and tumor progression (Luo et al., 2018;Luo et al., 2019). Given that CTCF is generally essential for cell survival, it is possible that cells targeted by CTCF sgRNAs in the HOXA9 P2A-mCherry reporter and TF screen quickly dropped out of the population and were unable to be enriched as a regulator of HOXA9. To mitigate the challenge, we utilized a previously described auxin-inducible degron (AID) cellular system Morawska and Ulrich, 2013;Natsume et al., 2016;Nora et al., 2017) to acutely deplete the CTCF protein in SEM cells and evaluate the immediate transcriptional response of HOXA9 (Figure 3-figure supplement 2A). Upon acute depletion of CTCF via auxin (IAA) treatment in three CTCF AID bi-allelic knock-in clones, the protein expression of a previously identified vulnerable gene as positive control, MYC, was significantly inhibited (Figure 3-figure supplement 2B). Moreover, a Cut and Run assay using CTCF antibody for chromatin immunoprecipitation confirmed loss of CTCF occupancy throughout the HOXA9 locus, including CBS7/9 (Figure 3-figure supplement 2C). However, RNA-seq data (Figure 3-figure supplement 2D-E) and Q-PCR analysis (Figure 3-figure supplement 2F-G) collected from these three clones further confirmed the observation that loss of CTCF occupancy did not correlate with a decrease in HOXA7 or HOXA9 expression at the mRNA level. Instead, long-term depletion of CTCF by auxin for 48 hr slightly increased the transcription of HOXA7 and HOXA9. Upon washout of auxin from culture medium for an additional 48 hr, both HOXA7 and HOXA9 expression were restored to levels indistinguishable from those of the parental untreated cells (Figure 3- (Luo et al., 2018). Collectively, these data further confirmed the results of our CRISPR screening that CTCF is not a key regulator of HOXA9 in MLLr B-ALL SEM and likely plays a role in regulating HOXA9 transcription in a cell-typespecific manner.

USF2 is required to maintain HOXA9 expression in MLLr leukemia
Aside from the positive controls confirmed from the CRISPR/Cas9 transcription factor screen in HOXA9 P2A-mCherry cells, the top-ranked candidate among positive regulators was USF2. To further validate the CRISPR screen result and investigate the regulatory effect of USF2 on HOXA9 expression, we individually delivered four lentiviral sgRNAs targeting USF2 exons 1, 2, 7, and 9 into the The ratio for all sgRNAs targeting HOXA9, USF2, and KMT2A, are shown between mCherry High and mCherry Low sorted population. NT sgRNAs were overlaid on a gray gradient depicting the overall distribution. NT: 100 sgRNAs. Transcription factors: seven sgRNAs/each. RRA score of each gene was collected from MAGeCK analysis. (E) The overall distribution of all sgRNAs from the HOXA9-MEIS1 overexpressing SEM HOXA9 reporter screening was shown based on the p-value and the DEseq2 score calculated by Log 2 [Fold Change (mCherry High /mCherry Low )]. NT, HOXA9, USF2 and KMT2A sgRNAs were highlighted by different color code. (F) The ratio for all sgRNAs targeting HOXA9, USF2, and KMT2A, are shown between mCherry High and mCherry Low sorted population. NT sgRNAs were overlaid on a gray gradient depicting the overall distribution. NT: 100 sgRNAs. Transcription factors: seven sgRNAs/each. RRA score of each gene was collected from MAGeCK analysis. The online version of this article includes the following figure supplement(s) for figure 3:   HOXA9 P2A-mCherry reporter line stably expressing Cas9. Similar to the results seen in sgENL targeted cells, USF2 knock-down significantly decreased the mCherry fluorescence in a time-dependent manner compared to that of luciferase sgRNA-targeted control (sgLuc) ( Figure 4A and Figure 4-figure supplement 1). Q-PCR and immunoblotting analysis further confirmed the concordant downregulation of both HOXA9 and mCherry ( Figure 4B-C). Collectively, these data suggest that USF2 positively controls HOXA9 expression in the MLLr B-ALL SEM cell line. USF2 was reported to generally bind to a symmetrical DNA sequence (E-box motif) (5'CACGTG3') in a variety of cellular promoters (Henrion et al., 1995). Publicly available ChIP-seq data collected from human ES cells suggested  that USF2 can directly bind to the conserved E-box element at both HOXA7 and HOXA9 promoters (Cheng et al., 2014). A Cut and Run assay was performed in control sgLuc and sgUSF2 targeted SEM cells to study genome-wide USF2 occupancy. In control SEM cells, USF2 bound to HOXA1, HOXA-AS3, HOXA7, and HOXA9 in HOXA cluster. Upon USF2 depletion, binding occupancy at these regions was significantly reduced ( Figure 4D), further supporting the specificity of the USF2 binding identified by the Cut and Run assay. Taken together, these data suggest that USF2 could regulate HOXA9 expression as well as other HOXA genes through interactions with its regulatory elements at the HOXA cluster gene loci.
USF2 is an essential gene in MLLr B-ALL by controlling HOXA9 expression To unbiasedly evaluate the survival dependency of USF2 in SEM cells, we conducted a dropout CRISPR/Cas9 screen by targeting 1639 transcription factors. SEM cells infected with the pooled library of sgRNAs were collected at day 0 and day 12 to sequence for sgRNA distribution ( Figure 5A). In accordance with prior genome-wide CRISPR screens and functional studies in B-ALL, many survival dependent genes were identified in the top 50 genes in our screen including PAX5, DOT1L, ZFP64, YY1, MEF2C, MYC, and KMT2A (Gu et al., 2019;Hyle et al., 2019;Pridans et al., 2008;Lu et al., 2018). USF2 was ranked as the top 24th essential gene in MLLr SEM cells ( Figure 5B). Taken together, these findings suggest that the USF2/HOXA9 axis might play a role in supporting MLLr B-ALL cell proliferation. To evaluate the importance of the USF2/HOXA9 axis in MLLr B-ALL progression, we sought to investigate the knockout phenotype of USF2 in MLLr B-ALL cells. A competition-based proliferation assay was performed by infecting SEM Cas9 cells with a lentiviral-mCherry-sgRNAs against the HOXA9 promoter at~50% targeting efficiency ( Figure 5C). The proportion of mCherry + cells were monitored over a 12-day time course (days 3, 6, 9, and 12) to investigate the proliferation disadvantage of HOXA9 knock-down cells ( Figure 5D). Next, the same assay was performed by infecting SEM Cas9 cells with three individual lentiviral-mCherry-sgRNAs against USF2 (sgRNA-2,-3 and 5) at~50% infection efficiency. As a result, the proliferation-arrested phenotype was observed in all three sgRNA targeted cells but not in cells targeted with sgLuc ( Figure 5E). Importantly, in SEM cells constitutively expressing ectopic retroviral mouse Hoxa9 (SEM-HOXA9 ), USF2 knock-down had little effect on cell growth ( Figure 5F), suggesting that HOXA9 is a functional and essential downstream gene of USF2 in USF2-mediated leukemia propagation.

USF1 and USF2 synergistically regulate HOXA9 expression in MLLr leukemia
Previously, other studies identified the USF2 homolog protein USF1 shares a similar protein structure with USF2 (49, 53). USF1 and USF2 bind to the same type of E-box elements and are also able to form homo-or heterodimers (Kumari and Usdin, 2001;Wang and Sul, 1995;Prasad and Singh, 2008;Spohrer et al., 2017) suggesting that these two proteins may function in synergy to regulate HOXA9. Interestingly, in our HOXA9-reporter-based CRISPR screen, USF1 was also among the top 50 positive regulator genes identified (49th) (Supplementary file 2). To test whether USF1 and USF2 have redundant roles in regulating HOXA9 expression, we co-delivered sgRNAs against USF2 (sgUSF2) and USF1 (sgUSF1) to the SEM HOXA9 P2A-mCherry reporter line stably expressing Cas9. Notably, both the flow cytometry and Q-PCR analysis confirmed a significant decrease in HOXA9 expression with double inactivation of USF1 and USF2 compared with inactivation of USF2 alone ( Figure 5G and Figure 5-figure supplement 1A), which was also supported by a synergetic effect in the competitive proliferation assay ( Figure 5H). To further evaluate whether USF2 and USF1 could regulate HOXA9 expression in other MLLr leukemias, sgUSF2 and sgUSF1 alone or in combination were delivered into the human MLLr AML cell line OCI-AML2 which carried the MLL-AF6 translocation. Similar to observations in SEM cells, USF1 or USF2 CRISPR targeting resulted in notably suppressed HOXA9 expression ( Figure 5I). In addition, USF1 and USF2 synergistically regulate HOXA9 expression and leukemia survival in OCI-AML2 ( Figure 5J and To examine if USF2 regulation of HOXA9 expression was unique to MLLr leukemias, we used two sgRNAs, sgUSF2#2 and sgUSF2#3, to knockdown USF2 expression in two human non-MLLr leukemia cell lines, OCI-AML3 and U937, which both express HOXA9. Upon complete USF2 depletion, HOXA9 expression remained unchanged, suggesting the USF2/HOXA9 axis may function in a MLLrdependent manner (

Discussion
HOX genes are a cluster of genes strictly regulated in development by various transcription and epigenetic modulators. Dysregulation of HOX genes has been frequently linked to human diseases, particularly cancer. Here, we focus on HOXA9, the aberrant expression of which is one of the most significant features in the most aggressive human leukemias. The HOXA9 P2A-mCherry knock-in MLLr cell line derived in this study fully recapitulated transcriptional regulation of the endogenous gene. Previously, Godmin, et al. derived two mouse strains by delivering the in-frame GFP cassette to two different murine Hox genes, Hoxa1 and Hoxc13, to visualize the proteins during mouse embryogenesis (Godwin et al., 1998). Although this previous study certainly added to the repertoire of research tools available to investigate HOXA-related gene expression and gene function, our HOXA9 reporter cell line provides a unique intrinsic cellular model with which to study transcriptional regulation of human HOXA9 directly. Additionally, the CHASE-knock-in protocol developed to generate the HOXA9 reporter is user-friendly, highly efficient, robust to reproduce and could be easily adapted to a wide variety of HOXA9-driven human leukemia cell models and other HOXA9-expressing cancer types.
In mammalian cells, each chromosome is hierarchically organized into hundreds of megabasesized TADs (ENCODE Project Consortium, 2012;Ji et al., 2016;Rowley et al., 2017;Rowley and Corces, 2018), each of which is insulated by the boundary elements. Within the TAD scaffold, promoter/enhancer physical contacts intricately regulate gene expression (Pombo and Dillon, 2015). efficiency. The mCherry% was quantified every three days by flow cytometry to evaluate the growth disadvantage. (E) Competitive proliferation assay was conducted by infecting SEM Cas9 cells with Lentiviral-mCherry-sgRNAs against luciferase (sgLuc) and USF2 (sgUSF2#2, 2#3 and 2#5) at about 50% efficiency. The mCherry% was quantified every 3 days by flow cytometry to evaluate the growth disadvantage. (F) Rescued competitive proliferation assay was conducted by infecting SEM cells overexpressing ectopic Hoxa9 with Lentiviral-mCherry-sgRNAs against luciferase (sgLuc) and USF2 (sgUSF2#2, 2#3 and 2#5) at about 50% efficiency. The mCherry% was quantified every 3 days by flow cytometry to evaluate the growth disadvantage. (G) Q-PCR analysis was conducted on the sgUSF2, sgUSF1 and sgUSF1/2-targeted SEM cells to monitor the reduction of HOXA9. Data shown are means ± SEM from three independent experiments. **p<0.01, two-tailed Student's t test. (H) Competitive proliferation assay was conducted by infecting SEM Cas9 cells with Lentiviral-mCherry-sgLuc, sgUSF1, sgUSF2, and sgUSF1/2 (DKO) at about 50% efficiency. The mCherry% was quantified at days 3, 7, 11, 15, 19, and 23 by flow cytometry to evaluate the growth disadvantage. A guide RNA targeting the survival essential gene RPS19 was included as a positive control for Cas9 activity. Guide RNAs targeting Luciferase gene (sgLuc) and the human ROSA26 gene (sgROSA26) were included as a negative control. (I) Q-PCR analysis was conducted on the sgUSF2, sgUSF1 and sgUSF1/2 targeted OCI-AML2 cells to monitor the reduction of HOXA9. Data shown are means ± SEM from three independent experiments. *p<0.05, **p<0.01, two-tailed Student's t test. (J) Competitive proliferation assay was conducted by infecting OCI-AML2 Cas9 cells with Lentiviral-mCherry-sgLuc, sgUSF1, sgUSF2, and sgUSF1/2 (DKO) at about 50% efficiency. The mCherry% was quantified at days 3, 7, 11, 15, 19, and 23 by flow cytometry to evaluate the growth disadvantage. A guide RNA targeting the survival essential gene RPS19 was included as a positive control for Cas9 activity. Guide RNAs targeting Luciferase gene (sgLuc) and the human ROSA26 gene (sgROSA26) were included as negative controls. The online version of this article includes the following figure supplement(s) for figure 5:   Intra-TAD chromatin interactions can be facilitated by a pair of CTCF-binding sites engaged in contact with each other when they are in a convergent linear orientation (Rao et al., 2014;Vietri Rudan et al., 2015). The HOXA9 cluster is located on the TAD boundary, providing an opportunity to interact with neighboring genomic elements. However, because of the low resolution of publicly available Hi-C data and the lack of DpnI restriction enzyme sites within the HOXA gene cluster that are necessary to generate high-quality 3C libraries, the impact of chromatin interaction regulation of HOXA9 remains unclear. Using a chromosome conformation capture-based PCR assay and CRISPR-mediated deletion of a minimal CTCF-binding motif between HOXA7 and HOXA9 (CBS7/9), Luo and colleagues proposed that the CTCF boundary was crucial for higher order chromatin organization by showing the depletion of CBS7/9 disrupted chromatin interactions and significantly reduced HOXA9 transcription in MLLr AML MOLM13 cells with t(9;11) (Luo et al., 2018;Luo et al., 2019). In our study, the loss-of-function results from auxin-inducible degradation of CTCF, siRNA-mediated CTCF knock-down, and the unbiased transcription factor screening suggested that CTCF is not required to maintain HOXA9 expression in SEM cells with MLLr with t(4;11). We speculate that the discrepancy could be due to the following reasons. Although both cell lines carried the MLLr translocation as a driver oncogenic mutation, MOLM13 and SEM were classified as AML and B-ALL, respectively. Besides the lineage difference, SEM cells are also less sensitive to many well-known pharmaceutical inhibitors including JQ1 and DOT1L inhibitor. Therefore, we hypothesized that other as yet to be identified looping factors might be involved in the transcriptional regulation of the HOXA9 locus in MLLr SEM cells, and that CTCF regulates HOXA9 expression in a cell-type-specific context.
By performing unbiased CRISPR screens designed to target 1639 known human transcription factors in a HOXA9 P2A-mCherry reporter cell line, we identified USF2 as a novel regulator of HOXA9. In addition, two known HOXA9 regulators, HOXA9 and DOT1L, were identified among the top hits supporting the reliable sensitivity of both the reporter system and the CRISPR screening strategy. USF2 is a ubiquitously expressed basic helix-loop-helix-leucine-zip transcription factor that generally recognizes E-box DNA motifs (Henrion et al., 1995;Groenen et al., 1996;Luo and Sawadogo, 1996). USF1 and USF2 usually form homo-or heterodimers to modulate gene expression (Kumari and Usdin, 2001). Interestingly, USF1 was also enriched in our CRISPR screening. Moreover, the function of USF2 in controlling leukemia progression has not been reported. Although our study identified the regulatory function of USF1/USF2 on HOXA9 maintenance and leukemia cell survival in MLLr B-ALL and AML cell lines, other HOXA9-independent functions of USF1/2 cannot be excluded and requires further studies.
In summary, we revealed that candidate transcription factors identified from the CRISPR/Cas9 screen including USF2 and USF1, regulate HOXA9 thereby providing a more comprehensive understanding about how the HOXA9 locus is regulated in human cancer cells. Given the well-recognized role of HOXA9 in hematopoietic malignancies, we anticipate the HOXA9 reporter cells will advance many lines of investigation including drug screening and the identification of concordant epigenetic modifiers/transcription factors that are required for activation and maintenance of HOXA9 expression in leukemia progression. Collectively, these efforts would clarify the molecular mechanisms underlying aberrant HOXA9 activation in leukemias, thus providing the foundation to develop clinically relevant therapies to target the expression and/or function of HOXA9 in leukemia patients.
Generation of a HOXA9 P2A-mCherry reporter allele SEM and OCI-AML2 were electroporated by using the Nucleofector-2b device (Lonza) with the V-kit and program X-001. For HOXA9 P2A-mCherry knock-in delivery, 2.5 mg of the donor plasmid and 2.5 mg of the CRISPR/Cas9-HOXA9-C-terminus-sgRNA all-in-one plasmid were used for 5 million SEM cells. Twenty-four hours after transfection, cells were sorted for the GFP fluorescent marker linked to Cas9 expression vector to enrich the transfected cell population. After the sorted cells recovered in culture for up to 3 weeks, a second sort was performed to select cells for successful knock-in by sorting for cells expressing the knock-in mCherry fluorescent marker. Two weeks later, a third sort was repeated based on the selection mCherry-expressing cells.

Characterization of successful knock-in events by PCR and Sanger sequencing
DNA from single-cell-derived bacterial or cell colonies was extracted with a Quick-DNA Miniprep Kit (Zymo #D3025). Combinatorial primer sets designed to recognize the 5 0 and 3 0 knock-in boundaries were used with the following PCR cycling conditions: 98˚C for two mins, followed by 40 cycles of 98C for 30 s and 68˚C for 60 s. The sequences for genotyping primers are provided in Supplementary file 1. After electrophoresis, the bands that were at the expected size were cut out, purified, and sequenced with two specific primers (Supplementary file 1).

CRISPR library construction and screening
A set of~11,000 sgRNA oligos that target 1639 human transcription factors were designed for array-based oligonucleotide synthesis (CustomArray). Unique binding of each sgRNA was verified by sequence blast against the whole human genome. In the sgRNA pooled library, seven gRNAs against each of the 1639 human transcription factors were obtained from validated sgRNA libraries published previously Doench et al., 2016;Sanjana et al., 2014;Ma et al., 2015;Tzelepis et al., 2016;Hart et al., 2015;Hart et al., 2017;Smith et al., 2008;Park et al., 2017). The synthesized oligo pool was amplified by PCR and cloned into LentiGuide-Puro backbone (#52963) by in-fusion assembly (Clontech #638909). The HOXA9 P2A-mCherry reporter cell line was overexpressed with lentiviral Cas9 followed by infection of pooled sgRNA library at low M.O.I (~0.3). Infected cells were selected by blasticidine and puromycin and later sorted for mCherry High and mCherry Low populations between days 10-12. The sgRNA sequences were recovered by genomic PCR analysis and deep sequencing using MiSeq for single-end 150 bp read length (Illumina). The primer sequences used for cloning and sequencing are listed in Supplementary file 1. The sgRNA sequences are described in Supplementary file 2. High-titer lentivirus stocks were generated in 293 T cells as previously described (Vo et al., 2017).

Data analysis of CRISPR screening
The raw FASTQ data were de-barcoded and mapped to the original reference sgRNA library. The differentially enriched sgRNAs were defined by comparing normalized counts between sorted cells in the top 10% and those in the bottom 10% of mCherry-expressing bulk populations. Two independent replicate screenings were performed with the HOXA9 P2A-mCherry reporter cell line stably expressing Cas9. Normalized counts for each sgRNA were extracted and used to identify differentially enriched sgRNA by DESeq2 . The combined analysis of seven sgRNAs against each human transcription factor was conducted by using the MAGeCK algorithm . Detailed screening results were included in Supplementary file 2.
Fluorescence imaging and analysis 0.1% of DMSO (vehicle control) or 10 doses of SGC0946 with a half log scale (0.3 nM-10 mM) were first dispensed into 384-well plates (in quadruplicate, four wells per dose). Suspension-cultured SEM cells were immediately plated into the 384-well plate (20,000 cells / well). Six days after drug treatment, the cells were fixed with 4% paraformaldehyde for 10 mins at room temperature, followed by Hoechst staining for 15 mins at room temperature. Fluorescence images (Hoechst and mCherry) were taken by a CellVoyager 8000 high content imager (Yokogawa). The acquired images were processed by using the Columbus Image Data Storage and Analysis system (Perkin Elmer) to count the number of positive cells and measure fluorescent intensity. To determine the changes of mCherry intensity in SEM expressing HOXA9 P2A-mCherry , we measured average mCherry intensity of four fields per well and normalized to vehicle (0.1% DMSO) treated control. Wild-type SEMs with no fluorescence were included as negative controls.

Cut and Run assay
Cut and Run assay was conducted following the protocol described previously (Skene and Henikoff, 2017). In brief, three million cells were collected for each sample. The USF2 antibody (NBP1-92649, Novus) was used at a 1:100 dilution. Library construction was performed using the NEBNext UltraII DNA Library Prep Kit from NEB (E7645S). Indexed samples were run using the Illumina Next-seq 300-cycle kit. Cut and Run raw reads were mapped to genome hg19. by bowtie 2.3.4 with default parameter. The mapping file were converse to. bw file by bamCoverage (Langmead and Salzberg, 2012;Ramírez et al., 2014).

Flow cytometry
Suspension-cultured SEM and OCI-AML2 cells were collected by centrifugation at 800Xg, filtered through a 70 mm filter, and analyzed for mCherry on a BD FACS Aria III flow cytometer with a negative control. The 4,6-diamidino-2-phenylindole (DAPI) staining was conducted prior to sorting to exclude dead cells.

Inhibitor treatment
SEM and OCI-AML2 cells were seeded at a density of 1 Â 10 5 cells/mL in medium supplemented with DMSO vehicle or different doses (from 0.5 mM to 15 mM) of the DOT1L inhibitor SGC0946 (MedChemExpress #HY-15650). Medium was replaced every three days, and fresh inhibitor was added. At day-6 post-treatment, cells were collected for flow cytometry analysis and RNA extraction.

Fluorescence in situ hybridization
An~800 bp purified P2A-mCherry DNA fragment was labeled with a red-dUTP (AF594, Molecular Probes) by nick translation, and a HOXA9 BAC clone (CH17-412I12/7p15.2) was labeled with a green-dUTP (AF488, Molecular Probes). Both of labeled probes were combined with sheared human DNA and independently hybridized to fix the interphase and metaphase nuclei derived from each sample by using routine cytogenetic methods in a solution containing 50% formamide, 10% dextran sulfate, and 2XSSC. The cells were then stained with DAPI and analyzed.

Quantitative real-time PCR
Total RNA was collected by using TRIzol (Thermo Fisher Scientific #15596026) or Direct-zol RNA Miniprep Kit (Zymo #R2052). Reverse transcription was performed by using a High-Capacity cDNA Reverse Transcriptase Kit (Applied Biosystems #4374966). Real-time PCR was performed by using FAST SYBR Green Master Mix (Applied Biosystems #4385612) in accordance with the manufacturer's instructions. Relative gene expression was determined by using the DD-CT method (Schmittgen and Livak, 2008). All Q-PCR primers used in this study are listed in Supplementary file 1.

Competitive proliferation assay
For evaluating the impact of USF2 sgRNAs on leukemia expansion, cell cultures were lentivirally transduced with individual USF2 sgRNAs in mCherry expressing vector, followed by measurement of the mCherry-positive percentage at various days post-infection using flow cytometry. The rate of mCherry-positive percentage was normalized to that of Day 3 and declined over time, which was used to infer a defect in cell accumulation conferred by a given sgRNA targeting USF2 relative to the uninfected cells in the same culture.

Statistics
All values are shown as the mean ± SEM. Statistical analyses were performed with GraphPad Prism software, version 8.0. p-Values were calculated by performing a two-tailed t-test. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. . Transparent reporting form

Data availability
All plasmids created in this study will be deposited to Addgene. Raw data collected from Cut&Run were deposited at NCBI GEO (GSE140664). Raw data collected from CRISPR screening were included in Supplementary File 2. Publicly available dataset used in this study were cited accordingly including Figures