3′HS1 CTCF binding site in human β-globin locus regulates fetal hemoglobin expression

Mutations in the adult β-globin gene can lead to a variety of hemoglobinopathies, including sickle cell disease and β-thalassemia. An increase in fetal hemoglobin expression throughout adulthood, a condition named hereditary persistence of fetal hemoglobin (HPFH), has been found to ameliorate hemoglobinopathies. Deletional HPFH occurs through the excision of a significant portion of the 3′ end of the β-globin locus, including a CTCF binding site termed 3′HS1. Here, we show that the deletion of this CTCF site alone induces fetal hemoglobin expression in both adult CD34+ hematopoietic stem and progenitor cells and HUDEP-2 erythroid progenitor cells. This induction is driven by the ectopic access of a previously postulated distal enhancer located in the OR52A1 gene downstream of the locus, which can also be insulated by the inversion of the 3′HS1 CTCF site. This suggests that genetic editing of this binding site can have therapeutic implications to treat hemoglobinopathies.


Introduction
The human β-globin locus consists of five globin genes embedded in the olfactory receptor cluster. During early development, these globin genes undergo gene switching from embryonic ε-globin (HBE) to fetal γ-globin (HBG1/2) and finally to adult β-globin (HBB). Inherited mutations in the HBB gene lead to dysfunction of the adult β-globin protein, causing hemoglobinopathies (Bauer et al., 2012). The symptoms of these disorders, including sickle cell disease and β-thalassemia, can be alleviated by persistent expression of fetal hemoglobin (hereditary persistence of fetal hemoglobin [HPFH]) throughout adulthood, which compensates for the mutant adult β-globin (Bank, 2006;Hassell, 2010). As such, multiple genome-editing strategies have been proposed to mimic HPFH as a treatment for hemoglobinopathies (Bauer et al., 2012;Breda et al., 2016;Sankaran et al., 2008;Sankaran et al., 2011;Sankaran et al., 2009;Traxler et al., 2016;Xu et al., 2011;Ye et al., 2016;Zeng et al., 2020). Two types of HPFH have been identified based on patient genetics. First is the non-deletional HPFH caused by point mutations in the BCL11A binding site at the HBG1/2 promoters, and disruption of this transcriptional repressor binding leads to the activation of these genes (Forget, 1998;Liu et al., 2018;Martyn et al., 2018;Traxler et al., 2016). Second is the deletional HPFH that consists of the excision of a large genomic region within the β-globin locus, frequently including HBB and HBD    (Liu et al., 2018) is shown in the lower panel. Black cycle indicates the position of loops previously identified (Huang et al., 2017). Yellow dotted line indicates the three sub-TAD domains identified previously (Huang et al., 2017). HPFH1-7 deletion is illustrated and 3′HS1 is marked in blue shade. (B). The scheme of CTCF binding motif orientation engineering in HUDEP-2 cells. (C-E) In situ Hi-C contact map around β-globin gene cluster in HUDEP-2 cells of wild type (C), 3′HS1 deletion (D), and 3′HS1 inversion (E). CTCF CUT&RUN tracks of WT (Liu et al., 2018), 3′HS1 deletion and 3′HS1 inversion HUDEP-2 cells are shown on the top of corresponding Hi-C plots. All loops that called in the HUDEP2 cells of three genotypes are marked with circles of different colors. (F) The HiCCUPS quantification of loops interaction strength by q value in β-globin locus. Dotted line annotates q = 0.1. n.d.: not detected by HiCCUPS (q value > 0.1).
The online version of this article includes the following figure supplement(s) for figure 1:  Next, we evaluated the expression of the β-globin genes and found that the HBG1/2 and HBE genes upregulated 2.5-to 8-fold in the Δ3′HS1 clones ( Figure 1G and H). In contrast, the inversion of 3′HS1 resulted in a >50% reduction of HBE and near-complete depletion of HBG1/2 ( Figure 1H). Most notably, the increase in HBG1/2 upon deletion of the 3′HS1 CTCF site leads to a significant increase in fetal hemoglobin HbF ( Figure 1I). Consequently, we evaluated the clones for HbF+ cells by flow cytometry. Consistent with transcription level, we observed an increase in HbF+ cells from 4.3 % in the Cas9 control clone to 37.8% and 53.1% in Δ3′HS1 B6 and D3 clones, respectively ( Figure 1J). Meanwhile, inversion of 3′HS1 resulted in a decrease of HbF+ cells to below 1% . In contrast, deletion of the upstream HS5 CBS did not induce nor abrogate the quantity of HbF+ cells ( Figure 1J). All these clones are well differentiated at the same stage when HbF is measured by flow cytometry (Figure 1figure supplement 3).
We then performed ATAC-seq and H3K27ac ChIP-seq in the Δ3′HS1 and 3′HS1 inversion clones to examine the regulatory landscape of the β-globin gene cluster. Following 3′HS1 CBS deletion, we observed significant open chromatin at the HBG1/2, BGLT3, and HBBP1 genes (Figure 2-figure supplement 1A). There was also a significant increase in activating H3K27ac in the HBG2 gene body. Interestingly, these epigenetic changes upon deletion of 3′HS1 CBS do not occur at the promoter of HBG2, which suggests that this 3′HS1-dependent regulation is independent from BCL11A transcriptional regulation (Figure 2-figure supplement 1A).
To determine if other regulatory pathways, such as the transcriptional repressor BCL11A, are involved in the upregulation of γ-globin expression in Δ3′HS1 and 3′HS1 inversion clones, we performed RNAseq to identify differentially expressed genes in these clones. We found 161 upregulated and 153 downregulated genes in the Δ3′HS1 clones with HBG1/2 genes being the most significantly upregulated as well as β-globin cluster genes HBE1, HBBP, and BGLT3 (Figure 2A and C). In the 3′HS1 inversion clones, we identified only 3 upregulated genes and 51 downregulated genes ( Figure 2B). In these clones, we observed downregulation of HBG2 and upregulation of the nearby OR52A5 gene; however, we did not observe any change in the expression of known γ-globin regulators (BCL11A, ZBTB7A (LRF), ELF2AK1 (HRI), ATF4, ZNF410, and NFIX) ( Figure 2B and D, Supplementary file 1; Grevet et al., 2018;Masuda et al., 2016). We further verified BCL11A protein level by immunoblotting and observed no significant reduction in Δ3′HS1 clones ( Figure 2, Figure 2-source data 1). Taken together, we concluded that BCL11A does not contribute to the induction of HbF in the edited HUDEP-2 cells. To further elucidate the role of BCL11A in the regulation of HBG1/2, we performed targeted disruption of a known BCL11A gene enhancer in both WT and Δ3′HS1 HUDEP-2 clones (  Bauer et al., 2013). This mutation resulted in a significantly lower BCL11A expression, and we observed further increase of HBG1/2 expression level and HbF+ cells ( Figure 2F, Figure 2-figure supplement 1C and F). Pomalidomide was found to boost the level of HbF in adult erythroblasts by destabilizing the BCL11A protein in cells (Grevet et al., 2018). Therefore, we treated HUDEP-2 3′HS1 deletion clones with pomalidomide and observed a significant reduction in BCL11A protein. This led to further increase in expression of fetal globin ( These results show that genetic editing of the regulatory cis-element (3′HS1 CBS) accentuates with depletion of the transcriptional repressor BCL11A. This further indicates that the γ-globin activation in 3′HS1 deletion HUDEP-2 cells was not driven by the BCL11A-associated pathways. The double disruption of BCL11A and 3′HS1 also led to similar level of γ-globin activation as the disruption of BCL11A alone ( Figure 2F). This data suggests that 3′HS1-regulated γ-globin repression might be hypostatic to the BCL11A-mediated γ-globin repression.
Deletional HPFH has been proposed to be the result of distal enhancer juxtaposition in the region downstream of 3′HS1, and several HPFH enhancers have been identified (Forget, 1998). The reduction of HbF in the 3′HS1 inversion HUDEP-2 clones suggests that potential enhancers may be located between 3′HS1 and 3′-OR52A5-CBS. These enhancers may be insulated by the loop formed between     2C and D), it suggests that the HBG1/2 expression increase is controlled by a cis-element other than the LCR. We then analyzed the ATAC-seq, GATA1 ChIP-seq data together to search for potential cis-regulatory elements. We found the OR52A1 region bound by the GATA1 transcriptional activator in the erythroid lineages (Corces et al., 2016;Feingold and Forget;Liu et al., 2018; Figure 3A). Importantly, the mapping of a previously described HPFH enhancer encompassed both OR52A1 gene and GATA1 binding site (Elder et al., 1990;Feingold and Forget, 1989). We proceeded to delete the GATA1 binding site within the HPFH enhancer site and the entire HPFH enhancer marked by ATAC-seq in Δ3′HS1 clones ( Figure 3B). We found that both deletions reduced the HBG1/2 expression and HbF+ cell percentage in these cells, suggesting that the HPFH enhancer contributes to the activation of HBG1/2 ( Figure 3C-E). We further evaluated whether there were direct interactions between HBG1/2 and HPFH enhancer after the deletion of 3′HS1 by virtual 4C (v4C) in our Hi-C dataset. We observed mildly increased v4C signal in HBG2 promoter region of Δ3′HS1 cells compared with WT and 3′HS1 inversion clones when HPFH  Source data 1. The immunoblot data of BCL11A, ZBTB7A, β-actin, β-globin, and γ-globin of clones displayed in Figure 2.          , and erythroblast is shown in the zoomed view for the OR52A1 region. Red-shaded area indicates the locus of OR52A1. HPFH 3' beak and δβ-thalassemia 3' break is annotated (Feingold and Forget, 1989). (B) The experimental scheme . Vice versa, we also observed mildly increased v4C signal in HPFH enhancer region of Δ3′HS1 cells compared with WT and 3′HS1 inversion clones when HBG2 promoter region acts as viewpoint (Figure 3-figure supplement 1A). These data suggest that the deletion of 3′HS1 might have induced more frequent interaction between HPFH enhancer and HBG2 promoter regions. Previously, mouse and human β-globin locus and its surrounding regions have been shown to be evolutionally conserved (Bulger et al., 2003). More interestingly, previous report showed that the disruption of 3′HS1 site in mouse did not result in any change in the β-like globin gene expression (Bender et al., 2006). We hypothesized that the alteration of HPFH enhancer sequence might contribute to the different outcome of 3′HS1 deletion in mouse and human. Therefore, we evaluated the evolutionary conservation of OR52A1 in mammals and found mouse and rat homolog of human OR52A1-Olfr68 bears two single-nucleotide substitutions right at the core binding site of GATA1 ( Figure 3F). When we further checked the GATA1 ChIP-seq in mouse erythroid cells, we also found the absence of binding in the OR52A1's homolog Olfr68 and its surrounding region ( Figure 3G). This data suggests that the mouse olfactory receptor region 3′ to β-globin genes no longer bears GATA1 binding sites and enhancer activity. Overall, the loss of GATA1 binding in mouse clearly explains the difference between mouse and human on the effect of 3′HS1 deletion on globin gene expression.
Previously, it has been proposed that the juxtaposition of HPFH enhancer results in the activation of γ-globin. We wonder if the chromosomal distance between HPFH enhancer and globin genes also contributes to the regulation of globin gene expression. We used paired guide RNA to delete a 48 kb region between HPFH enhancer and 3′HS1, to access the effect of distance between enhancer and target genes in the presence of chromosomal insulators (Figure 3-figure supplement 1B). We obtained a heterozygous clone bearing this 48 kb deletion (Figure 3-figure supplement 1B). We found that moving the HPFH enhancer to the proximity of globin locus mildly increases the HBG1/2 gene expression by threefold with increase of HBE1 gene and reduction of HBB gene (Figure 3figure supplement 1C and D). Overall, the data suggests that both chromosomal distance and insulator contribute to the low expression of HBG1/2, but chromosomal insulators are dominant to insulate HPFH enhancer to access the globin genes.
To assess the therapeutic potential of 3′HS1 deletion in primary HSPCs, we performed the 3′HS1 and HS5 CBS deletions in adult mobilized peripheral blood CD34+ HSPC from three different donors ( Figure 4A). We achieved high deletion percentage of both CBSs in these primary cells ( Figure 4B, Figure 4-source data 1). Upon differentiation, we observed a robust increase in HbF+ cells across all three donors with 3′HS1 deletion but not with HS5 deletion ( Figure 4C and D). The normal erythroid differentiation was not affected by either CBS deletion (Figure 4-figure supplement 1A and B). While the disruption of the 3'HS1 CBS in primary patient cells did not yield as great of an effect on HbF+ cells as in HUDEP-2 cells, the results do support the involvement of a downstream cis-acting regulatory element on the HBG1/2 genes.
Finally, based on our data, we propose a model where the 3′HS1 CBS modulates the HPFH enhancer's access to the HBG1/2 genes ( Figure 4E).

Discussion
Here, we show that the deletion of a CBS -3′HS1-in the human β-globin locus can phenocopy HPFH. This condition is driven by alteration of the 3D genomic organization around the β-globin of hereditary persistence of fetal hemoglobin (HPFH) deletion in the 3'HS1 deletion background. (C) The composition of β-like globin Δ3′HS1 (clone B6) HUDEP-2 cells with GATA1 binding site and HPFH region deletion. Mean ± SD is displayed, n = 3. (D) Relative expression of HBE, HBG (probe measures both HBG1 and HBG2), and HBB in the Δ3'HS1 (clone B6) HUDEP-2 cells with GATA1 binding site and HPFH enhancer region deletion. Mean ± SD is displayed, n = 3. (E) The representative HbF flow plot of Δ3'HS1 (clone B6) HUDEP-2 cells with GATA1 binding site and HPFH enhancer region deletion. (F) Evolution conservation of OR52A1 GATA1 binding site in vertebrates. GATA1 binding motif is shown in the middle. The site in mouse and rat associated with human GATA1 binding is boxed out. (G) Chromatin landscape of mouse β-globin gene cluster in mouse erythroid cells MEL and G1-ER4. CTCF, GATA1, and TAL1 ChIP-seq is shown. Orange stripe highlights the mouse homolog of human OR52A1-Olfr68.
The online version of this article includes the following figure supplement(s) for figure 3:     locus, which allows the long-range interaction of a distal enhancer in the OR52A1 gene to drive the expression of HBG1/2. We further show that inversion of 3′HS1 insulates the enhancer element and further suppresses HBG1/2. Previously, induced LCR to HBG1/2 interaction showed the importance of high-order chromatin structure in the regulation of globin gene expression (Breda et al., 2016;Deng et al., 2014). However, the role of CTCF protein bound around the gene cluster was not clear. Our study reveals how CTCF binding at this locus modulates the accessibility of the fetal HBG1/2 genes to a downstream enhancer. In the HPFH enhancer scenario, 3′HS1 limits the HPFH enhancer access to HBG1/2 by forming the sub-TAD with 5′HS ( Figure 4E; Oudelaar et al., 2021). When the 3′HS1 CBS is deleted, the HPFH enhancer gains access to HBG1/2 without the hinder of 3′HS1-HS5 loop. When the 3′HS1 CBS motif is inverted, the HPFH enhancer is further restricted by the pairing of 3′-OR52A5-CBS to the inverted 3′HS1 CBS, which results in the strong chromosomal loop formation between the two CBSs. This insulation leads to the reduced HBG1/2 expression and upregulation of OR52A5 ( Figures  2B and 4E).
Furthermore, we have elucidated the function of long speculated HPFH enhancer in the induction of HbF with our Δ3′HS1 cells. However, we suspect that other regulatory elements also exist between 3′HS1 and 3′-OR52A5-CBS since we only get a partial decrease of HBG1/2 expression with HPFH enhancer deletion in Δ3′HS1 cells. This may suggest that other previously identified HPFH enhancers also contribute to HbF induction in the Δ3′HS1 cells (Feingold and Forget, 1989;Forget, 1998). Indeed, other GATA1 binding sites can be observed between OR52A1 and OR52A5 genes of erythroid progenitor cells. Overall, these observations suggest that GATA1 binding across the locus works collectively to activate HBG1/2 expression with or without direct promoter-enhancer interactions. Interestingly, despite the evolution conservation of OR52A1 gene in mammals, TF binding disrupting nucleotide substitutions occur at the GATA1 binding site in HPFH enhancer in some mammal species that do not express distinct form of fetal hemoglobin (mouse and rat) ( Figure 3F and G). This evolutionary alteration in the TF binding site suggests that HPFH enhancer may also play a role in regulating globin expression in other developmental stages. The consistent binding of GATA1 in high HBE/HBGexpressing K562 cells and high HBB-expressing HUDEP-2 cells suggests that the potential function of HPFH enhancer may still need the modulation of 3D genomic reorganization.
Despite our data suggesting that deletion of 3′HS1 is sufficient to induce the γ-globin activation, yet, the upregulation does not completely phenocopy the HPFH condition, in which the HbF is expressed at pancellular level. The deletion of 3′HS1 only induces a portion of cells to be F+ cells, with an interesting differentiation block phenotype in a large portion of clones we have selected ( Figure 1J). Although HUDEP-2 is known to be heterogenous in clonal level, the F+ cell phenotype of 3′HS1-deleted cells suggests that the full HPFH phenotype may require the deletion of HBB and HBD genes to erase the strong enhancer-promoter interaction between LCR and HBB. Furthermore, our results indicate that the distance between HPFH enhancer and the globin locus could also play a role in regulating γ-globin gene expression. This data hints that the deletion of both 3′HS1 and the flanking region between 3′HS1 and HPFH enhancer may activate the fetal hemoglobin to an even higher level. Larger deletions will result in the reorganization of chromosomal loop interactions as well as the decrease of physical distance from HPFH enhancer to globin locus. The enhancers have to locate at certain range of distance from the target gene promoters to exert the maximal activation effect (Jessica Zuin et al., 2021). Therefore, a method that can create large deletion that disrupts 3′HS1 and reduces enhancer-promoter distance simultaneously could be a potential gene therapy to effectively activate the γ-globin expression in hemoglobinopathies.
The differentiation stage of HUDEP-2 cell clones used in Figure 1 was profiled by flow cytometry of CD71 and CD235a (Figure 2-figure supplement 1-source data 1). The immunoblot data of plot for HbF+ cells at day 21 in 3′HS1-deleted and HS5-deleted PBMC HSPC. The data is from donor #1. (E) The model of fetal hemoglobin regulation through 3′HS1.

The online version of this article includes the following figure supplement(s) for figure 4:
Source data 1. The gel picture of paired guide deletion for HS5 and 3′HS1 in HSPC.

Continued on next page
Key resources

Tissue culture of cell lines
Our HUDEP-2 cell lines were directly obtained from the cell bank of Riken Institute, Japan, The provider validates the cell line before shipment. Additionally, the cell line has been validated by HiC sequencing on the chromosome copy number (Figure 1-figure supplement 2C). We also validate that all cell lines used in the study are free from mycoplasma contamination. We routinely test the mycoplasma contamination status by PCR. K562 cells were maintained in RPMI medium with 10% FBS. HUDEP clone 2 (HUDEP-2) cells were cultured as previously described (Kurita et al., 2013). Cells were expanded in StemSpan serum-free expansion medium supplemented with 1 μM dexamethasone (D2915, Sigma), 1 μg/mL doxycycline (D9891, Sigma), 50 ng/mL human SCF, 3 units/mL EPO, and 1% penicillin/streptomycin. HUDEP-2 cells were differentiated in a two-phase differentiation protocol consisting of IMDM supplemented with 5% human AB serum, 10 μg/mL recombinant human insulin, 330 μg/mL holo-transferrin, 3 units/mL EPO, 1 μg/mL doxycycline, 2 units/mL heparin, and 1% penicillin/streptomycin. 50 ng/mL human SCF was included in phase 1 of the culture (days 1-3) and withdrawn in phase 2 of the culture (days 4 and beyond). Samples were collected on day 5 for flow cytometry, RNA extraction, immunoblot, HiC, ATAC-seq, and HPLC analyses.

CRISPR/Cas9-mediated deletion and homologous recombination
Cas9 nuclease (1081058, IDT) were mixed with synthetically modified sgRNAs (synthesized by Synthego) at a 1:3 molar ratio in resuspension buffer T and incubated at 37 °C for 10 min to form ribonucleoprotein complexes (RNPs). For the CTCF inversion clones, homology-directed repair (HDR) approach was used by adding 2 μg of dsDNA homology repair template (synthesized by Gene Universal) to the RNP complexes. dsDNA repair templates were designed to have 90-130 bp of homology arms distal and proximal to the PAM sequence.
For CD34+ cells, about 2.5 × 10 5 cells were harvested 24 hr after thawing, washed in HBSS (Gibco, 14170112), resuspended in RNP complexes, and electroporated at 1600 V and 3 pulses of 10 ms using the Neon Transfection system (Thermo Fisher Scientific). For HUDEP-2 cells, 2 × 10 5 were harvested, resuspended in RNP complexes, and electroporated at 1300 V and 1 pulse of 20 ms using the Neon Transfection system. 24 hr after electroporation, cells were harvested to assay for deletion or inversion. Genomic DNA was extracted using DirectPCR lysis reagent (102T , Viagen) followed by proteinase K treatment at 55 °C. PCR was performed using the EconoTaq PLUS 2X master mix (30033-2, Lucigen) with the following cycling conditions: 95 °C for 2 min; 45 cycles of 95 °C for 30 s, 55 °C for 30 s, 72 °C for 40 s; 72 °C for 5 min. Amplicons were purified using the Zymoclean Gel DNA recovery kit and Sanger sequenced.
The sequence of guide RNA and other oligo sequence used to generate guideRNA, genotyping deletion, and HDR template is listed in Key resources table.

Flow cytometry analysis of fetal hemoglobin protein expression
Upon differentiation, the expression of fetal hemoglobin was analyzed by intracellular flow cytometry staining. Briefly, 50,000 cells were fixed and permeabilized in CytoFast Fix/Perm buffer set (426803, BioLegend), and incubated with FITC-conjugated anti-Human Fetal hemoglobin antibody (Miltenyl Biotec,clone # REA533) in the dark at room temperature for 15 min. In addition, phenotypic characterization of cells upon differentiation was done by cell-surface antigens staining with PE anti-human CD71 (334105, BioLegend) and APC anti-human CD235a (561775, BD Biosciences) monoclonal antibodies for 30 min at 4 °C. For CD34+ cells,Miltenyi Biotec) was also used to assess differentiation. Cells were analyzed using CytoFLEX S flow cytometer, and FlowJo cytometry software was used for data visualization.

RNA isolation and quantitative PCR with reverse transcription (RT-qPCR)
RNA was extracted from 1 to 2 million cells using TRIzol (Invitrogen) followed by phenol-chloroform extraction. Reverse transcription reactions were performed with random hexamers using iScript (Bio-Rad). BCL11A, HBB, HBD, HBG1/2, and HBE mRNAs were quantified by SsoAdvanced Universal SYBR Green Supermix (1725272, Bio-Rad) and run on a CFX96 Touch Real-Time PCR Detection System (Bio-Rad). The sequence of oligos is listed in key resources table.
Hemoglobin HPLC 1 million HUDEP-2 cells were harvested and washed with PBS two times. After the harvest of the cells, cells were snap frozen at -80 °. Frozen cell pellets were collected for HPLC at the University of Michigan Hospital.

RNA-sequencing
Total RNA from HUDEP-2 cells upon 5 days of differentiation were isolated using TRIzol followed by phenol-chloroform extraction. Sequencing libraries were prepared from 500 ng of total RNA using Swift Biosciences rapid RNA library kit following the manufacturer's protocol. Libraries were sequenced in the Illumina HiSeq 4000 platform to generate paired-end reads of 2 × 150 bp.
Paired-end reads were trimmed with trim galore and aligned to hg19 genome with STAR (v2.7.0f) using the default parameters with the following modifications:(1) sjdbOverhang was set to sequence length -1 as recommended in the STAR manual, (2) twopassMode was set to Basic, (3) outReadsUnmapped was set to None, (4) outSAMtype was set to BAM SortedByCoordinate, and (5) quant-Mode was set to GeneCounts. Gene expression was quantified using STAR's built-in and counts were imported into R (v4.0) using readDGE function to produce DGE list object (Dobin et al., 2013;Robinson et al., 2010).
Differential expression (DE) analysis was performed in R with the edgeR package (v3.30.3). A paired DE analysis was performed to assess changes between groups (3′HS1 deletion or inversion versus WT). Normalization factors and effective library size were applied. Dispersion was estimated using the "estimateDisp" function, with the design matrix as: ~ replicate + group, where "replicate" refers to biological replicates of each sample and "group" refers to the individual clones of deletion or inversion of 3′HS1 and WT. Likelihood ratio test was performed for differential expression with the "glmFit" and "glmLRT" functions. The list of DE genes was further filtered by setting p-value < 0.01 and absolute value of log2 fold-change >1.
Paired-end reads were trimmed using Trim Galore (version 0.6.1) and aligned to hg19 using Burrows-Wheeler Aligner (bwa-mem, version 0.7.17). The resulting alignments were sorted, indexed using SAMtools (version 1.9), and marked for duplicates with samblaster (version 0.1.24). Reads were then normalized using deeptools bamCoverage with RPGC parameter and visualized with IGV.

CUT&RUN
CTCF CUT&RUN was performed according to published protocol (Meers et al., 2019). Briefly, upon 5 days of differentiation, 500,000 HUDEP2 cells were immobilized with BioMag Plus Concanavalin A (BangsLabs, Inc). Cells were permeabilized and incubated with CTCF antibody (ab70303, Abcam) overnight. After washing away unbound antibody, pA-MNase (a gift from Dr. Steven Henikoff) was then added to the cells and incubated for 5 min in a metal block on ice. The MNase reaction was stopped, chromatin was released by diffusion at 37° C for 30 min, and DNA was extracted. Swift Biosciences Accel-NGS 2S Plus DNA library kit was used to construct NGS libraries. Libraries were sequenced on the Illumina HiSeq 4000 platform to generate paired-end reads of 2 × 150 bp.
Paired-end sequencing reads were trimmed using Trimmomatic (version 0.36) and aligned to hg19 genome assembly using Bowtie2 (version 2.3.5) with the parameter "--dovetail --phred33". The resulting alignments were indexed, sorted, and marked for duplicates with Picard "MarkDuplicates" function. Reads were then normalized using deeptools bamCoverage with RPGC parameter and visualized with IGV.
Reads were aligned to the human genome (hg19) using Bowtie2 (version 2.3.5). The resulting alignments bam files were indexed, sorted with SAMtools, and marked for duplicates with Picard "MarkDuplicates" function. Reads were then normalized using deeptools bamCoverage with RPGC parameter and visualized with IGV.
Chromatin conformation capture (HiC and capture HiC)

Hi-C library preparation
Approximately 2 million cells of each HUDEP-2 clone were differentiated for 5 days and fixed in 1% formaldehyde. Hi-C libraries were generated using the Arima-HiC kit, according to the manufacturer's protocols. Libraries were prepared using the Accel-NGS 2S Plus kit (Swift Biosciences, 21096), with single indexing kit 2S Set A (Swift Biosciences, 26148). The final amplification cycle numbers for each library were determined by qPCR in the QC2 step of the Arima protocol. Quantification of the libraries was performed with KAPA Library Quantification Kit (Roche, KK4824). The libraries were then pooled for sequencing on NovaSeq S4 to get between 300 and 400 million reads each.

Probe design
The capture probes were designed as previously described (Sanborn et al., 2015). 3822 oligos of probe sequence covering Chr11:4665299-5954156 with adaptor sequence on both end as follows: ATCGCACCAGCGTGT N120 CACTGCGGCTCCTCA was synthesized by GeneScripts. The biotinylated RNA probes specific to the β-globin locus were made as described. Briefly, the desired sequences were amplified out of the pool with primer sequences complementary to both ends using KAPA HiFi HotStart MasterMix for 12 cycles. The oligonucleotides were then prepared for in vitro transcription by adding a T7 promoter to the forward primer and amplifying for a further 15 cycles. In vitro transcription with Biotin-16-UTP was performed with NEB HiScribe T7 for 2 hr at 37 °C. The template DNA was degraded by adding 1 μL DNase and incubating at 37 °C for a further 15 min, and then stopping the reaction by adding 1 μL 0.5 M EDTA. The RNA was then purified using Zymo RNA Clean & Concentrator columns, eluting in 15 μL elution buffer. We then added 1 U/μL RNase Inhibitor (NEB, M0314), aliquoted, and stored at -80 °C until needed for the capture.
Capture 150-500 ng of the previously created Hi-C libraries were diluted to 25 μL and mixed with 2.5 μg Cot-1 DNA and 10 μg salmon sperm DNA and heated to 95 °C for 5 min, then held at 65 °C for at least 5 min. 33 μL of prewarmed (to 65 °C) hybridization buffer (10× SSPE, 10× Denhardt's buffer, 10 mM EDTA, and 0.2% SDS), along with 6 μL of RNA probe mixture (500 ng biotinylated-RNA probes and 20 U RNase inhibitor) were added to the DNA mixture and hybridized for 24 hr at 65 °C. After the hybridization incubation was complete, 50 μL of Streptavidin T1 beads (Dynabeads, Life Technologies) were washed in Bind-and-Wash buffer (1 M NaCl, 10 mM Tris-HCl, pH 7.5, and 1 mM EDTA), resuspended in 134 μL of the same buffer, and then added to the hybridization mixture. The beads were allowed to bind to the biotinylated, hybridized DNA and RNA mixture for 30 min at room temperature before separating and discarding the supernatant. The bead-bound DNA was then washed once with low-stringency wash buffer (1× SSC, 0.1% SDS), for 15 min at room temperature, and three times with high-stringency wash (0.1× SSC, 0.1% SDS) for 10 min at 65 °C, separating the beads on a magnet each time before discarding the supernatant. After the last wash, the beads were resuspended in 21 μL nuclease-free water. 1 μL was diluted 1:1000 and had qPCR performed as for QC2 of the Arima protocol to determine the number of cycles to amplify the enriched libraries.