Polycomb-mediated genome architecture enables long-range spreading of H3K27 methylation

Significance The relationship between long-range Polycomb-associated chromatin contacts and the linear propagation of histone H3 lysine 27 trimethylation (H3K27me3) by Polycomb repressive complex 2 (PRC2) is not well-characterized. Here, we nominate a role for developmental loci as genomic architectural elements that enable long-range spreading of H3K27me3. Polycomb-associated loops are disrupted upon loss of PRC2 binding and deletion of loop anchors results in alterations of H3K27me3 deposition and ectopic gene expression. These results suggest that Polycomb-mediated genome architecture is important for gene repression during embryonic development.


HiChIP data processing
HiChIP data were processed as described previously (1). Briefly, paired end reads were aligned to hg38 or mm10 genomes using the HiC-Pro pipeline (2) (version 2.11.0). Default settings were used to remove duplicate reads, assign reads to MboI restriction fragments, filter for valid interactions, and generate binned interaction matrices. The Juicer pipeline's HiCCUPS tool and FitHiChIP were used to identify loops (3,4). Filtered read pairs from the HiC-Pro pipeline were converted into .hic format files and input into HiCCUPS using default settings.
Dangling end, self-circularized, and re-ligation read pairs were merged with valid read pairs to create a 1D signal bed file. FitHiChIP was used to identify "peak-to-all" interactions at 10 kb resolution using peaks called from the one-dimensional HiChIP data. A lower distance threshold of 20 kb was used. Bias correction was performed using coverage specific bias. 1D signal bed files were converted to bigwig format for visualization using deepTools bamCoverage (version 3.3.1) with the following parameters: --bs 5 --smoothLength 105 --normalizeUsing CPM --scaleFactor 10 (5). Enrichment of 1D signal at ChIP-seq peaks was computed using deepTools computeMatrix (version 3.3.1) (5). TAD and A/B compartment annotations were obtained from a previously published mESC Hi-C dataset (6). Gene ontology enrichment at loop anchors was performed using GREAT (7) (version 4.0.4).

Virtual 4C
Virtual 4C plots were generated from dumped matrices generated with Juicebox. The Juicebox tools dump command was used to extract the chromosome of interest from the HiChIP .hic file or Hi-C .hic file from Bonev et al. (6) obtained from the 4D Nucleome Data Portal. E11.5 limb bud Hi-C data (8) was obtained the Gene Expression Omnibus (GEO) (GSE116794) and aligned to the mm10 genome using the HiC-Pro pipeline (2) (version 2.11.0). The interaction profile at the indicated resolution for the bin containing the anchor was then plotted in R following scaling by the total number of filtered reads in each experiment.

Cut&Tag data processing
Cut&Tag data were processed as described previously (9) with the following modifications.

ChIP-seq data processing
ChIP-seq data were obtained from the Gene Expression Omnibus (GEO) using the following accession numbers: mESC EZH2 ChIP-seq (14) (GSE23943), EED cage mutant mESC

4C-seq data processing
4C-seq data was processed using pipe4C (22) and aligned to the hg38 genome using reading primer sequences in SI Appendix, Table S1.

ORCA data analysis
For all analysis in Figure 2 and supp. Figure 2, ORCA data was processed to calculate absolute distances between all barcodes for all cells.