Polycomb contraction differentially regulates terminal human hematopoietic differentiation programs

Lifelong production of the many types of mature blood cells from less differentiated progenitors is a hierarchically ordered process that spans multiple cell divisions. The nature and timing of the molecular events required to integrate the environmental signals, transcription factor activity, epigenetic modifications, and changes in gene expression involved are thus complex and still poorly understood. To address this gap, we generated comprehensive reference epigenomes of 8 phenotypically defined subsets of normal human cord blood. We describe a striking contraction of H3K27me3 density in differentiated myelo-erythroid cells that resembles a punctate pattern previously ascribed to pluripotent embryonic stem cells. Phenotypically distinct progenitor cell types display a nearly identical repressive H3K27me3 signature characterized by large organized chromatin K27-modification domains that are retained by mature lymphoid cells but lost in terminally differentiated monocytes and erythroblasts. We demonstrate that inhibition of polycomb group members predicted to control large organized chromatin K27-modification domains influences lymphoid and myeloid fate decisions of primary neonatal hematopoietic progenitors in vitro. We further show that a majority of active enhancers appear in early progenitors, a subset of which are DNA hypermethylated and become hypomethylated and induced during terminal differentiation. Primitive human hematopoietic cells display a unique repressive H3K27me3 signature that is retained by mature lymphoid cells but is lost in monocytes and erythroblasts. Intervention data implicate that control of this chromatin state change is a requisite part of the process whereby normal human hematopoietic progenitor cells make lymphoid and myeloid fate decisions.


Background
Epigenetic modifications govern local chromatin activity and support activation or silencing of gene transcription through the regulation of chromatin structure and DNA accessibility [1][2][3]. Numerous cell differentiation processes are accompanied by obligatory changes in chromatin structure mediated by proteins that specifically modify chromatin and thereby establish and maintain defined transcriptional regulatory states [4][5][6][7][8]. However, little is known about the details of epigenomic programing changes that promote or determine lineage restriction in normal tissues, or how Open Access *Correspondence: mhirst@bcgsc.ca 5 Canada's Michael Smith Genome Science Centre, BC Cancer, Vancouver, Canada Full list of author information is available at the end of the article such changes may contribute to the initiation and completion of terminal differentiation programs.
Hematopoiesis refers to the general process by which the different types of short-lived, mature blood cells are produced throughout life [9]. The accessibility of the cells directly involved has made this system an attractive one for identifying the types of epigenomic changes that characterize this process, the sequence in which they occur, and those that are necessary [10][11][12][13]. At birth, human cord blood (CB) has been particularly useful for such studies, because it contains a self-maintaining population of hematopoietic stem cells (HSCs) that regenerate the entire system in transplanted receptive hosts lifelong as well as a full spectrum of derivative progenitors with reduced proliferative capacity and different lineage options. These progenitor cells then undergo further restriction of their proliferative and lineage potentials to eventually allow the initiation of a single lineage program and the production of one of the many different types of mature blood cells [14,15]. A complex coordinated action between a network of transcription factors (TFs) and chromatin controls cellular specification and differentiation. TF-mediated chromatin state changes are accomplished through interactions between TFs and epigenetic modifiers [2]. Prior efforts have identified a series of lineage-specific TFs like PU.1 (in the myeloid lineage) and GATA1 (in the erythroid lineage) [16] as drivers of hematopoietic lineage differentiation. Furthermore, previous studies have suggested that the process of lineage restriction involves the epigenetic demarcation of specific genomic regions that make them accessible to particular TFs to ultimately activate single lineage-specific gene expression programs [10,11,17]. However, neither the molecular details of these steps nor their dynamics in relation to cell phenotype changes have been delineated.
Previous studies using chromatin immunoprecipitation sequencing (ChIP-seq) of sites of permissive histone modifications and transposase-accessible chromatin sequencing (ATAC-seq) have provided insight into the identity of regulatory regions in the genomes of various phenotypically defined primitive subsets of mouse and human hematopoietic cells that change during their differentiation [10][11][12]. Together, these findings have suggested a model in which certain enhancers are initially "primed" in primitive multi-potent hematopoietic cells by an acquisition of "permissive" histone 3 lysine 4 mono-methylation (H3K4me1) modifications, followed by a gain of additional histone 3 lysine 27 acetylation (H3K27ac) modifications by the same histones to enable terminal differentiation programs to become activated [10]. However, the potential existence of "stage-specific" as well as lineage-specific changes in the chromatin as an integral part of the process [18,19] has remained uncharacterized, particularly in human hematopoiesis.
To obtain such information, we isolated 8 previously well-characterized, phenotypically defined subsets of normal human CB cells at high purity [18,19] and generated detailed genome-wide datasets for permissive and repressive histone modifications, sites of DNA methylation, and transcriptomes from them. Four of these were highly enriched in cells with progenitor activity but different lineage output capabilities and together constitute the bulk of all of the CD34+ CB cells. They consisted of a CD38− subset, which contains the most primitive cells and all HSCs, and 3 distinct subsets within the CD38+ subpopulation of CD34+ cells, the so-called common myeloid progenitors (CMPs), granulocyte-macrophage progenitors (GMPs), and megakaryocyte-erythroid progenitors (MEPs). The 4 "mature" cell types analyzed were low-density, circulating monocytes, erythroblasts, B cells, and T cells.
Analysis of repressive chromatin states in the 4 (CD34+) progenitor populations showed these all shared a nearly identical H3K27me3 signature (gene promoter Spearman correlation R>0.97). This signature included large organized chromatin K27-modification domains (LOCKs) that were also present in the B and T cell isolates, but absent from the monocytes and erythroblasts. The LOCKs present in all of the progenitor fractions examined as well as in the mature lymphoid cells were also found to be co-marked with H3K9me3 and located within lamin-associated domains. The CpGs within the B and T cells also resembled the CD34+ progenitor fractions in their general hypermethylation status in comparison to the CpGs of the co-isolated monocytes and erythroblasts, consistent with the acquisition of an altered heterochromatic state by the latter two cell types. Analysis of the enhancer landscape of these 8 populations revealed that a majority of traditionally defined active enhancers found in differentiated cell types were already evident within their progenitors at a hypermethylated state, and those unique to terminally differentiated cells, were found almost exclusively within the boundaries of super-enhancers. From these findings, we propose a model in which a genome-wide contraction of heterochromatin is a critical step in the process by which human hematopoietic progenitor cells with lympho-myeloid potential lose their lymphoid potential.

Identification of a chromatin signature shared by progenitors and mature lymphoid cells but not mature myeloid cells
We first undertook a comprehensive mapping of the epigenetic and transcriptional states of historically defined immunophenotypes of human CB cells using our low input ChIP-seq protocol [20,21] to identify H3K4me3, H3K4me1, H3K27me3, H3K27ac, H3K36me3, and H3K9me3 sites genome-wide, plus whole genome bisulfite sequencing and RNA-seq protocols following International Human Epigenome Consortium (IHEC) standards [22] (Fig. 1A and Additional file 1: Fig. S1A). A standardized analytical pipeline was then applied to qualify and analyze the resulting data (see the "Methods" section).
RNA expression profiles for each of the progenitor populations analyzed (CD34+CD38− cells, CMPs, GMPs, and MEPs) were more highly correlated with one another (Spearman R >0.92) than with any of the 4 more mature CB cell types examined; i.e., monocytes, erythroblasts, and B and T cells (average Spearman R = 0.76, 0.83, 0.84, and 0.84, respectively, Fig. 1B). These confirm relationships also evident in previously published datasets [4] (Additional file 1: Fig. S1B). Expression signatures derived for each progenitor subset were also in agreement with previously published features of these same subsets (see the "Methods" section and Additional file 1: Fig.  S2A and B). For example, pathway and gene enrichment analysis of uniquely upregulated transcripts in GMPs showed these were enriched in terms related to leukocyte differentiation, inflammation response, and regulation of immune response (Benjamini q-value <10e −20 , Additional file 1: Fig. S2C), and included the CD135 cell surface marker. Transcripts uniquely upregulated in MEPs were enriched in terms related to myeloid lineage differentiation (Benjamini q-value <10e−3) (Additional file 1: Fig. S2D). Genes upregulated in CD34+CD38− cells and CMPs as compared to the 4 populations of more differentiated cells examined were enriched in terms related to hematopoiesis regulation and differentiation (Benjamini q-value <10e −3 ) (Additional file 1: Fig. S2E and F).
Examination of the chromatin state of the 8 different cell types examined here showed H3K4me3 and H3K36me3 densities correlated with expected transcript levels and cell-type-specific signatures (Figs. 1C, E-G, S2G and H). H3K4me3 densities at promoters of genes that were expressed at different levels in each of the different progenitor subsets, or between them and the 4 later cell types, also showed expected relationships (Additional file 1: Fig. S3A). In contrast, H3K27me3 occupancy was nearly identical across all of the 4 different progenitor subsets (Fig. 1D, H; promoter Spearman R > 0.97), and cell-type-specific signatures were apparent only in the mature cell types (Fig. 1E, F). However, the H3K27me3 densities at promoters in the progenitor fractions were more significantly correlated with those in the lymphoid cells as compared to either the monocytes or the erythroblasts (Fig. 1D). In fact, RNA expression, H3K4me3 and H3K27me3 signatures at promoters in the erythroid precursor population all showed the lowest correlation (average Spearman R <0.56) with the corresponding features in all 4 progenitor populations, including the MEPs (Fig. 1D). Moreover, a comparison of the H3K27me3 densities within the promoters of genes that were differentially expressed in GMPs and MEPs showed these were not significantly different (Additional file 1: Fig. S3B). In contrast, genes whose expression appeared downregulated in monocytes and erythroblasts compared to progenitor subsets showed a significant gain of H3K27me3 density at the corresponding promoters (2-sided t-test, p <2.2 × 10 −16 ).

Terminally differentiated monocytes and erythroblasts exhibit a genome-wide contraction of H3K27me3 density
Examination of the patterns of H3K27me3 occupancy in the chromatin of the 4 different types of mature blood cells analyzed (Fig. 1D, H) showed both the monocytes and erythroblasts contained a significantly reduced overall frequency of H3 histones that were methylated at K27 (30-52%) (Additional file 1: Fig. S3C). This was explained in part by the observation of a pronounced contraction of the extensive contiguous stretches of chromatin (domains) containing the H3K27me3 mark characteristic of the 4 progenitor populations. The remaining H3K27me3 in the monocytes and erythroblasts displayed a more punctate structure reminiscent of that seen in pluripotent cell types [23] ( Fig. 2A-E). Immunoprecipitated (IP) fragment distributions at promoters [21] (± 2 Kb of transcription start sites (TSS)) were also significantly different in the monocytes and erythroblasts in comparison to the 4 progenitor subsets or to either the mature B or T cells present in the same samples (Kolmogorov-Smirnov test, p < 7×10 −12 ; Fig. 2C). Examination of H3K27me3 distributions in all of these cell phenotypes also revealed a decrease in the proportion of H3K27me3-marked histones outside the promoters in the monocytes and erythroblasts compared to the progenitors from which they are thought to be most immediately derived (GMPs, Additional file 1: Fig. S3D; and MEPs, Additional file 1: Fig.S3F), without a measurable change in H3K4me3 occupancy (Additional file 1: Fig.  S3E and G).
To examine the functional consequence of the global alteration in H3K27me3 occupancy, we examined the levels of H3K27me3 within gene bodies in relation to transcript levels in all 8 cell types (Fig. 2F). Consistent with the genome-wide patterns, the monocytes and erythroblasts showed a loss in H3K27me3 density on more genes (1418 and 1980, respectively) than the B and T cells (942 and 873, respectively) when both sets were compared to the most primitive CD34 + CD38 − progenitor compartment (Fig. 2F). In addition, loss of H3K27me3 correlated with a greater proportion of upregulated genes in monocytes and erythroblasts compared to the B and T cells (Fig. 2G). Among the genes that lost H3K27me3 and were upregulated were well-known markers of monocyte and erythroid lineage differentiation including CD14, EPB42, and CD36 ( Fig. 2H-J).
In addition to the differential gene-specific losses of H3K27me3 in the monocytes and erythroblasts, there was also an associated loss of many large organized chromatin K27me3 domains (LOCKs) [24] (Fig. 3A, B) that were, nevertheless generally retained in the B and T cells (Figs. 3A, B and S3H). A majority (average 87%) of the few remaining LOCKs identified in the monocytes and erythroblasts overlapped with LOCKs present in the 4 progenitor populations (Fig. 3B). One hundred nine LOCKS are lost in both monocytes and erythroid precursors and 38 LOCKs are retained in both. Even within the LOCKs retained in the monocytes and erythroblasts, H3K27me3-marked regions showed evidence of contraction, with an average width of 52 Kb compared to 268 Kb seen in related progenitor subsets (GMPS and MEPs, Fig. 3C-E).
The corresponding punctate signature observed in monocyte and erythroblasts is reminiscent of a chromatin state initially annotated in human embryonic stem cells (ESCs) that is lost following their differentiation [23]. This reversion of differentiated monocytes and erythroblasts to a chromatin state previously noted in undifferentiated ESCs was thus unexpected and prompted us to quantify the similarities between the LOCKs present in all the CB cell types analyzed and ESCs. This showed a majority of H3K27me3-marked regions within the LOCKs in the progenitor subsets that were also present in the monocytes and erythroblasts were also marked by H3K27me3 in the ESCs (Fig. 3F). These data thus provide evidence of a genome-wide contraction of H3K27me3 density during the process by which certain CB progenitors generate mature progeny (Fig. 3G) that appears to be lacking in the lymphoid restriction process.

H3K27me3 LOCKs are co-marked with H3K9me3 and enriched in lamina-associated domains (LADs)
To further characterize the H3K27me3 LOCKs lost during the terminal differentiation of monocytes and erythroblasts, we next examined the co-occurrence of other histone modifications and DNA methylation in these same regions. For this, we applied ChromHMM [25] to generate an 18-state model based on H3K4me1, H3K4me3, H3K27me3, H3K27ac, H3K36me3, and H3K9me3 occupancy across all of the 8 CB cell types examined (Fig. 4A). H3K9me3 was the most highly enriched mark within H3K27me3-defined LOCKs in all 8 cell types (Fig. 4B, C). Indeed, H3K9me3 is significantly higher at polycomb-enriched regions within the LOCKs compared to the regions outside the LOCKs in progenitor and lymphoid populations, suggesting co-occurrence of H3K27me3 and H3K9me3 is a key characteristic of these structures (Additional file 1: Fig. S4F). Furthermore, H3K27me3-enriched regions with or without H3K9me3 were found to be enriched in lamina-associated domains (LADs) in both the progenitor subsets and in the B and T cells, but not in monocytes or erythroblasts (  with laminB binding sites (> 55%) and these appeared to be specifically lost in monocytes and erythroblasts ( Fig. 4F, G). In addition, H3K9me3 occupancy, like that of H3K27me3, was significantly reduced in the monocytes and erythroblasts compared to the progenitor subsets (2-sided t-test p<2.2 × 10 −16 , Fig. 4H). H3K9me3 is also highly correlated between the 4 progenitor populations compared to any of the 4 mature cell types studied and there was a directional loss of H3K9me3 occupancy specifically in monocytes and erythroblasts (Additional file 1: Fig. S4B and C). Consistent with the genome-wide reduction in heterochromatic states exhibited by the monocytes and erythroblasts, these two cell types showed reduced CpG methylation in their LOCKs compared to the 4 progenitor subsets and the B and T cells (2-sided t-test p<5.5 × 10 −8 ; Fig. 4I). Erythroblasts particularly, but also the monocytes, showed reduced DNA methylation more broadly in chromatin states identified as polycombrepressed and/or heterochromatin-repressed in the ChromHMM model ( Fig. 4J and K and Additional file 1: S4D and E). However, unlike H3K9me3, DNA methylation levels at polycomb-enriched regions were not significantly higher inside LOCKs as compared to polycomb-enriched regions outside LOCKs (Additional file 1: Fig. S4G). These findings are consistent with a previously reported reduction in DNA methylation of monocytes and neutrophils relative to different progenitor populations [13]. They also corroborate a previous relationship between DNA methylation and H3K9me3 densities [26][27][28]. In the present context, they also suggest that human CD34+ hematopoietic progenitors share a higher order chromatin structure that is associated strongly with LADs and is enriched in sites of H3K27me3 and H3K9me3.

H3K27me3 contraction is associated with reduced BMI1 expression
We next asked whether there were any similarities in the expression of polycomb group (PcG) proteins in monocytes and erythroblasts in comparison to ESCs. B cell-specific Moloney murine leukemia virus integration site 1 (BMI1), a component of the polycomb repressive complex 1 (PRC1), demonstrated the most significant reduction in expression across all 3 of these cell types compared to the other 6 types profiled here (Fig 5A, B and Additional file 1: S5A), a feature conserved in their mouse counterparts (Fig 5C) [10]. Transcriptional repression of BMI1 has been previously associated with a loss of both H3K27me3 and H3K9me3, and a concomitant reduction in heterochromatin [29,30]. Likewise, inhibition of BMI1 expression in vivo compromised hematopoietic progenitor self-renewal in mice [31,32]. These observations suggest that the contraction of H3K27me3 observed in monocytes and erythroblasts and shared with ESCs is associated with reduced expression of BMI1, a PRC1 complex member previously implicated in the maintenance and spreading of H3K27me3.
To investigate the role of BMI1 expression on H3K27me3 maintenance, we first identified HL60 as a cell line model that mirrored the H3K27me3 LOCKS seen in primary CD34+ hematopoietic progenitors. We next examined the ability of HL60 cells to maintain a progenitor-like H3K27me3 state at LOCKs (Fig. 3A) in the presence or absence of BMI1. We utilized CRISPR/ cas9 to target the BMI1 allele in two independent biological replicates and validated its knockout using a T7 endonuclease mismatch cleavage assay and at the protein level by western blot (Additional file 2: Table S1 and Additional file 1: Fig S5B- [29,30]. ChIP-seq confirmed a genome-wide, rather than localized reduction of H3K27me3 in the BMI1-KO cells suggesting loss of H3K27me3 both within and outside of LOCKs (Additional file 1: Fig S5G). These results confirm previously reported associations between BMI1 loss and H3K27me3 reduction and support a model implicating BMI1 in the expansion of H3K27me3 in CD34+ hematopoietic progenitors and lymphoid cells [29,30].

EZH2 inhibition differentially alters the production of different hematopoietic lineages consistent with their acquired epigenomic features
The broad H3K27me3 domains shared by CB progenitors that are no longer present in monocytes and erythroblasts but persist in mature lymphoid cells (Fig. 3) led us to hypothesize that these differences have functional roles in their differentiation. To test this possibility, we first asked how the mature cell outputs of a CB cell type with dual and almost exclusive granulopoietic and B-lymphopoietic differentiation potential would be affected by exposure to either of 2 inhibitors -one, EPZ-6438, that targets EZH2, a major component of the polycomb repressive complex-2 (PRC2), and another, GSK-J4, that targets a H3K27me3 demethylase. Accordingly, CD34+CD38midCD71− CB cells were incubated with these inhibitors (0.1% DMSO as a control) for 3 weeks in cultures optimized to support their differentiation into mature GM and B cells. Analysis of the numbers of these generated in bulk cultures showed a selective decrease in CD19+ B-lineage cells in the presence of EPZ-6438 (2-sided t-test p<0.014), with no effects of either inhibitor on the output of cells expressing phenotypic markers of monocyte/macrophages or neutrophils (Fig. 6A We then asked if these 2 inhibitors also affect the ability of HL-60 cells to activate a granulopoietic differentiation program [33]. After 3 days of treatment with EPZ-6438, HL60 cells showed the same growth arrest obtained with all-trans retinoic acid (ATRA) (Fig. 6D, E) and an expected subsequent loss of viability. In contrast, in the presence of GSK-J4, the results were indistinguishable from those of the DMSO-treated controls. Confirmation of induced granulopoietic differentiation in the presence of ATRA and EPZ-6438 was obtained by FACS detection of the appearance of increased numbers of CD11b+ cells after 48 h of exposure to these treatments compared to controls (2-sided t-test, p < 0.001, Fig. 6F). Thus, the inferred EPZ-6438-mediated removal of H3K27me3 marks from HL60 cells appears to promote the same differentiation alterations as ATRA, consistent with the loss of H3K27me3 seen to distinguish CB monocytes from their co-isolated progenitors [34]. Further support of this inference was the finding of a significantly lower H3K27me3 density (2-sided t-test, p < 0.001) in the EPZ-6438-treated HL60 cells compared to the DMSO control at HL60-defined LOCKs (Fig. 6G) in parallel with those that overlapped with LOCKs in CB progenitors (Fig. 6H) and no longer evident in monocytes (Fig. 6I).
Together, these results suggest that a global reduction of H3K27me3 is required for the differentiation of mature myeloid cells from progenitors with that potential, whereas maintenance of H3K27me3 domains is required for the production of mature lymphoid cells.

Lineage-specific enhancers marked by H3K27ac are hypermethylated in hematopoietic progenitor subsets
We next sought to identify and compare the enhancer states of the 4 progenitor populations and the 4 more mature cell types examined. Accordingly, we identified the H3K27ac-and H3K4me1-marked regions in each population and used the results to create a catalogue of active (H3K27ac and H3K4me1) and primed (H3K4me1) enhancers. Primed enhancers were relatively consistent across all 4 progenitor populations (Spearman, average R >0.8; Fig. 7A) with active enhancers showing increased progenitor specificity (Spearman, average R >0.52; Fig. 7B and Additional file 1: S6G). The progenitor populations also showed consistently a higher number of total enhancers, as measured by the sum of both the active and primed enhancer states, by comparison to the 4 more mature cell types (Additional file 1: Fig. S6C). In contrast, the number of active enhancers was higher in 3 of the 4 more mature cell types (2-sided t-test, p =0.026), the exception being the erythroblasts that had the lowest number of active enhancers compared to all other cell types (Fig. S6E). As expected, a majority of H3K27ac (>90%) and H3K4me1 (>60%) enriched regions overlap with open chromatin regions (Additional file 1: Fig. S6D and F) [11]. We next traced the gain or loss of H3K27ac and H3K4me1 from the most primitive CD34+CD38− progenitor subset to each differentiated cell type according to published trajectories. This confirmed a directional loss of total enhancers during progenitor differentiation with the erythroblasts showing the greatest overall loss of enhancers (Fig. 7C). However, this directional loss was largely restricted to primed enhancers with active enhancers staying either at the same frequency or at a frequency that increased with differentiation (Fig. 7C). The majority of primed enhancers (>90%) and surprisingly active enhancers (>72%) found in mature cells were already evident in their proximal progenitor populations (Fig. 7D, E and Additional file 1: S6H). A significant fraction (>80%) of the active enhancers in the differentiated cell types were also already primed across the progenitor subsets (Additional file 1: Fig. S6I and J), as noted earlier for similar subsets of mouse hematopoietic cells [10]. Interestingly, genes associated with enhancers found to be active in monocytes and erythroblasts, but already active in their corresponding progenitors produced significantly higher levels of transcripts in the mature cells compared to their inferred parental progenitors (pairwise Wilcoxon signed-rank test p <2×10 −5 , Fig. 7F and G). These genes were enriched in terms related to specific hematopoietic differentiation programs (Benjamini corrected p <10e −30 ) (Additional file 1: Fig. S7A and B). Strikingly, CpGs within these active enhancers showed significantly higher levels of methylation in progenitors compared to mature cells (2-sided t-test p < 2 × 10 −16 ) suggesting that CpG methylation states within active enhancers may be predictive of activity and coordinately regulated during hematopoietic differentiation. For example, SPI1 and EPB41 were associated with methylated active enhancers in progenitor cells that were hypomethylated in lineage-restricted cells concomitant with their expression (Fig. 7H, I). Motif enrichment analysis identified CEBP and GATA family of TF motifs as the most enriched in these lineage enhancers (Additional file 1: Fig. S7C and D). Collectively, these observations suggest that a majority of active enhancers driving terminal hematopoietic differentiation programs appear in early progenitors but constrained by CpG methylation until lineage-specific programs are initiated.
To further understand how active enhancer states differ between progenitors and terminally differentiated cells, we leveraged self-organizing maps (SOM) and used ranked normalized H3K27ac density across all hematopoietic enhancers to identify progenitor, monocyte, erythroid, B, and T cell enhancer clusters (the "Methods" section; Additional file 1: Fig. S7E). Relative to the CD34+CD38− population, GMPs and MEPs showed progression towards, and enrichment in, enhancers belonging to the monocyte and erythroblast enhancer clusters, respectively (Additional file 1: Fig. S7F). Enrichment of these differentiated enhancer clusters in MEPs and GMPs supports a model in which pre-existing celltype-specific active enhancer signatures emerge in proximal progenitors and may thus contribute to driving a transcriptional program activated at later stages of differentiation. To that end, we compared the TF binding sites across all cell types in this study with respect to the CD34+CD38− cell population. Monocyte and erythroblast cells clustered with GMP and MEP, respectively, indicating shared lineage-specific enhancers between progenitors and differentiated cells also share lineagespecific TF binding sites (Additional file 1: Fig. S8A and B). The most significantly enriched motifs belonged to the CEBP and GATA family of TFs in monocytes and erythroblasts, respectively. Interestingly, CMPs show enrichment in both lymphoid and monocyte active enhancer clusters consistent with emerging evidence suggesting that CMPs comprise multiple subsets with different mature output capabilities [18,19].
We next used the ROSE RANK algorithm [35] to identify high amplitude clusters of H3K27ac (super-enhancers) in each of the 8 cell types analyzed. We observed a greater than 2-fold increase in the number of superenhancers in the mature cell types compared to the numbers of these in the progenitor populations (Fig. 8A). Surprisingly, we also found that a majority (>70%) of the active enhancers seen only in the mature cells were located within the boundaries of super-enhancers, including the human counterparts of enhancers previously identified in differentiating mouse hematopoietic cells [10] (Fig. 8B, C). Pathway analysis of genes associated with monocyte and erythroblast super-enhancers were enriched in terms related to leukocyte activity and erythroid differentiation, respectively (Additional file 1: Fig. S7G and H). Taken together, these data suggest a model in which a majority of active enhancers are initially marked in progenitor populations and then further reinforced to form super-enhancers during the final processes of terminal blood cell differentiation.

Discussion
Epigenetic mechanisms have long been postulated to play a central role in hematopoietic cell fate decisions [36][37][38][39][40][41] and repressive H3K27me3 chromatin modifications have been implicated from investigations of the regulation of hematopoiesis both in vivo and in vitro [42][43][44][45][46]. Manipulation of H3K27me3 has been implicated particularly as impacting the steps progenitors undergo to produce mature myeloid and lymphoid cell types [32,44]. In addition, both overexpression and inactivation of PRC2 components (responsible for the methylation of H3K27) have been reported in hematopoietic malignancies, suggesting a critical role of H3K27me3 in regulating normal hematopoiesis [47,48]. Here we revealed a broad and stable repressive H3K27me3 landscape across multiple normal progenitor subsets present in normal human CB, i.e., those conventionally defined phenotypically as an HSC-enriched subset, CMPs, GMPs, and MEPs. This finding contrast markedly with the dynamic and cell-type-specific landscape we report for active histone modifications. We also identified a striking and lineageselective genome-wide H3K27me3 signature evident in 2 mature myeloid cell types (monocytes and erythroblasts) not present in 2 mature co-isolated B cells and T cells. Intriguingly, the H3K27me3 signature common to the monocytes and erythroblasts also displayed a punctuated H3K27me3 profile reminiscent of that previously uniquely associated with ESCs [49].
We also found that the observed contraction of H3K27me3 domains evident in monocytes and erythroblasts includes a specific loss of H3K27me3 marked LOCKs that in the progenitors and mature lymphoid cells show co-occupancy with another suppressive histone modification, H3K9me3. The lineage-specific restructuring of H3K27me3 specifically during differentiation into the monocyte and erythroid lineages highlights the importance of higher order chromatin structure in differentiation and establishment of cellular identity in the human hematopoietic system. H3K27me3-and H3K27me3/K9me3-enriched regions common to progenitor and lymphoid cells and lost in different mature myeloid cell types are strongly enriched in LADs, reinforcing the concept that myeloid cells can also be distinguished from lymphoid and progenitors based on lamin distribution and nucleus rigidity [50]. Taken together, these observations provide molecular support of the observation that manipulation of lamin expression specifically modulates myeloid, but not lymphoid, cell differentiation [50].
In contrast to H3K27me3, we found the active enhancer mark H3K27ac is dynamic across all progenitor populations analyzed and the majority of active enhancers identified in the mature cells were first detected in their traditionally defined progenitor populations despite recent evidence of the considerable heterogeneity in differentiation potential and other molecular features they display [23]. Key lineage-specific regulators were identified among the genes associated with active enhancers in the progenitor populations. During differentiation, these active enhancers increase in width and amplitude and the expression of associated genes increases. Our results thus suggest that priming of key regulatory regions during hematopoietic differentiation occurs in the context of traditionally described active enhancers whose activity is reinforced and increases as part of the lineage restriction process. This finding refines the currently proposed primed (H3K4me1) to an active enhancer model [10] and suggests that additional features beyond the presence of H3K27ac may be required for full enhancer activity; for example, loss of DNA methylation at active enhancers as shown here and the formation of lineage-specific phaseseparated condensates [51].  Table S2). E Percentage of viable cells in D (N=3, Additional file 2: Table S2). F Percentage of CD11b+ cells assessed by FACS after 48 and 72 h of treatment with ATRA, GSK-J4, or EPZ. H3K27me3 density at LOCKs identified in HL60 cells (N=3, Additional file 2: Table S3). G LOCKs identified in HL60 cells that overlapped with primary CB progenitor LOCKs (H) and were lost in monocytes (I) (two-sided t-test *p < 0.05, **p < 0.01 and ***p < 0.001)

Conclusion
In the classical Waddington view [52], the chromatin of very primitive cells is now envisaged to exist in a more plastic state allowing for a more heterogeneous dynamic remodeling process that culminates in lineage restriction and subsequent activation of a differentiation program. One early finding in support of such privileged chromatin in ESCs is their bivalent state [23,49,53,54] in which the nucleosomes in the promoters of developmentally important genes are marked by both permissive and repressive histone modifications (H3K4me3 and H3K27me3, respectively) that subsequently resolve to an homogenously active or repressed chromatin state as differentiation occurs [49,53]. However, subsequent epigenomic studies across a broad range of primary tissues have revealed that bivalent promoters are not unique to ESC chromatin and can be found in many cell types including fully functional, terminally differentiated cell types [4,21]. Another prevailing model of H3K27me3 occupancy during ESC differentiation posits that, upon differentiation, H3K27me3 genomic occupancy spreads outwards from discrete focal regions to occupy large genomic regions [23], a feature that has been correlated with the differentiation capacity of the cells. Here, we demonstrate the surprising finding that H3K27me3 reverts back to a punctuated profile in cells that have differentiated but just into certain types of mature blood cells. This observation now raises the possibility that an overall loss of repressive chromatin rather than its further compaction is critical for the activation of monocyte and erythroblast differentiation programs. We correlated H3K27me3 contraction with a marked reduction in expression of the PRC1 complex member BMI1, in monocytes, erythroblasts, and ESCs, and demonstrated that loss of its expression and/or activity results in a genome-wide loss of H3K27me3 and certain myeloid phenotypes. This finding is consistent with the model of lineage restriction in the neonatal human hematopoietic system we now propose. Why B and T cells retain a broader H3K27me3 landscape also raises interesting questions. One possibility is that this could be related to their requisite ability to activate a "stem-like" state upon antigenic stimulation in order to generate a large expansion of their progeny [55], a feature not shared by cells within the myeloid lineages.

Preparation of human CB cells
Anonymized consented anticoagulated samples of normal CB cells were obtained with informed consent according to University of British Columbia Research Ethics Board-approved protocols. CD34+ cells were enriched by EasySep ™ from the light-density fraction of Lymphoprep ™ or RosetteSep ™ -depleted CD11b+;CD3+;CD19+ cells (STEMCELL Technologies) and then used after cryopreservation in DMSO and fetal bovine serum (FBS, STEMCELL Technologies). Three pools of cord bloods consisting of collections from 646, 255, and 6 individual donors were used to isolate 10,000 cells of each compartment for histone modification ChIPseq, whole genome DNA methylation, and RNA-seq analyses. For the inhibitor experiments, cells were isolated from a pool of cord bloods consisting of 3 donors.

Isolation of human CB populations
Frozen CB cells were thawed by drop-wise addition to Iscove's modified Dulbecco's medium (IMDM) (STEM-CELL Technologies) supplemented with 10% FBS and 10 μg/mL DNase (Sigma-Aldrich). Cells were suspended in Hanks Balanced Salt Solution (HBSS) supplemented with 5% human serum and 1.5 μg/mL anti-human CD32 antibody (Clone IV.3; STEMCELL Technologies) and then stained with designated antibodies for 1-2 h on ice prior to sorting on a Becton Dickinson FACSAria ™ Fusion or FACSAria ™ III sorter. Cells from the following populations were sorted directly into DNA LoBind tubes (Eppendorf ) containing HBSS + 2% FBS: CD34+CD38-, CD34+CD38+CD10-CD7-CD135-CD45RA-(MEP), CD34+CD38+CD10-CD7-CD135+CD45RA-(CMP), C D 3 4 + C D 3 8 + C D 1 0 -C D 7 -C D 1 3 5 + C D 4 5 R A + (GMP), CD45-CD34-GPA+ (erythroid precursors), CD45+CD34-CD11b+CD33+CD14+ (monocytes), and CD45+CD34-CD11b-CD33-CD19+CD7-(B cells). Naïve CD4 T cells were isolated from CB mononuclear cells (CBMCs) obtained using Lymphoprep (Stem-Cell Technologies Inc., Canada), followed by a first step of negative selection by magnetic bead separation using EasySep Human Naïve CD4 T cell isolation kit (STEMCELL Technologies Inc., Canada) and FACS on a BD FACSAriaTM II using the following markers/staining: anti-CD3 PE (clone UCHT1; BD Bioscience), anti-CD4 BV605 (clone OKT4; BioLegend), anti-CD25 PE-Cy7 (clone M-A251; BD Bioscience), anti-CD45RO Alexa Fluor700 (clone UCHL1; BioLegend), anti-CCR7 Alexa Fluor 647 (clone 3D12; BD Bioscience, ON, Canada), and anti-CD235 eFluor 450 (clone 6A7M eBioscience). Isolated cells were then centrifuged (500 rcf, 6 min) and had their supernatant removed prior to rapid freezing using dry ice or liquid nitrogen then stored at −80°C. org/) was applied to qualify the resulting data. Pairwise differential expression analysis was carried by an inhouse MATLAB script, DEFine, on RPKM values that are transcripts GC-biased corrected [58,59]. Define module is accessible through Zonodo depository (DEfine v0.9.5 -a matlab function for the pair-wise differential gene expression analysis | Zenodo). DEfine is designed to perform a pair-wise differential expression (DE) analysis of RNA-seq data. DEfine heuristics assumes that most of the genes (~90%) are not differentially expressed. DEfine uses RPKM as well as raw read counts for every gene as input. First, considering the fold-change between read counts for all genes as a function of gene GC content, DEfine corrects any existing GC bias by re-normalizing expression metrics (RPKM) values. This is done under the assumption that the mode of the distributions is GC independent. The normalization is done for 10 bins over the values of GC content. Next, for every quantile of the mean gene expression for two samples, DEfine performs best Gaussian fit of the distribution of log-transformed fold-change of the expression values (after removing extreme outliers) and assigns p-values using the result of the fit. This is followed by an FDR control process with FDR threshold being a user-defined parameter, 0.001 in this study. MetaScape version 2.0 [60] (http:// metas cape. org) was used for genome ontology analysis of differentially expressed genes. All figures were generated by R statistical software [61]. RNA-seq data set for mouse hematopoietic cells were obtained from Lara-Astiaso et al. [10].

Low input native ChIP-seq
ChIP-seq was performed as previously described [20]. In brief, cells were lysed in 0.  [63]. and converted to bam format by Sambamba (version 0.5.5). Sequence reads with BWA mapping quality scores <5 are discarded and reads that aligned to the same genomic coordinate were counted only once in the profile generation. A standardized analytical pipeline developed by the International Human Epigenomic Consortium (IHEC) (http:// ihec-epige nomes. org/) was applied to qualify the resulting data.

ChIP-seq analysis
Genome browser tracks were generated by converting bam files to wiggle files using a custom script (http:// www. epige nomes. ca/ tools-and-softw are). Wiggle files then converted to bigwig for display on UCSC genome browser by UCSC tool, wig2Bigwig script. MACS2 [64] was employed to identify enriched regions with a false discovery rate (FDR) value of ≤ 0.01 for H3K4me3, H3K4me1, and H3K27ac peaks in ChIP-seq data. Finder2.0 (http:// www. epige nomes. ca/ tools-and-softw are/ finder) was used to identify enriched regions for H3K27me3 and H3K9me3 ChIP-seq data. Enriched regions overlapping with ENCODE blacklist regions were eliminated. ChIP-seq signal was calculated as tag density generated using HOMER v4.10 [65] and normalized to total number of tags within enriched regions. Heat maps of fragment distribution at promoters were generated by deeptools [66]. All other figures were generated by R statistical software [61]. ChromHMM [67] was used to identify 18 chromatin states based on MACS2 [64] identified enriched regions in each cell type as previously described [25]. To identify active enhancers, we first found MACS2 identified enriched H3K27ac regions in all populations (CD34+ CD38−, CMP, GMP, MEP, monocytes, erythroblasts, B cells, and T cells) and created an enhancer catalogue of hematopoietic cells. MACS2 was run with the following parameters --broad -g hs -B --broad-cutoff with distance of greater than 50Kb to enhancer regions were eliminated. We utilize GREAT, an online ontology analysis tool, for gene ontology analysis of regulatory regions. To identify H3K27me3 LOCKs, MACS2 broad mode was used to call peaks for H3K27me3 with FDR cutoff of 0.05. The resulting peaks were then entered into CREAM [70] with WScutoff of 1.5, MinLength of 1000, and peakNumMin of 2 to identify large H3K27me3enriched regions. Regions with lengths less than 100Kb were eliminated and the remaining were identified as LOCKs. To examine the relationship between LOCKs and LADs, we utilize RoadMap reference epigenome LAD annotation for hg19 [3]. H3K27me3-marked genes in CD34+CD38−, monocyte, erythroid precursor, and B and T cells were annotated by overlapping FindER identified H3K27me3-enriched regions with hg19v75 ensemble gene annotations. We then selected for genes with >20% of their genome covered with H3K27me3. Significantly enriched motifs were identified in 300-bp windows centered on the regions within our enhancer catalogue with respect to a randomized regions using MEME-SUIT fimo command and JASPAR catalogue [71]. Motifs for TFs with expression level of < 1RPKM across all cell types were eliminated from subsequent analysis. To calculate motif enrichment score with respect to the CD34+CD38− compartment, first cumulative quantile normalized H3K27ac signal was calculated for each significant motif across all cell types. Then, the standard score of cell of interest was subtracted from that of the CD34+CD38− cells for motifs that showed a significant increase or decrease in the cumulative H3K27ac signal between the cell type of interest and the CD34+CD38− cells.

Whole genome bisulfite sequencing
Whole genome bisulfite libraries were constructed as previously described [72]. In brief, genomic DNA was sheered and subjected to bisulfite conversion using the MethylEdge Bisulfite Conversion kit (Promega, N1301) using a bead-based automated protocol. Bisulfiteconverted DNA was mixed with 180 μl of MethylEdge Binding Buffer and 1.8μl of 20 mg/ml decontaminated MagSi-DNA all-round silica beads (MagnaMedics, MD02018) and left at room temperature for 15 min and washed twice with 220 μl of 80% ethanol for 30 s. A total of 60 μl of MethylEdge desulfonation buffer was added to the beads and incubated at room temperature for 15 min, then washed twice with 100 μl of 80% ethanol, and dried for 1 min. To elute the DNA, 20 μl 10 mM Tris-HCL, pH 8.5 (Qiagen, 19086), were added to DNA-bead mix and incubated in a Thermomixer C (Eppendorf, 5382000015) at 56°C while being centrifuged at 2000 rpm for 15 min. The bisulfite-converted single-stranded DNA was converted to double-stranded DNA through 1 cycle of PCR with random hexamer as previously described [72] followed by standard illumine library construction. Libraries were aligned using Novoalign V3.02.10 (www. novoc raft. com) to human genome assembly GRCh37 (hg19). Duplicate reads were marked by Picard V1.31 (http:// picard. sourc eforge. net) and discarded (http:// picard. sourc eforge. net). Methylation calls were generated using Novomethyl V1.01 (www. novoc raft. com). All figures were generated by R statistical software [61].

Quantitative measurement of CD11b in HL60
HL60 cells were cultured at 1,000,000 cells/mL in RPMI prior to analysis by flow cytometry. Flow cytometry data were analyzed in R using the package flowCore [73] and custom scripts.
For clonal cultures, single P-NML cells were deposited into each well of a 96-well plate preloaded with 9000 MS-5 cells and 333 each of M210B4 mouse fibroblasts expressing human IL-3 and G-CSF, and sl/sl mouse fibroblasts expressing human SCF and IL-3, and human FLT3L, with alpha-MEM medium with 2 mM glutamine, 7.5% FBS and 10−4 M β-mercaptoethanol (Sigma) plus 50 ng/mL, SCF (Novartis), 10 ng/mL FLT3L (Immunex) added for the first 2 weeks. Weekly half-medium changes were performed. After 3 weeks, all cells were harvested, stained with antibodies, and assessed by FACS to detect clones of >10 monocytes, neutrophils, or B cells using the same FACS analysis protocol as for the bulk cultures.