Introduction

Mixed lineage leukemia (MLL), encoded by the lysine (K)-specific methyl transferase 2A (KMT2A) gene at 11q23, is a H3K4me3-depositing protein active during early development. MLL rearranged (MLLr) leukemias are responsible for about 10% of all acute lymphoblastic and myeloid leukemias (ALL/AML).1 The N-terminal CXXC-domain containing DNA-binding part of the MLL gene can, by genomic translocation, be fused to >60 different fusion partners.2 The two most common fusion partners, AF9 (MLLT3) and AF4 (AFF1), are found in 30% of MLLr AML and 66% of MLLr ALL cases, respectively.3

The most prevalent fusion partners (including AF9 and AF4) are present in the super elongation complex (SEC), therefore leading to a general MLLr mechanism of action despite being composed of different proteins.4, 5 The SEC normally binds to RNA polymerase II and facilitates transcriptional elongation. In MLLr leukemias, the SEC is tethered to the DNA-binding domain of MLL via the fusion partner, leading to aberrant transcription of MLL target genes. Next to that, AF9 as well as other fusion partners such as ENL, AF10 and AF17 are present in the DOT1L complex (DOTCOM). This deposits H3K79me2 on actively transcribed genes, leading to aberrant H3K79me2 deposition in these subsets of MLLr-induced leukemias.6, 7, 8 As such, a promising avenue for treatment of MLLr-induced leukemia is inhibition of DOT1L.9, 10

Furthermore, AMLs in general and MLLr leukemia in particular have been shown to be sensitive to inhibition of BET family proteins such as BRD4, which regulate transcription elongation via P-TEFb at promoters and enhancers.11, 12, 13, 14 MLL fusion proteins have also been shown to interfere with RUNX115, 16, 17 and modulate PU.1 via its distal regulatory elements.18 Moreover, MLLr and other AMLs are sensitive to inhibition of mediator kinases,19 linking modulation of distal regulatory elements to execution of the MLLr leukemic program.

Thus far, genome-wide maps of MLL binding are only available in mouse models of AML,7 human ALL20, 21, 22 and of MLL-AF6 in the human AML cell line ML-2.23 No reports to date have described genome-wide MLL-AF9 and -AF4 binding in human AML.

Here we set out to investigate the molecular mechanisms and targets of MLLr-induced AML. For this, we characterized the genome-wide binding, epigenetic signature and gene expression program of wild-type (WT) MLL, MLL-AF9 and MLL-AF4 in human AML cells. We show that MLL fusion proteins bind in a ‘broad’ mode elongating over the gene body as well as in a ‘sharp’ mode stalled on the transcription start site (TSS), in addition to non-genic elements, such as distal enhancers. We show that MLL-AF9 and MLL-AF4 share only a subset of target genes, yet show enrichment for the same pathways in both the shared as well as the MLL-AF9- and MLL-AF4-specific gene sets. These target genes are marked by H3K79me2, H3K4me3 and H3K27ac enrichment as well as by RUNX1 occupancy and constitute a mixture of CD34+ and monocyte-expressed gene sets. Together these results suggest that, in MLLr AML, the RUNX1-mediated progenitor to monocyte differentiation program is deregulated.

Results

MLL fusions and WT MLL show a broad and sharp binding mode

To investigate the binding sites of MLL-AF9, we used a substrain of THP-1 cells that express both WT MLL as well as MLL-AF9 but not WT AF9 (Figures 1a and b, Supplementary Figure S1A). Using antibodies against the N-terminus of MLL-1 (ab1542/ab1547) and the C-terminus of AF9 (ab1327/ab1474) (Figure 1a) in chromatin immunoprecipitation (ChIP)-qPCR (quantitative PCR) showed enrichment for the canonical HOXA MLL fusion target region in THP-1 cells (Figure 1c), while HOXA7 and MEIS1 were, in contrast to expectations,24, 25, 26 not enriched. Western analysis against MLL and AF9 using our antibodies yielded no results (data not shown), suggesting that our antibodies were of ‘ChIP-seq grade‘ but not of ‘western grade’. Subsequently, we performed ChIP-seq experiments using MLL and AF9 antibodies in THP-1 MLLr cells that confirmed enrichment on the HOXA locus for both antibodies (Figure 1d, top). Interestingly, we observed that both MLL and AF9 show not only ‘broad mode’-enriched regions elongating over gene bodies (Figure 1d, middle), but also ‘sharp mode’-enriched regions on target promoters and enhancers (Figure 1d, bottom).

Figure 1
figure 1

Genome-wide binding patterns of MLL and AF9 in THP-1 cells. (a) Schematic representation of MLL, AF9 and MLL-AF9. Antibody binding locations are indicated with dotted lines, primer regions used in panel (b) with a filled line. (b) Reverse transcriptase–qPCR experiments (n=5) in THP-1 cells with primers against the C and N termini of MLL and AF9 normalized to GAPDH. The N-terminus of AF9 is not expressed, indicating that there is no WT expression of AF9 in this cell line. ***P<0.001 (Welch’s t-test). (c) ChIP-qPCR experiments using two anti-MLL-1 and two anti-AF9 antibodies in THP-1 cells and primers for HOXA7, 9, 10 and MEIS1. (d) ChIP-seq overview of MLL and AF9 binding at the HOXA, ZEB2 and CDKN2C loci in THP-1 cells. (e) Classification of MLL and AF9 binding events in ‘broad’ and ‘sharp’ modes. Left: boxplot showing dispersion of peak lengths. Right: barplot showing genomic distributions. (f) Classification of MLL-AF9 and MLL WT binding events. Average profiles showing ChIP-seq signal intensities for MLL-AF9 and MLL WT binding events in THP-1 cells. (g) Left: Distribution of MLL-AF9 and MLL WT binding events in the ‘broad’ and ‘sharp’ modes. Right: Genomic distribution of MLL-AF9 and MLL WT binding events in the ‘broad’ and ‘sharp’ modes. (h) Venn diagram illustrating the overlap between our human (THP-1) MLL-AF9 AML targets, MLL-AF9 targets in a mouse LSC model (Bernt et al.7) and human MLL-AF4 ALL targets (Guenther et al.20).

Using MACS2 for defining sharp peaks and HOMER for defining broad regions (see Materials and methods), we identified 16 099 unfiltered MLL occupied regions in THP-1 cells (Supplementary Table S1). Of these, 8217 were of ‘broad mode’ (mean length ~12 kb) and 7882 of ‘sharp mode’ (mean length ~4.6 kb) (Figure 1e, left). Analysis of the genomic distribution revealed that the ‘broad mode’ peaks cover more TSS regions (Figure 1e, right), while the ‘sharp mode’ peaks seem to occur in intergenic regions more often.

As AF9 was not expressed from its endogenous locus in this substrain of THP-1 cells (Figure 1b, Supplementary Figure S1A), we defined MLL-AF9-binding peaks as those MLL-binding sites that show a high AF9 signal, and MLL WT-binding events as those that show a low AF9 signal (Figure 1f). This distilled our list of MLL-occupied regions down to 1613 high-confidence MLL-AF9-fusion-binding sites, including known AML and MLLr targets such as HOXA9, CDK6, MYB, MYC, JMJD1C, FOXP2, FLI1, RUNX1, PBX3, BCL2 and BRD4, as well as 439 high-confidence MLL WT-binding sites (Supplementary Table S1). In all, 84% of these MLL-AF9-binding sites are ‘broad mode’, versus 58% of WT MLL-binding sites (Figure 1g, left), and they occupy more TSS regions as compared with WT MLL (Figure 1g, right).

As MEIS1 occupancy and expression is near universal for all MLL-fusion-induced AML and ALL,27, 28 we investigated the MEIS1 locus in THP-1 cells using our genome-wide data (Supplementary Figure S1B). This corroborated our ChIP-qPCR finding that MEIS1 is not bound by MLL-AF9 and not expressed in THP-1 cells, as it is marked only by H3K4me3 at its promoter but not by MLL, AF9, H3K27ac or H3K79me2. MEIS1 expression in MLL-fusion-induced AML has been shown to be especially important for the initial transformation of the leukemic cells in mouse models.29, 30, 31, 32 It is therefore conceivable that, at some point since the original establishment of the THP-1 cell line in 1980,33 the locus was silenced and its role in leukemic maintenance taken over by another TALE class protein such as, for example, PBX3, which is expressed and regulated by MLL-AF9 in THP-1 cells.

In order to further validate our findings, we performed additional ChIP-seq experiments against MLL (ab1542) and AF9 (ab1474) in one MLL-AF9-positive AML patient and repeated the MLL ChIP-seq in THP-1 cells with an antibody targeting a different epitope (ab1547). ChIP-seq signal intensity at our designated MLL-AF9-binding regions shows a good enrichment in all three cases (Supplementary Figure S1C). This indicates that our selected MLL-AF9 targets are not only bound by the fusion protein in the cell line system but also in more plastic primary cells.

As MLL-fusion-induced leukemia has previously been studied in various other human and mouse models, for example, Bernt et al.7 and Guenther et al.,20 we set out to compare our MLL-AF9 target genes with target genes identified in these studies. We found a relatively minor overlap of 12–23% between our set and the various other sets (Figure 1h, Supplementary Table S2), which was about the same range of overlap found between the different mouse studies. This indicates that, while a core set of targets is present in both human and mouse models, mouse models do not fully recapitulate the situation in human leukemogenesis.

Epigenetic signature of MLL-AF9 target genes

As MLLr leukemias have been suggested to alter the epigenetic signature of affected cells, we compared the epigenetic state of the MLL fusion with WT MLL target genes focusing on H3K4me3, H3K79me2 and H3K27ac. For this, we took the set of high-confidence MLL-AF9 and MLL WT binding events overlapping with RefSeq hg19 genes, identifying 962 MLL-AF9 and 76 MLL WT target genes, corresponding to 11% and 1% of all expressed genes (reads per kilobase of exon per million reads mapped (RPKM)>0.5, cutoff based on RPKM distribution), respectively (Figure 2a). Promoters of MLL-AF9 target genes were marked by H3K4me3 and H3K27ac, while a H3K79me2 signal on gene bodies decreasing in the 5′ to 3′ direction was observed (Figures 2b and c top), indicating that these genes are actively transcribed. A similar pattern was seen on WT MLL target genes, and a random pool of expressed genes, albeit with a mildly reduced signal strength for H3K79me2 (P=3.79e−10 and P<2.2e−16, respectively) (Figure 2c, middle, bottom). The lower occupancy of H3K79me2 was also reflected by a lower level of gene expression of WT MLL targets as opposed to MLL-AF9 target genes, as determined by RNA-seq (Figure 2d). This suggests that fusion target genes are higher expressed, in concordance with the paradigm that MLLr activates MLL target genes by aberrant elongation.

Figure 2
figure 2

Epigenetic signature of MLL target genes. (a) Distribution of expressed genes (RPKM>0.5), silent genes, MLL-AF9 and MLL WT target genes. (b) Overview of AF9, MLL, H3K79me2, H3K27ac and H3K4me3 binding and transcriptional activity at the ZEB2 locus in THP-1 cells. (c) Average signal of H3K27ac, H3K4me3 and H3K79me2 at MLL-AF9 (top), and MLL WT (middle) target genes, as compared with a random set of expressed genes (bottom) (d) Expression levels of MLL-AF9 and MLL WT target genes. ***P<0.001 (Welch’s t-test). (e) Motif families enriched over background in MLL-AF9 target gene promoters (top left), MLL WT target gene promoters (top right) and motif families in MLL-AF9 target gene promoters enriched over MLL WT target gene promoters (bottom).

Pathway enrichment analysis of MLL-AF9 target genes revealed a significant (Benjamini–Hochberg-adjusted P-value<1e−6) enrichment of immune system, hemostasis and adaptive immune system pathways (Supplementary Figure S1D, top). MLL WT target genes, in contrast, only revealed enrichment for the PDGFRB pathway (Supplementary Figure S1D, bottom), often involved in translocation events leading to myeloproliferative disorder.34 Motif analysis35 of MLL-AF9 targets revealed enrichment of the ETS, AP2 and C2H2-Zf family, while the POU family was depleted over background (Figure 2e, top left). WT MLL target genes were enriched for motifs recognized by C2H2-Zf and ETS families, while the NR family was depleted over background (Figure 2e, top right). Direct comparison of MLL-AF9 and MLL WT targets revealed both gene sets as similar in terms of motif enrichment, except for the NR and AP-2 motif families, which were enriched in the MLL-AF9 target gene set (Figure 2e, bottom). This suggests that MLL-AF9-mediated deregulation of NR and AP-2 signaling might be involved in aberrant hematopoietic and immunological processes.

Similarities of MLL-AF9 and MLL-AF4 binding patterns

To identify common MLL fusion targets, we expanded our analysis by including an MLL-AF4-expressing AML. As both MLL-AF9 and MLL-AF4 are thought to bind to MLL target genes and are linked to aberrant elongation and transcription of their targets via the SEC, we set out to assess the subset of MLL target genes commonly bound by the fusion proteins. First, we created genome-wide binding profiles for MLL, AF4, H3K4me3, H3K79me2 and H3K27ac, as well as an RNA-seq expression profile in MV4-11 AML cells expressing the MLL-AF4 fusion protein (Figure 3a, Supplementary Figure S2A–B). As before, we divided the unfiltered MLL targets (28 656) into ‘broad mode’ (18 782) and ‘sharp mode’ (9874), and by rate of AF4 occupancy filtered them down to fusion (2560) and WT (828) binding events (Figure 3b, Supplementary Figures S2C and D, Supplementary Table S3), showing a similar distribution as in THP-1 cells (Supplementary Figure S2E). Expression of MLL-AF4 target genes (1722, identified by overlapping the high-confidence fusion-binding events with RefSeq hg19 genes) was significantly higher than MLL WT genes (308) (Figure 3c). Epigenetic signatures and pathway enrichments were also comparable to MLL-AF9 AMLs (Supplementary Figures S2F–H), suggesting that both MLL-AF9 and MLL-AF4 use similar molecular mechanisms to induce leukemia. We also found a similar overlap with the various other target sets as observed for MLL-AF9 (Supplementary Figure S2I, Supplementary Table S2).

Figure 3
figure 3

Comparison of MLL-AF4 and MLL-AF9 target genes. (a) Overview of AF4, MLL, H3K79me2, H3K27ac and H3K4me3 binding and transcriptional activity at the ZEB2 locus in MV4-11 cells. (b) Genomic distribution of MLL-AF4 and MLL WT binding events in the ‘broad’ and ‘sharp’ modes. (c) Expression levels of MLL-AF4 and MLL WT target genes. ***P<0.001 (Welch’s t-test) (d) Distribution of MLL-AF9-specific and MLL-AF9 and MLL-AF4 common target genes (top). Distribution of MLL WT target genes in THP-1 and MV4-11 (bottom). (e) Average signal of H3K4me3, H3K27ac and H3K79me2 on MLL-AF4 and MLL-AF9 common and specific target genes. (f) Expression level of MLL-AF9 target genes shared (+) or not shared (−) by MLL-AF4 in THP-1 and MV4-11 cells (left). Expression level of MLL-AF4 target genes shared (+) or not shared (−) by MLL-AF9 in THP-1 and MV4-11 cells (right). ***P<0.001 (Welch’s t-test).

Comparing the MLL-AF9 and MLL-AF4 target gene sets (Supplementary Table S4) revealed that a significant (P=1.98e−11) 29% (277) of MLL-AF9 target genes are also targeted by MLL-AF4 (Figure 3d, top), including known MLLr targets, such as BCL2, HOXA9, MYB and BRD4. In contrast, only 3% of WT MLL target genes found in THP-1 were targeted by WT MLL in MV4-11 (Figure 3d, bottom). Next, we analyzed the activity of these common MLL fusion and MLL WT target genes versus MLL-AF9- and -AF4-specific target genes. Common MLL-AF9 and MLL-AF4 target genes, as well as those specific for MLL-AF9 and -AF4 show a comparable level of H3K27ac and H3K4me3 (Figure 3e), while H3K79me2 is slightly lower in MLL-AF9 AMLs. The presence of H3K79me2 signal on MLL-AF4 target genes confirms the deposition of this histone tail mark also on MLL-AF4 targets in AML.22, 36

The gene expression levels as determined by RNA-seq are comparable for shared fusion targets in THP-1 and MV4-11, while the MLL-AF9- and -AF4-specific target genes are lower expressed in MV4-11 and THP-1, respectively, with median RPKM values of 23 and 17 for the MLL-AF9-specific target genes and 25 and 18 for the MLL-AF4-specific target genes, respectively (Figure 3f). Together, this indicates that the set of shared MLL fusion target genes might represent a ‘core’ set of targets important for driving the leukemic potential of both MLL-AF9 and -AF4.

Interplay of MLLr target genes with RUNX1 and CTCF

As RUNX1 is a known factor in several types of translocated AML and ALL37, 38, 39, 40 and has been suggested to be involved in MLLr leukemias,15, 16, 41 we investigated RUNX1 DNA-binding in MLL-fusion-induced AML. Next to that, it was recently shown that MLL-translocated leukemias are affected by mediator kinase inhibition.19 The mediator complex42 is associated with regulation of RNA-polymerase II at promoters and distal regulatory elements. Cohesin, important for the establishment of promoter-enhancer interactions, co-localizes with CTCF43 and mediator.44 Cohesin mutations are prevalent in non-translocated AMLs45 and CTCF is implicated in T-ALL.46 Therefore, we investigated CTCF binding at MLL fusion targets, as a proxy for mediator/cohesin binding. CTCF47 (GSM1335528) and RUNX1 show enrichment on MLL-AF9 target genes, while the signal on WT MLL genes is slightly lower for RUNX1 and almost absent for CTCF (Figure 4a, Supplementary Figure S3A). A total of 22% (215) of MLL-AF9 target gene promoters are occupied by CTCF, versus 11% for MLL WT target genes. For RUNX1, this overlap is 80% (767) and 61% (46), respectively (Figure 4b, left). Moreover, RUNX1 co-occupied MLL-AF9 target genes are higher expressed than MLL-AF9 target genes without RUNX1 co-occupancy (Figure 4b, right). As RUNX1 binding to MLL-AF4 target genes in MV4-11 cells follows a similar pattern as discussed for MLL-AF9 in THP-1 cells (Supplementary Figure S3B), together these results suggest that targeting the RUNX1 gene program is a common feature of MLLr AMLs.

Figure 4
figure 4

Characterization of MLL-AF9-bound distal regulatory elements. (a) Average signal of RUNX1 and CTCF on MLL-AF9 and MLL WT target genes. (b) Rate of co-occupancy of MLL-AF9 and MLL WT target genes by RUNX1 and CTCF (left). Expression level of MLL-AF9 target genes grouped by RUNX1 co-occupancy. ***P<0.001 (Welch’s t-test) (right). (c) Average signal on MLL-AF9 (left) and MLL WT (right) bound enhancers for H3K4me3 and H3K27ac (top), MLL and AF9 (middle) and RUNX1, CTCF and H3K79me2 (bottom). (d) Genomic distribution of MLL-AF9 and MLL WT enhancers (top left). Co-occupancy of MLL-AF9 and MLL WT bound enhancers by CTCF, H3K79me2 and RUNX1 z(top right). Expression levels of MLL-AF9 and MLL WT intergenic enhancers (bottom left). Expression level of MLL-AF9-bound enhancers grouped by H3K79me2 co-occupancy. *P<0.05 (Welch’s t-test). (e) Motif family enrichment for MLL-AF9-bound enhancers. (f) Overview of HOXA locus in THP-1 cells showing alignment of AF9, MLL, H3K79me2, H3K4me3, H3K27ac, RUNX1 and RNA expression signal with the HOXA TAD boundary (gray box). (g) Pathway enrichments of active genes nearest to an MLL-AF9-bound enhancer within the same TAD. (h) Long range interactions from the BCL2 and PHLPP1 promoters as measured by 4C-seq in THP-1 cells (black bars, q<0.01) aligned with MLL, AF9 and H3K27ac ChIP-seq patterns on MLL-AF9-bound enhancers (gray boxes). Arrows highlight examples of interactions of the baited promoters with MLL-AF9-bound enhancers.

MLL fusion binding at distal regulatory regions

MLLr leukemias have largely been described to function via aberrant elongation of MLL target genes. However, as we noticed a significant portion (~25%) of MLL-AF9 and MLL WT peaks occurring in distal regions (Figure 1g, right), we next set out to characterize these putative MLL-bound enhancers. Overlapping distal MLL peaks with H3K27ac peaks yielded 342 MLL-AF9-bound and 75 MLL WT-bound active enhancers with high H3K27ac and low H3K4me3 signal (Figure 4c, top and middle, Supplementary Table S5). Interestingly, while RUNX1 was detected on both MLL-AF9 and MLL WT enhancer peaks, CTCF and H3K79me2 signals were specific for MLL-AF9-bound enhancers (Figure 4c bottom, Supplementary Figure S4A).

As MLL-AF9 and MLL WT enhancers occupy both intergenic as well as intronic regions (Figure 4d, top left), we further characterized only the set of MLL-bound intergenic enhancers to prevent mixing gene body and intronic enhancer chromatin signatures. A higher percentage of MLL-AF9-bound enhancers (39%) than MLL WT-bound enhancers (12%) was marked by H3K79me2 (Figure 4d, top right). This might reflect that aberrant deposition of the histone mark by DOT1L tethered to the MLL-AF9 fusion protein. RUNX1 differences were less striking, while a CTCF peak is present in 50% of all MLL-AF9-bound intergenic regions, versus 11% for MLL WT, pointing towards increased interaction with mediator for the MLL fusion-bound enhancers. Next to that, MLL-AF9-bound intergenic enhancer regions showed a significantly higher expression of enhancer RNA than MLL WT-bound regions (Figure 4d, bottom left), with MLL-AF9-bound intergenic regions co-occupied by H3K79me2 showing an even higher expression (Figure 4d, bottom right). Taken together, these findings indicate that MLL-AF9-bound enhancer regions are epigenetically more activated than their WT counterparts and might show aberrant H3K79me2 deposition and expression owing to the binding of the MLL fusion protein.

Next we wondered how distal binding sites identified in MV4-11 cells for the MLL-AF4 fusion would compare to the MLL-AF9-bound enhancers, as a difference in distal regulatory elements could potentially explain the difference in gene expression we found between the MLL-AF9- and -AF4-specific target genes (Figure 3f). MLL-AF4 active distal binding regions were similarly grouped based on H3K27ac (although average H3K27ac signal was lower as compared with MLL-AF9 distal regions), MLL and AF4 signals (Supplementary Figure S4B, Supplementary Table S5). Unlike the MLL fusion target genes, there was virtually no overlap (2%) between MLL-AF9 and MLL-AF4 enhancers, indicating that the core set of common target genes might be regulated by a variable set of regulatory regions in the different MLL fusions. Interestingly, we observed the same decrease in co-occupancy of H3K79me2 on MLL-AF4 versus MLL WT intergenic enhancers, while RUNX1 co-occupancy was slightly higher in the WT set, and no significant difference in enhancer RNA expression between MLL-AF4 and WT MLL intergenic enhancers was observed (Supplementary Figure S4C). This could indicate that, as the epigenetic landscape of MLL-AF9- and MLL-AF4-bound active enhancers is similar except for a lower H3K27ac signal, transcriptional activity of these elements might be restricted to MLL-AF9-based leukemia, potentially either as a result of differences in complex presence (DOTCOM vs SEC) or H3K27ac occupancy.

Next we set out to determine the genes these enhancers are potentially interacting with. The majority of MLL-fusion-bound enhancer regions are located between 5 and 500 kb from the nearest TSS (Supplementary Figures S4D and E). However, as the TSS closest to an enhancer is not necessarily the one it acts upon, we refined our list of closest genes by comparison with topologically associating domain (TAD) data and by including active genes only. For this, we combined the CTCF-binding data in THP-1 with TADs in human monocytes as determined by HiC (TK, S-YW and HGS, in preparation) to get an approximate distribution of TADs. Subsequently, we linked the MLL-AF9-bound enhancers to the closest active (H3K27ac marked) gene within the same TAD, resulting in 247 genes putatively regulated by MLL-AF9-bound enhancers, including BCL2, PHLPP1, RUNX1 and SPI1 (Supplementary Table S6). We confirmed that our THP-1 TAD list includes the boundary at the 5′ of the HOXA cluster, as described in THP-1 cells47 (Figure 4f), and performed further validation using 4C-seq experiments on the promoters of BCL2 and PHLPP1, both MLL-AF9 target genes identified in this study, as bait. This allowed to confirm interactions formed by these promoters with MLL-AF9-bound active enhancers (Figure 4h).

Interestingly, of the 247 genes regulated by MLL-AF9-occupied enhancers, 25% (61) are MLL-AF9 promoter/gene body-occupied genes, which is significantly (P<0.0001) more than in a random selection of genes. Next to that, GSEA (gene set enrichment analysis) pathway analysis of this set of 247 genes revealed a strong enrichment for cancer (q<1e−5), CMYB (q<1e−5), AML (q<1e−4) and immune system (q<1e−4) pathways (Figure 4g), providing a strong indication that these genes and their putative enhancers are indeed implicated in MLL-AF9-mediated leukemogenesis and/or maintenance.

Expression of MLL-AF9 targets in primary cells

Finally, we compared gene expression of MLL fusion cell lines and primary AML blasts to CD34+ hematopoietic progenitor cells and primary human monocytes (Supplementary Table S7). We established that patient blasts cluster separately from CD34+ cells and monocytes (Figure 5a), Next we focused on the subset of MLL fusion target genes and investigated their spread by principle component (PC) analysis (Figure 5b, Supplementary Table S8). We found that primary MLL fusion samples and cell lines differ from monocytes by one principle component (PC1) and from CD34+ cells by another (PC2), which is confirmed by functional analysis of PC1-associated pathways (Figure 5c, top left) and analysis of the average RPKM of genes in PC1 in the various cell types, which revealed a higher spread of expression levels for this subset in monocytes (Figure 5c, bottom left). Analogous functional analysis of PC2 revealed enrichment for pathways more associated with dividing (CD34+) progenitor cells (Figure 5c, top right) and an average CD34+ RPKM with a higher spread (Figure 5c, bottom right). Together, this suggests that AML-associated MLL fusions impose a block during monocyte differentiation.

Figure 5
figure 5

Gene expression levels of MLL-AF9 patient blasts as compared with CD34+ cells, monocytes and AML blasts. (a) Distance-based clustering on the expression of all hg19 refSeq genes. (b) PC analysis on the expression of MLL-AF9 target genes. (c) Pathway enrichments for MLL-AF9 target genes in PC1 (top left) and PC2 (top right). Expression levels of MLL-AF9 target genes in PC1 (bottom left) and PC2 (bottom right). (d) Distribution of differentially expressed AP2, C2H2-Zf, ETS, NR, POU and T-box TFs. Green dots: Benjamini–Hochberg-adjusted P-value<0.01 and fold change >4; orange dots: Benjamini–Hochberg-adjusted P-value<0.05; red dots: fold change >1. (e) Euclidean distance clustering of 74 TFs expressed significantly different in MLL-AF9 cells versus monocytes, CD34+ and MLL-AF4 cells. (f and g) Mean RPKM of MLL-AF9 AML high (f) and low (g) expressed TFs in CD34+ cells (n=3), THP-1 (n=2), MV4-11 (n=1), monocytes (n=3), MLL-AF9 blasts (n=5), AML-ETO blasts (CGA, n=7), CBFß-MYH11 blasts (CGA, n=11), MLLr blasts (CGA, n=11), PML-RAR blasts (CGA, n=16) and other AMLs (CGA, n=134).

To identify the transcription factors (TFs) that cooperate with MLL fusion in driving leukemogenesis, we investigated whether the expression of 878 TFs associated with the MLL-fusion-enriched motif families (Figures 2e and 4e, Supplementary Figures S2H and S4F) is different between MLL-fusion-positive cells, CD34+ cells and monocytes. We identified 146 TFs differentially expressed in one or more cell types (Figure 5d, green dots, Supplementary Table S9). Subsequently, we filtered these TFs based on an RPKM cutoff of 5 in at least one sample, similar to that of the MLL-AF9 samples, and a deviation from the mean in the same direction in at least 2 samples, and clustered the remaining 74 TFs on expression pattern in monocytes, CD34+ cells and MLL-AF9- and MLL-AF4-expressing cells, revealing 6 TF clusters (Figure 5e), each of which is differentially expressed in MLL-AF9 cells as compared with normal cell types or MLL-AF4-expressing cells and potentially involved in co-regulating MLL-AF9-binding sites. To confirm this specificity, we compared the clustering results to the expression in other types of AML (Ley et al.48). This identified three factors, ZNF521, ZNF433 and ZNF532, for which expression (Figure 5f) was increased in MLL-AF9-positive cells only, suggesting that these collaborate with MLL-AF9 in deregulating gene expression and driving leukemogenesis and several tumor-suppressing factors, such as ETV3, NR4A1 and EGR2 whose expression (Figure 5g) is downregulated in MLLr as well as all other types of AML included in the analysis. Deregulation of, for example, ZNF521, NR4A1 and EGR2, has indeed previously been implicated in AML.49, 50, 51

In summary, these results show that MLL fusion target genes identified in this study can be divided into a group behaving more like CD34+ cells and a group behaving more like monocytes. Similarly, expression of TF family members whose motifs were enriched under MLL-AF9 and -AF4 target genes and enhancers can also be classified as CD34+ like, monocyte like or MLLr specific. Disturbance of the normal gene expression patterns of both direct MLLr targets and co-regulating TFs potentially produces the leukemogenic phenotype witnessed in MLL-translocated AML.

Discussion

In this study, we investigated the genome-wide binding and epigenetic signature of MLL-AF9, MLL-AF4 and WT MLL in AML-derived cell lines carrying the respective MLL fusions. Enrichment of H3K79me2, H3K27ac and RUNX1 signal was high on both MLL-AF9 and MLL-AF4 target genes. Enrichment of H3K79me2 confirms that deposition of this histone modification on aberrantly activated MLL fusion target genes is also a feature of MLL-AF4-induced AML, as was shown for MLL-AF4 in murine and human ALL models.22, 36 Deposition of H3K79me2 is possibly deregulated in all MLLr acute leukemias involving a component of the SEC via indirect association of the SEC with DOT1L via AF9 or ENL.52, 53 However, as H3K79me2 is enriched on all activated genes in general and not just on aberrantly activated MLL fusion targets, inhibiting DOT1L function to non-specifically stop H3K79me2 deposition9, 54 may introduce deleterious off-target effects.

Enrichment of H3K27ac on aberrantly activated MLLr target gene promoters—and BRD4 being an MLLr target—corroborates the facilitating role of bromodomain proteins in transcription of MLLr targets,55 as evident from AML susceptibility to BET inhibition.56 This suggests a positive feedback loop where transcription of MLLr targets is facilitated by BRD4, and BDRD4 is transcribed because it is an MLLr target.

The large overlap between MLLr and RUNX1-binding sites and the identification of RUNX1 as an MLLr target gene suggests that MLLr AML deregulates (a subset of) the RUNX1 program, important for hematopoietic development.57 This is reminiscent of the way the CBFß-MYH11 oncofusion modulates the expression of RUNX1 targets37 and AML1-ETO can increase the expression of a subset of RUNX1 targets.58, 59 Together with aberrant expression of genes such as MYC in most AML subtypes,60, 61 this hints at the existence of a core set of genes, including a subset of the RUNX1 program, that is important for leukemic transformation and maintenance in translocation-induced AML.

In addition to MLL fusion binding to promoter regions as expected by the consensus of MLLr acting through aberrant activation of MLL target genes, we determined a significant number of MLL-fusion-binding sites at active distal regulatory elements enriched for H3K27ac, H3K79me2 and RUNX1 and in close proximity to genes enriched for pathways related to leukemia. In light of this, it seems likely that MLLr does not only act directly on their target genes but can also modulate the expression of target genes via distal regulatory elements. The MLL-AF4- and MLL-AF9-bound enhancers showed almost no overlap, indicating that each MLLr subtype has a distinct enhancer repertoire. This is in line with several studies linking differences in acquired and innate resistance to BET inhibition in various AML cell lines and different clones to a dynamic or variable enhancer landscape.12, 62

Moreover, MLL-AF9-bound enhancers are enriched for CTCF binding, which is in line with MLLr AML cells being responsive to treatment with mediator kinase inhibitors.19 This suggests an active role for MLL-AF9 in modulating the chromatin conformation to facilitate target gene expression via interference with the interplay between CTCF, RAD21 (cohesin) and the mediator complex.63, 64

We extracted an extended set of core MLLr target genes including known targets such as MYC, RUNX1, BCL2 and CDK6, which can potentially be used to develop new strategies for combating MLLr leukemias, for instance, by fine-tuning the targeting of existing potential treatments, such as inhibition of DOT1L,7, 9, 54 BLC2i65 or BET11 to MLL fusion target genes only. Next to that, the uncovered sets of MLL-AF9- and -AF4-specific target genes such as ZNF521 and CDKN2A, respectively, indicate that each specific fusion partner also has its own unique binding signature, which may potentially be exploited against MLLr blasts with resistance against a more general treatment, such as BET inhibition.55

Finally, we show that gene expression of MLL-AF9 target genes can be divided into CD34+-like and monocyte-like groups, thereby keeping the MLL-AF9-positive leukemic cells in a state between CD34+ progenitor cells and fully differentiated monocytes. Likewise, TFs from families with motifs enriched under MLL fusion targets can be divided into CD34+-like, monocyte-like and MLLr-specific groups, uncovering TFs such as ZNF521 and ZNF433 that are indirectly involved in the expression or selection of MLLr target genes and tumor-suppressing TFs such as ETV3 and NR4A1 that are downregulated in MLLr leukemias. Interestingly, ZNF521 is also an MLL-AF9 target, suggesting a feed-forward loop of ZNF521 and the MLL-AF9 leukemic program.

Materials and methods

Cell culture

THP-133 and MV4-1166 cells were routinely cultured in RPMI 1640 supplemented with 10% fetal calf serum and 1% penicillin/streptomycin at 37 °C in a humidified incubator with 5% CO2. Mycoplasm status was determined every 6 months.

Patient samples

Bone marrow samples from MLL-AF9-positive AML patients were collected at diagnosis. The study was conducted in accordance with the Declaration of Helsinki and institutional guidelines and regulations (CMO 2013/064). Patient data are summarized in Table 1.

Table 1 Patient data

ChIP and ChIP-seq

Chromatin from cell lines was harvested as described.67 ChIPs were performed using antibodies against MLL-1, AF9, AF4, RUNX1, H3K4me3, H3K27ac and H3K79me2 and analyzed by qPCR or sequencing. Relative occupancy was calculated as fold over background, for which the promoter of the Myoglobin gene was used.

Reverse transcriptase–qPCR and RNA-seq

Total RNA was extracted with TRIzol (Invitrogen, Bleiswijk, The Netherlands) or RNAsol (GenDepot, Barker, TX, USA), treated with DNAse on column (Qiagen, Venlo, The Netherlands) and analyzed by reverse transcriptase–qPCR or strand-specific sequencing.

Illumina high-throughput sequencing

ChIP-seq and RNA-seq libraries were prepared according to the manufacturer’s instructions (Illumina, Eindhoven, The Netherlands). All data can be downloaded from the Gene Expression Omnibus GSE79899, GSM1631708, GSM1704846 and GSM1704847 or through the Blueprint DCC (http://dcc.blueprint-epigenome.eu/#/home).