Dynamic super-enhancer core regulatory circuits and epigenetic landscapes drive malignant progression and refractory disease in multiple myeloma

The plasma cell malignancy multiple myeloma (MM) evolves from a pre-malignant state and remains all but incurable due to emergence of therapy resistance. Despite intensive analyses, mechanisms driving MM progression and refractory disease are poorly understood. Integrating topologic, expression and epigenetic analyses of 1,016 patient specimens, we report super-enhancer core regulatory circuits (SECRCs) that drive and sustain MM epigenetic states. Reprogramming of cell identity and tumor microenvironment genes drive malignant conversion, while alterations in cell cycle and metabolic control genes cause refractory disease. Thus, select epigenetic states drive progression and therapy resistance, providing strategies to prevent and effectively treat MM. While MM refractory disease is the product of selection for the fastest growing therapy-refractory clones during relapse and tumor re-growth. b, We have explored the central biology involved in MM progression and refractory disease by leveraging a new database of combined clinical and molecular data, from a cohort of 1,016 bone marrow aspirates collected from patients across MM spectrum treated at Mott Cancer Center. c, Differential gene expression associated with MM progression or d, refractory disease. e, Differential relevance (based on enrichment score) of Cancer Hallmarks during MM progression or f, refractory disease. Differential gene expression was calculated on 16,738 expressed genes (z-normalized as Log2(FPKM+10-3)). Single sample gene set enrichment analysis (ssGSEA) was used to calculate the enrichment score of each gene set per sample. Unpaired two-sided t-tests with multiple test correction (q-value < 1%) were conducted in all comparisons above.


Results
Transcriptional topology reveals regulatory pathways in MM biology. To identify the mechanisms driving transcriptional alterations across MM spectrum, we recreated a transcriptional topology of MM using dimensionality reduction analysis (t-SNE 26 Figure 3a). Supercluster a's expression correlated with 14 hallmarks, including protein secretion, oxidative phosphorylation, and DNA repair, while the expression of supercluster b correlated with 17 hallmarks involved in EMT, in ammation, immunogenicity and hypoxiaresponse. These observations suggested that these biological pathways are coordinated by a shared mechanism of transcriptional regulation. Importantly, the integrity of these superclusters was conserved in the t-SNE (Figure 3a), and their differential expression, as estimated by ssGSEA NES, signi cantly changed, with a increasing and b decreasing (Figure 3b), along transitions to NDMM or LRMM.
Independently, genes differentially expressed during progression and refractory disease also co-localized in the transcriptional topology, suggesting shared transcriptional mechanisms (Figure 3c Epigenetic modi ers are dysregulated in MM. Using the pathway analysis aggregator Enrichr 28 , based on Chip-seq data (ENCODE 29 and Chea consensus), we identi ed DNA-binding proteins (DBP) with enriched binding to the genes differentially expressed between MM states. 13 DBPs were enriched for the genes differentially expressed between MGUS/SMOL and NDMM, whereas 75 DBPs were enriched for the NDMM-to-LRMM transition (Extended Data Figure 3d-e). While the former included genes associated with PRC2 complex 30 and plasma cell differentiation 31 (SUZ12, EZH2 and SOX2), the latter were notably  associated with telomere formation/TERT activation in cancer (RFX5, CEBPB, MAX, SP1, MYC, STAT3,   E2F1, KLF4, ETS1, USF2, USF1) 32 as well as the formation of topologically associated domains (TADs), essential to the formation of super-enhancers and establishment of cell identity 29,33,34 (YY1, CTCF, GABPA and NRF1). Given the role of these DBPs, and the contribution of epigenetics to cancer 1,7 and hematopoietic cell identity 31 , we have investigated the putative role of methylation or acetylation modi cations of histone 3 lysine 27 in transcriptional dysregulation across MM spectrum. Figure 3e depicts gene clusters individually enriched for histone modi cations responsible for increased (H3K27ac) or decreased (H3K27me3) chromatin accessibility [35][36][37] , based on enrichment score calculated from Chipseq data repositories. H3K27ac-and H3K27me3-enriched gene clusters coalesced in two large regions of the t-SNE, suggesting the overall transcriptional topology of MM is de ned by two epigenetic regulatory mechanisms that control cell differentiation or super-enhancers. Differential expression of these two putatively epigenetically-regulated gene sets, as calculated by ssGSEA NES, revealed signi cant suppression of H3K27me3-enriched genes between MGUS/SMOL and NDMM samples, while the expression of H3K27ac-enriched genes increased from NDMM to LRMM (Figure 3f). These ndings were also observed in 8 individual sequential samples of MM patients in this cohort (Extended Data Figure 4). Collectively, these data suggest that the transition from pre-malignant to active disease is characterized by increased H3K27me3 histone modi cation, and consequent reduced chromatin accessibility and decreased expression, of gene sets associated with TME-dependency and cytokine-mediated signaling cascades. In turn, refractory MM correlated with increased H3K27ac histone modi cation, chromatin accessibility and expression of gene sets involved in cell cycle and metabolism. Additionally, we have observed a strong correlation between expression of epigenetically regulating genes with opposing roles, such as HAT1 and HDAC1 as well as SUZ12 and KDM1A (Extended Data Figure 5), suggesting that increased activity of acetylating/de-acetylating, or methylating/demethylating enzymes must be coordinated, to maintain distinct epigenetics states required for transitions to NDMM or LRMM. Together with SOX2, POU5F1 is part of the Yamanaka factors, which regulate pluripotency and cell differentiation 12 and, with NANOG, these pluripotency-inducer transcription factors ("OSN") modulate gene repression by recruitment of Polycomb group chromatin regulators, which induce reduction in chromatin accessibility through spread of H3K27me3 [43][44][45][46] , disrupting cohesin and CTCF and YY1 binding to chromatin, ultimately disassembling TADs in non-OSN SEs. These ndings suggest a coordinated mechanism behind the opposing accessibility pattern of OSN and CTCF and YY1 biding sites 47 . Six DBPs were both enriched for binding to differentially expressed genes (Extended Data Figure 3d) and showed differential motif accessibility in the MGUS-to-NDMM transition: RUNX1, TCF3, GATA2, IRF8, ESR1 and TP63. Twenty-nine others matched these requirements for the NDMM-to-LRMM transition, among these were RFX5, USF2, YY1, NRF1 and POU5F1. We have identi ed 1,883 genes whose differential expression across disease states was associated with changes in chromatin accessibility at promoter and distal levels. These genes were used in a pseudotime analysis 48 to estimate the trajectory from SMOL-to-LRMM samples, based on overall (distal + promoter) single cell chromatin accessibility (Figure 5a-b). This topology suggested increased inter-patient tumor heterogeneity with disease progression, with cells from different SMOL samples more closely located, while NDMM and LRMM samples were markedly further away in pseudotime. Super-enhancer-regulated self-regulatory gene networks maintain MM transcriptional dysregulation. SEbased gene regulation is central to maintaining cell identity, cancer progression and drug response 41,42,49 .
SE-regulated transcription factors (TF) form self-reinforcing transcriptional networks through binding to their own SE regions, stabilizing chromatin in an accessible state (H3K27ac), dramatically increasing their transcription and that of thousands of downstream genes, as a means of epigenetic memory 50 .
These networks are termed SE-regulated core regulatory circuits (SECRC) 50 . We generated a transcriptional network, based on a previously established method 50,51 , to evaluate the role of SECRCs in transcriptional reprogramming in MM 37,50,51 . This analysis identi ed 60 SECRC TFs and 1,865 downstream genes regulated by enhancers with transcription factor binding sites identi ed for at least one the 60 SECRCs, as determined by dbSuper/GeneCards, in a consensus of B cell lineages. We have investigated changes in chromatin accessibility of these 60 SECRCs across a pseudotime trajectory connecting 3 SMOL, 2 NDMM and one LRMM sample (Figure 5b-c). EBF1 and ZEB2 were examples of genes with no signi cant changes, while RFX5, YY1 and CTCF showed marked increase in accessibility.
In addition, RFX5 was notable as a transcription factor whose changes in expression, motif and combined enhancer/promoter accessibility signi cantly increased towards refractory disease (Extended Data Figure 6e-g), suggesting a driver role. This concept was reinforced by Pearson correlation between RFX5 expression (RNAseq) and chromatin accessibility (scATAC-Seq) of YY1 and POU5F1 groups in seven MM samples with both datasets available (Extended Data Figure 7). Additionally, Pearson correlation between motif accessibility and ssGSEA of gene clusters, cancer hallmarks and KEGG pathways in these seven samples provided biological pathways and gene sets positively-or negativelycorrelated with YY1 and POU5F1 motifs (Extended Data Figure 8). These observations a rm the correlation between chromatin accessibility of OSN and TAD motifs and differential expression of pathways involved in MM progression and refractory disease, as well as expression of SECRCs (e.g., RFX5) and the accessibility of these motifs.
We thus investigated SECRCs that could putatively regulate OSN or TAD-associated chromatin accessibility motifs, and thus drive MM progression: 8 of the 60 SECRCs were under-expressed and 3 over-expressed in NDMM when compared to MGUS/SMOL samples. Notably, among the under-expressed SECRC were RB1, ELF1, IKZF1, FOXP1 KLF6, SPI1, ZEB2, and EBF1. The most differentially underexpressed SECRC was EBF1 (Figure 5d), which regulates B cell maturation and with SPI1 52 stabilize chromatin accessibility 53,54 . RB1 55  Multifactorial in uences of the MM epigenetic landscape. Our data indicated that transcriptional reprogramming follows an evolving epigenetic landscape. Epigenetics has been linked to the high-risk translocation of IgH enhancer-driven expression of the methyltransferase NSD2/MMSET in t(4;14)positive tumors 38,62 . Further, MM progression is associated with accumulation of cytogenetic abnormalities and mutations in putative driver genes 5 . Therefore, we also examined changes in mutational load of individual samples and mutational frequency of individual genes associated with progression (MGUS, n=64; SMOL, n=65; and NDMM, n=199) and refractory (ERMM, n=342; and LRMM, n=146) disease states (Extended Data Figure 9). In agreement with previous studies 5 , we observed increased mutational load per sample, and mutational frequency of putative "driver genes", across disease state (Figure 6a-b). Only two of these genes demonstrated statistically different frequency between two disease states, KRAS and NRAS, between MGUS and SMOL (Extended Data Figure 10a-c) 63 , although neither could solely account for disease progression and refractory disease. These observations led us to investigate whether well-established genetic aberrations in MM could converge within a common transcriptional dysregulation mechanism, thus de ning a common "driver biology". Figure 6c-d and Extended Data Figure 10d-k illustrate the differentially expressed genes and gene clusters in samples positive or negative for frequently mutated genes and cytogenetic abnormalities. Notably, mutations in NRAS and KRAS, as well as del13q, correlated with differential expression associated with transition from pre-malignant to active disease, while TP53 mutation or loss in del17p was associated with refractory disease. In contrast, amp1q21 is associated with differential expression of both transitions, suggesting it contributes to both pre-malignant and refractory disease. We have con rmed that amp1q21, a marker of high-risk MM and disease progression 64 , correlated with over-expression of RFX5 and MCL1, both genes being located in 1q21 region, although samples without amp1q21 also show comparable expression of these genes, suggesting amp1q21 as a su cient but not necessary condition for RFX5 over-expression (Extended Data Figure 10l). Additionally, scATAC-seq analysis of LRMM samples demonstrates dramatic increase in chromatin accessibility of RFX5, but only moderate for MCL1 (Figure 5c), indicating the former's association with SE activation, but not the latter's. t (11;14) was associated with a unique pattern of differential expression, but which did not match either pattern identi ed for progression or refractory disease, con rming its neutral prognostic impact 65 . Moreover, mutations in large genes such as TTN, MUC4 and MUC16, despite increased frequency across MM spectrum (Extended Data Figure 9b), did not signi cantly correlate with transcriptional reprogramming associated with disease progression or refractory disease, serving as a negative control for this analysis (Figure 6c, Extended Data Figure 10 j-k).

Discussion
Integrating WES, RNA-seq and scATAC-seq of a new cohort of MM patient specimens, we have shown that a unifying SECRC control of distinct epigenetic transcriptional programs drives MM progression and refractory disease. Despite being associated with increased mutational load per sample, as well as increased mutational frequency of putative driver genes (e.g. KRAS, NRAS, TP53) 66 , adaptations in MM biology between MGUS-to-NDMM and NDMM-to-LRMM states are fundamentally implemented by transcriptional reprogramming. The transition from pre-malignant to active MM is dominated by reduced expression of genes linked to cell identity, cell adhesion and in ammation, whereas hallmarks of refractory MM are associated with cell cycle, DNA repair, metabolism, protein and RNA synthesis and degradation. Underscoring the importance of these transcriptional programs is the fact that they are prognostic in de ning time to progression of SMOL patients and overall survival of patients with active disease.
Notably, integrating the topology of MM transcriptional control and pathway analysis, revealed that epigenetic-mediated changes in chromatin accessibility are the driving mechanism of transcriptional changes observed in MM progression and refractory disease. Speci cally, there is an enrichment for H3K27me3 and PRC2-involved DBPs (SUZ12 and EZH2) in under-expressed genes that characterize the MGUS-to-NDMM transition, while H3K27ac and TAD-forming DBPs (YY1, CTCF, GABPA and NRF1) are enriched in the progression to refractory disease. ScATAC-seq analyses of primary MM samples across MM spectrum con rmed these conclusions and revealed that the two families of motifs are inversely correlated with chromatin accessibility patterns, where (1) those of the OSN factors POU5F1 and SOX2 (among others) have increased accessibility from SMOL to NDMM and a decrease in therapy refractory disease, and (2) those of CTCF, YY1, NRF1 (among others) that direct the formation of topologically associating domains (TADs) in chromatin, had the inverse behavior. Given the roles of OSN factors in inducing iPSCs, and those of TADs in establishing SEs and cell identity, we propose that the progression from pre-malignant towards refractory disease should be described as "dysdifferentiation", a term previously proposed to explain loss of cell identity in cancer cells 67 , where pre-existing epigenetic mechanisms of transcriptional control are hijacked through oncogenic events (e.g., mutations, translocations, SECRC dysregulation, etc.).
Mechanistically, transcriptional modeling and CRISPR tness screens revealed that select SECRCs drive and sustain MM states. Such stable self-regulating transcriptional circuits provide a form of epigenetic memory 52 , and MM progression and refractory disease were speci cally associated with SECRCs that are either under-expressed (e.g. EBF1, ZEB2, during MM progression) or overexpressed (e.g., IRF4, PRDM1 during MM progression and RFX5, YY1, CTCF in the transition to LRMM). Notably, publicly available CRISPR screens of 20 human MM cell lines revealed that NDMM-to-LRMM associated SECRCs were necessary for cell tness, suggesting targeting such SECRCs as a therapeutic strategy to overcome the emergence of refractory disease.
Pre-malignant MM cells are limited to the perivascular niche 68 , but as they acquire independence from the TME, they expand into other niches of the marrow (e.g. endosteum, interstitium), leading to increased tumor burden 69 and active MM. TME-independence is acquired by a process akin to EMT in epithelial cells 70 , involving loss of hematopoietic cell markers and adhesion molecules. Following the initiation of treatment, MM cells are selected for two traits: resistance to therapy and accelerated growth, where the fastest growing clones among the therapy-resistant cells re-populate the tumor during relapse.
The role of MM well-established mutations, and cytogenetic abnormalities, in regulation of epigenetic modi cations in histones has been previously characterized in different cancers and iPSC 71,72 . More speci cally, the OSN core transcriptional network is modulated by TGF-β, MAPK, Akt-PI3K and Wnt signaling pathways, among others [73][74][75][76] . Here, we have shown the correlation between the occurrence of these genetic events and transcriptional dysregulation associated with MM progression and refractory disease. Collectively, our ndings support a uni ed model of MM progression and emergence of refractory disease, where niche and therapy adaptations are result of genome-wide transcriptional reprogramming via hijacking of OSN, TAD and SECRC networks, which can be provoked by combinations of mutations, cytogenetic abnormalities, or environmental stress-induced epigenetic dysregulation 77 (Figure 6d-f). Among the new putative targets identi ed herein was RFX5, a TF that was originally identi ed in MHCII transcriptional regulation 78,79 yet that also controls telomere maintenance 21,22 . Critically, our data indicate that clinical management of MM patients should include therapeutic targeting of altered transcriptional programming, to both mitigate malignant transformation and/or prevent development of drug resistance, and that the signi cant inter-patient transcriptional heterogeneity manifest within clinical classi cations, such as SMOL and therapy naïve MM, will require precision therapeutic approaches.

Additional Information
Supplementary Information is available for this paper online.
Correspondence and requests for materials should be addressed to K.H.S. and A.S.S.
Biological materials availability: All analysis was performed on primary samples from Mo tt Cancer Center patients enrolled in the ORIEN/AVATAR consortium study, and thus are not publicly available.
Code availability: Not custom software was developed for this study, except for scripts used for data formatting, ltering and sorting. All other software from third parties were cited with version details in the  Differential transcriptional pro le across MM spectrum. a, Model of MM clinical evolution suggests that transition from pre-malignant -monoclonal gammopathy of undetermined signi cance (MGUS) or smoldering MM (SMOL) -to active disease (newly diagnosed, NDMM) consists of loss of TME dependency and accumulation of genetic abnormalities. Successive lines of therapy response are followed by early relapses (ERMM, 1-3 lines of therapy) and eventually refractory disease (late relapse MM, LRMM, >3 lines of therapy), characterized by high expression of cell cycle genes and increased genetic heterogeneity. While MM progression is selected by TME-imposed restrictions (i.e., limited availability of plasma cell-supporting niches in the marrow), refractory disease is the product of selection for the fastest growing therapy-refractory clones during relapse and tumor re-growth. b, We have explored the central biology involved in MM progression and refractory disease by leveraging a new database of combined clinical and molecular data, from a cohort of 1,016 bone marrow aspirates collected from patients across MM spectrum treated at Mo tt Cancer Center. c, Differential gene expression associated with MM progression or d, refractory disease. e, Differential relevance (based on enrichment score) of Cancer Hallmarks during MM progression or f, refractory disease. Differential gene expression was calculated on 16,738 expressed genes (z-normalized as Log2 (FPKM+10-3)). Single sample gene set enrichment analysis (ssGSEA) was used to calculate the enrichment score of each gene set per sample. Unpaired two-sided t-tests with multiple test correction (q-value < 1%) were conducted in all comparisons above. disease states. b -f, Samples were displayed in 5 plots, according to disease state, to facilitate visualization. Percentages depict the proportion of samples in each quadrant. g, Distribution of samples from each disease state in the 4 PC quadrants annotated in a. MGUS samples are primarily located in quadrants Q1 and Q2, while LRMM samples are primarily located in quadrants Q3 and Q4. Samples in quadrants Q1 or Q2 were labeled as "MGUS-like" while those in quadrants Q3 and Q4 were labeled as "LRMM-like". "MGUS-like" patients had a better prognosis than "LRMM-like", as measured by (h) time to progression of SMOL patients, as well as overall survival of NDMM (i), ERMM (j) and LRMM (k) patients, indicating that the enrichment of speci c cancer hallmarks has clinical implications.