H3K27me3 Signal in the Cis Regulatory Elements Reveals the Differentiation Potential of Progenitors During Drosophila Neuroglial Development

Drosophila neural development undergoes extensive chromatin remodeling and precise epigenetic regulation. However, the roles of chromatin remodeling in establishment and maintenance of cell identity during cell fate transition remain enigmatic. Here, we compared the changes in gene expression, as well as the dynamics of nucleosome positioning and key histone modifications between the four major neural cell types during Drosophila neural development. We find that the neural progenitors can be separated from the terminally differentiated cells based on their gene expression profiles, whereas nucleosome distribution in the flanking regions of transcription start sites fails to identify the relationships between the progenitors and the differentiated cells. H3K27me3 signal in promoters and enhancers can not only distinguish the progenitors from the differentiated cells but also identify the differentiation path of the neural stem cells (NSCs) to the intermediate progenitor cells to the glial cells. In contrast, H3K9ac signal fails to identify the differentiation path, although it activates distinct sets of genes with neuron-specific and glia-related functions during the differentiation of the NSCs into neurons and glia, respectively. Together, our study provides novel insights into the crucial roles of chromatin remodeling in determining cell type during Drosophila neural development.

Abstract Drosophila neural development undergoes extensive chromatin remodeling and precise epigenetic regulation. However, the roles of chromatin remodeling in establishment and maintenance of cell identity during cell fate transition remain enigmatic. Here, we compared the changes in gene expression, as well as the dynamics of nucleosome positioning and key histone modifications between the four major neural cell types during Drosophila neural development. We find that the neural progenitors can be separated from the terminally differentiated cells based on their gene expression profiles, whereas nucleosome distribution in the flanking regions of transcription start sites fails to identify the relationships between the progenitors and the differentiated cells. H3K27me3 signal in promoters and enhancers can not only distinguish the progenitors from the differentiated cells but also identify the differentiation path of the neural stem cells (NSCs) to the Introduction Chromatin structure and state regulate many biological processes through controlling DNA accessibility. In general, chromatin structure is more open in the pluripotent stem cells than in the differentiated cells [1]. As the basic repeating unit of chromatin, nucleosome is important for establishing chromatin architecture. Therefore, nucleosome organization in the genome plays a critical role in modulating transcription, DNA replication, DNA repair, and other DNA templatebased processes [2]. Nucleosome eviction and the resultant gene activation promote endodermal differentiation of mouse embryonic stem cells (mESCs) [3]. Nucleosome repositioning and variable nucleosome repeat length also have regulatory functions in cell lineage commitment [4]. Differential nucleosome occupancy between somatic cells and pluripotent stem cells often occurs in the regions containing binding sties for the pluripotency transcription factors OCT4, SOX2, etc. [5]. Thus, dynamic nucleosome positioning is a key regulatory mechanism underlying cell fate transition.
Similarly, histone modifications (HMs) are also involved in the regulation of many biological processes. HMs can serve a signal to recruit non-histone proteins to the target DNA sites [6]. Therefore, changes in HMs can lead to alterations in the chromatin state. Both global and gene-specific HM changes are required for mESC differentiation [7]. For instance, extensive HM changes occur and have impact on the differentiation of mESCs into neural progenitor cells [8]. In humans, H3K9ac signal recruits SOX2 and PAX6 to their target sites to facilitate the neuroectodermal differentiation from ESCs [9]. Interestingly, underacetylated H4 is essential for X-inactivation after mESC differentiation [10]. Profiling HMs across 16 developmental stages of hematopoietic differentiation in mice reveals the commitment path of each lineage without erroneously clustering intermediate progenitors of different cell lineages together [11]. Therefore, in addition to the pivotal regulatory role in the differentiation, HMs are able to reveal the differentiation potential of progeny.
Our previous study shows that during the early Drosophila embryonic development, formation of nucleosome-depleted regions (NDRs) in enhancers activates gene transcription for neural stem cells (NSCs) to differentiate into neurons. H3K27ac and H3K9ac changes in promoters, in coordination with their changes in enhancers, regulate gene expression during this process [12]. We also reveal chromatin remodeling patterns and the associated functions in the glial differentiation during early Drosophila embryonic development [13]. However, little is known about the epigenetic regulatory network of chromatin remodeling during neurogenesis from NSCs to intermediate progenitors to terminally differentiated neural cells. It also remains unclear which HM(s) can reveal the differentiation potential of intermediate progenitors.
To address the questions above, we integrated RNA-seq, MNase-seq, and HM ChIP-seq data of four Drosophila neural cell types: NSCs, intermediate neural progenitors with lineage commitment to the glial cells, as well as the neuronal cells and the glial cells. Our systematic epigenomic analysis reveals that H3K27me3 signal in the regulatory regions could establish the differentiation paths of NSCs and the relationships among NSCs, intermediate progenitors, and terminally differentiated neural cells.

Results and discussion
Gene expression profiles distinguish the neural progenitor cells and the differentiated cells During Drosophila neuroglial development, upon the activation of the complicated gene regulatory network, the NSCs differentiate into glia and neurons, the two major types of neural cells ( Figure 1A). As a transcription factor (TF), glial cell missing (Gcm) drives the glial differentiation [14]. The multipotent neural progenitor cells are committed to the glial cells when Gcm is expressed [15] (in this study, we name the Gcmexpressed neural progenitor cells as GNP cells).
Global gene expression profiles contain information regarding the transcriptional state of cells and can distinguish between cells of different types or states [16,17]. Therefore, comparison of the gene expression profiles could allow us to identify the difference between these four neural cell types. Our results show that these cells are clustered into two groups: the progenitors (NSCs and GNP cells) and the terminally differentiated cells (neurons and glia) ( Figure 1B). Nevertheless, the gene expression profiles failed to build the differentiation path of the NSCs to the GNP cells to the glial cells, indicating that gene expression may primarily reflect the pluripotency during Drosophila neuroglial development.

Nucleosome organization in promoters fails to reveal the differentiation potential of intermediate progenitors
Accurate nucleosome remodeling, especially around transcription start sites (TSSs), is critical to cell fate transition [18]. Therefore, we profiled nucleosome organization around TSSs. The result shows the typical arrangement at À1, NDR, +1, +2, +3 nucleosomes, etc. around TSSs in all four cell types. The nucleosome array in the regions downstream of TSSs gradually weakens as nucleosome positioning extends into the NDR borders to shorten the NDRs. The nucleosome array finally disappears when the NDRs are fully occupied by nucleosomes ( Figure 2A). Further unsupervised hierarchical clustering analysis based on the nucleosome occupancy in the NDRs grouped the NSCs and the GNP cells together and left neurons and glia outside of the group ( Figure 2B). The resultant clustering structure deviates from the lineage-commitment relationships between the four cell types. This suggests that nucleosome organization in the NDRs alone is not able to identify the unique and consistent difference between the cell types.

H3K27me3 signal in the promotors reveals the lineage-commitment paths of NSCs
We next examined HM changes in the promoters during the neuroglial development. Gene transcription is poised when the promoter is marked by both H3K4me3 and H3K27me3, i.e., bivalent [19]. Bivalent promoters are rarely seen in Drosophila genome [20]. The low level of H3K9ac is important to the glial differentiation in Drosophila [15]. Therefore, we profiled H3K9ac and H3K27me3 in the Drosophila genome using ChromHMM [21] and classified the genome into four chromatin states: H3K9ac + , H3K27me3 + , both (H3K9ac + / H3K27me3 + ), and unmarked. Almost half of the genomic regions are marked by H3K9ac in both NSCs and GNP cells, whereas only an insignificant fraction (14.9% and 15.7%) of the genome is marked by H3K9ac in neurons and glia ( Figure 3A). Moreover, most of H3K9ac + regions are maintained from NSCs to GNP cells. Although most of H3K9ac + , H3K27me3 + and H3K9ac + /H3K27me3 + regions in GNP cells are inherited from NSCs, approximately one third of H3K9ac + and half of H3K27me3 + regions in NSCs become unmarked in GNP cells. This suggests that there exist characteristic chromatin states between NSCs and GNP cells, albeit both are pluripotent progenitors. In contrast, neurons and glia share similar genome-wide chromatin states. As expected, chromatin states are prominently different between the progenitor cells and the differentiated cells ( Figure 3A).
The characteristic difference in chromatin states of the four cell types prompted us to examine that specific HM(s) could be used to construct the neuroglial development pathways. We first analyzed the correlation between HMs in the promoters   Figure 3B). Specifically, the H3K27me3 change in the promoters is inversely correlated with the change in gene expression from NSCs to GNP cells ( Figure S1 and Figure S2A). Consistently, it has been reported that the signals of H3K9ac and H3K27me3 in the promoters are predictive markers for gene activity in the neuroectodermal differentiation of human ESCs [9]. We next classified the four neural cell types using the signals of H3K9ac and H3K27me3 in the promoters. The result shows that H3K9ac signal in the promoters separates the progenitor cells (NSCs and GNP cells) from the differentiated cells (neurons and glia) ( Figure 3C). In contrast, H3K27me3 signal in the promoters constructs the lineage-commitment paths of NSCs ( Figure 3D). Together, chromatin states in promoters are predictive for gene activity. Furthermore, H3K27me3 signal in the promoters can reveal the differentiation potential of progenitor cells.

H3K27me3 signal in the enhancers reveals the differentiation potential of progenitor cells
Enhancers are distal cis regulatory elements critical to establish and maintain cell identity [22,23]. Since H3K4me1 is the predictive chromatin signature of enhancers [24], we identified enhancers by predicting H3K4me1 peaks using HOMER [25] (details in Methods). In addition, HMs in enhancers determine the chromatin state [23]. For example, enhancers with H3K27me3 are poised. Thus, we analyzed the correlation between HMs in enhancers and gene activity. The results show that H3K9ac signal in enhancers is not predictive for gene activity but can separate the progenitor cells from the differentiated cells ( Figure 4A and B). However, H3K27me3 signal in enhancers is not only predictive for gene activity but also can construct the neuroglial development path (Figure 4A and C). Specifically, the H3K27me3 change in the enhancers is inversely correlated with the change in gene expression from NSCs to GNP cells ( Figure S2B). This result agrees with the reported finding that enhancer establishment and its chromatin state can reveal the differentiation potential of progeny during hematopoiesis [11].
H3K9ac signal in enhancers is associated with neuron-and gliarelated functions Although H3K9ac signal in promoters and enhancers fails to reveal the differentiation potential of progenitor cells ( Figure 3C and Figure 4B), a low level of global H3K9ac is required for gliogenesis in Drosophila [15]. Conversely, we found that H3K9ac signal in promoters was increased in the glia-specific genes up-regulated during the glial differentiation in Drosophila [13]. Additionally, H3K9ac signal in promoters was increased during the neuronal differentiation in Drosophila [12]. This indicates that H3K9ac signal in promoters activates distinct sets of genes with neuron-and glia-related functions during their differentiation.
Similarly, in order to understand the distinct functions of H3K9ac in enhancers in neurons and glia, we compared the neuron-and glia-specific H3K9ac + enhancers and find that approximately half of H3K9ac + enhancers are specific to neurons or glia ( Figure 5A). We define the nearest gene of an enhancer as its target gene. Consequently, we obtained 92 target genes for the neuron-specific H3K9ac + enhancers and 79 target genes for the glia-specific H3K9ac + enhancers, respectively (Table S1), among which six genes were commonly  Figure 5B), whereas gliaspecific target genes are enriched for nervous system development, tissue morphogenesis, etc. ( Figure 5C). This suggests that H3K9ac signal in enhancers activates the target genes with functions specific to neural cell subtypes.
Dynamics of HMs in promoters and enhancers have been reported to synergistically regulate gene activity in the cell fate transition [26]. We have also revealed that HM changes in promoters and enhancers synergistically regulate gene expression during the neuronal differentiation in Drosophila embryos [12]. Here, we find that HMs in promoters and enhancers are crucial for Drosophila neuroglial development. However, it is not clear whether there exists looping between enhancers and promoters for their synergistic regulation. Further study of chromatin interactions through chromatin conformation capture carbon copy (5C) or Hi-C technology could resolve this issue.

Data collection
The high-throughput sequencing data of the four types of neural cells (NSCs, GNP cells, glia, and neurons) were from our recent work [12,13]. In brief, four distinct Drosophila strains were generated for purification of these cells. NSCs and GNP cells were isolated from 5 to 7 h after egg laying (AEL) embryos (stage 11), whereas glia and neurons were isolated from 12 to 14 h AEL embryos (stages 15 and 16). The nuclei specific for the four types of neural cells were purified from the corresponding embryos using INTACT technology [27].
Nuclear RNA was extracted from nuclei that were isolated with the RNeasy Micro Kit (Catalog No. Y5-74004, Qiagen, Valencia, CA). Mononucleosomes were obtained from the nuclei by MNase digestion. Nucleosomes with specific HMs were further purified using the corresponding HM antibodies. The purified RNA, as well as nucleosomal DNA with and without HMs, was used to construct sequencing library with standard Illumina library prep protocols. The sequencing was performed on Illumina HiSeq2000 system with read length of 49 bp, single end. The sequencing datasets were deposited in the Gene Expression Omnibus database with the accession numbers GSE80458 and GSE83377.

RNA-seq analysis
We mapped sequencing reads to the annotated Drosophila transcripts (FlyBase r5.43) using the tool Tophat (v1.3.1) with the default parameters [28], and retained the uniquely mapped reads to calculate gene expression levels using Cuffdiff (v1.3.0) [28]. We calculated and normalized gene expression levels as read per kilobase per million mapped reads (RPKM), and identified the differentially expressed genes (DEGs) between two samples with false discovery rate (FDR) < 0.05.

MNase-seq analysis
We mapped the nucleosomal reads to the reference genome (dm3) of Drosophila using Bowtie with maximal two mismatches [29]. We next retained the uniquely mapped reads for nucleosome analysis. We then calculated nucleosomal read counts using a 10 bp bin in the 2 kb regions centered at TSS. We further calculated and normalized the read count in each bin as read per million mapped reads (RPM) that was represented in heatmaps. Regions ranging from 150 bp upstream to 50 bp downstream of TSS were defined as NDRs. Genes were ascendingly ordered by nucleosome occupancy in the NDRs.

HM signal in promoters and enhancers
We mapped the HM reads to the reference genome (dm3) of Drosophila using Bowtie with maximal two mismatches [29], and retained the uniquely mapped reads for downstream analysis.
Regions ranging from 1 kb upstream to 1 kb downstream of TSS were defined as the promoters. We calculated and normalized HM read counts in the promoters as RPM in the same way of MNase-seq analysis mentioned above.
Enhancers were defined as H3K4me1 peaks predicted by HOMER [25] with a 1 kb sliding window and FDR of 0.001. H3K4me1 peaks overlapping with promoter regions were discarded. H3K9ac and H3K27me3 read counts in the enhancers were calculated and normalized as RPM in the same way as for the promoters.