NCOR1 and OCT4 Facilitate Early Reprogramming by Co-Suppressing Fibroblast Gene Expression

Reprogramming somatic cells to induced pluripotent stem cells (iPSC) succeeds only in a small fraction of cells within the population. Reprogramming occurs in distinctive stages, each facing its own bottlenecks. It initiates with overexpression of transcription factors OCT4, SOX2, KLF4 and c-MYC (OSKM) in somatic cells such as mouse embryonic fibroblasts (MEFs). OSKM bind chromatin, silencing the somatic identity and starting the stepwise reactivation of the pluripotency program. However, inefficient suppression of the somatic lineage leads to unwanted epigenetic memory from the tissue of origin, even in successfully generated iPSCs. Thus, it is essential to shed more light on chromatin regulators and processes involved in dissolving the somatic identity. Recent work characterized the role of transcriptional co-repressors NCOR1 and NCOR2 (also known as NCoR and SMRT), showing that they cooperate with c-MYC to silence pluripotency genes during late reprogramming stages. NCOR1/NCOR2 were also proposed to be involved in silencing fibroblast identity, however it is unclear how this happens. Here, we shed light on the role of NCOR1 in early reprogramming. We show that siRNA-mediated ablation of NCOR1 and OCT4 results in very similar phenotypes, including transcriptomic changes and highly correlated high content colony phenotypes.. Both NCOR1 and OCT4 bind to promoters co-occupied by c-MYC in MEFs. During early reprogramming, downregulation of one group of somatic MEF-expressed genes requires both NCOR1 and OCT4, whereas another group of MEF-expressed genes is downregulated by NCOR1 but not OCT4. Our data suggest that NCOR1, assisted by OCT4 and c-MYC, facilitates transcriptional inactivation of genes with high expression in MEFs, which need to be suppressed to bypass an early reprogramming block. This way, NCOR1 facilitates early reprogramming progression.


INTRODUCTION
Induced pluripotent stem cells (iPSCs) are generated in vitro by overexpressing factors OCT4, SOX2, KLF4 and c-MYC (OSKM factors) in somatic cells [1]. iPSCs have been successfully used in disease modeling and cell transplantation, showcasing their relevance in regenerative medicine [2]. Despite significant improvements to the original protocol [3][4][5], in general, only a small percentage of somatic cells become pluripotent [6]. The reason is that, as embryonic development proceeds, cells gradually commit to a certain lineage, and acquire full stability upon differentiation [7]. Thus, in vivo, differentiated cells are not able to become other cell types, and such stability is conferred by chromatin state. Chromatin state defines the properties of regulatory elements as a function of DNA sequence, histone modifications, DNA methylation, chromatin architecture and the activity of transcription factor networks [7]. To understand reprogramming and develop new or improve existent iPS generation protocols, it is crucial to understand the interplay of chromatin modulators and transcriptional regulation.
Pluripotency acquisition involves silencing of somatic genes [8], and reactivation of the pluripotency network, which occurs in a stepwise manner [9]. The OSKM reprogramming transcription factors play different roles, but assist the two processes [10]. For instance, it has been shown that KLF4 and SOX2 bind to closed chromatin in MEFs [11,12]. These binding events are related to opening up of silenced pluripotency enhancers [12]. It has also been suggested that opening up of pluripotency enhancers requires cooperative binding of at least two of the reprogramming factors [10]. OCT4 was shown to bind to closed chromatin in somatic cells [11,13], but accumulating evidence indicates that OCT4 first binds open chromatin in MEFs and is recruited by other reprogramming factors to assist in opening-up of pluripotency enhancers [10,14,15]. OSK also are involved in transcriptional silencing of somatic genes, facilitating the relocation of somaticspecific transcription factors to loci co-occupied by OSK [10]. Thus, reprogramming factors facilitate the repression of the somatic program but also the activation of the pluripotency network. This transcriptional duality is elicited by differential interaction with transcriptional co-activators or co-repressors [16].
The transcriptional co-repressors NCOR1 (nuclear receptor co-repressor 1) and its paralogue NCOR2 ( silencing mediator of retinoic acid and thyroid hormone receptor or NCoR2), interact with transcription factors to suppress their target genes [16]. NCOR1/NCOR2 are essential enzymatic co-factors of histone deacetylases, especially HDAC3 [17] and mediate transcriptional repression of their targets via deacetylation of Lysine residues of histone tails [18,19]. NCOR1/NCOR2 are essential for organism homeostasis, and have nonredundant roles in metazoans, as mutation of each gene individually leads to embryonic lethality [16]. In fact, NCOR1/NCOR2 regulate several essential metabolic and cellular processes, such as circadian rhythm [20] and mitochondrial function [21]. Recently, it was shown that NCOR1/NCOR2 interact with all four OSKM factors during reprogramming from MEFs to iPS [22]. NCOR1/NCOR2 interaction with c-MYC promotes the repression of pluripotency genes in later reprogramming stages [22], imposing a barrier for iPS generation. In addition, NCOR1/NCOR2 knockdowns induced transcriptional upregulation of somatic genes in the earliest reprogramming stages [22], arguing in favor of a dual role for these co-repressors. However, it is unknown how these early effects of NCOR1/SMART are brought about.
Previously, we performed a high-content imaging siRNA screening, combined with a secondary RNA-sequencing screen, to identify chromatin-associated regulators during early reprogramming from MEFs to iPS [23]. Using this data from two orthogonal screens, we identified strong phenotypic similarities between NCOR1 and OCT4 knockdowns. We have identified functional interactions from knockdowns showing similar phenotypes before [23,24]. Here we characterize the relationship of NCOR1 and OCT4 in early reprogramming. For such purpose, we first looked closer into the phenotypic similarities of both knockdowns, based on the high-content imaging. Then we set out to investigate the effect on the reprogramming transcriptome in Ncor1 and Oct4 knockdowns with RNAsequencing. Finally, the analyses we compared our RNA-seq findings to published ChIP-seq and RNA-seq datasets. These analyses not only document the cooperation of OCT4 and NCOR1 in downregulating one set of somatic genes, they also show an antagonistic role in the regulation of another subset of somatic genes during early reprogramming.

High content screening reveals early reprogramming disruption upon Ncor1 depletion
In a previous study, we performed a high-content siRNA screen, targeting 300 chromatin-associated factors during early reprogramming (Fig. 1A) [23]. We hypothesized that the onset of reprogramming would be associated with colonylevel phenotypic changes and that these changes are affected upon gene perturbation. Moreover, using computational analysis of high-content imaging data, perturbed genes can be grouped according to their phenotypic similarities in a multidimensional space, rather than focusing on a single phenotype [25]. We used early pluripotency markers CDH1 and SALL4 to detect colonies at day 6 of reprogramming. Additionally, multiple colony features were extracted, including pluripotency marker expression levels, morphological traits such as colony roundness, area and symmetry, and many other colony texture and morphological features [23]. We selected 30 hits that were subjected to a secondary RNA-seq-based screening [23] (overview in Fig. 1A). We have previously identified functional interactions from knockdowns showing similar phenotypes [23,24,26]. Using this rationale, we compared knockdown-toknockdown correlations based on high-content phenotypes and also based on transcriptomes (Fig. 1B). We filtered for pairwise correlations showing highest scores in both the high-content and the RNA-seq screen (both R> 0.4). We visualized the resulting potential interactions between genes (nodes) as networks, considering both high-content imaging correlations (edge width) and transcriptome-based correlations (edge color) of all siRNA knockdowns. We previously uncovered a functional interaction between BRCA1, BARD1 and WDR5 [23]. The strongest correlation however is between NCOR1-OCT4, raising the possibility of a functional interaction between these factors (Fig. 1B).
We verified that, based on high-content features, siNcor1 positively correlated with siOct4, but anti-correlated with the negative (nt, non-targeting) control ( Fig  1C). Colony number analysis derived from the screen showed a significant reduction of colony formation upon Ncor1 depletion (Fig. 1D, Supplemental  Fig.1). High-content images from Ncor1 and Oct4 knockdowns showed very low CDH1 and SALL4 intensities and compromised colony formation, when compared to the nt control. We validated this finding using an independent In-Cell Western assay and confirmed that siNcor1 and siOct4 show similarly impaired reprogramming phenotypes (Fig 1E, Supplemental Fig. 1).
From these experiments we concluded that siRNA targeting of Ncor1 results in an early reprogramming phenotype that is similar to that caused by Oct4 knockdown.

Ncor1 depletion affects transcriptional regulation of fibroblast identity
To gain insight in the molecular events that explain early reprogramming disruption by Ncor1 depletion, we performed gene expression analyses. We first asked whether Ncor1 mRNA expression followed a dynamic pattern during early reprogramming. For this, Ncor1 mRNA expression was quantified by RT-qPCR at different reprogramming timepoints, from MEFs to reprogramming day 6. We found that Ncor1 mRNA expression was relatively stable throughout reprogramming, with slight variations ( Fig. 2A). Then, we sought to understand the role of NCOR1 in transcriptional dynamics during early phases of iPSC reprogramming. For this purpose, we performed RNA-sequencing at day 3 and day 6 after expression of OSKM factors in MEFs transfected with control or Ncor1 targeting siRNAs, respectively. Differential gene expression analysis revealed 405 deregulated genes at day 6 (siControl versus siNcor1, adjusted pvalue <0.05, Fig. 2B-C). The effect size was relatively small for most transcripts, suggesting that the strong defects in repogramming observed with siNcor1, are brought about by moderate changes in expression of hundreds of transcripts (mean absolute log 2 -fold change of 0.43). The majority of the genes (94%) were up-regulated, as expected for a transcriptional co-repressor, although the deregulation of some of these genes may be an indirect consequence of the reduction of NCOR1 rather than a reduced NCOR1-mediated repression on that particular gene.
To gain insight into which processes were affected in Ncor1-depleted cells, we performed Gene Ontology classification (GO) for upregulated genes because the downregulated were very few (Fig. 2D). This analysis showed that upregulated genes were associated with cell-cell adhesion, poly-A-RNA-binding, exosome and cytoskeleton (Fig. 2D). Using our reprogramming RNA-seq time course dataset [23], we observed that genes associated with these terms were indeed downregulated in the course of early reprogramming (Fig. 2E). To note, the cell adhesion terms featured laminins, collagens and other genes related to basement membrane and extracellular matrix. These data suggest that reduction of Ncor1 prevents downregulation of genes related to fibroblast identity during MEF to iPSC reprogramming.

NCOR1 and OCT4 co-regulate two distinct MEF-expressed groups of genes
To gain insight into the functional relationship between OCT4 and NCOR1, we examined RNA-seq profiles at reprogramming day 6 for both knockdowns. To verify Ncor1 and Oct4 mRNA-targeting by the corresponding siRNAs, we plotted the RNA-seq normalized values for each of the knockdowns, compared to the controls at day 3 (Supplemental Fig. 2). Oct4 knockdown did not show a high efficiency of knockdown at day 3 in these samples, however from time course experiments we have observed that Oct4 mRNA is targeted and efficiently silenced at day 2, but rapidly recovers within a couple of days (Supplemental Fig. 2). This is due to its high overexpression driven by the tet-OSKM cassette [27]. Even with such transient early knockdown we observe a strong phenotype ( Fig.1) and many deregulated genes ( Fig. 2A), highlighting the important contribution of OCT4 in OSKM reprogramming.
When we compared the genes deregulated in each knockdown we found that approximately half of the genes deregulated in siNcor1 cells were also deregulated in siOct4 cells (3.4 fold enrichment compared to random overlap, hypergeometric p-value 8.9 x 10 -54 ; Fig.3A). After hierarchical clustering of genes deregulated by both siNcor1 and siOct4 based on RNA-seq, we identified one small cluster (Cluster 1) and a large cluster that was further divided in two (Cluster 2A, 2B; Fig.3B). Cluster 1 genes were relatively unaffected at day 3, but at day 6 they were downregulated for both knockdowns. For cluster 2A we found that genes were also unaffected at day 3, but showed a similar upregulated pattern in both Oct4 and Ncor1 siRNAs at day 6 ( Fig. 3B). Cluster 2B contained genes with high expression in MEFs, most of which failed to be downregulated at day 3, whereas by day 6 they predominantly showed opposite expression changes in the knockdowns (siNcor1 versus siOct4) (Fig. 3B). Our RNA-seq time course dataset [23] showed that genes in cluster 1 mostly remain the same in time, whereas clusters 2A and 2B tend to go down in time, but with different dynamics (Fig. 3C).
To shed some light on the difference between clusters 2A and 2B, we performed GO classification for genes of each cluster (Fig. 3D). NCOR1-OCT4 co-regulated genes (cluster 2A) showed an association with RNA metabolism and cell adhesion (Fig. 3D). The term "poly(A)-RNA binding" contained several components for pre-mRNA splicing. Genes increased by siNcor1 but not siOct4 (Cluster 2B), were associated with functions in extracellular matrix organization and other cell adhesion features. We wondered whether the "cell adhesion" terms were related to the mesenchymal-to-epithelial transition (MET) observed in early reprogramming [28,29]. We found that most mesenchymal genes exhibit a cluster 2B-type pattern, with increased expression at day 3 in both knockdowns and opposite effects of the two knockdowns at day 6. Epithelial genes on the other hand show more variable patterns (Fig. 3E). In siOct4 cells, the downregulation of mesenchymal gene expression is delayed. By day 6, it not only catches up, this decrease in expression is more severe in siOct4 cells relatively to control cells, suggesting that OCT4 moderates this decrease during normal reprogramming. In siNcor1 cells on the other hand, mesenchymal gene expression is maintained at higher levels at both time points, although some epithelial genes are also expressed at higher levels (Fig. 3E).
These results identify two distinct groups of somatic genes, the expression of which is normally reduced during reprogramming. Ncor1 depletion prevents downregulation of both groups, whereas Oct4 depletion prevents the downregulation of one group and causes a stronger decrease in expression in the other group of genes.

NCOR1 and OCT4 target promoters associated to cellular and metabolic processes in MEFs and early reprogramming cells
To gain insight into the functional relationship between NCOR1 and OCT4, we analyzed published ChIP-seq and RNA-seq data [10,22]. First, we asked whether the up-regulated genes that we identified in siNcor1 and siOct4 cells, were either MEF genes or genes dynamically changing during reprogramming. Consistent with our earlier analysis (Fig. 3C), these independent data confirmed that upregulated genes in siNcor1 cells were mostly MEF-expressed and very early reprogramming (48h) genes, but were mostly inactive in ES cells (Fig. 4A). In the heatmap several genes related to fibroblast identity are labeled (lamins, collagens, fibronectin, signaling genes; Fig. 4A). In contrast, genes upregulated in siOct4 cells showed a variety of expression patterns during reprogramming (Fig. 4B).
Then, we asked which fraction of differentially expressed genes in Ncor1-or Oct4-knockdowns were directly bound by NCOR1. We overlapped NCOR1 binding sites in reprogramming cells with siNcor1 and siOct4 differential genes (Fig.4C, left and right, respectively) and calculated the fold enrichment. In all cases, the overlap was higher than expected by random chance, as determined with a hypergeometric test ( Table 1). These data confirmed that NCOR1 directly binds to a significant fraction of deregulated genes in Ncor1 and Oct4 knockdowns (Fig. 4C, Table 1  We then asked whether NCOR1-bound genes with increased expression in siNcor1 cells, were co-bound by OCT4 or c-MYC or were decorated with the H3K4me3 promoter mark (121 genes in Fig. 4C left). We observed that these NCOR1-bound regions were also enriched with OCT4 and c-MYC at 48 hours of reprogramming, whereas OCT4 and c-MYC binding was lower in ESC (Fig.4D).
Since c-MYC is bound to those regions in MEFs already (Fig. 4D) and c-MYC, OCT4 and NCOR1 colocalize in the genome during reprogramming [22], it is possible that c-MYC facilitates OCT4 and NCOR1 binding to those regions early in reprogramming. Furthermore, the enrichment of H3K4me3 in MEFs, 48 h and ESC suggests that these genomic locations represent promoters (Fig.4D).
To explore the function of genes co-regulated by NCOR1 and OCT4, we performed a Gene Ontology classification of genes analyzed in Fig. 4D  (Supplemental Fig.3). This analysis confirmed once more the terms we observed before (Figs. 2D and 3D), related to cell-cell adhesion, extracellular exosome, extracellular matrix, cytoskeleton, focal adhesion and poly(A) RNA binding, among others (Supplemental Fig.3). As before, the poly(A) RNA binding term included several genes related to translation and pre-mRNA splicing.

DISCUSSION
In this study we found that depletion of Ncor1 results in disrupted colony morphology in early iPS cells. Transcriptome analyses showed that NCOR1 has a role in suppression of fibroblast identity and signaling modulation (Fig. 2-3). Our findings extend an earlier report on the role of NCOR1 in later phases of reprogramming [22]. NCOR1 function could be mediated by c-MYC, that occupies many MEF regulatory regions [10].
Using multi-dimensional readouts, we found strong phenotypic similarities between Ncor1 and Oct4 depletion. Such phenotypic similarities often reflect functional interactions [23,24,26]. Our analyses indicate that NCOR1 and OCT4 co-bind a group of promoters that are also bound by c-MYC in early reprogramming cell populations. Genes associated with these promoters are expressed in MEFs and during the first 48 hours of reprogramming, and are mostly transcriptionally silent in ESCs (Fig.4). Upon Ncor1 knockdown they are upregulated. This could mean that the repressor NCOR1 is recruited to such promoters by OCT4/c-MYC quite early in reprogramming, and transcriptional downregulation occurs via histone deacetylation. There was a surprisingly moderate effect on the transcriptome associated with siNcor1. Because of the increased proliferation of reprogramming cells [30], Ncor1 knockdown at day 3 was 50%. Perhaps the effects on the transcriptome would have been stronger if we had transfected siRNAs at day 3 instead of day 0, or alternatively with a complete loss of function model. In addition, Ncor1 and Ncor2 may have partially redundant functions [31], which may have contributed to the relatively mild effect on the transcriptome. Nevertheless, we observe a strong phenotype in colony formation, and the transcriptome data suggest that Ncor1 attenuates signaling, which might be crucial to weaken fibroblast identity. Consistent with this, NCOR1/NCOR2 are known to be connected to various signaling pathways [31].
Other genes involved are related to essential cellular functions, such as cell-cell adhesion, vesicle-mediated transport and RNA splicing. In agreement, complexes and proteins associated with these functions have been shown to be downregulated during intermediate stages of reprogramming [32]. Moreover some of these processes (e.g. cell adhesion, vesicle-mediated transport) have been shown to be barriers for reprogramming [33,34]. One apparent discrepancy in these analyses might be that the active promoter mark H3K4me3, seems abundant in MEFs, 48h reprogramming cells and ESCs. However, one possible explanation would be that target genes are downregulated in intermediate stages but then required again in late stages, as has been shown for some of the processes involved (cell adhesion, spliceosome) [32]. Moreover, some of these genes are essential for the cellular functions and continue to be expressed in ESCs, albeit at lower levels.
Recent work has shown that NCOR1/NCOR2 interact with all OSKM factors, but especially with c-MYC [22]. During late reprogramming stages, the co-repressors NCOR1/NCOR2 interact with c-MYC to silence the pluripotency network, posing a barrier for reprogramming [22]. Our study contributes with further characterization of an early facilitating function of NCOR1 in reprogramming, which in cooperation with OCT4 and c-MYC, may suppress cellular and metabolic functions sustaining fibroblast identity. This is consistent with a role of NCOR1 in overall cellular homeostasis and metabolic regulation [16,21]. It also highlights the emerging role and cooperation of reprogramming factors in transcriptional repression [10]. Future work will focus on potential functional interactions of NCOR1 with other chromatin factors, for instance with PHF1 or APEX1 (Fig.1B), and their physical interactions in early and late stages of reprogramming.  . Data points represent the mean ± SD from two replicates from independent samples. B) Volcano plot for differentially expressed genes in Ncor1 knockdown at reprogramming day 6, after RNA-sequencing. Blue data points represent the down-regulated genes and red datapoints represent up-regulated genes. C) Heatmap of differentially expressed genes from siNcor1 at reprogramming day 3 and day 6. Color scale represents the log2-fold change compared to nontargeting control. D) Gene Ontology (GO) classification from differential genes in C represented as a bubble plot. The x-axis of the plot is the -log10 of the adjusted p-value, and some of the most significant terms are represented in red (upregulated). Size of the bubbles represents the fold-enrichment per category. E) Graphical representation of reprogramming RNA-seq temporal dynamics from some of the upregulated GO categories in D, shown in different colors. Each datapoint represents the median and the lines the 95% CI. Per GO category, medians were calculated based on the normalized expression value (log2-counts per million reads) of each gene, relative to the gene´s maximum value (maximum set to 1).   [10,22] at NCOR/SMRT-bound genomic locations [22]) that are associated with siNcor1-upregulated genes in our data. The heatmap shows ChIP signals (color intensity) at genomic locations representing peak summits (center) plus and minus 5kb (left-right). Vertically different genomic locations are shown. Scales represent RPKM-normalized ChIP-seq signals.