A lncRNA/Lin28/Mirlet7 axis coupled to DNA methylation fine tunes the dynamics of a cell state transition

Execution of pluripotency requires progression from the naïve status represented by mouse embryonic stem cells (ESCs) to a condition poised for lineage specification. This process is controlled at transcriptional, post-transcriptional and epigenetic levels and non-coding RNAs are contributors to this regulation complexity. Here we identify a molecular cascade initiated by a long non-coding RNA (lncRNA), Ephemeron (Epn), that modulates the dynamics of exit from naïve pluripotency. Epn deletion delays the extinction of ESC identity, an effect mediated by perduring expression of the pivotal transcription factor Nanog. In the absence of Epn, Lin28a expression is reduced, resulting in an elevated level of Mirlet7g that suppresses de novo methyltransferases Dnmt3a/b. Dnmt3a/b deletion also retards exit from the ESC state, and is associated with delayed promoter methylation and slower down-regulation of Nanog. Altogether, our findings reveal a lncRNA/miRNA/DNA methylation axis that facilitates a timely stem cell state transition.


Introduction
Mouse embryonic stem cells (ESCs), in vitro counterparts of the pre-implantation epiblast, exhibit dual properties of self-renewal and differentiation (Boroviak et al., 2015;Bradley et al., 1984;Evans and Kaufman, 1981;Martin, 1981). These properties make them an attractive system for investigating cell fate decision making. In the embryo, spatially and temporally coordinated signals direct the rapid and continuous transition of the epiblast towards lineage specification (Acampora et al., 2016;Kalkan et al., 2017). In contrast, ESCs can be suspended in a ground state of pluripotency, where self-renewal is decoupled from lineage specification, using two inhibitors (2i) of glycogen synthase kinase 3 (GSK3) and mitogen-activated protein kinase kinase (MEK1/2), along with cytokine leukaemia inhibitory factor (LIF) (Ying et al., 2008). Therefore, ESCs provides a unique opportunity to explore the principles and molecular players underlying the developmental progression of pluripotency (Kalkan and Smith, 2014).
While it is increasingly clear that the ESC state is maintained by a core network of transcription factors (Dunn et al., 2014;Ivanova et al., 2006;Niwa et al., 2000;Niwa et al., 2009;Wray et al., 2011), little is known about how cells progress from this state to lineage specification (Buecker et al., 2014;Kalkan and Smith, 2014). Lossof-function screens have highlighted a multi-layered machinery for dissolving this transcription factor network (Betschinger et al., 2013;Leeb et al., 2014) and the latency period for exiting the naïve state depends on the clearance kinetics of the network members (Dunn et al., 2014). The coordination of multiple antagonistic regulators thus ensures a rapid and complete dismantling of this core network and consequent timely extinction of ESC identity upon 2i withdrawal (Kalkan and Smith, 2014).
In addition to protein coding genes, accumulating evidence suggests that non-coding RNAs can contribute to the regulation complexity for cell fate transitions. Within this class, long non-coding RNAs (lncRNAs) comprise a large fraction of the transcriptome in diverse cell types and exhibit specific spatio-temporal expression (FANTOMConsortium, 2005;Guttman et al., 2009;Necsulea et al., 2014). The genomic distribution of lncRNAs is non-random (Luo et al., 2016) and a subclass of lncRNAs which are divergently transcribed from the neighbouring genes, are thought to regulate proximal gene expression in cis, either due to the process of transcription (Ebisuya et al., 2008;Engreitz et al., 2016;Martens et al., 2004) or through lncRNA-protein interactions to recruit regulatory complexes (Lai et al., 2013;Lee, 2012;Luo et al., 2016;Nagano et al., 2008).
However, the functions and mode of action of the vast majority of lncRNAs remain unknown and require caseby-case experimental investigation. In mouse ESCs, knockdowns of a number of lncRNAs have been reported to exert effects on the transcriptome (Bergmann et al., 2015;Dinger et al., 2008;Guttman et al., 2011;Lin et al., 2014;Sheik Mohamed et al., 2010) and in some cases impair self-renewal (Lin et al., 2014;Luo et al., 2016;Savić et al., 2014). Therefore, lncRNAs could provide an additional layer of regulation in cell fate transition from the ESC state.
We investigated the potential involvement of lncRNAs in transition from the naïve ESC state and identified a dynamically regulated lncRNA that we named Ephemeron (Epn). We present functional evaluation of Epn and delineation of a downstream molecular cascade, which is an integral part of the regulatory machinery driving the irreversible exit from naïve pluripotency.

Identification of lncRNAs associated with transition from naïve pluripotency
Post-implantation epiblast derived stem cells (EpiSCs) represent a primed state of pluripotency developmentally downstream of the naïve state ESCs (Brons et al., 2007;Nichols and Smith, 2009;Tesar et al., 2007). To identify lncRNA candidates with a possible role in ESC transition, we analysed in silico the effect of genetic perturbation on expression of ESC and EpiSC states based on published data. We first selected genes that are over ten-fold differentially enriched in ESCs (182 genes) and EpiSCs (131 genes) relative to each other (Tesar et al., 2007) as molecular signatures to represent these two states. We investigated the impact on these two signature sets when individual lncRNAs (147 in total) and known protein coding regulators (40 in total) were knocked down in ESCs grown in LIF/serum based a published study (Guttman et al., 2011) (Fig1A, Fig1-source data 1). Serum culture supports a heterogeneous mixture of naïve and primed cells (Chambers et al., 2007;Kolodziejczyk et al., 2015;Marks et al., 2012). Therefore, analysis in this condition could potentially reveal regulators of the ESC and EpiSC states. The effect of each gene knockdown was plotted based on the percentage of genes significantly altered within ESC and EpiSC signature sets (FDR<0.05 and fold change > 2 or <0.5 over negative control defined by the original study). Applying this approach, we identified lncRNAs that increased ESC and decreased EpiSC signatures when knocked down, suggestive of a potential role in transition from the ESC state (Fig1A bottom right quadrant). We validated the approach by analysing the knockdown effects of known ESC self-renewal regulators. As predicted, depletion of factors that maintain the ESC state, such as Stat3, Esrrb, Sox2 and Klf4, led to a decrease in ESC and increase in EpiSC signature (Fig1A), while knockdown of Oct4 gave rise to a decrease in both ESC and EpiSC signatures consistent with its requirement in both states (Niwa et al., 2000;Osorno et al., 2012).
We next examined expression profiles of these candidate lncRNAs during exit from self-renewal in defined conditions, exploiting the Rex1::GFP (RGd2) reporter ESC cell line (Kalkan et al., 2017;Wray et al., 2011)  Ribosomal profiling analysis indicated that Epn is indeed a non-coding RNA, with the longest predicted open reading frame (80 amino acids) possessing a ribosome release score typical of a non-coding sequence (Guttman et al., 2013). Epn is located in a region of high transposable element (TE) content, with its exons comprised of 76.4% annotated TE sequences (including ERV-K, LINE L1, and SINE B2 elements, Fig1- Figure   Supplement 1A). This genomic region exhibits minimal sequence conservation in mammals (Fig1 - Figure  Supplement 1A) and we failed to identify any human homologue either within the syntenic region or elsewhere in the human genome. However, a positionally conserved spliced transcript (CA504619) that shares 79% sequence identity to exon 3 of mouse Epn is present within the rat syntenic region (Fig1 - Figure   Supplement 1C). Therefore, it is likely that Epn is conserved in rodents over 30 million years since the mouserat lineage divergence.
We conducted RT-qPCR, Northern blotting and RNA-FISH to evaluate expression, transcription variants and subcellular localisation of Epn in ESCs. Epn showed strong induction within 12 hours of 2i/LIF withdrawal, but To explore the expression regulation of Epn, two inhibitors and LIF were withdrawn singly or dually for 24 hours. In conditions lacking Gsk3 inhibitor CHIRON99021 (CH), Epn was upregulated (Fig1E). In LIF/serum, Epn expression was twofold higher than in 2i (Fig 1-Figure Supplement 1G) and the addition of CH to LIF/serum culture suppressed Epn expression within 24 hours irrespective of the presence of MEK inhibitor PD0325901 (PD) (Fig1 - Figure Supplement 1H). Upon PD/LIF withdrawal, Epn expression was maintained for 24 hours then declined (Fig1F). Therefore, Epn is suppressed by CH in self-renewing ESCs.
During early mouse development (Boroviak et al., 2015), Epn expression peaked at E4.5 and was present in both epiblast and primitive endoderm of the mature blastocyst, and was absent or low in E5.5 postimplantation epiblast (Fig1 - Figure Supplement 2A) and later stages between E7 and E17 (Fig1 - Figure  Supplement 2B). Amongst somatic tissues analysed, Epn was only detected in kidney, but at a much lower level than in ESCs. We also observed that Epn expression is restored upon naïve state resetting from EpiSCs (Guo et al., 2009;Yang et al., 2010) (Fig1 -Figure Supplement 2C,D). We conclude that Epn expression is highly specific to ESCs and the early mouse embryo.
LINE and ERVL-MaLR elements are present within the Epn proximal promoter region (2kb upstream of TSS) (Fig1- Figure Supplement 1A). Since such repetitive elements gain DNA CpG methylation dramatically during pre-to post-implantation transition . By examining published data from embryos Wang et al., 2014) and ESC progression in vitro (Kalkan et al., 2017), we found that CpG methylation gain at the Epn promoter was more extensive in the primed E6.5 epiblast (3% to 80%) than the average changes across all promoters (9% to 35%) or the genome (24% to 70%) (Fig 1-Figure Supplement 2E).
In contrast, no major CpG methylation gain at Epn was present in ESCs 24 hours post 2i withdrawal. These data suggest that Epn promoter methylation does not initiate repression, but could contribute to maintain Epn silencing in gastrulating epiblast.

Loss of Ephemeron delays exit from naïve pluripotency
Initiation of ESC differentiation in defined media upon withdrawal of self-renewal factors recapitulates features of peri-implantation epiblast development (Kalkan et al., 2017). The naïve state exit latency varies, however, according to the starting self-renewal condition (Dunn et al., 2014;Wray et al., 2011). Higher activity of the core network in PD/LIF compared with 2i results in slower network dissolution, reflecting in later onset of RGd2 downregulation (Dunn et al., 2014). PD/LIF and 2i also feature different levels of Epn due to CH To assess the effect of Epn depletion functionally, we conducted colony forming assays, in which cells maintained in PD/LIF were subjected to 40 hours culture in N2B27 and then plated at clonal density in 2i/LIF to assay the persistence of ES self-renewal potential (Betschinger et al., 2013). Epn KO and knockdown cells both gave rise to substantially more undifferentiated colonies than wild type controls (

Molecular consequences of Ephemeron loss
We performed RNA-sequencing and compared the transcriptome of wild type and Epn KO ESCs using three independently targeted KO ESC lines and three subclones of the parental wild type ESCs. Twenty-two and Fiftyfive genes were significantly differentially expressed between wild type and Epn KO cells in PD/LIF and 8 hours after PD/LIF withdrawal respectively (Benjamini-Hochberg adjusted p<0.05, fold change >1.5 or <0.7) ( Kumar et al., 2014). Lin28a was the most differentially expressed gene among these 16 differentially expressed genes common to both time points, with Epn KO cells displaying a twofold reduction in mean expression level (Fig2E). Attenuated downregulation of members of the core naïve transcription factor network is one explanation for delayed exit from ESC state (Kalkan and Smith, 2014). We hypothesised that Lin28a could be a negative regulator of the naïve network. Although Lin28a is commonly considered as a pluripotency factor, its expression is actually increased when cells transiting out of the naïve state in vivo and in vitro (Boroviak et al., 2015;Kalkan et al., 2017;Marks et al., 2012). We examined expression of naïve

Lin28a and Nanog are in a genetic network with Ephemeron
Based on the preceding data, we hypothesised that Lin28a could be a downstream effector of Epn and the delayed exit phenotype might be attributed, at least in part, to the elevated starting level and increased perdurance of Nanog. To characterise further the relationship between Epn, Lin28a and the naïve transcription factor network, we carried out a series of genetic perturbation experiments and measured both Rex1-GFP reporter dynamics and colony formation upon withdrawal from PD/LIF. Nanog depletion in wild type cells did not substantially alter Rex1-GFP profile, but did reduce the colony formation capacity of cells recovered at 40  We investigated whether lowered expression of Lin28a contributes to the slower exit from naïve pluripotency and the increased Nanog expression. We manipulated Lin28a dosage by either overexpression or knockdown in Epn KO cells. In wild type cells, Lin28a overexpression had no significant effect. In Epn KO cells, however, it  (Ohnishi et al., 2014). These data are consistent with the proposition that Lin28a is genetically downstream of Epn and may facilitate exit from naïve pluripotency by accelerating downregulation of Nanog.
To assess whether Epn could regulate Lin28a or Nanog expression directly by promoter localisation (Rinn and Guttman, 2014), we employed chromatin isolation by RNA purification (ChIRP) (Chu et al., 2011). Using this method, we were able to selectively pull down endogenous Epn RNA (Fig3- Figure   One explanation for anti-correlated expression could be direct negative regulation of Lin28a by Nanog. We therefore inspected two published Nanog chromatin immunoprecipitation (ChIP) sequencing datasets (Chen et al., 2008;Marson et al., 2008) but observed no localisation of Nanog at the Lin28a locus (Fig3- Figure   Supplement 2G). Furthermore, we did not observe Lin28a downregulation in Nanog knockdown cells (Fig 3-Figure Supplement 2H). Therefore, Nanog does not appear to be a direct upstream regulator of Lin28a.

The function of Lin28a in ESC transition is mediated by suppression of Mirlet7g
Lin28a is an RNA binding protein with a well-established function in suppressing maturation of Mirlet7 family miRNAs (Cho et al., 2012;Viswanathan et al., 2008). We investigated whether the role of Lin28a in naïve state exit is Mirlet7 dependent. We profiled mature miRNA expression of Mirlet7 family members using RT-qPCR.
Expression of Mirlet7a, Mirlet7d, Mirlet7e, Mirlet7g and Mirlet7i decreased 24 hours after 2i/LIF withdrawal, coinciding with the increase in Lin28a expression (Fig4A). However, mature miRNA Mirlet7c expression was unaffected, suggesting that Mirlet7c expression is independent of Lin28a. This observation is in agreement with a recent finding that Mirlet7c-2, the major Mirlet7c isoform expressed in mouse ESCs, bypasses Lin28a regulation due to lack of a GGAG recognition motif in its loop region (Triboulet et al., 2015). The Lin28a regulated Mirlet7 miRNAs, but not Mirlet7c, are expressed at higher levels in ESCs in 2i/LIF than in LIF/serum (Pandolfini et al., 2016)  genes that are upregulated upon 2i withdrawal in our RNA-sequencing dataset and are either known or predicted Mirlet7g targets using the RNA22 tool (Miranda et al., 2006). DNA methyltransferases Dnmt3a and Dnmt3b emerged as prime candidates as has previously been proposed (Kumar et al., 2014). Expression of both increases during transition from both 2i and PD/LIF (Kalkan et al., 2017).

Dnmt3a and Dnmt3b methylate the Nanog promoter during naïve state exit
Epiblast progression is associated with genome-wide de novo methylation during pre-to post-implantation development (Auclair et al., 2014). This phenomenon is recapitulated when naïve ESCs are withdrawn from 2i (Buecker et al., 2014;Kalkan et al., 2017). Previous studies demonstrated hypomethylation of the Nanog promoter in mouse ESCs compared to lineage committed cells (Farthing et al., 2008;Yu et al., 2007). We speculated that impeded de novo DNA methylation could allow perdurance of Nanog expression at the onset of naïve state exit. To investigate this hypothesis, we carried out bisulfite sequencing analysis across the Nanog proximal promoter region, 1 kb upstream of the TSS, after siRNA knockdown of Dnmt3a/3b singly or together (Fig5A). We observed a marked reduction of CpG methylation in the -1kb to -761 bp region (region 1) of the These data suggest that Dnmt3a and Dnmt3b have overlapping roles in mediating de novo methylation at the Nanog proximal promoter.
To explore the role of de novo DNA methylation in ESC transition, we investigated functional consequences of Dnmt3a and Dnmt3b depletion. We generated Dnmt3a and Dnmt3b single and compound knockouts in RGd2 ESCs using CRISPR/Cas9. Using two guide RNAs (gRNAs), we generated deletions of highly conserved PC and ENV motifs (motifs IV and V) within the catalytic domain for both Dnmt3a and Dnmt3b, recapitulating the previously characterised Dnmt3b and Dnmt3b mutant gene structures (Okano et al., 1999) Figure Supplement 1E). The promoters of these genes are methylation refractory in 2i withdrawal time course (Kalkan et al., 2017). Therefore, the elevated expression should be secondary to some other factor(s) such as increased Nanog expression.
Dnmt3a/3b compound KO also resulted in impeded upregulation of peri-implantation markers such as Fgf5, Oct6 and Otx2 at 24 hours post PD/LIF withdrawal (Fig5 - Figure Supplement 1F). These data indicate that de novo DNA methylation facilitates timely progression from the ESC state. Importantly, however, de novo methylation by Dnmt3a/3b is not essential for the exit from naïve pluripotency.

Discussion
In this study, we identified a genetic network that connects a lncRNA, Ephemeron, with known players in posttranscriptional and epigenetic regulation. Epn sits at the apex of this cascade, upstream of Lin28a/Mirlet7g and Dnmt3a/3b (Fig5E), and ultimately contributing to downregulation of the potent naïve transcription factor Nanog. Epn facilitates upregulation of Lin28a, although how this is achieved remains unclear. Lin28a in turn suppresses sub-family members of the Mirlet7 miRNAs such as Mirlet7g, targets of which include de novo DNA methyltransferases Dnmt3a and Dnm3b. Increased Dnmt3a/3b activity correlates with Nanog proximal promoter CpG methylation, which may consolidate the lowered expression trigged by withdrawal of 2i/LIF. This mechanism provides an additional dimension to the multi-layered molecular machinery that expedites the irreversible ESC transition from naïve to formative pluripotency (Jang et al., 2017;Kalkan et al., 2017;Kalkan and Smith, 2014).
ES cell maintenance is robust due to parallel pathway wiring (Dunn et al., 2014). Progression from such a resilient state requires a powerful and coordinated dissolution machinery. Our findings indicate that one component is the Epn cascade. Gsk3 inhibition by CH represses Epn and thereby Lin28a, consistent with insulation of the naïve transcription factor network (Wray et al., 2011). In contrast, ESCs cultured in PD/LIF express Epn and Lin28a but without overt consequence, presumably due to the potent self-renewal environment of Stat3 activation and MEK inhibition that sustain expression of Nanog and other naïve factors.
However, loss of Epn in PD/LIF resulted in elevated Nanog.
LncRNAs are more tolerant to TE integration than protein coding genes and TE could drive the more rapid evolution than in protein coding genes (Kelley and Rinn, 2012;Necsulea et al., 2014). Epn is comprised of 76.4% TEs, compared to the average of 41.4% TE composition in the mouse genome and of 33% reported for mouse multi-exon lincRNA sequences (Kelley and Rinn, 2012). The aligned sequence between Epn and the rat transcript from the syntenic region includes ERVK LTR and SINE B2 elements. These sequences have been preserved for over 30 million years since mouse-rat lineage divergence, which could be indicative of functional constraint on Epn sequence and these domesticated TEs. Interestingly, non-coding transcripts harbouring TE sequences are enriched in ESCs and early embryo development for both mouse and human (Fort et al., 2014;Göke et al., 2015;Kelley and Rinn, 2012) and in several instances have been proposed to regulate pluripotency through different mechanisms (Durruthy-Durruthy et al., Fort et al., 2014;Lu et al., 2014).
However, Epn is only found in rodents. In fact, species-specific lncRNAs comprise the majority of the lncRNAs discovered, in particular within the primate branches (Necsulea et al., 2014). Due to their more rapidly evolving nature, it is thought that lncRNAs are more likely to acquire species-specific and lineage-restricted functions and several such examples have recently been characterised (Durruthy-Durruthy et al., Paralkar et al., 2014;Rani et al., 2016). The rodent specificity of Epn might be related to the more rapid progression from pre-implantation epiblast to gastrulation in rodents and the associated requirements for acute extinction of the naïve pluripotency programme (Smith, 2017).
In common with Lin28a (Shinoda et al., 2013), Epn is dispensable for development in vivo, as we obtained a Mendelian ratio of homozygous Epn mutant mice from heterozygous intercrosses (5:18:7, number of wild type: heterozygous: homozygous offspring). Thus ESCs provide a sensitised platform for dissecting redundant individual elements within a multi-layered control machinery for pluripotency regulation (Leeb et al., 2014;Martello and Smith, 2014).
Lin28a is known as a human somatic cell reprogramming factor that acts to suppress Mirlet7 miRNA family members, which are highly expressed in differentiated cells (Melton et al., 2010;Yu et al., 2007). However, Lin28a is expressed at a low level in ground state mouse ESCs (Marks et al., 2012) and pre-implantation epiblast, but at high levels in post-implantation epiblast and EpiSCs (Boroviak et al., 2015). The expression pattern is consistent with our evidence that up-regulation of Lin28a at the onset of mouse ESC differentiation functions to facilitate transition from the naïve state. During human iPSC generation, it is plausible that Lin28a promotes acquisition of primed pluripotency, the endpoint for current human somatic cell reprogramming.
Lin28a itself is a target of Mirlet7 miRNAs (Kumar et al., 2014;Melton et al., 2010), and this double-negative feedback loop can act as a bimodal switch to facilitate network transition. Our findings are consistent with the recent report that loss of Lin28a reduces ESC heterogeneity in serum/LIF, an effect is mediated by Mirlet7g (Kumar et al., 2014). However, independent of Mirlet7, Lin28a can post-transcriptionally regulate the expression and/or translation of many RNAs (Cho et al., 2012;Zhang et al., 2016), that could also contribute to naïve state exit.
We observed that de novo DNA methyltransferases Dnmt3a/3b are targets of Mirlet7g. Naïve ESCs (Ficz et al., 2011;Habibi et al., 2013;Leitch et al., 2013) and pre-implantation epiblast (Monk et al., 1987;Sanford et al., 1987) display global DNA hypomethylation. De novo methyltransferases Dnmt3a/3b are lowly expressed and dispensable in ESCs (Okano et al., 1999). However, the post-implantation epiblast rapidly acquires global DNA methylation and this process is dependent on Dnmt3a/3b (Auclair et al., 2014). A similar trend is also observed upon naïve ESC withdrawal from 2i (Kalkan et al., 2017). Loss of Dnmt3a and Dnmt3b singly and in combination delay naïve state exit. The role of de novo methylation in facilitating the ESC state exit may be exerted on specific naïve pluripotency associated factors, such as Nanog. It is noteworthy, however, that this effect is transient and ESC identity does not persist. Therefore, although de novo DNA methylation facilitates the rapid dissolution of ESC identity, consistent with placement of Dnmt3a/3b downstream of Epn, it is not required for the exit from naïve pluripotency.
In summary, we have mapped a genetic interaction pathway consisting of a novel lncRNA, proteins and miRNAs serving as an integral part of the multi-layered molecular machinery that propels mouse ESCs towards lineage competence. The defined mouse ESC system for phased progression of pluripotency is a sensitive experimental platform for the functional annotation of lncRNAs. We speculate that the fine-tuning effect of Epn may be representative of lncRNA actions on specific cellular processes. Lin28a overexpression vector was constructed by PCR cloning mouse Lin28a from cDNA using forward primer AATTGTCGACATGGGCTCGGTGTCCAACCAGCAGT and reverse primer AATTGCGGCCGCTCAATTCTGGGCTTCTGGGAGCAG and cloned into pENTR2B vector. It was subsequently cloned into PiggyBac-based expression vector using Gateway LR clonase (Thermo Fisher 11791020) to generate pCAG-Lin28a-pA:PGK-hygro-pA plasmid.  (Okano et al., 1999), with the highly conserved PC and ENV motifs (motifs IV and V) within the catalytic domain deleted. The gRNAs were generated by annealing the indicated oligos (Supplementary file 2A), which were subsequently ligated into pX458 vector (Addgene) digested with BbsI. The constructs were sequence validated before transfection.

Naïve pluripotency exit assays
ESCs were plated at 1x10 4 /cm 2 in 2i/LIF or PD/LIF. The next day, cells were carefully washed with PBS before switching to NDiff B27 medium. Rex1-GFP profile was analysed at indicated time point in at least two independent experiments on a Cyan or Fortessa FACs analyser and the GFP high population was quantified and indicated in all flow cytometry profiles. Live dead discrimination was performed using Topro-3. For clonal assay, post 24h or 40h 2i or PD/LIF withdrawal respectively, 300-500 cells were plated per well of a 12 well plate coated in Laminin (1:100 dilution, Sigma L2020) and cultured in 2i/LIF for 6 days. Alkaline Phosphatase staining (Sigma 86R-1KT) was conducted to detect ES colonies. AP-stained plates were imaged using an Olympus IX51, DP72 camera with CellSens software and subsequent colony counting was conducted manually using ImageJ software.

EpiSC derivation from ESCs and EpiSC resetting
ESCs were plated at 1x10 4 /cm 2 in 2i/LIF on gelatin coated plate. The next day, cells were washed with PBS and before medium switch to NDiff B27 medium supplemented with 20ng/ml Activin A and 12ng/ml Fgf2 together with 2µM XAV939 (Sigma, X3004), A/F/X. Cells were then passaged to fibronectin coated plate in A/F/X medium. EpiSCs were passaged for at least seven times before gene expression analysis and resetting. For EpiSC resetting, EpiSCs were stably transfected with GY118F construct by piggyBac transposition (Yang et al., 2010). 1x10 4 cells were plated in a one well of a 12 well plate in A/F/X, the next day, 2i plus GCSF was supplied to initiate resetting.

Differentiation assays
For neuronal differentiation, ESCs were plated at 1x10 4 /cm 2 in NDiff B27 medium on laminin ( Detailed protocols can be found in Mulas et al (Mulas et al., 2016).

Generation of Dnmt3a and Dnmt3b KO ESCs with CRISPR/Cas9
A pair of gRNA containing plasmids based on px458 designed were transfected using Fugene HD (Promega).

Southern blotting
Genomic DNA of individually picked ESC clones was extracted and digested with XmnI, size-fractionated on a 0.8% agarose gel and transferred to Hybond-XL blotting membrane (Amersham) using standard alkaline transfer methods. The 5' and 3' external probes were generated by PCR with primer sequences shown in Supplementary file 4A. Southern blot hybridization was conducted as described previously (Li et al., 2011).

Northern blotting
10 µg of purified RNA was resolved by denaturing formaldehyde agarose gel electrophoresis with MOPS buffer.
RNA was transferred to Hybond-XL (GE Healthcare, RPN2020S) membrane in 1xSSC buffer overnight by capillary transfer. RNA was UV cross-linked to the membrane and pre-hybridised with Expresshyb (CloneTech, 636831) for 2 hours at 65ᵒC. The DNA probe was generated by PCR (primers are shown in Supplementary file 4B) and 25ng of probe DNA was labelled with [ 32 P]-dCTP using Radprime DNA labeling system (Invitrogen, 18428-011). The free-nucleotide was removed from labelled probe using G-50 column (GE Healthcare,27-5330-01), and was heat-denatured followed by snap cooling. The probe was added to the pre-hybridised membrane and incubated overnight at 65ᵒC in a rolling incubator. Membrane was washed with wash buffer containing 0.1 x SSC and 0.1% SDS 3 times at 65ᵒC with 10 min intervals. The membrane was placed in a phosphoimager and exposed for at least overnight at -80ᵒC before scanned using Typhoon 9410 phosphoimager system (GE Healthcare).

5' and 3' RACE
5' RACE was conducted using 5'-Full RACE Core Set (Takara, #6122) following manufacture's protocol. The sequences for RT-primer and nested PCR primers A1, A2, S1, and S2 are shown in Supplementary file 5A. 3' RACE was conducted by using a polyT RT-primer with a unique sequence tag to synthesis cDNA. The 3' end region was PCR amplified using a primer specific to the RT-primer and a gene specific primer. The primers are shown in Supplementary file 5B. Both 5' and 3' RACE PCR products were cloned into plasmids using Zero blunt TOPO PCR cloning kit (Life Technologies, 451245) for subsequent sequencing.

RNA extraction, reverse transcription and Real-time PCR
Total RNA was isolated using Trizol (Invitrogen) or RNeasy kit (Qiagen) and DNase treatment was conducted either after RNA purification or during column purification. cDNA was transcribed from 0.5~1 ug RNA using SuperScriptIII (Invitrogen) and oligo-dT priming. Real-time PCR was performed using StepOnePlus machine (Applied Biosystems) with Fast Sybrgreen master mix (Applied Biosystems). Target gene primer sequences are shown in Supplementary file 6. Expression level were normalised to Actinβ. Technical replicates for at least two independent experiments were conducted. The results were shown as mean and standard deviation calculated by StepOnePlus software (Applied Biosystems). The cDNA library for E7-E17 embryos and adult somatic tissues were purchased from Clontech (Mouse Total RNA Master Panel, cat. no. 636644).

RNA-FISH
RNA-FISH was conducted using ViewRNA ISH Cell Assay for Fluorescence RNA In Situ Hybridization system (Affymetrix Panomics) with modifications and imaged on a DeltaVision Core system (Applied Precision), as described in Bergmann et al., 2015. The probe set used for Ephemeron was VX1-99999-01.

Luciferase assay
The Entire 3'UTR of both Dnmt3a and Dnmt3b were PCR cloned downstream of the firefly luciferase coding region into pGL3 vector. For Dnmt3a 3'UTR, forward primer AATTGGCCGGCCGGGACATGGGGGCAAACTGAAGTAG and reverse primer AATTGGATCCGGGAAGCCAAAACATAAAGATGTTTATTGAAGCTC were used for PCR cloning. For Dnmt3b 3'UTR, forward primer AATTGGCCGGCCTTCTACCCAGGACTGGGGAGCTCTC and reverse primer AATTGGATCCTTATAGAGAAATACAACTTTAATCAACCAGAAAGG were used for PCR cloning. The pGL3 vector without the 3'UTR clone was used as a control. Each firefly luciferase construct (500ng) together with Renilla luciferase construct (10ng) were con-transfected with either Mirlet7g mimic or scramble control (20nM). The firefly and Renilla luciferase activity was determined by dual luciferase assay (Promega, catalogue no. E1960) 48h post-transfection.

Immunostaining
Cells were fixed in 4% paraformaldehyde for 10 min at room temperature and were blocked with blocking buffer (5% semi-skimmed milk with 0.1% Triton in PBS) for 2 hours at room temperature. Primary antibodies were diluted in blocking buffer and incubated at 4ᵒC overnight. Primary antibody was carefully washed away with 0.1% Triton in PBS three times with 10 min incubation between each wash. Secondary antibody diluted in blocking buffer (1:1000) was incubated at room temperature for 1 hour followed by 3 washes with 0.1% Triton in PBS. Nuclei were counterstained with DAPI. Primary antibodies used were Nanog (eBioscience, 14-5761, RRID:AB_763613, 1:200) and Lin28a (Cell signalling, 3978, RRID:AB_2297060, 1:800; 8706, RRID:AB_10896850, 1:200). Images from random fields were taken with Leica DMI3000 and the images from different fields at each time point were combined and analysed using CellProfiler software (Broad Institute, RRID:SCR_007358) to conduct nuclear and cytoplasmic compartmentalisation and total fluorescent intensity for each sub-cellular compartments as well as the whole cell for each cell was extracted for correlation analysis.

Chromatin isolation by RNA purification (ChIRP)
The antisense oligo probes were selected with GC content in the range of 40%-50% in regions of the Epn exons without repetitive sequences (Figure 1-Figure Supplement 1A). The probes sequences are in shown in Supplementary file 7. CHIRP was conducted following published protocol (Chu et al., 2011). The data is available at the NCBI Gene Expression Omnibus (accession number: GSE90574). The link to the data is as follows: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=cralcwwqvtaptgv&acc=GSE90574

ChIP
The experimental procedure was conducted as described previously (Betschinger et al., 2013). 2 ug of H3K4me3 antibody (Diagenode pAb-003-050) and IgG control (Santa Cruz, sc-2345) was used for 4x10 6 cells per ChIP. qPCR was performed with primers shown in Supplementary file 8.
500ng purified genomic DNA was treated with sodium bisulfite to convert all unmethylated cytosine residues into uracil residues using Imprint DNA modification Kit (Sigma, MOD50-1KT) according to the manufacturer's protocol. Nanog proximal promoter regions (Region 1 and 2 as indicated in Figure 5a) were amplified using a nested PCR approach with KAPA HiFi Uracil+ Readymix (KapaBiosystems, KK2801). The PCR condition for both nested rounds of PCR is as follows: denaturation at 98°C for 5 minutes followed by 10 cycles of gradient PCR, 98°C for 15 seconds, 62°C (starting annealing temperature) for 15 seconds with annealing temperature reduced by 1°C per cycle and 72°C for 1.5 minutes. Followed by this, a 35 cycles of 98°C for 15 seconds, 58°C for 15 seconds and 72°C for 1.5 minutes were conducted. 2µl first round PCR product was used as template for the nested PCR. All primer sequences are shown in Supplementary file 9. The PCR products were verified and purified by gel electrophoresis and subsequently subcloned into PCR4.1 TOPO vector (Invitrogen) according to the manufacturer's protocol. Reconstructed plasmids were purified and individual clones were sequenced (Eurofins).

Transcriptome sequencing and analysis
Total RNA was isolated with RNeasy RNA purification kit (Qiagen). Ribo-zero rRNA depleted RNA was used to generate sequencing libraries for wild type and Ephemeron knockout cells in PD/LIF and 8 hours withdrawal from PDL from three independent cell lines. Single end sequencing was performed and the reads were mapped using NCBI38/mm10 with Ensembl version 75 annotations. RNA-seq reads were aligned to the reference genome using tophat2. Only uniquely mapped reads were used for further analysis. Gene counts from SAM files were obtained using htseq-count with mode intersection non-empty, -s reverse. Differential gene expression analysis was conducted using Bioconductor R package DESeq2 version 1.4.5. DESeq2 provides two P-values, a raw P-value and a Benjamini-Hochberg P-value (adjusted p value). An adjusted p-Value threshold of 0.05 was used to determine differential gene expression (95% of the results are not false discoveries, error rate 0.05 =5%). The data is available at the NCBI Gene Expression Omnibus (accession number: GSE90574). https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=cralcwwqvtaptgv&acc=GSE90574

Epn promoter CpG methylation analysis
Using published genome-wide bisulpite sequencing data (Kalkan et al., 2017;Seisenberger et al.;Wang et al., 2014), Epn promoter region was defined as the 2kb region upstream of the TSS and the percentage of CpG methylation within the region was quantified. For promoter average, percentage of CpG methylation around the 2kb promoter region of each annotated gene was quantified and averaged for all values. For genome average, percentage of CpG methylation of all 50kb tiling windows was quantified and averaged all values.

Acknowledgments
We thank Kosuke Yusa and Graziano Martello for comments on the manuscript. We are grateful to Carla Mulas for assisting the miRNA expression plot, Yiping Zhang for lncRNA candidate prediction analysis and Rosalind Drummond for technical support. We thank Heather Lee for providing Dnm3ta and Dnmt3b siRNAs. We also thank Peter Humphreys and Andy Riddell for technical support for imaging analysis and flow cytometry respectively. We thank Nicholas Ingolia for useful discussion on ribosomal footprinting on Epn. A.S is   A, Experimental scheme for analysing naïve state exit using Rex1GFPd2 reporter cells. B, Rex1-GFP flow cytometry profiles of wild type and Epn KO cells in 2i and PD/LIF and during transition from these starting conditions. Two independent clones for wild type and Epn KO cells were analysed. Percentage of GFP high cells were quantified. C, Experimental scheme for colony formation assay. D, Colony formation assay for wild type and Epn KO cells in 2i/LIF 40 hours post PD/LIF withdrawal. Colonies were stained with alkaline phosphatase (AP), with representative images shown. Percentage clonogenicity was calculated by the number of AP positive colonies divided by the total number of cells plated. Mean+/-SD, n=3. E, Lin28a and Nanog expression relative to β-actin in three independent wild type and Epn KO cell lines measured by RT-qPCR. Mean+/-SD, n=3. *p<0.05, **p<0.01, student's t-test. F, Nanog and Lin28a expression kinetics upon PD/LIF withdrawal in wild type and Epn KO cells. Mean+/-SD, n=3. G, Rex1-GFP flow cytometry profile for wild type, Epn KO and Epn rescue cells 40 hours post PD/LIF withdrawal. Percentage of GFP high cells were quantified.   Note that Northern blotting probe region overlaps with a LINE element. Epn expression was detectable by FANTOM5 CAGE expression. B, Epn expression upon 2i withdrawal in cells fractionated based on Rex1-GFP expression. 25hH and 25hL: Rex1-GFP high and low cells respectively sorted at 25 hours post 2i withdrawal. C, Sequence alignment of Exon3 of Epn and a rat EST transcribed from the syntenic region. Epn EST, mouse AK131952; rat EST, CA504619. D, Northern blot of wild type and Epn KO cells in 2i/LIF and 24 hours post 2i/LIF withdrawal. * non-specific hybridisation. 28s and 18S gel electrophoresis served as loading control. E, 5'RACE sequence confirming the 5' start of Epn RNA. F, 3'RACE sequence confirming the 3' end of Epn RNA. G, Epn expression relative to β-actin measured by RT-qPCR upon 2i/LIF component addition to LIF/serum (LS) culture. PD or/and CH were added to cells maintained in LIF/serum for indicated time. Mean+/-SD, n=3. promoter TSS site was cloned to be directly upstream of the first base of Epn exon 1. B, Long-range PCR genotyping of targeted clone. 5' long range PCR product, 6 kb; 3' long range PCR product, 6.2 kb. C, Expression of Epn, Lin28a and Nanog relative to β-actin in PD/LIF measured by RT-qPCR. Mean+/-SD, n=3.