Cytosine hydroxymethylation by TET enzymes: From the control of gene expression to the regulation of DNA repair mechanisms, and back

: Chromatin is a complex multi-scale structure composed of DNA wrapped around nucleosomes. The compaction state is finely regulated mainly by epigenetic marks present not only on nucleosomes but also on the DNA itself. The most studied DNA post-transcriptional modification is 5-methylcytosine (5-mC). Methylation of the cytosine at CpG islands localized at the promoter is associated with repression of transcription. On the contrary, enrichment of 5-hydroxymethylcytosine (5-hmC), one of the oxidation products of 5-mC by TET (ten-eleven translocation) enzymes, on promoters and enhancers promotes transcription activation. Recently, a new role of 5-hmC has been proposed in the context of DNA repair. 5-hmC was found to be enriched at DNA lesions and knockdown of TET led to impaired repair efficiency. Here, we review our current knowledge regarding the role of the regulation of the 5-mC/5-hmC balance by TET enzymes in the context of transcription modulation as well as DNA repair processes. In a final section, we speculate on the potential involvement of TET proteins in DNA repair mechanisms associated with transcription activation.


Introduction
Research work performed over the last fifty years have shown that our genetic material is not simply a linear sequence of 3 billion base pairs coding for all the proteins required by the cell. An additional layer of complexity, which allows to reach different outcomes while starting from the same DNA sequence, is provided by epigenetic regulatory mechanisms. These processes rely on labile chemical tag, such as phosphorylation or methylation, tagging the DNA double-helix or the nucleosomes. Epigenetic marks serve two major functions: (i) they participate to signaling pathways; and (ii) they regulate chromatin architecture [1]. The principle of epigenetic signaling follows a quite simple rule: A given mark will attract or repel specific effector proteins participating in physiological mechanisms such as DNA transcription, replication or repair. The structural role of the epigenetic code remains more fuzzy but might rely on a modification of the physico-chemical properties of the chromatin fiber, thus impacting its folding state inside the nucleus. Whether epigenetic marks are able to regulate all scales of multi-step 3D chromatin organization or only specific ones remains unclear [2].
The most studied epigenetic marks are those found along the N-terminal tails of the core histones. These tails, which are localized at the surface of the compact nucleosome particle, are readily accessible to nuclear proteins among which specific enzymes responsible for writing or erasing epigenetic marks, as well as effector proteins involved in cellular processes using DNA as a template [3]. Methylation or acetylation of the histone tails are also known to regulate the interaction between the histones and the DNA, thus affecting the stability of the nucleosome. At higher folding scales, these epigenetic marks probably also regulate nucleosome/nucleosome interactions, which in turn could impact higher folding-levels of the chromatin as well as its compaction state [4].
Besides histones, the DNA molecule itself is also subject to epigenetic modifications. In contrast to histone marks which are highly diverse, only one major epigenetic modification is found on DNA and corresponds to cytosine methylation on carbon 5, mainly in a CpG dinucleotide context. With the exception of CpG islands, CpGs are highly methylated throughout the genome of somatic cells and changes in their level of methylation have been associated to differentiation and tumorigenesis [5]. While the direct impact of methylated cytosines (5-mC) on chromatin structure remains unclear, this mark shows key signaling functions in relation to transcription regulation. Indeed, depending on their CpG density, promoters enriched in 5-mC at CpG sites tend to have a lower transcription rate, a fact that has been attributed to a lack of transcription factor binding [6]. This mark is also involved in X-chromosome inactivation [7] as well as genome imprinting [8].
Cytosine methylation, in line with all other epigenetic modifications, is a dynamic mark. DNMT (DNA methyl transferase) enzymes are in charge of adding this mark along the DNA helix [9]. 5-mC removal has been proposed to occur by passive dilution [10] but also involve two different active multi-step processes ( Figure 1). First, activation induced deaminases (AID), which deaminate cytosine into uracil, are also able to convert 5mC into thymidine, yet with a lower efficiency [11]. The relevance of this 5-mC clearance pathway, which later involves the Base Excision Repair (BER) machinery to replace the thymidine by a cytosine, remains debated [12]. Alternatively, 5-mC undergoes successive oxydation steps that lead to first 5-hmC (5-hydroxymethylcytosine), then 5-fC (5-formylcytosine) and finally 5-caC (5-carboxylcytosine). These oxydation steps are carried out by the TET (Ten-eleven translocation) enzymes which are 2-oxoglutarate and Fe(II)-dependent dioxygenases [13,14]. 5-fC and 5-caC are then excised and repaired by the BER machinery to ultimately return to the unmethylated cytosine state [15,16]. Interestingly, mutual regulations have been reported between enzymes controlling DNA methylation and those in charge of adding and erasing methylation at histone tails, suggesting dynamic crosstalks between DNA and histone epigenetic marks [17]. The TET family is composed of three members: TET1, TET2 and TET3. Despite sharing the same catalytic activity, these proteins are expressed differentially during development and are also present in different cell types, suggesting that they may fulfill different functions [18]. Cytosine can be methylated by DNA methyl transferase 1 (DNMT1) for methylation maintenance or by DNMT3a and DNMT3b for de novo methylation. TET (Ten-eleven translocation) enzymes oxidize 5-mC into 5-hmC (5-hydroxymethylcytosine), then 5-fC (5-formylcytosine) and finally 5-caC (5-carboxylcytosine). Both 5-fC and 5-caC are recognized and repaired by the base excision repair (BER) system, leading the restoration of an unmodified cytosine.
In this dynamic regulation of the cytosine methylation illustrated on Figure 1, 5-hmC can be seen as a transient demethylation intermediate similar to 5-fC and 5-caC. However, several observations suggest that 5-hmC might be more than such transient intermediate. First, 5-hmC is much more stable than the two other oxidized forms of 5-mC. Second, specific readers of 5-hmC have been identified such as the DNMT1-interacting protein UHRF2 or the homeobox protein Zhx1 [19]. Conversely, 5-hmC may also repel proteins that bind to 5-mC [19]. Altogether, these findings led to the idea that 5hmC could be a fully-fledged epigenetic mark and serve signaling functions by attracting or repelling specific factors [20]. In fact, data accumulate in favor of specific functional roles for 5-hmC and for the regulation of the 5-mC/5-hmC balance via TET enzymes. High levels of 5-hmC, associated with strong expression of TET enzymes, correlate with cell pluripotency [21] and are important for embryonic stem cell maintenance since impairing 5-hmC by knocking-out TET1 induces abnormal ESC differentiation [21,22]. More recently, it was also shown that hydroxymethylation of specific loci along the genome is important at late differentiation stages such as the transition from B cells to plasma cells [23] and during cell reprogramming towards pluripotency [24]. In adult organisms, 5-hmC shows tissue specificity with high levels found in the brain in contrast to other organs [25][26][27]. While the exact function of 5-hmC in the brain remains unclear [28], decreased level of this epigenetic mark has been observed in several neurodegenerative diseases [29]. Finally, reduced 5-hmC levels is a feature shared by multiple cancers [30]. Importantly, decreased 5-hmC levels do not depend on tumor stage, indicating that loss of 5-hmC is an early event in cancer development [31]. These various results suggest that, while 5-hmC is definitely a central player in transcription regulation, it may also participate in other cellular functions involving DNA transactions. In this review, we present the findings that recently improved our understanding of the exact role of 5-hmC and TET enzymes in two different processes: gene transcription and DNA repair.

Cytosine hydroxy-methylation by TET enzymes: A central regulator of gene transcription
Before describing the current knowledge regarding the function of cytosine hydroxymethylation, it is useful to first briefly recall the function of the best known modified state of the cytosine: The 5-methylation. This epigenetic mark is a well-known negative regulator of transcription. It can regulate the binding of transcription factors at CpC islands located in regulatory regions [6,32], but it also recruits methyl-CpG binding proteins including MeCP2, MBP2 which can in turn attract co-repressor complexes displaying histone deacetylase activities such as SIN3A [33]. The cooperative action of these different players is thought to create a closed chromatin state repressive to transcription [34]. Outside the areas enriched in CpGs, the exact function of 5-mC is less clear [35]. Mirroring the inhibitory role of cytosine methylation, 5-hmC is found at genomic areas associated with active transcription. High levels of 5-hmC are found at active promoters, enhancers, transcription start sites (TSS) and gene bodies [36,37]. Since 5-hmC is a demethylation intermediate of 5-mC, these increased levels of hydroxymethylation at active transcription sites may just reflect an increased demethylation activity in these areas to prevent silencing [38]. However, several lines of evidence suggest a specific role for hydroxymethylation by TET enzymes in both the recruitment of early components of the transcription machinery and in the establishment of a chromatin landscape favorable for transcription.
First, 5-hmC is important at initial stages of transcription activation during the recruitment of pioneering factors responsible for chromatin opening at enhancer elements to facilitate the binding of subsequent transcription factors [32]. In the context of P19 cell differentiation induced by retinoic acid, two pioneering factors, MEIS1 and PBX1 are more retained to binding sites enriched in 5-hmC [32]. At enhancers bound by these factors, 5-hmC is also required for chromatin opening and for the first step of enhancer activation named enhancer priming. The mechanism underlying this positive impact of 5-hmC on the recruitment of pioneering factors remains nevertheless unclear. One hypothesis is that 5-hmC may recruit remodeling factors leading to nucleosome eviction or destabilization, thus favoring the binding of pioneering factors [32]. In line with this model, TET enzymes were shown to associate with several chromatin remodeling complexes. During osteogenesis, the activity of TET1/2 at the promoter of the bone master transcription factor Sp7 promotes the recruitment of the catalytic subunits BRG1 and BRM of the SWI/SNF chromatin remodeler, which leads to the eviction of histone H3 [39]. Surprisingly, TET1 was shown to associate not only with remodelers promoting transcription such as the SWI/SNF, but also with repressive remodeling factors such as the NURD (Nucleosome Remodeling Deacetylase) or the Polycomb complexes [40,41]. These interactions with remodelers displaying opposite functions led to propose that TET1 might balance the activity of both remodelers to allow the fine-tuning of the gene expression levels [40].
A second important function of cytosine hydroxymethylation via TET enzymes is the establishment of a chromatin landscape favorable for transcription, both in terms of epigenetic marks and 3-dimensional structure. At promoters, TET2 enzymes participate in the enrichment in the active H3K4me3 mark (tri-methylation of lysine 4 of histone H3) [42] and might also help to erase the repressive marks H3K9me3 and H3K27me3 [39]. TET enzymes also promote the mono-methylation of lysine 4 on histone H3 (H3K4me1) at enhancers, which is important for the activation of these regulatory elements [32]. However, cytosine hydroxymethylation is not only associated with active epigenetic histone marks. Similar to the association of TET proteins with antagonist remodeling complexes discussed in the previous paragraph, these enzymes also contribute to the formation of dual epigenetic patterns. TET2 activity appears crucial for the establishment of bivalent chromatin domains enriched in both the active H3K4me3 and the repressive H3K27me3 marks at CpG islands [43]. All these different impacts of the 5-mC/5-hmC balance on the epigenetic histone code are mediated by complex cross-regulations between TET enzymes and histone methyltransferases and demethylases [39,43]. For example, the activity of TET1/2 was shown to be essential for the recruitment of Wdr5 and Set1b, two subunits of the histone methyltransferase complex COMPASS, and of the histone demethylases Jmjd2a and Jmjd3 at the Sp7 promoter upon activation of this gene [39]. This composite epigenetic network does not only signal genetic area for the transcription machinery, it also modulates the chromatin compaction state, a process which could regulate access to the target DNA sequence. Evidences for a specific effect of 5-hmC on the chromatin structure remains nevertheless relatively sparse. Results obtained on in-vitro reconstituted nucleosomes show that 5-hmC is able to modulate the interactions between the nucleosome core particle and the DNA thus affecting both nucleosome stability and compactness [44]. In cells, high 5-hmC levels are associated with loose chromatin packing [32] but it is difficult to establish a direct causal link between these two features.
The data presented above clearly show that the 5-mC to 5-hmC oxidation dynamics controlled by TET enzymes is involved at multiple stages in transcription regulation, allowing to fine-tune the expression of our genome. Nevertheless, for many of the results mentioned in this section, it is important to point out that it is difficult to discriminate a specific effect of the 5-hmC mark from the demethylation activity of the TET enzymes. Moreover, it has also been proposed that TET enzymes may act as scaffolding proteins for the recruitment of members of the transcription machinery independently of their enzymatic activity. Deplus et al. showed that, while TET2/3 participate in the recruitment of the SET1/COMPASS methyltransferase complex as well as the O-GlcNAc transferase at TSS and CpG-rich sites, no enrichment of 5-hmC is observed in these genomic regions [45]. Furthermore, TET3 isoform seems to participate in the stabilization of several nuclear receptors onto their binding sites on the chromatin independently of its dioxygenase activity [46]. Future work should help to better understand these intricate functions of the TET enzymes.

An emerging role for TET enzymes in DNA repair mechanisms
Recent results suggest that, besides its role in the regulation of transcription, the 5mC to 5hmC transition mediated by TET enzymes may also be involved in the DNA damage response (DDR). First, TET2 was recently shown to be crucial for the clearance of aberrant DNA methylation associated with oxidative stress [47]. Furthermore, in mouse cells of the haematopoietic lineage, simple TET1 knockout or double knockout of TET2 and TET3 led to DNA repair defects as shown by increased phosphorylation of histone variant H2AX (γH2AX) both in the absence of exogenous DNA damage and after X-ray irradiation [48,49]. In line with this impaired DDR, removal of TET enzymes induces chromosome segregation abnormalities during mitosis [50] and unbalanced chromosome translocations [49], two mechanisms leading to genomic instability [51] and, ultimately, to tumorigenesis [49]. This involvement of TET enzymes during the DDR might be simply explained by the impairment of the expression of many key repair proteins in TET knockout cells [48,49]. However, several lines of evidences also suggest a more direct involvement of the TET enzymes during DDR.
An important finding suggesting a direct function of TET enzymes during DNA repair is the recruitment of these enzymes at DNA lesions induced by laser irradiation, leading to a local increase of the 5-hmC mark [50]. Depending on the DNA damaging conditions, such gain in 5-hmC levels seems to require the activation of the DNA-damage-related PI3K kinases ATM, which is involved mainly into double-strand break repair mechanisms, or ATR, which was found activated for a large spectrum of DNA insults [52,53]. Both kinases participate in DNA damage signaling, for example via H2AX phosphorylation, and are also targeting multiple repair proteins to modulate their activity during the DDR [54]. More specifically, TET1 is a substrate of ATM and TET3 interacts with ATR and is phosphorylated by this enzyme [52,53]. In addition to this regulation via the ATM/ATR kinases, a complex interplay has also been reported between TET enzymes and the poly-ADP-ribose polymerase 1 (PARP1). PARP1-dependent signaling plays multiple roles during the DDR from the recruitment of early repair factors to chromatin remodeling in the vicinity of the DNA lesions [55,56]. In vitro, PARP1 and TET1 are able to modulate each other's activities [57]. Moreover, PARP1 activity regulates the transcription of the TET1 gene [58]. Altogether, the recruitment of TET enzymes at DNA lesions and their relationships with key components of the DNA repair machinery point towards a specific function of 5-hmC during DDR. Nevertheless, more work is needed to delineate the exact function of this mark at DNA breaks: Could it be the equivalent, along the DNA double-helix, to ATM-dependent H2AX phosphorylation, which signals the presence of DNA breaks, or could it promote, together with PARP1 the formation of a loose chromatin architecture facilitating access for the DNA repair machinery?

Coupling DNA repair mechanisms and transcription modulation: A new job for TET enzymes?
DNA transcription and DNA repair are not independent mechanisms in the cell nucleus but instead display complex inter-dependencies. Multiple factors involved in transcription inhibition, including subunits of the polycomb and NURD complexes [59] or the heterochromatin protein 1 [60], are recruited at DNA breaks, where they add repressive epigenetic marks such as trimethylation of the lysine 27 of histone 3 [61]. Thus, it seems important to shut-down transcription in the vicinity of DNA damage sites to avoid interference between the transcription machinery and the recruitment of DNA repair proteins [62]. However, the reverse does not seem to hold true and several recent experimental evidences suggest that activating the transcription of certain genes actually requires DNA breaks induction and repair [63].
In 2006, Ju et al. were the first to report that efficient estrogen-dependent transcription of the TFF1 gene requires the occurrence of double-strand breaks (DSBs) at the promoter via a topoisomerase IIβ activity [64]. Intriguingly, it was later reported that inducing DSBs at promoters of neuronal-activity regulated genes by chemical agents or the CRISPR-Cas9 endonuclease system was sufficient to activate the transcription of these genes [65]. Other types of DNA lesions were also observed in relation to transcription activation such as base oxydation [66] or DNA nicks [67] and these lesions were not only found at gene promoters but also within gene bodies [68] and at enhancers [67]. Importantly, transcriptional activation did not only required the induction of DNA lesions but also their resolution via multiple repair factors including PARP1 and the DDR-dependent PIKK kinase DNA-PK [64].
The fact that many of the transcription-related DNA lesions are induced by topoisomerases and that DNA break induction seems tightly linked to transcription elongation both suggest that the formation of these breaks is necessary for the release of the topological constraints associated with RNA polymerase II progression [69]. Besides this topological effect, the presence of DNA damage at transcriptionally-active genes might also be related to the formation of a non-canonical DNA structure: the R-loop [70]. R-loops are three-stranded structures composed of an hybrid duplex associating nascent RNA with the template DNA strand [71], and the single complementary DNA strand. R-loops, which are favored by CG enrichment, might help to stabilize the transcription bubble [70] but their resolution is associated with the formation of DNA lesions that might result in genomic instability if incorrectly processed [72].
Since TET enzymes have been involved in both early stages of transcription activation and repair mechanisms, it is tempting to speculate that they may participate in this complex interplay between the repair and transcription machineries observed upon activation of the transcription process. In favor of this hypothesis, the TFF1 promoter, at which transcription-related DNA lesions were initially observed [64], is also known to undergo cycles of methylation/demethylation [73]. Furthermore, it was also recently reported that the pioneering factor FOXA1, which is found preferentially at hydroxymethylated enhancers [74], is able to nucleate numerous repair factors [75]. It remains nevertheless unclear whether TET enzymes directly participat e in these different processes.
Affecting the 5mC/5hmC balance via TET activity in the context of the transcription-related DNA lesions may fulfill several functions. Because transcription requires nucleosome eviction, epigenetic histone modifications classically used by the cell as signaling marks are transiently absent from the transcribed area. Cytosine hydroxymethylation via TET activity at nucleosome-depleted areas such as R-loops [76] or topologically constrained areas, may help to signal DNA lesions for the repair machinery or orient the repair process towards the most appropriate pathway to avoid deleterious effects [72]. Independently of their activity, TET enzymes may also serve as recruitment platform for repair factors similar to what was reported for the pioneering factor FOXA1 [75]. Repair factors found at active transcription sites not only allow DNA break resolution, they also seem to contribute to the establishment of a transcription-competent chromatin structure. For example, upon activation of the TFF1 gene, topoisomerase IIβ and PARP1 were shown to promote exchange of histone linker H1, involved in transcription repression, for the high mobility group B proteins [64]. This chromatin remodeling activity may be concerted with the one reported for TET enzymes via cross-regulations such as the one reported between PARP1 and TET1 [57]. These different roles of the TET enzymes are nevertheless currently mostly speculative and more work is needed to delineate more precisely the exact implication of the TET enzymes during transcription-induced DNA repair mechanisms.
In this short review, we have highlighted how recent findings extended the spectrum of action of the TET enzymes from the modulation of transcription to a contribution to DNA repair. The involvement of TET enzymes in both mechanisms might reflect the fact that repair and transcription are not independent from each other and appear highly entangled.